STRING_AGG missing functionality #2

gabotechs · 2025-02-15T16:17:30Z

…nality

…#2) When scanning an exact list of remote Parquet files, the ListingTable was fetching file metadata (via head calls) sequentially. This was due to using `stream::iter(file_list).flatten()`, which processes each one-item stream in order. For remote blob stores, where each head call can take tens to hundreds of milliseconds, this sequential behavior significantly increased the time to create the physical plan. This commit replaces the sequential flattening with concurrent merging using `tream::iter(file_list).flatten_unordered(meta_fetch_concurrency). With this change, the `head` requests are executed in parallel (up to the configured `meta_fetch_concurrency` limit), reducing latency when creating the physical plan. Note that the ordering loss introduced by `flatten_unordered` is perfectly acceptable as the file list will anyways be fully sorted by path in `split_files` before being returned. Additionally, tests have been updated to ensure that metadata fetching occurs concurrently.

github-actions · 2025-04-17T02:07:13Z

Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days.

gabotechs added 2 commits February 15, 2025 17:04

Add a STRING_AGG implementation based on ARRAY_AGG for reusing funcio…

50c930e

…nality

Add a STRING_AGG implementation based on ARRAY_AGG for reusing funcio…

84e4fbb

…nality

github-actions bot added the Stale label Apr 17, 2025

gabotechs closed this Apr 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

STRING_AGG missing functionality #2

STRING_AGG missing functionality #2

Uh oh!

gabotechs commented Feb 15, 2025

Uh oh!

github-actions bot commented Apr 17, 2025

Uh oh!

Uh oh!

STRING_AGG missing functionality #2

STRING_AGG missing functionality #2

Uh oh!

Conversation

gabotechs commented Feb 15, 2025

Uh oh!

github-actions bot commented Apr 17, 2025

Uh oh!

Uh oh!