Fix concurrency issue with load snapshot from DeltaLakeMetadata #25588
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
If getTableHandle called concurrently for same table,
getSnapshot
method could produce concurrency issue:steps to reproduce:
CREATE table my_catalog_delta.my_schema.my_table_1 AS SELECT 1 as a;
If you repeat it a few times, you will notice that sometime query failed with :
latestTableVersions changed concurrently
exception.It happens because there is multiple concurrent calls to
getSnapshot
and first call updatedlatestTableVersions
map but not yetqueriedSnapshots
map and second call see these 2 tables in inconsistent state ().Additional context and related issues
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text: