[SPARK-51717][SS] [RocksDB] Fix SST mismatch corruption that can happen for second snapshot created for a new query #50512
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Fix error: Sst file size mismatch ... MANIFEST-000005 may be corrupted.
This is an edge case in SST file reuse that can only happen for the first ever RocksDB checkpoint if the following conditions happen:
The problem here is from step 3, the way the file manager loads v0 is different from how it loads other versions. During the load of other versions, when we delete an existing local file we also delete it from file mapping. But for v0, file manager just deletes the local dir and we missed clearing the file mapping in this case. Hence the old x.sst was still showing in the file mapping at step 4. We need to fix this and also add additional size check.
Why are the changes needed?
Can cause checkpoint corruption, hence the query will fail.
Does this PR introduce any user-facing change?
No
How was this patch tested?
New test included
Was this patch authored or co-authored using generative AI tooling?
No