Skip to content

Check that digesting consumes the expected number of bytes. #21

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

arvi18
Copy link

@arvi18 arvi18 commented Apr 28, 2025

In Bazel, the time a file is stated to find its size can be very far from the time the file's digest is computed, which creates a window for shenanigans. Allow passing the expected size into Path.getDigest and raise an error there if the number of digested bytes doesn't match the previously seen size.

Summary by CodeRabbit

  • New Features

    • File digest computation now supports explicit file size validation, improving accuracy and error detection for mismatched file sizes during hashing.
  • Bug Fixes

    • Enhanced error reporting when file size does not match the expected value during digest operations.
  • Tests

    • Updated tests to verify file digest behavior with explicit file size parameters and to check for proper error handling on size mismatches.
  • Refactor

    • Digest-related methods and their usage were updated to consistently accept and utilize file size information.

In Bazel, the time a file is `stat`ed to find its size can be very far from the time the file's digest is computed, which creates a window for shenanigans. Allow passing the expected size into `Path.getDigest` and raise an error there if the number of digested bytes doesn't match the previously seen size.
@arvi18
Copy link
Author

arvi18 commented Apr 28, 2025

If file size differs, the digest must be different. Can you share more details about the motivation/background for this PR?

@arvi18
Copy link
Author

arvi18 commented Apr 28, 2025

The potential problem is Bazel doesn't compute the file size by counting how many bytes are fed into the hash function. Rather, the file is stated.. Thus, there's a race between computing the size of the digest and computing the digest itself.

@arvi18
Copy link
Author

arvi18 commented Apr 28, 2025

In other words, I'm trying to add a best-effort consistency check.

@arvi18
Copy link
Author

arvi18 commented Apr 28, 2025

Internally, we embed the size into the digest (i.e. the file size can be derived from the digest) so I can understand the change. But I am not sure whether missing the file size could be a problem in practice.

Considering lots of effect will be needed to import this PR, I will let @meisterT make the decision.

@arvi18
Copy link
Author

arvi18 commented Apr 28, 2025

If we decide to add this, we should consider merging it with --experimental_guard_against_concurrent_changes (which only applies when uploading action results to a remote/disk cache). In other words, I don't think we should have two separate mechanisms to detect unexpected changes to build outputs.

@arvi18
Copy link
Author

arvi18 commented Apr 28, 2025

Steps can likely to taken to make this easier to merge if needed. Note I added a Path.getDigest overload taking a size; the no-argument version is preserved meaning no callers of Path.getDigest() will break. I did modify FileSystem.getDigest on the assumption that it is a less prominent interface than Path. If Google having many FileSystem implementations is a difficulty, I could similarly add an override for FileSystem.getDigest.

With regards to --experimental_guard_against_concurrent_changes (bazelbuild#3360 (comment)), the problem is no one sets that flag. Unlike that flag, this change does not add any more I/O to Bazel; it's a simple internal comparison of two integers.

@arvi18
Copy link
Author

arvi18 commented Apr 28, 2025

Internally, we embed the size into the digest (i.e. the file size can be derived from the digest) so I can understand the change.

Alternatively, would there be a way to align the hash functions available in Bazel with this construction?

Copy link

coderabbitai bot commented Apr 28, 2025

Walkthrough

This change introduces file size awareness to digest computation throughout the codebase. The primary update is the addition of an expectedSize or fileSize parameter to various getDigest and related methods in the filesystem and digest utility classes. This parameter enables explicit verification that the number of bytes read during digest calculation matches the expected file size, enhancing correctness and potentially optimizing digest strategies for large files. The changes propagate through core filesystem classes, digest utilities, and test code, updating method signatures, internal calls, and relevant assertions to use the new parameter. No control flow or error handling is significantly altered beyond the size check.

Changes

File(s) / Path(s) Change Summary
src/main/java/com/google/devtools/build/lib/vfs/DigestUtils.java Updated getDigestWithManualFallback and manuallyComputeDigest to accept a fileSize parameter; added getDigestWithManualFallbackWhenSizeUnknown; internal logic now uses file size for digest computation decisions.
src/main/java/com/google/devtools/build/lib/vfs/FileSystem.java Changed getDigest(PathFragment) to getDigest(PathFragment, long expectedSize); new implementation verifies that bytes read match expectedSize and throws if not.
src/main/java/com/google/devtools/build/lib/vfs/Path.java Added getDigest(long expectedSize) method; updated existing getDigest() to delegate to new method; updated directory digest logic to use file size where available.
src/main/java/com/google/devtools/build/lib/vfs/JavaIoFileSystem.java,
.../UnixFileSystem.java,
.../PathTransformingDelegateFileSystem.java,
.../testing/vfs/SpiedFileSystem.java
Updated getDigest method signatures to accept expectedSize; updated calls to superclasses accordingly.
src/main/java/com/google/devtools/build/lib/analysis/ConfiguredRuleClassProvider.java Updated BundledFileSystem's getDigest to accept expectedSize; updated getFastDigest to call new signature.
src/main/java/com/google/devtools/build/lib/remote/RemoteActionFileSystem.java Changed getDigest to accept expectedSize; updated internal calls to use new parameter.
src/main/java/com/google/devtools/build/lib/exec/RunfilesTreeUpdater.java Replaced getDigestWithManualFallback with getDigestWithManualFallbackWhenSizeUnknown for manifest digest comparison.
src/main/java/com/google/devtools/build/lib/exec/SpawnLogContext.java,
.../actions/FileArtifactValue.java,
.../remote/util/DigestUtil.java,
.../skyframe/ActionOutputMetadataStore.java
Updated calls to digest utility methods to provide file size as argument.
src/test/java/com/google/devtools/build/lib/vfs/DigestUtilsTest.java,
.../FileSystemTest.java,
.../skyframe/FileFunctionTest.java,
.../skyframe/FileArtifactValueTest.java,
.../skyframe/ArtifactFunctionTest.java,
.../skyframe/ArtifactFunctionTestCase.java,
.../exec/SingleBuildFileCacheTest.java,
.../exec/SpawnLogContextTestBase.java,
.../analysis/StubbableFSBuildViewTest.java,
.../remote/RemoteActionFileSystemTest.java,
.../remote/RemoteExecutionServiceTest.java,
.../unix/UnixDigestHashAttributeNameTest.java
Updated test code to match new digest method signatures; added/updated assertions for size mismatch; updated mock and override methods to include expectedSize parameter.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant Path
    participant FileSystem
    participant DigestUtils

    Caller->>Path: getDigest(expectedSize)
    Path->>FileSystem: getDigest(pathFragment, expectedSize)
    FileSystem->>DigestUtils: (if fallback needed) getDigestWithManualFallback(path, expectedSize, xattrProvider)
    DigestUtils->>FileSystem: manuallyComputeDigest(path, expectedSize)
    FileSystem-->>DigestUtils: digest bytes (after verifying size)
    DigestUtils-->>FileSystem: digest bytes
    FileSystem-->>Path: digest bytes
    Path-->>Caller: digest bytes
Loading

Poem

A hop, a skip, a measured byte,
Now digests check sizes right!
From filesystem to test and tool,
Each method learns a size-wise rule.
With every nibble, every read,
The rabbit ensures the counts succeed.
Code hops ahead, precise and spry—
Digesting truth, and hopping high! 🐇

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ast-grep (0.31.1)
src/test/java/com/google/devtools/build/lib/remote/RemoteExecutionServiceTest.java

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c0c98a2 and 7760121.

📒 Files selected for processing (26)
  • src/main/java/com/google/devtools/build/lib/actions/FileArtifactValue.java (1 hunks)
  • src/main/java/com/google/devtools/build/lib/analysis/ConfiguredRuleClassProvider.java (1 hunks)
  • src/main/java/com/google/devtools/build/lib/exec/RunfilesTreeUpdater.java (1 hunks)
  • src/main/java/com/google/devtools/build/lib/exec/SpawnLogContext.java (1 hunks)
  • src/main/java/com/google/devtools/build/lib/remote/RemoteActionFileSystem.java (2 hunks)
  • src/main/java/com/google/devtools/build/lib/remote/util/DigestUtil.java (1 hunks)
  • src/main/java/com/google/devtools/build/lib/skyframe/ActionOutputMetadataStore.java (1 hunks)
  • src/main/java/com/google/devtools/build/lib/testing/vfs/SpiedFileSystem.java (1 hunks)
  • src/main/java/com/google/devtools/build/lib/unix/UnixFileSystem.java (1 hunks)
  • src/main/java/com/google/devtools/build/lib/vfs/DigestUtils.java (2 hunks)
  • src/main/java/com/google/devtools/build/lib/vfs/FileSystem.java (3 hunks)
  • src/main/java/com/google/devtools/build/lib/vfs/JavaIoFileSystem.java (1 hunks)
  • src/main/java/com/google/devtools/build/lib/vfs/Path.java (3 hunks)
  • src/main/java/com/google/devtools/build/lib/vfs/PathTransformingDelegateFileSystem.java (1 hunks)
  • src/test/java/com/google/devtools/build/lib/analysis/StubbableFSBuildViewTest.java (1 hunks)
  • src/test/java/com/google/devtools/build/lib/exec/SingleBuildFileCacheTest.java (1 hunks)
  • src/test/java/com/google/devtools/build/lib/exec/SpawnLogContextTestBase.java (1 hunks)
  • src/test/java/com/google/devtools/build/lib/remote/RemoteActionFileSystemTest.java (7 hunks)
  • src/test/java/com/google/devtools/build/lib/remote/RemoteExecutionServiceTest.java (10 hunks)
  • src/test/java/com/google/devtools/build/lib/skyframe/ArtifactFunctionTest.java (1 hunks)
  • src/test/java/com/google/devtools/build/lib/skyframe/ArtifactFunctionTestCase.java (1 hunks)
  • src/test/java/com/google/devtools/build/lib/skyframe/FileArtifactValueTest.java (1 hunks)
  • src/test/java/com/google/devtools/build/lib/skyframe/FileFunctionTest.java (2 hunks)
  • src/test/java/com/google/devtools/build/lib/unix/UnixDigestHashAttributeNameTest.java (1 hunks)
  • src/test/java/com/google/devtools/build/lib/vfs/DigestUtilsTest.java (6 hunks)
  • src/test/java/com/google/devtools/build/lib/vfs/FileSystemTest.java (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (6)
src/main/java/com/google/devtools/build/lib/exec/SpawnLogContext.java (1)
src/main/java/com/google/devtools/build/lib/vfs/DigestUtils.java (1)
  • DigestUtils (32-230)
src/main/java/com/google/devtools/build/lib/skyframe/ActionOutputMetadataStore.java (1)
src/main/java/com/google/devtools/build/lib/vfs/DigestUtils.java (1)
  • DigestUtils (32-230)
src/main/java/com/google/devtools/build/lib/remote/util/DigestUtil.java (1)
src/main/java/com/google/devtools/build/lib/vfs/DigestUtils.java (1)
  • DigestUtils (32-230)
src/main/java/com/google/devtools/build/lib/exec/RunfilesTreeUpdater.java (1)
src/main/java/com/google/devtools/build/lib/vfs/DigestUtils.java (1)
  • DigestUtils (32-230)
src/main/java/com/google/devtools/build/lib/actions/FileArtifactValue.java (1)
src/main/java/com/google/devtools/build/lib/vfs/DigestUtils.java (1)
  • DigestUtils (32-230)
src/main/java/com/google/devtools/build/lib/vfs/Path.java (1)
src/main/java/com/google/devtools/build/lib/vfs/DigestUtils.java (1)
  • DigestUtils (32-230)
🔇 Additional comments (60)
src/main/java/com/google/devtools/build/lib/exec/RunfilesTreeUpdater.java (1)

141-142: LGTM: Update to use size-unaware digest method variant

The change correctly replaces calls to DigestUtils.getDigestWithManualFallback with DigestUtils.getDigestWithManualFallbackWhenSizeUnknown for comparing manifest digests. This is appropriate since the exact file size is not known or needed when comparing these manifests, and the new method properly indicates that the file size is unknown by passing -1 to the underlying implementation.

src/main/java/com/google/devtools/build/lib/exec/SpawnLogContext.java (1)

180-181: LGTM: Adding file size to digest computation

The change appropriately updates the DigestUtils.getDigestWithManualFallback call to include the newly computed fileSize parameter, which was retrieved earlier at line 175. This allows the digest computation to verify that the file size matches what was expected and potentially optimize reads for large files.

src/main/java/com/google/devtools/build/lib/skyframe/ActionOutputMetadataStore.java (1)

560-560: LGTM: Passing file size to manual digest computation

This change correctly adds value.getSize() to the DigestUtils.manuallyComputeDigest call, ensuring the digest computation can verify that the bytes read match the expected file size. This is part of a larger effort to ensure file digest consistency and improve handling of file size discrepancies.

src/main/java/com/google/devtools/build/lib/actions/FileArtifactValue.java (1)

276-276: LGTM: Adding size parameter to digest computation

The change appropriately passes the size parameter to DigestUtils.getDigestWithManualFallback. This allows the digest computation to verify that the number of bytes read matches the expected file size, which can help detect file modifications that occur between the size check and digest computation.

src/test/java/com/google/devtools/build/lib/skyframe/ArtifactFunctionTestCase.java (1)

183-183: Updated getFastDigest method to pass expected size parameter

The method now passes -1 as the expectedSize parameter to align with the updated method signature for getDigest that checks whether the number of bytes consumed during digest computation matches the expected file size.

src/test/java/com/google/devtools/build/lib/unix/UnixDigestHashAttributeNameTest.java (1)

45-47: Added file size parameter to DigestUtils.getDigestWithManualFallback call

The test has been modified to include 123 as the file size parameter in the call to getDigestWithManualFallback, reflecting the API change that now supports size verification during digest computation.

src/test/java/com/google/devtools/build/lib/analysis/StubbableFSBuildViewTest.java (1)

140-140: Added expected size parameter to getDigest call

The getFastDigest method now passes -1 as the expectedSize parameter to the getDigest method, consistent with the updated API that now checks file size during digest computation.

src/test/java/com/google/devtools/build/lib/vfs/FileSystemTest.java (1)

1797-1798: Extracted buffer size variable for better readability

The hardcoded buffer size is now extracted into a variable named size, which improves code readability and makes it easier to reference in the new assertions.

src/main/java/com/google/devtools/build/lib/remote/util/DigestUtil.java (1)

74-76: Updated compute method to verify file size during digest computation.

The method now passes the fileSize parameter to DigestUtils.getDigestWithManualFallback, allowing verification that the actual number of bytes read matches the expected size. This helps detect race conditions where a file might change between size determination and digest computation.

src/test/java/com/google/devtools/build/lib/remote/RemoteExecutionServiceTest.java (10)

388-389: Added expectedSize parameter to getDigest call.

Updated test assertion to pass -1 as the expectedSize parameter to align with the new method signature. Using -1 likely indicates that size verification should be skipped or handled differently in this test context.


425-429: Added expectedSize parameter to getDigest calls.

Updated test assertions to pass -1 as the expectedSize parameter to accommodate the modified method signature.


1154-1157: Added expectedSize parameter to getDigest calls.

Updated test assertions to match the new method signature that accepts an expected file size parameter.


1188-1191: Added expectedSize parameter to getDigest calls.

Updated test assertions to include the -1 parameter for expectedSize, maintaining consistent usage of this default value throughout the test file.


1238-1241: Added expectedSize parameter to getDigest assertions.

Consistently updated test assertions with the -1 parameter for cases where exact size verification isn't needed in the test.


1287-1290: Added expectedSize parameter to getDigest calls in directory testing context.

Consistently updated test assertions to use the -1 parameter for the file size.


1370-1373: Added expectedSize parameter to stdout/stderr digest assertions.

Updated test assertions for stdout and stderr path digest verification to include the required file size parameter.


1406-1409: Added expectedSize parameter to inlined stdout/stderr digest verification.

Updated test assertions for stdout and stderr digests with the new file size parameter.


1451-1454: Added expectedSize parameter to in-memory output digest assertions.

Updated assertions to verify file digests in the context of in-memory outputs.


1562-1566: Added expectedSize parameter to path-mapped digest assertions.

Updated digest verification for path-mapped outputs to include the file size parameter.

src/main/java/com/google/devtools/build/lib/testing/vfs/SpiedFileSystem.java (1)

61-62: Updated getDigest method signature to include expectedSize parameter.

Modified the method signature to accept a long expectedSize parameter, allowing for size verification during digest computation. This change correctly delegates the call to the superclass implementation.

src/test/java/com/google/devtools/build/lib/skyframe/FileArtifactValueTest.java (1)

189-190: Updated getDigest method signature in test filesystem implementation.

Modified the overridden method signature to include the expectedSize parameter, aligning with changes to the parent class. The implementation still throws the same exception as before, maintaining the test's behavior while adapting to the API change.

src/main/java/com/google/devtools/build/lib/vfs/PathTransformingDelegateFileSystem.java (1)

229-232: LGTM - Digest verification with expected size.

The method has been updated to accept a new expectedSize parameter and correctly passes it to the delegate file system, aligning with the PR objective to verify byte counts during digest computation.

src/test/java/com/google/devtools/build/lib/exec/SingleBuildFileCacheTest.java (1)

64-67: LGTM - Updated test implementation.

The anonymous file system implementation now correctly passes the expected size parameter to the parent class, maintaining consistency with the updated API.

src/test/java/com/google/devtools/build/lib/exec/SpawnLogContextTestBase.java (2)

88-90: Verify -1 is the appropriate value for unknown size.

The code now passes -1 as the expected size to super.getDigest(). This appears to indicate "size unknown" or "skip size verification". Ensure this matches the expected behavior defined in the parent implementation.

Can you confirm that -1 is the documented/expected value to use when size verification should be skipped, and that this is consistent with other callers in the codebase?


92-95: LGTM - Updated method signature.

The signature update is consistent with the broader codebase changes for size-aware digest computation.

src/main/java/com/google/devtools/build/lib/vfs/JavaIoFileSystem.java (1)

423-431: LGTM - Consistent implementation of signature change.

The implementation correctly passes the new expectedSize parameter to the parent method while preserving the profiling logic, maintaining consistency with the broader digest verification updates.

src/main/java/com/google/devtools/build/lib/unix/UnixFileSystem.java (1)

448-453: Appropriate update to add the expected size parameter in getDigest.

The method has been properly updated to accept the new expectedSize parameter and passes it through to the superclass implementation. This change aligns with the broader effort to make digest computation size-aware throughout the codebase.

src/main/java/com/google/devtools/build/lib/remote/RemoteActionFileSystem.java (1)

436-447: Method signature and implementation updated correctly.

The getDigest method has been properly modified to include the expectedSize parameter and passes this parameter to the local filesystem's getDigest method. This ensures that size verification occurs during digest computation when accessing files through the remote action file system.

src/test/java/com/google/devtools/build/lib/skyframe/ArtifactFunctionTest.java (1)

122-126: Test method updated to match new API.

The override of getDigest in the anonymous CustomInMemoryFs class has been appropriately updated to include the expected size parameter and passes it through to the superclass when needed. This maintains the test's functionality while adapting to the API changes.

src/main/java/com/google/devtools/build/lib/analysis/ConfiguredRuleClassProvider.java (2)

128-130: Updated getFastDigest to use the new getDigest signature.

The implementation now calls getDigest with a default value of -1 for the expected size parameter. This is an appropriate default value for cases where the file size isn't known or relevant.


132-135: Added expectedSize parameter to getDigest method signature.

The method has been updated to include the expected size parameter, maintaining API compatibility with other FileSystem implementations. Note that this implementation doesn't actually use the size parameter since it computes digests based on path strings rather than file contents (as documented in comments above).

src/main/java/com/google/devtools/build/lib/vfs/FileSystem.java (3)

22-23: Addition of new import for Funnels.

The added import supports the new implementation of getDigest() that uses Funnels.asOutputStream() to efficiently copy file contents into a hasher.


37-37: Addition of Locale import for string formatting.

The Locale import is used to ensure consistent string formatting regardless of the system's default locale, which is a good practice for error messages.


352-372: Enhanced digest calculation with size verification.

The method signature has been modified to accept an expected file size parameter, and the implementation now verifies that the exact number of bytes are read during digest computation. This prevents potential corrupted digests that could occur when a file changes size between the stat and digest operations.

Implementation details:

  • Creates a ByteSource that reads from the file
  • Copies the content to a hasher using Funnels
  • Checks if the bytes read match the expected size
  • Throws a detailed error message if there's a mismatch

This approach helps detect race conditions that might otherwise go unnoticed.

src/test/java/com/google/devtools/build/lib/remote/RemoteActionFileSystemTest.java (9)

24-24: Added import for Mockito's anyLong matcher.

This import is necessary to support the updated method verification calls that now need to match the long parameter in the getDigest method.


582-583: Updated Mockito verification to match new method signature.

The verification has been updated to check that getDigest(any(), anyLong()) is never called, reflecting the new method signature with the expectedSize parameter.


584-585: Updated digest method call with expected size parameter.

Changed to pass -1 as the expectedSize parameter, indicating that no size verification should be performed in this test case.


597-598: Updated Mockito verification for never calling getDigest.

Similar to the previous change, this updates the verification to account for the new method signature with the expectedSize parameter.


599-600: Updated getDigest call with expected size parameter.

Changed to include -1 as the expectedSize parameter for this test case involving remote artifacts.


610-611: Updated getDigest call with expected size parameter for remote output tree test.

Consistently passing -1 as the expectedSize parameter to maintain the same test behavior while accommodating the new method signature.


621-622: Updated getDigest call for local filesystem test.

Modified to include the -1 expectedSize parameter for testing getDigest on files from the local filesystem.


631-632: Updated exception assertion test for getDigest.

The assertion has been updated to reflect the new method signature, ensuring it tests that the correct exception is thrown when attempting to get a digest for a non-existent file.


654-655: Updated getDigest call in symlink test.

This change ensures that symlink following behavior continues to work correctly with the new method signature by passing -1 for the expectedSize parameter.

src/test/java/com/google/devtools/build/lib/skyframe/FileFunctionTest.java (2)

898-901: Updated overridden getDigest method signature in test.

The test method has been updated to match the new signature with the expectedSize parameter. It properly increments the digest call counter and delegates to the parent implementation with the same parameters.


1819-1820: Updated call to getDigest in getFastDigest method.

Modified to pass -1 as the expectedSize parameter when calling getDigest, indicating that no size verification should be performed for fast digest calculations.

src/main/java/com/google/devtools/build/lib/vfs/Path.java (4)

688-690: Overloaded getDigest method to support expected size verification.

The default implementation now delegates to the new size-aware method with -1 as the default value, maintaining backward compatibility.


692-706: Well-documented size-aware digest method enhances consistency.

This new overload improves reliability by allowing callers to specify an expected file size, which can be used to detect inconsistencies when the actual bytes read don't match expectations. This helps catch race conditions between file size checking and digest computation.


735-737: File size now explicitly passed to digest calculation.

By passing the stat-determined file size to the digest calculation, we ensure consistency between the file size stat and the actual bytes digested.


750-752: Using appropriate fallback for unknown size of linked files.

For symbolic links resolved to regular files, the code correctly uses the specialized method for unknown sizes, as we can't determine the size without additional I/O operations.

src/test/java/com/google/devtools/build/lib/vfs/DigestUtilsTest.java (8)

52-53: Test class updated to match new method signature.

The test implementation has been properly updated to accommodate the new expectedSize parameter in the getDigest method.

Also applies to: 62-63


67-68: Updated fast digest fallback to use size-aware method.

The mock implementation correctly passes -1 as the size parameter when providing fast digests through the standard method.


78-83: Test method calls updated to pass file size parameter.

Test methods have been updated to explicitly pass the file size to DigestUtils.getDigestWithManualFallback.


112-115: Another test implementation updated with size parameter.

This part of the test harness has been modified to accept the new parameter and pass it through to the superclass implementation.


123-131: Test case updated to include file size in method calls.

The test now correctly retrieves and passes the file size when calling DigestUtils.getDigestWithManualFallback.


137-142: Cache-related test updated with size parameter.

The cache test has been properly updated to include file size in method calls after clearing the cache.


156-158: Mock getDigest method updated in test class.

The test implementation for the manuallyComputeDigest test case has been updated to match the new method signature.


163-163: Test assertion updated to pass file size parameter.

The test now explicitly passes the file size when calling DigestUtils.manuallyComputeDigest.

src/main/java/com/google/devtools/build/lib/vfs/DigestUtils.java (3)

163-171: Enhanced digest method to accept file size parameter.

The updated getDigestWithManualFallback method now accepts a file size parameter, which can be used to optimize digest calculation strategies (serial vs. parallel) based on file size.


192-196: Updated manual digest calculation to use file size.

The manuallyComputeDigest method now accepts a file size parameter, allowing for optimization of the digest calculation process.


210-213: Passing file size to path's getDigest method.

The method now correctly passes the file size parameter to the underlying path's getDigest method and updates the error message to include the size for better debugging.

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @arvi18, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

This pull request introduces a mechanism to verify that the number of bytes digested from a file matches the expected file size. This is done to prevent potential security issues where the file's content changes between the time its size is determined and the time its digest is computed. The changes involve modifying the getDigest methods in FileSystem and related classes to accept an expectedSize parameter, and throwing an IOException if the number of digested bytes doesn't match the expected size. Additionally, the DigestUtils class is updated to include methods that utilize this new size-checking mechanism.

Highlights

  • Digest Verification: Introduces a check to ensure the number of bytes digested matches the expected file size.
  • Modified getDigest Methods: Updates getDigest methods in FileSystem and related classes to accept an expectedSize parameter.
  • DigestUtils Updates: Adds new methods to DigestUtils to utilize the size-checking mechanism during digest computation.
  • Exception Handling: Throws an IOException if the digested bytes don't match the expected size, preventing potential security issues.

Changelog

Click here to see the changelog
  • src/main/java/com/google/devtools/build/lib/actions/FileArtifactValue.java
    • Modified line 276 to pass the file size to DigestUtils.getDigestWithManualFallback.
  • src/main/java/com/google/devtools/build/lib/analysis/ConfiguredRuleClassProvider.java
    • Modified line 129 to pass -1 as the expected size to getDigest.
    • Modified line 133 to add expectedSize parameter to getDigest.
  • src/main/java/com/google/devtools/build/lib/exec/RunfilesTreeUpdater.java
    • Modified line 141 to use DigestUtils.getDigestWithManualFallbackWhenSizeUnknown.
  • src/main/java/com/google/devtools/build/lib/exec/SpawnLogContext.java
    • Modified line 181 to pass the file size to DigestUtils.getDigestWithManualFallback.
  • src/main/java/com/google/devtools/build/lib/remote/RemoteActionFileSystem.java
    • Modified line 437 to add expectedSize parameter to getDigest.
    • Modified line 446 to pass expectedSize to localFs.getPath(path).getDigest().
  • src/main/java/com/google/devtools/build/lib/remote/util/DigestUtil.java
    • Modified line 75 to pass the file size to DigestUtils.getDigestWithManualFallback.
  • src/main/java/com/google/devtools/build/lib/skyframe/ActionOutputMetadataStore.java
    • Modified line 560 to pass the file size to DigestUtils.manuallyComputeDigest.
  • src/main/java/com/google/devtools/build/lib/testing/vfs/SpiedFileSystem.java
    • Modified line 61 to add expectedSize parameter to getDigest.
  • src/main/java/com/google/devtools/build/lib/unix/UnixFileSystem.java
    • Modified line 449 to add expectedSize parameter to getDigest.
    • Modified line 453 to pass expectedSize to super.getDigest.
  • src/main/java/com/google/devtools/build/lib/vfs/DigestUtils.java
    • Modified line 167 to add fileSize parameter to getDigestWithManualFallback.
    • Added getDigestWithManualFallbackWhenSizeUnknown method on line 183.
    • Modified line 196 to add fileSize parameter to manuallyComputeDigest.
    • Modified line 210 to pass fileSize to path.getDigest.
    • Modified line 212 to include fileSize in the Preconditions check.
  • src/main/java/com/google/devtools/build/lib/vfs/FileSystem.java
    • Added import for com.google.common.hash.Funnels on line 23.
    • Added import for java.util.Locale on line 37.
    • Modified line 352 to add expectedSize parameter to getDigest.
    • Modified line 361 to use Funnels.asOutputStream.
    • Modified line 362 to check if expectedSize matches the number of bytes copied.
    • Modified line 363 to throw an IOException if the sizes don't match.
  • src/main/java/com/google/devtools/build/lib/vfs/JavaIoFileSystem.java
    • Modified line 423 to add expectedSize parameter to getDigest.
    • Modified line 427 to pass expectedSize to super.getDigest.
  • src/main/java/com/google/devtools/build/lib/vfs/Path.java
    • Modified line 688 to call getDigest(-1).
    • Added getDigest(long expectedSize) method on line 704.
    • Modified line 735 to pass stat.getSize() to DigestUtils.getDigestWithManualFallback.
    • Modified line 751 to use DigestUtils.getDigestWithManualFallbackWhenSizeUnknown.
  • src/main/java/com/google/devtools/build/lib/vfs/PathTransformingDelegateFileSystem.java
    • Modified line 230 to add expectedSize parameter to getDigest.
    • Modified line 231 to pass expectedSize to delegateFs.getDigest.
  • src/test/java/com/google/devtools/build/lib/analysis/StubbableFSBuildViewTest.java
    • Modified line 140 to pass -1 as the expected size to getDigest.
  • src/test/java/com/google/devtools/build/lib/exec/SingleBuildFileCacheTest.java
    • Modified line 64 to add expectedSize parameter to getDigest.
    • Modified line 66 to pass expectedSize to super.getDigest.
  • src/test/java/com/google/devtools/build/lib/exec/SpawnLogContextTestBase.java
    • Modified line 89 to pass -1 as the expected size to super.getDigest.
    • Modified line 93 to add expectedSize parameter to getDigest.
  • src/test/java/com/google/devtools/build/lib/remote/RemoteActionFileSystemTest.java
    • Added import for org.mockito.ArgumentMatchers.anyLong on line 24.
    • Modified line 582 to verify that getDigest is never called with any long.
    • Modified line 584 to pass -1 as the expected size to getDigest.
    • Modified line 597 to verify that getDigest is never called with any long.
    • Modified line 599 to pass -1 as the expected size to getDigest.
    • Modified line 610 to pass -1 as the expected size to getDigest.
    • Modified line 621 to pass -1 as the expected size to getDigest.
    • Modified line 631 to pass -1 as the expected size to getDigest.
    • Modified line 654 to pass -1 as the expected size to getDigest.
  • src/test/java/com/google/devtools/build/lib/remote/RemoteExecutionServiceTest.java
    • Modified line 388 to pass -1 as the expected size to getDigest.
    • Modified line 425 to pass -1 as the expected size to getDigest.
    • Modified line 428 to pass -1 as the expected size to getDigest.
    • Modified line 1154 to pass -1 as the expected size to getDigest.
    • Modified line 1156 to pass -1 as the expected size to getDigest.
    • Modified line 1188 to pass -1 as the expected size to getDigest.
    • Modified line 1190 to pass -1 as the expected size to getDigest.
    • Modified line 1238 to pass -1 as the expected size to getDigest.
    • Modified line 1240 to pass -1 as the expected size to getDigest.
    • Modified line 1287 to pass -1 as the expected size to getDigest.
    • Modified line 1289 to pass -1 as the expected size to getDigest.
    • Modified line 1370 to pass -1 as the expected size to getDigest.
    • Modified line 1372 to pass -1 as the expected size to getDigest.
    • Modified line 1406 to pass -1 as the expected size to getDigest.
    • Modified line 1408 to pass -1 as the expected size to getDigest.
    • Modified line 1451 to pass -1 as the expected size to getDigest.
    • Modified line 1453 to pass -1 as the expected size to getDigest.
    • Modified line 1562 to pass -1 as the expected size to getDigest.
    • Modified line 1565 to pass -1 as the expected size to getDigest.
  • src/test/java/com/google/devtools/build/lib/skyframe/ArtifactFunctionTest.java
    • Modified line 122 to add expectedSize parameter to getDigest.
  • src/test/java/com/google/devtools/build/lib/skyframe/ArtifactFunctionTestCase.java
    • Modified line 183 to pass -1 as the expected size to getDigest.
  • src/test/java/com/google/devtools/build/lib/skyframe/FileArtifactValueTest.java
    • Modified line 189 to add expectedSize parameter to getDigest.
  • src/test/java/com/google/devtools/build/lib/skyframe/FileFunctionTest.java
    • Modified line 898 to add expectedSize parameter to getDigest.
    • Modified line 900 to pass expectedSize to super.getDigest.
    • Modified line 1819 to pass -1 as the expected size to getDigest.
  • src/test/java/com/google/devtools/build/lib/unix/UnixDigestHashAttributeNameTest.java
    • Modified line 46 to pass a file size of 123 to DigestUtils.getDigestWithManualFallback.
  • src/test/java/com/google/devtools/build/lib/vfs/DigestUtilsTest.java
    • Modified line 52 to add expectedSize parameter to getDigest.
    • Modified line 62 to pass expectedSize to super.getDigest.
    • Modified line 67 to pass -1 as the expected size to super.getDigest.
    • Modified line 79 to pass fileSize1 to DigestUtils.getDigestWithManualFallback.
    • Modified line 83 to pass fileSize2 to DigestUtils.getDigestWithManualFallback.
    • Modified line 112 to add expectedSize parameter to getDigest.
    • Modified line 114 to pass expectedSize to super.getDigest.
    • Modified line 124 to pass file.getFileSize() to DigestUtils.getDigestWithManualFallback.
    • Modified line 129 to pass file.getFileSize() to DigestUtils.getDigestWithManualFallback.
    • Modified line 137 to pass file.getFileSize() to DigestUtils.getDigestWithManualFallback.
    • Modified line 160 to pass a file size of 8 to DigestUtils.manuallyComputeDigest.
  • src/test/java/com/google/devtools/build/lib/vfs/FileSystemTest.java
    • Added assertions to test that xFile.getDigest throws an IOException when the expected size does not match the actual size.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


A digest's tale,
Size known, no room to fail,
Integrity's keep.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces a mechanism to verify that the number of bytes digested matches the expected size, enhancing the integrity of file digests in Bazel. This is a valuable addition to prevent potential shenanigans due to timing discrepancies. The changes are well-structured and address the core issue effectively.

Summary of Findings

  • Method Naming Consistency: The introduction of getDigestWithManualFallbackWhenSizeUnknown alongside getDigestWithManualFallback raises a question about naming consistency. Consider whether a single method with a default value for the fileSize parameter would be more maintainable.
  • Exception Message Clarity: The exception message in FileSystem.getDigest could be improved by including the file size unit (e.g., bytes) for better clarity.
  • Test Coverage: While the changes include modifications to several test files, it's important to ensure that the new functionality is adequately tested, especially the case where the digested bytes do not match the expected size.

Merge Readiness

The pull request is well-structured and addresses an important issue. However, before merging, it would be beneficial to address the naming consistency of the digest methods and enhance the clarity of the exception message. Also, ensure that the new functionality is adequately tested. I am unable to approve this pull request, and users should have others review and approve this code before merging. Given the medium severity issues, I recommend that the pull request not be merged until those are addressed.

Comment on lines +173 to 186
/**
* Gets the digest of {@code path}, using a constant-time xattr call if the filesystem supports
* it, and calculating the digest manually otherwise.
*
* <p>Unlike {@link #getDigestWithManualFallback}, will not rate-limit manual digesting of files,
* so only use this method if the file size is truly unknown and you don't expect many concurrent
* manual digests of large files.
*
* @param path Path of the file.
*/
public static byte[] getDigestWithManualFallbackWhenSizeUnknown(
Path path, XattrProvider xattrProvider) throws IOException {
return getDigestWithManualFallback(path, -1, xattrProvider);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Consider whether it would be better to have a single getDigestWithManualFallback method with an optional expectedSize parameter (e.g., defaulting to -1) instead of having two separate methods. This might improve maintainability and reduce code duplication.

Comment on lines +1803 to +1804
assertThrows(IOException.class, () -> xFile.getDigest(size + 1));
assertThrows(IOException.class, () -> xFile.getDigest(size - 1));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

It would be helpful to add a test case that specifically checks if an IOException is thrown when xFile.getDigest is called with an expectedSize that does not match the actual file size.

@visz11
Copy link
Collaborator

visz11 commented Apr 28, 2025

@coderabbitai full review

Copy link

coderabbitai bot commented Apr 28, 2025

✅ Actions performed

Full review triggered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants