Skip to content

Cache dex syntethic context in dex builder #20

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

arvi18
Copy link

@arvi18 arvi18 commented Apr 28, 2025

There is a data race when using workers when there are multiple libraries that include the same class. This is more likely to happen when building multiple android_binary targets in a single invocation. When a cached entry is found, only the dex content is used and the sythetic info is ignored. In those instances, the dex archive will be incorrectly created and might lead to failures during dex merging

Fixes: bazelbuild/rules_android#300

Summary by CodeRabbit

  • New Features
    • Improved dexing process by caching both dex byte content and associated synthetic context information, enhancing efficiency in repeated builds.
  • Tests
    • Added a new test to verify correct caching of synthetic context data during dexing operations.

There is a data race when using workers when there are multiple
libraries that include the same class. This is more likely to happen
when building multiple android_binary targets in a single invocation.
When a cached entry is found, only the dex content is used and the
sythetic info is ignored. In those instances, the dex archive will
be incorrectly created and might lead to failures during dex merging
@arvi18
Copy link
Author

arvi18 commented Apr 28, 2025

I imported this and started running our internal tests against it. Also, is there a test we can add to src/test/java/com/google/devtools/build/android/r8 for this case?

@arvi18
Copy link
Author

arvi18 commented Apr 28, 2025

I imported this and started running our internal tests against it. Also, is there a test we can add to src/test/java/com/google/devtools/build/android/r8 for this case?

Added a new test. I had to expose processRequest to the test. Otherwise I'd have to start a new process to in order to go through the workers flow.

Copy link

coderabbitai bot commented Apr 28, 2025

Walkthrough

This change refactors the CompatDexBuilder class to enhance how dexing cache entries are handled. Instead of caching only the raw dex byte arrays, the cache now stores both the dex content and associated synthetic context information using a new DexingEntryR8 class. The relevant methods and cache types are updated accordingly. Additionally, the visibility of the ContextConsumer class is increased, and a setter for it is added in DexConsumer. A new test, compileWithCachedSyntheticLambdas, is introduced to verify correct caching and restoration of synthetic context information during dexing.

Changes

File(s) Change Summary
src/tools/android/java/com/google/devtools/build/android/r8/CompatDexBuilder.java Refactored dex cache to store DexingEntryR8 (dex bytes + context); updated related methods and types; made ContextConsumer public; added setter in DexConsumer; introduced DexingEntryR8 class.
src/test/java/com/google/devtools/build/android/r8/CompatDexBuilderTest.java Added test compileWithCachedSyntheticLambdas to verify caching and restoration of synthetic context info during dexing.

Sequence Diagram(s)

sequenceDiagram
    participant Test as CompatDexBuilderTest
    participant Builder as CompatDexBuilder
    participant Cache as Guava Cache
    participant DexConsumer
    participant Output as Output ZIP

    Test->>Builder: create CompatDexBuilder instance
    Test->>Cache: create Guava Cache
    loop For each input JAR
        Test->>Builder: processRequest(cache, ...)
        Builder->>Cache: lookup DexingKeyR8
        alt Cache miss
            Builder->>DexConsumer: create and process
            DexConsumer->>Builder: provide dex bytes + context
            Builder->>Cache: store DexingEntryR8 (bytes + context)
        else Cache hit
            Cache->>Builder: return DexingEntryR8
            Builder->>DexConsumer: restore context from entry
        end
        Builder->>Output: write .class.dex and synthetic-contexts.map
    end
    Test->>Output: verify expected files and mapping
Loading

Assessment against linked issues

Objective Addressed Explanation
Ensure synthetic context information is preserved and restored when dexing with cached entries to prevent incomplete META-INF/synthetic-contexts.map and related merge failures (bazelbuild#300)
Provide a mechanism for the cache to store both dex bytes and synthetic context info, and restore context info on cache hit (bazelbuild#300)
Add tests to verify caching and restoration of synthetic context information (bazelbuild#300)

Poem

In the warren where code does hop,
A cache of context now won't stop!
Dex bytes and context, side by side,
In DexingEntryR8, they both reside.
With tests that check the mapping true,
No synthetic bunny will go askew.
🐇✨ The builder leaps—bugs bid adieu!

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @arvi18, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

This pull request addresses a data race condition that occurs when using workers with multiple libraries containing the same class. This is more likely when building multiple android_binary targets in a single invocation. The issue arises because when a cached entry is found, only the dex content is used, and the synthetic info is ignored, leading to incorrect dex archive creation and potential failures during dex merging. To resolve this, the PR caches the dex synthetic context in the dex builder. The changes involve modifying CompatDexBuilder.java to store and retrieve the ContextConsumer along with the dex bytecode in the cache. A new test case, compileWithCachedSyntheticLambdas, is added to CompatDexBuilderTest.java to verify the fix.

Highlights

  • Data Race Fix: Addresses a data race when using workers with multiple libraries containing the same class by caching dex synthetic context.
  • Cache Implementation: Modifies CompatDexBuilder.java to store and retrieve ContextConsumer along with dex bytecode in the cache.
  • Test Case Addition: Adds compileWithCachedSyntheticLambdas test case to CompatDexBuilderTest.java to verify the fix.
  • DexingEntryR8 Creation: Introduces DexingEntryR8 to encapsulate both dex content and ContextConsumer for caching.

Changelog

Click here to see the changelog
  • src/test/java/com/google/devtools/build/android/r8/CompatDexBuilderTest.java
    • Added imports for DiagnosticsHandler, Cache, CacheBuilder, PrintWriter, and Arrays.
    • Added a new test case compileWithCachedSyntheticLambdas to verify caching of synthetic context information.
  • src/tools/android/java/com/google/devtools/build/android/r8/CompatDexBuilder.java
    • Added @VisibleForTesting annotation.
    • Made ContextConsumer a public static class.
    • Modified DexConsumer to hold a non-final ContextConsumer.
    • Added setter method setContextConsumer in DexConsumer.
    • Changed the dex cache to store DexingEntryR8 instead of byte[].
    • Modified processRequest and dexEntries to use the new cache type.
    • Updated dexEntry to retrieve and set the ContextConsumer from the cache.
    • Created DexingEntryR8 to store both dex content and ContextConsumer in the cache.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


A race condition's dread,
Threads entangled, ahead.
Cache the context,
Avoid the pretext,
Now the build runs instead.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request addresses a data race issue in the CompatDexBuilder when using workers with multiple libraries containing the same class. The changes involve caching synthetic context information along with dexed classes to ensure correct dex archive creation. The new test case compileWithCachedSyntheticLambdas validates this fix. Overall, the changes seem well-structured and address the reported issue.

Summary of Findings

  • Cache Key Considerations: The current cache key DexingKeyR8 includes classfileContent. While this ensures that changes to the class file invalidate the cache, it might be too granular. Consider whether other factors (e.g., compiler flags) should also be part of the key, or whether the entire classfile content is necessary, or if a hash of the classfile would suffice.
  • Error Handling in processRequest: The processRequest method catches several exceptions but only prints a stack trace for IOException, InterruptedException, and ExecutionException. Consider logging the exception message for OptionsParsingException as well for consistency.

Merge Readiness

The pull request appears to address the data race issue effectively. The addition of the compileWithCachedSyntheticLambdas test provides confidence in the fix. However, before merging, it would be beneficial to consider the cache key considerations and error handling improvements mentioned in the review comments. I am unable to directly approve this pull request, and recommend that others review and approve this code before merging.

Comment on lines 160 to 161
// .class & class.dex files are around 1-5 KB, so this fits ~30K-35K class-dex pairs.
.maximumWeight(min(Runtime.getRuntime().maxMemory() - 25 * ONE_MEG, 200 * ONE_MEG))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The cache weight calculation includes key.classfileContent().length. Is it necessary to include the entire class file content length in the weight? Could a hash of the class file content be used instead to reduce memory usage and improve performance?

                      return key.classfileContent().length + value.dexContent().length;

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🔭 Outside diff range comments (1)
src/tools/android/java/com/google/devtools/build/android/r8/CompatDexBuilder.java (1)

77-113: 💡 Verification agent

🧩 Analysis chain

Public ContextConsumer exposes mutable state outside the builder

ContextConsumer is now public, yet its mutable fields (sythesizedPrimaryClass, contextOfSynthesizedClass) are package-private and can be modified from anywhere in the codebase.
This breaks encapsulation and makes incorrect external mutation (or even accidental reads during mutation) possible, which can easily re-introduce the data-race the PR is trying to solve.

Two minimal mitigations:

-public static class ContextConsumer implements SyntheticInfoConsumer {
+// Keep visibility package-private unless tests require otherwise.
+static class ContextConsumer implements SyntheticInfoConsumer {

AND/OR wrap the two fields in private volatile and expose them only through the existing accessor, keeping the mutable state hidden.

A more future-proof approach is to store the immutable mapping string in the cache instead of the whole ContextConsumer, then construct a fresh, empty consumer on every cache hit.
That avoids sharing mutable objects across threads entirely.


🏁 Script executed:

#!/bin/bash
# Show the declaration and context of ContextConsumer
grep -n "class ContextConsumer" src/tools/android/java/com/google/devtools/build/android/r8/CompatDexBuilder.java
# List all references to ContextConsumer in the repo (to check external usage)
rg -n "ContextConsumer" .

Length of output: 1935


Limit ContextConsumer visibility and encapsulate its mutable state

To prevent accidental external mutation (and re-introducing the data race), update CompatDexBuilder.java as follows:

• In src/tools/android/java/com/google/devtools/build/android/r8/CompatDexBuilder.java at line 77, change the nested class to package-private:

- public static class ContextConsumer implements SyntheticInfoConsumer {
+ static class ContextConsumer implements SyntheticInfoConsumer {

• Hide and synchronize its mutable fields by making them private volatile, exposing them only through getContextMapping():

- ClassReference sythesizedPrimaryClass = null;
- ClassReference contextOfSynthesizedClass = null;
+ private volatile ClassReference sythesizedPrimaryClass = null;
+ private volatile ClassReference contextOfSynthesizedClass = null;

These two changes ensure callers can’t mutate internal state directly.

🧹 Nitpick comments (4)
src/tools/android/java/com/google/devtools/build/android/r8/CompatDexBuilder.java (2)

157-169: Cache weigher ignores synthetic-context size

Weigher only counts dexContent().length, ignoring the (typically small) size of the synthetic-context map held inside ContextConsumer.
If large amounts of synthetic data are ever stored, the cache may exceed its intended memory bound.

-                      return key.classfileContent().length + value.dexContent().length;
+                      return key.classfileContent().length
+                          + value.dexContent().length
+                          // synthetic info is very small but make the cost explicit
+                          + 64;

(optional – at least leave a comment clarifying why it is ignored).


409-425: DexingEntryR8 stores a mutable object – breaks AutoValue immutability contract

AutoValue classes are intended to be value holders; embedding a mutable ContextConsumer violates that expectation and the class’ generated equals/hashCode semantics (instance identity vs. content equality).
Caching an immutable DTO (e.g. the mapping string or an ImmutableList<String>) would avoid surprises and simplify equality checks in the future.

src/test/java/com/google/devtools/build/android/r8/CompatDexBuilderTest.java (2)

135-139: PrintWriter is never closed

PrintWriter pw = new PrintWriter(System.err); opens a buffered writer that should be flushed/closed.
Since it wraps System.err you may omit it altogether and pass null to processRequest, or use try-with-resources:

-    DiagnosticsHandler diagnostics = new DiagnosticsHandler() {};
-    PrintWriter pw = new PrintWriter(System.err);
+    DiagnosticsHandler diagnostics = new DiagnosticsHandler() {};
+    try (PrintWriter pw = new PrintWriter(System.err)) {
+      ...
+    }

150-176: Missing existence assertion for outputZipB

The second build artefact is later inspected but its existence is never checked.
Add a quick guard to fail fast if the build silently failed:

 final Path outputZipB = temp.getRoot().toPath().resolve("outB.zip");
 ...
 assertThat(Files.exists(outputZipB)).isTrue();
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c0c98a2 and 6dabad1.

📒 Files selected for processing (2)
  • src/test/java/com/google/devtools/build/android/r8/CompatDexBuilderTest.java (2 hunks)
  • src/tools/android/java/com/google/devtools/build/android/r8/CompatDexBuilder.java (11 hunks)
🔇 Additional comments (1)
src/tools/android/java/com/google/devtools/build/android/r8/CompatDexBuilder.java (1)

135-138: 🛠️ Refactor suggestion

setContextConsumer is unsynchronised – potential visibility issues

setContextConsumer writes a reference that is subsequently read from another thread when getContextConsumer() is invoked after Future#get.
Although today both calls happen in the same thread, the contract of ExecutorService does not guarantee that future changes will respect this ordering. Marking the field volatile (or synchronising the setter/getter) eliminates any out-of-order hazard at negligible cost.

-    ContextConsumer contextConsumer = new ContextConsumer();
+    volatile ContextConsumer contextConsumer = new ContextConsumer();

Likely an incorrect or invalid review comment.

Comment on lines +359 to 372
DexingEntryR8 cachedDexEntry = null;
byte[] classFileBytes = null;
try (InputStream stream = zipFile.getInputStream(classEntry)) {
classFileBytes = ByteStreams.toByteArray(stream);
if (dexCache != null) {
// If the cache exists, check for cache validity.
cachedDexBytes =
cachedDexEntry =
dexCache.getIfPresent(DexingKeyR8.create(mode, minSdkVersion, classFileBytes));
}
if (cachedDexBytes != null) {
if (cachedDexEntry != null) {
// Cache hit: quit early and return the data
consumer.setBytes(cachedDexBytes);
consumer.setBytes(cachedDexEntry.dexContent());
consumer.setContextConsumer(cachedDexEntry.contextConsumer());
return consumer;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Shared ContextConsumer instance may be read concurrently

On a cache hit the same ContextConsumer instance is handed out to every new DexConsumer:

consumer.setContextConsumer(cachedDexEntry.contextConsumer());

Even though we currently never mutate it after caching, future refactors (or a D8 upgrade) could call acceptSyntheticInfo again, causing racy writes on a cached object.
Safer pattern:

-        consumer.setContextConsumer(cachedDexEntry.contextConsumer());
+        // copy-by-value to guarantee immutability
+        ContextConsumer ctxCopy = new ContextConsumer();
+        ctxCopy.sythesizedPrimaryClass =
+            cachedDexEntry.contextConsumer().sythesizedPrimaryClass;
+        ctxCopy.contextOfSynthesizedClass =
+            cachedDexEntry.contextConsumer().contextOfSynthesizedClass;
+        consumer.setContextConsumer(ctxCopy);

(or cache the String mapping as proposed earlier).

Comment on lines +145 to +149
compatDexBuilder.processRequest(dexCache, diagnostics, Arrays.asList(args), pw);

args = new String[] {"--input_jar", inputJar, "--output_zip", outputZipB.toString()};
compatDexBuilder.processRequest(dexCache, diagnostics, Arrays.asList(args), pw);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Return value of processRequest is ignored

processRequest returns a non-zero exit code on failure.
Asserting the result will surface unexpected errors during the test run and prevent false positives.

-int exitA = compatDexBuilder.processRequest(dexCache, diagnostics, Arrays.asList(args), pw);
-assertThat(exitA).isEqualTo(0);
+int exitA = compatDexBuilder.processRequest(dexCache, diagnostics, Arrays.asList(args), pw);
+assertThat(exitA).isEqualTo(0);

Repeat for the second invocation.

Committable suggestion skipped: line range outside the PR's diff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Conflicting version of R8/D8 in Dexer/Desugar classpath
2 participants