-
-
Notifications
You must be signed in to change notification settings - Fork 585
fix: Fall back to directory based runfiles using relative paths #2621
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Thanks for the fix @mering . I think this looks fine, but some surrounding changes to make:
|
b5f6f7d
to
cf7f3af
Compare
@rickeylev Thanks for the review. See my answers below:
|
This will also work out of the box when the runfiles tree is packaged into a container image.
cf7f3af
to
d060f57
Compare
Overall LGTM. CI is failing -- that test just needs to be updated. It's assuming the only two ways to find the runfiles root is the environment variables, but we're adding a 3rd attempt to look based upon its own file path.
Thanks for explaining. So basically, copying Yeah, that should work -- the structure is preserved; I don't see why it wouldn't. I've updated the description to reflect this.
What you're probably seeing is the bootstrap will set the RUNFILES_DIR environment variable (or manifest env var, depending on what it figures out). er but -- if the bootstrap is always setting those env vars, how is runfiles not seeing them? |
exit 1 | ||
fi | ||
|
||
# Test invocation without RUNFILES environment variables set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if I'm misunderstanding the test setup, but this doesn't look like a supported execution mode for any runfiles library: The runfiles tree lives next to the .sh
file, which invokes the Python binary launcher as a subprocess but doesn't set any environment variables.
@mering Could you share a reproducer for the original issue that motivated this PR? I am happy to debug it, but I strongly suspect that the root cause lies elsewhere and strictly following the lookup procedures of other runfiles libraries is the best way to avoid nasty surprises.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The situation is essentially the same as calling bazel-bin/foo
directly (no guarantees about what the environment has set/not set), which is supported. Having e.g. a sh_binary with a data dependency that calls it as a subprocess is the same, just with an extra layer.
The bootstrap logic should be taking care of this, though. It jumps through lots of hoops to find the runfiles directory/manifest and sets a runfiles environment variable, which the runfiles library should see later when Create() is called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not the same situation since the py_binary lives in the runfiles tree of the sh_binary, it doesn't have its own sibling runfiles tree. Transitive runfiles trees are no longer created, which is why cooperation via environment variables is required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The py_binary is a data dependency of the sh_binary, so the py runfiles are merged with the sh runfiles. The result is the py binary within the sh runfiles uses the sh runfiles tree -- this is correct, since that's where its runfiles are.
Cooperation between the two processes using env vars isn't necessary: the py bootstrap is already performing the logic to find its runfiles root. There's actually a test of this over in tests/bootstrap_impls already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, that's because of this loop: https://github.com/bazel-contrib/rules_python/blame/d713ba704e9a6442c409134f7a701c0b6e1a9fe0/python/private/stage1_bootstrap_template.sh#L77
This is non-standard logic that most runfiles libraries don't contain. It may work well, it may be non-hermetic in edge cases, I'm not entirely sure. I'll think about this some more. It does mean that a Python process indirectly invoked within the runfiles of another binary will work, but if it runs a tool that uses a runfiles library without this trick that one won't work without the env vars.
Since setting env vars ensures hermeticity across languages, I would personally always set them.
@@ -339,6 +342,10 @@ def Create(env: Optional[Dict[str, str]] = None) -> Optional["Runfiles"]: | |||
directory = env_map.get("RUNFILES_DIR") | |||
if directory: | |||
return CreateDirectoryBased(directory) | |||
|
|||
directory = _FindPythonRunfilesRoot() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may not find the correct runfiles tree: It's an internal helper meant to find the tree that contains the Python files, solely for the purpose of identifying the calling repo. This tree could be entirely different from the runfiles tree, e.g. on Windows when using a Python ZIP.
This makes the runfiles libraries work when they're repackaged and extracted elsewhere.
e.g. a py_binary is packaged using
pkg_tar
, then extracted somewhere else.The
bazel-bin/{BIN,BIN.runfiles}
structure (sans the bazel-bin prefix) is preserved, and then e.g../BIN
is run. This is equivalent to runningbazel-bin/BIN
(i.e nopresumption about environment variables can be made), just the location differs.
This makes it work out of the box when the runfiles tree is packaged into a container image as follows: