fix: Fall back to directory based runfiles using relative paths #2621

mering · 2025-02-21T11:26:30Z

This makes the runfiles libraries work when they're repackaged and extracted elsewhere.

e.g. a py_binary is packaged using pkg_tar, then extracted somewhere else.
The bazel-bin/{BIN,BIN.runfiles} structure (sans the bazel-bin prefix) is preserved, and then e.g. ./BIN is run. This is equivalent to running bazel-bin/BIN (i.e no
presumption about environment variables can be made), just the location differs.

This makes it work out of the box when the runfiles tree is packaged into a container image as follows:

py_binary(
    name = "binary",
    srcs = ["main.py"],
    data = ["runfile.txt"],
)

pkg_tar(
    name = "layer",
    srcs = [":binary"],
    strip_prefix = "/",
    include_runfiles = True,
)

oci_image(
    name = "image",
    tars = [":layer"],
)

rickeylev · 2025-02-22T22:25:26Z

Thanks for the fix @mering . I think this looks fine, but some surrounding changes to make:

Can you describe the problematic case and why this fixes it?
Tests: This looks easy to test if we just have a shell script setup the environment (unset the RUNFILES env vars) and call the binary and check it worked. The tests/support/sh_py_run_test.bzl#sh_py_run_test should make this pretty easy. There are examples of using it in tests/bootstrap_impls
CHANGELOG.md needs to be updated.

mering · 2025-02-26T11:52:42Z

@rickeylev Thanks for the review. See my answers below:

When used in a container image without involving Bazel at all, the environment variables are not set but runfiles are in the expected folder BIN.runfiles as it was packaged via pkg_tar with include_runfiles = True. I updated the description with an example.
I tried doing this but for some reason the $RUNFILES_DIR is still set within Python even though I unset it in the shell script.
I added this to the CHANGELOG.

This will also work out of the box when the runfiles tree is packaged into a container image.

rickeylev · 2025-04-25T19:55:16Z

Overall LGTM. CI is failing -- that test just needs to be updated. It's assuming the only two ways to find the runfiles root is the environment variables, but we're adding a 3rd attempt to look based upon its own file path.

When used in a container image without involving Bazel at all, the environment variables are not set but runfiles are in the expected folder BIN.runfiles as it was packaged via pkg_tar with include_runfiles = True. I updated the description with an example.

Thanks for explaining. So basically, copying {BIN,BIN.runfiles} somewhere else, and then running e.g. ./BIN.

Yeah, that should work -- the structure is preserved; I don't see why it wouldn't. I've updated the description to reflect this.

I tried doing this but for some reason the $RUNFILES_DIR is still set within Python even though I unset it in the shell script.

What you're probably seeing is the bootstrap will set the RUNFILES_DIR environment variable (or manifest env var, depending on what it figures out).

er but -- if the bootstrap is always setting those env vars, how is runfiles not seeing them?

fmeum · 2025-04-30T08:11:48Z

tests/runfiles/run_binary_with_runfiles_test.sh

+  exit 1
+fi
+
+# Test invocation without RUNFILES environment variables set


Sorry if I'm misunderstanding the test setup, but this doesn't look like a supported execution mode for any runfiles library: The runfiles tree lives next to the .sh file, which invokes the Python binary launcher as a subprocess but doesn't set any environment variables.

@mering Could you share a reproducer for the original issue that motivated this PR? I am happy to debug it, but I strongly suspect that the root cause lies elsewhere and strictly following the lookup procedures of other runfiles libraries is the best way to avoid nasty surprises.

The situation is essentially the same as calling bazel-bin/foo directly (no guarantees about what the environment has set/not set), which is supported. Having e.g. a sh_binary with a data dependency that calls it as a subprocess is the same, just with an extra layer.

The bootstrap logic should be taking care of this, though. It jumps through lots of hoops to find the runfiles directory/manifest and sets a runfiles environment variable, which the runfiles library should see later when Create() is called.

It's not the same situation since the py_binary lives in the runfiles tree of the sh_binary, it doesn't have its own sibling runfiles tree. Transitive runfiles trees are no longer created, which is why cooperation via environment variables is required.

The py_binary is a data dependency of the sh_binary, so the py runfiles are merged with the sh runfiles. The result is the py binary within the sh runfiles uses the sh runfiles tree -- this is correct, since that's where its runfiles are.

Cooperation between the two processes using env vars isn't necessary: the py bootstrap is already performing the logic to find its runfiles root. There's actually a test of this over in tests/bootstrap_impls already.

I see, that's because of this loop: https://github.com/bazel-contrib/rules_python/blame/d713ba704e9a6442c409134f7a701c0b6e1a9fe0/python/private/stage1_bootstrap_template.sh#L77

This is non-standard logic that most runfiles libraries don't contain. It may work well, it may be non-hermetic in edge cases, I'm not entirely sure. I'll think about this some more. It does mean that a Python process indirectly invoked within the runfiles of another binary will work, but if it runs a tool that uses a runfiles library without this trick that one won't work without the env vars.

Since setting env vars ensures hermeticity across languages, I would personally always set them.

fmeum · 2025-04-30T08:13:19Z

python/runfiles/runfiles.py

@@ -339,6 +342,10 @@ def Create(env: Optional[Dict[str, str]] = None) -> Optional["Runfiles"]:
        directory = env_map.get("RUNFILES_DIR")
        if directory:
            return CreateDirectoryBased(directory)
+
+        directory =  _FindPythonRunfilesRoot()


This may not find the correct runfiles tree: It's an internal helper meant to find the tree that contains the Python files, solely for the purpose of identifying the calling repo. This tree could be entirely different from the runfiles tree, e.g. on Windows when using a Python ZIP.

mering requested review from rickeylev and aignas as code owners February 21, 2025 11:26

mering force-pushed the runfiles-relative branch from b5f6f7d to cf7f3af Compare February 26, 2025 11:50

mering added 3 commits April 25, 2025 13:28

Fall back to directory based runfiles using relative paths

e9f93c9

This will also work out of the box when the runfiles tree is packaged into a container image.

Add changelog

27fd83c

Add test which unsets RUNFILES env vars

d060f57

mering force-pushed the runfiles-relative branch from cf7f3af to d060f57 Compare April 25, 2025 13:28

rickeylev approved these changes Apr 25, 2025

View reviewed changes

rickeylev and others added 3 commits April 25, 2025 13:01

Update runfiles_test.py test

e28a9a4

Merge branch 'main' into runfiles-relative

3584c16

Update CHANGELOG.md

d189545

fmeum reviewed Apr 30, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Fall back to directory based runfiles using relative paths #2621

fix: Fall back to directory based runfiles using relative paths #2621

mering commented Feb 21, 2025 •

edited by rickeylev

Loading

rickeylev commented Feb 22, 2025

mering commented Feb 26, 2025 •

edited

Loading

rickeylev commented Apr 25, 2025

fmeum Apr 30, 2025

rickeylev Apr 30, 2025

fmeum Apr 30, 2025

rickeylev Apr 30, 2025

fmeum Apr 30, 2025

fmeum Apr 30, 2025

fix: Fall back to directory based runfiles using relative paths #2621

Are you sure you want to change the base?

fix: Fall back to directory based runfiles using relative paths #2621

Conversation

mering commented Feb 21, 2025 • edited by rickeylev Loading

rickeylev commented Feb 22, 2025

mering commented Feb 26, 2025 • edited Loading

rickeylev commented Apr 25, 2025

fmeum Apr 30, 2025

Choose a reason for hiding this comment

rickeylev Apr 30, 2025

Choose a reason for hiding this comment

fmeum Apr 30, 2025

Choose a reason for hiding this comment

rickeylev Apr 30, 2025

Choose a reason for hiding this comment

fmeum Apr 30, 2025

Choose a reason for hiding this comment

fmeum Apr 30, 2025

Choose a reason for hiding this comment

mering commented Feb 21, 2025 •

edited by rickeylev

Loading

mering commented Feb 26, 2025 •

edited

Loading