Skip to content

whl_library.BuildWheelFromSource uses pip to download dependencies and does not inject credential helper information. #2640

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dougthor42 opened this issue Feb 28, 2025 · 4 comments

Comments

@dougthor42
Copy link
Collaborator

🐞 bug report

Affected Rule

pip.parse and the underlying code

Is this a regression?

Not that I can tell.

Description

When using python package index that requires authentication, whl_library.BuildWheelFromSource will pass that index arg to the pip wheel command.

This is a problem when using the Bazel downloader (experimental_index_url et. al.). Bazel can authenticate to the package index via a credential helper header injection and can successfully download the source tarball. However, rules_python then tries to create a wheel from that tarball and uses pip wheel --index-url "${INDEX_NEEDING_AUTH}" .... Thus building the wheel fails because pip can't auth to the private index.

🔬 Minimal Reproduction

Hard to repro if you don't have a private index readily available, but the gist is:

  1. Use only package indexes that require authentication
  2. Use experimental Bazel downloader experimental_index_url
  3. Attempt to install a package that fits both of these requirements. pygraphviz is a good example of such a package.
    1. Does not have a wheel available on the package index
    2. Requires an additional package (such as setuptools) in order to build a wheel.

🔥 Exception or Error

root@b4337bd118dd:/bazel_starter# time bazel build --nobuild --config=local --remote_cache= --bes_backend= //...
INFO: Repository rules_python~~pip~pypi_312_pygraphviz_sdist_8b0b9207 instantiated at:
  <builtin>: in <toplevel>
Repository rule whl_library defined at:
  /root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~/python/private/pypi/whl_library.bzl:469:30: in <toplevel>
INFO: repository @@rules_python~~pip~pypi_312_pygraphviz_sdist_8b0b9207' used the following cache hits instead of downloading the corresponding file.
 * Hash '8b0b9207954012f3b670e53b8f8f448a28d12bdbbcf69249313bd8dbe680152f' for https://REDACTED_1/pygraphviz/pygraphviz-1.12.tar.gz
If the definition of 'repository @@rules_python~~pip~pypi_312_pygraphviz_sdist_8b0b9207' was updated, verify that the hashes were also updated.
ERROR: /root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~/python/private/repo_utils.bzl:79:16: An error occurred during the fetch of repository 'rules_python~~pip~pypi_312_pygraphviz_sdist_8b0b9207':
   Traceback (most recent call last):
        File "/root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~/python/private/pypi/whl_library.bzl", line 245, column 40, in _whl_library_impl
                pypi_repo_utils.execute_checked(
        File "/root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~/python/private/pypi/pypi_repo_utils.bzl", line 133, column 38, in _execute_checked
                return repo_utils.execute_checked(
        File "/root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~/python/private/repo_utils.bzl", line 216, column 29, in _execute_checked
                return _execute_internal(fail_on_error = True, *args, **kwargs)
        File "/root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~/python/private/repo_utils.bzl", line 147, column 27, in _execute_internal
                return logger.fail((
        File "/root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~/python/private/repo_utils.bzl", line 89, column 39, in lambda
                fail = lambda message_cb: _log(-1, "FAIL", message_cb, fail),
        File "/root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~/python/private/repo_utils.bzl", line 79, column 16, in _log
                printer("\nrules_python:{} {}:".format(
Error in fail:
rules_python:whl_library(@@rules_python~~pip~pypi_312_pygraphviz_sdist_8b0b9207) FAIL: repo.execute: whl_library.BuildWheelFromSource(rules_python~~pip~pypi_312_pygraphviz_sdist_8b0b9207, pygraphviz==1.12): end: failure:
  command: /root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~~python~python_3_12_2_host/python -m python.private.pypi.whl_installer.wheel_installer --requirement pygraphviz==1.12 --isolated --extra_pip_args "{\"arg\":[\"--index-url\",\"https://oauth2accesstoken@REDACTED_2/simple\",\"--extra-index-url\",\"https://oauth2accesstoken@REDACTED_1/simple\",\"--find-links\",\".\"]}" --pip_data_exclude "{\"arg\":[]}" --environment "{\"arg\":{}}"
  return code: 1
  working dir: <default: /root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~~pip~pypi_312_pygraphviz_sdist_8b0b9207>
  timeout: 600
  environment:
PYTHONPATH="/root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~:/root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~~internal_deps~pypi__build:/root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~~internal_deps~pypi__click:/root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~~internal_deps~py
pi__colorama:/root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~~internal_deps~pypi__importlib_metadata:/root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~~internal_deps~pypi__installer:/root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~~internal_deps~pypi__more_itertools:/root/.cache/bazel/_bazel_root/dd66987607986e0bbc3
aa1b8af741d50/external/rules_python~~internal_deps~pypi__packaging:/root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~~internal_deps~pypi__pep517:/root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~~internal_deps~pypi__pip:/root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~~internal_deps~pypi__pip_tools:/root/.cache/bazel/
_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~~internal_deps~pypi__pyproject_hooks:/root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~~internal_deps~pypi__setuptools:/root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~~internal_deps~pypi__tomli:/root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~~intern
al_deps~pypi__wheel:/root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~~internal_deps~pypi__zipp"
CPPFLAGS="-isystem /root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~~python~python_3_12_2_host/include/python3.12"
===== stdout start =====
Looking in indexes: https://****@REDACTED_2/simple, https://****@REDACTED_1/simple                                                                                                                                                                                                                                                                                         Looking in links: .
Processing ./pygraphviz-1.12.tar.gz (from -r /tmp/tmp192c1dbh (line 1))
  File was already downloaded /root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~~pip~pypi_312_pygraphviz_sdist_8b0b9207/pygraphviz-1.12.tar.gz
  Installing build dependencies: started
  Installing build dependencies: finished with status 'error'
===== stdout end =====
===== stderr start =====
WARNING: 401 Error, Credentials not correct for https://REDACTED_2/simple/pygraphviz/
WARNING: 401 Error, Credentials not correct for https://REDACTED_1/simple/pygraphviz/
  error: subprocess-exited-with-error

  × pip subprocess to install build dependencies did not run successfully.
  │ exit code: 1
  ╰─> [6 lines of output]
      Looking in indexes: https://****@REDACTED_2/simple, https://****@REDACTED_1/simple
      Looking in links: .
      WARNING: 401 Error, Credentials not correct for https://REDACTED_2/simple/setuptools/
      WARNING: 401 Error, Credentials not correct for https://REDACTED_1/simple/setuptools/
      ERROR: Could not find a version that satisfies the requirement setuptools>=61.2 (from versions: none)
      ERROR: No matching distribution found for setuptools>=61.2
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× pip subprocess to install build dependencies did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~/python/private/pypi/whl_installer/wheel_installer.py", line 205, in <module>
    main()
  File "/root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~/python/private/pypi/whl_installer/wheel_installer.py", line 190, in main
    subprocess.run(pip_args, check=True, env=env)
  File "/root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~~python~python_3_12_2_x86_64-unknown-linux-gnu/lib/python3.12/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/root/.cache/bazel/_bazel_root/dd66987607986e0bbc3aa1b8af741d50/external/rules_python~~python~python_3_12_2_host/python', '-m', 'pip', '--isolated', 'wheel', '--no-deps', '--index-url', 'https://oauth2accesstoken@REDACTED_2/simple', '--extra-index-url', 'https://oauth2accesstoken@REDACTED_1/simple', '--find-links', '.',
 '-r', '/tmp/tmp192c1dbh']' returned non-zero exit status 1.
===== stderr end =====

🌍 Your Environment

Operating System:

gLinux (based on Debian testing)

Output of bazel version:

$ bazel version
Bazelisk version: v1.20.0
Starting local Bazel server and connecting to it...
Build label: 7.4.1
Build target: @@//src/main/java/com/google/devtools/build/lib/bazel:BazelServer
Build time: Mon Nov 11 21:24:53 2024 (1731360293)
Build timestamp: 1731360293
Build timestamp as int: 1731360293

Rules_python version:

1.1.0

Anything else relevant?

Internal bug: b/399782261

I think I see a couple potential paths forward:

  1. support keyring when building wheels
    • At least with some private registries, they support using keyring as an auth provider. This may allow the internal pip wheel command to auth to the registry.
  2. support using the Bazel downloader, and thus the credential helper, when building wheels
    • Manually replacing pip wheel's downloading action with a separate Bazel download task might work, but only if people are using the Bazel downloader
  3. support passing a secret value (the access token) down to the pip wheel command and injecting that secret into the (extra) index url.
    • As far as I know, all package indexes support auth via URL credential injection a-la https://oauth2accesstoken:${SECRET}@private_index.com/simple. It might be possible to add an attribute to pip.parse that injects that secret.
@aignas
Copy link
Collaborator

aignas commented Feb 28, 2025

@dougthor42, I think you had some experiments with adding more deps to the pip parse to include credetial support, but that was somewhat hard to complete.

Related, I think we should focus on #2410 so that we don't have a need to extend the pip dependencies to support credentials.

Potentially related: #2328, where at some point we removed the --no-index-url from the code branch in question.

github-merge-queue bot pushed a commit that referenced this issue Apr 24, 2025
…process (#2817)

While making a local patch to work around #2640, I found that I had a
need for running a subprocess (`gcloud auth print-access-token`) via
`repo_utils.execute_checked_stdout`. However, doing so would log that
access token when debug logging was enabled via
`RULES_PYTHON_REPO_DEBUG=1`. This is a security concern for us, so I
hacked in an option to allow a particular `execute_(un)checked(_stdout)`
call to disable logging stdout, stderr, or both.

I figure this might be useful to others so I thought I'd upstream it.

`execute_(un)checked(_stdout)` now support `log_stdout` and `log_stderr`
bools that default to `True` (which is the same behavior as before this
PR.

When the subprocess writes to stdout and `log_stdout = False`, the
logged message will show:

```
===== stdout start =====
<log_stdout = False; skipping>
===== stdout end =====
```

If the subprocess does not write to stdout, the debug log shows the same
as before:

```
<stdout empty>
```

The above also applies for stderr, with text adjusted accordingly.
@dougthor42
Copy link
Collaborator Author

FYI: I'm currently working around this with the following patch, with some things redacted. Hopefully I didn't accidentally remove any lines in the copy-paste-redact or else the patch line numbers will be all mucked up 🙃.

diff --git a/python/private/pypi/whl_library.bzl b/python/private/pypi/whl_library.bzl
index 630dc851..6189946d 100644
--- a/python/private/pypi/whl_library.bzl
+++ b/python/private/pypi/whl_library.bzl
@@ -289,6 +289,38 @@ def _whl_library_impl(rctx):
         if filename.endswith(".whl"):
             whl_path = rctx.path(rctx.attr.filename)
         else:
+            # HACK ALERT
+            # We need to inject a secret into the pypi index url if we are not using any public
+            # package index.
+            # TODO: Remove this hack when rules_python issue is fixed and released.
+            _index_path = "us-python.pkg.dev/REDACTED/REDACTED/simple"
+            _url = "https://oauth2accesstoken@{}".format(_index_path)
+            if _url in extra_pip_args:
+                print("{} is not a wheel and we are using Airlock. Injecting gcloud access token.".format(filename))
+
+                # N.B.: This is a nontrivial time cost of ~1s/call, but we only have ~13 sdist deps.
+                secret = repo_utils.execute_checked_stdout(
+                    rctx,
+                    op = "GetGCloudAuthAccessToken",
+                    arguments = ["gcloud", "auth", "print-access-token"],
+                    log_stdout = False,
+                )
+
+                # extra_pip_args is a list. It's safe enough to assume that the element after
+                # the "--index-url" string is the URL because otherwise the value
+                # would not be attached to the correct CLI arg.
+                # We must remove --index-url from `extra_pip_args` because of variable
+                # precedence: CLI args take precedence over env vars (see
+                # https://pip.pypa.io/en/stable/topics/configuration/#precedence-override-order).
+                extra_pip_args.remove(_url)
+                arg_index = extra_pip_args.index("--index-url")
+                extra_pip_args.remove(extra_pip_args[arg_index])
+
+                prefix = "https://oauth2accesstoken:{}@".format(secret.strip())
+                environment["PIP_INDEX_URL"] = prefix + _index_path
+
+            # END HACK
+
             # It is an sdist and we need to tell PyPI to use a file in this directory
             # and, allow getting build dependencies from PYTHONPATH, which we
             # setup in this repository rule, but still download any necessary
diff --git a/python/private/repo_utils.bzl b/python/private/repo_utils.bzl
index eee56ec8..3c8fa287 100644
--- a/python/private/repo_utils.bzl
+++ b/python/private/repo_utils.bzl
@@ -338,10 +338,21 @@ def _cwd_to_str(mrctx, kwargs):
     return cwd
 
 def _env_to_str(environment):
-    if not environment:
+    # HACK ALERT
+    # TODO: Remove this hack when rules_python issue is fixed and released.
+    _env = dict(environment)  # N.B.: .copy not supported
+    if "PIP_INDEX_URL" in environment.keys():
+        # This could definitely be better so that we log something like
+        # "PIP_INDEX_URL=https://oauthaccesstoken:[email protected]/..."
+        # but it's not worth it right now.
+        _env["PIP_INDEX_URL"] = "<REDACTED>"
+
+    # END HACK
+
+    if not _env:
         env_str = " <default environment>"
     else:
-        env_str = "\n".join(["{}={}".format(k, repr(v)) for k, v in environment.items()])
+        env_str = "\n".join(["{}={}".format(k, repr(v)) for k, v in _env.items()])
         env_str = "\n" + env_str
     return env_str

@rickeylev
Copy link
Collaborator

@aignas
Copy link
Collaborator

aignas commented May 2, 2025

FYI, ~/.netrc is picked up by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants