Skip to content

More robust shebang manipulation #12599

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 task done
jaraco opened this issue Mar 26, 2024 · 21 comments
Closed
1 task done

More robust shebang manipulation #12599

jaraco opened this issue Mar 26, 2024 · 21 comments
Labels
C: scripts How scripts and entry points are managed resolution: wrong project Should be reported elsewhere type: feature request Request for a new feature

Comments

@jaraco
Copy link
Member

jaraco commented Mar 26, 2024

What's the problem this feature will solve?

When pip installs (non-"console script") scripts to an environment with spaces in the pathname, it supplies the interpreter name in the script but doesn't escape the space(s) in any way, leading to failures:

 draft @ mkdir 'foo bar'
 draft @ py -m venv 'foo bar/.venv'
 draft @ 'foo bar/.venv/bin/pip' install docutils
Collecting docutils
  Using cached docutils-0.20.1-py3-none-any.whl.metadata (2.8 kB)
Using cached docutils-0.20.1-py3-none-any.whl (572 kB)
Installing collected packages: docutils
Successfully installed docutils-0.20.1
 draft @ cd 'foo bar'
 foo bar @ .venv/bin/rst2html.py
xonsh: subprocess mode: command not found: '/Users/jaraco/draft/foo'
 foo bar @ head -n 1 .venv/bin/rst2html.py
#!/Users/jaraco/draft/foo bar/.venv/bin/python3.12

Although the root cause is a long-standing problem with Unix itself, some shells like xonsh do support them:

 foo bar @ sed -i -e '1{s/^#!/#!"/;s/$/"/;}' .venv/bin/rst2html.py
 foo bar @ head -n 1 .venv/bin/rst2html.py
#!"/Users/jaraco/draft/foo bar/.venv/bin/python3.12"
 foo bar @ .venv/bin/rst2html.py --help > /dev/null && echo done
done

Moreover, the Python launcher for Windows also supports them.

Lack of support for spaces in filenames leads to problems like pypa/pipx#1198 when applications attempt to install to the platform-recommended directories.

Describe the solution you'd like

When rewriting the shebangs for scripts during install, wrap the interpreter in quotes when it contains spaces.

Other shells like bash and zsh will still choke on the space, but at least they'll have the opportunity to potentially honor them, and in the meantime, users of shells like xonsh can enjoy a working executable. It's one small step toward honoring valid pathnames and strengthening a precedent for a workable solution that those shells could honor.

Alternative Solutions

One might be tempted to suggest that users should just avoid paths with spaces in filenames, and although that kind of bullying is the norm in Linux, on other platforms like Windows and macOS, the platform guidance itself specifies to install scripts and applications and other things in paths with spaces (e.g. C:\Program Files, ~/Library/Application Support).

Another option might be to apply the somewhat wonky exec wrapper used for console scripts:

 foo bar @ head -n 3 .venv/bin/pipx
#!/bin/sh
'''exec' "/Users/jaraco/draft/foo bar/.venv/bin/python3.12" "$0" "$@"
' '''

Additional context

none

Code of Conduct

@uranusjr
Copy link
Member

Note that pip already has access to the shell script hack via vendored distlib. So we can maybe reuse that logic to minimise things to implement.

@uranusjr uranusjr added C: scripts How scripts and entry points are managed and removed S: needs triage Issues/PRs that need to be triaged labels Mar 27, 2024
@jaraco
Copy link
Member Author

jaraco commented Apr 22, 2024

Today I learned that although the shebang hack in distlib does in fact allow a script with spaces in the interpreter path to be executed by bash:

 jaraco.vcs main @ bash
bash-5.2$ head -n 3 $(which hg)
#!/bin/sh
'''exec' "/Users/jaraco/Library/Application Support/pipx/venvs/mercurial/bin/python" "$0" "$@"
' '''
bash-5.2$ hg --version -q
Mercurial Distributed SCM (version 6.7.2)

It doesn't solve the issue for other forms of execution, such as through subprocess.Popen:

 @ subprocess.check_output(['hg', '--version'])
FileNotFoundError: [Errno 2] No such file or directory: 'hg'

I don't yet understand why the subprocess invocation is unable to honor the sh+exec hack.

Note, I generated that header with:

pip-run distlib -- -c "import distlib.scripts as scr; sm = scr.ScriptMaker(None, None); print(sm._build_shebang(scr.enquote_executable('/Users/jaraco/Library/Application Support/pipx/venvs/mercurial/bin/python').encode(), b'').decode())"

@jaraco
Copy link
Member Author

jaraco commented Apr 28, 2024

I don't yet understand why the subprocess invocation is unable to honor the sh+exec hack.

Oh. The issue seems to be related to me using subprocess.*/shutil.which inside a xonsh context. If I execute it in a native Python context, the program executes as expected.

 ~ @ py -c "import subprocess; print(subprocess.check_output(['hg', '--version'])[:41])"
b'Mercurial Distributed SCM (version 6.7.2)'

It took only a small amount of investigation to determine the cause - for whatever reason, under xonsh when launched as the user's default shell, os.environ only contains a few entries.

@eabase
Copy link

eabase commented May 8, 2024

OMG, I really hope people doesn't have to use and resort to these kinds of hacks:

#!/bin/sh
'''exec' "/Users/jaraco/draft/foo bar/.venv/bin/python3.12" "$0" "$@"
' '''

It breaks all portability possible!
However, much liked creds to those who thought about this.

@flying-sheep
Copy link

flying-sheep commented Apr 8, 2025

The solution exists and is extremely simple for Linux and MacOS:

#!/usr/bin/env -S "/Users/me/Library/Application Support/pipx/venvs/tada/bin/python"

Why don’t we just use that?

Supported by at least GNU coreutils1, uutils2, and MacOS’s default installation.

Footnotes

  1. The ones every relevant Linux (Alpine doesn’t by default but it packages coreutils, so pip on Alpine could just start depending on that.).

  2. The Rust ones that Ubuntu is switching to.

@notatallshaw
Copy link
Member

Reading through the history of this it's not clear to me what is being reported as a pip bug?

Does someone have a MRE where pip is producing the wrong shebang?

@flying-sheep
Copy link

@Gitznik implies it’s installing something on macOS, but I can’t trivially reproduce that: pypa/pipx#1327 (comment)

@Gitznik can you provide a MRE?

@pfmoore
Copy link
Member

pfmoore commented Apr 8, 2025

Why don’t we just use that?

This has been discussed before, and people have pointed to Unix systems (freebsd?) that don't support multiple arguments in shebang lines. If you look at Wikipedia, it states under "Argument splitting":

Command arguments are split in different ways across platforms. Some systems do not split up the arguments

I don't have details, and I don't really have much interest in what we end up doing here, but basically the answer is "because no-one is able to confirm that it won't break someone".

@flying-sheep
Copy link

flying-sheep commented Apr 8, 2025

So if the pipx maintainers can provide a MRE where pip’s approach breaks and we can’t fix it using only POSIX stuff, the only possible solutions that I can think of would be

  1. creating native binaries instead of scripts (we already do that on windows, right?), or
  2. doing something horrible like giving Python a flag that makes it behave like env -S instead of doing anything python-related

@pfmoore
Copy link
Member

pfmoore commented Apr 8, 2025

or:

  1. Don't use paths with spaces in them if you want portability.

@flying-sheep
Copy link

flying-sheep commented Apr 8, 2025

Absolutely not accepting this, we should have moved past this 40 years ago as a profession.

In the last 12 years, I had exactly one very minor problem while having my dev directory contain a space: Chrome didn’t build, which officially makes it the most crusty build system I’ve encountered in that whole time. Like, I did a lot of very experimental things in there, trying to build many many venerable projects with old build systems. No problem, except with Chrome.

Only pipx suddenly complaining to me made me aware that somehow something still hasn’t solved this very basic problem. Not solving it is not an option for me.

PS: I dimly recall that once or twice, something broke, but a 2-character-diff PR by me that was merged within a few days fixed the problem forever. I’m not counting these. Only Chrome was so bad that it wasn’t fixed immediately.

@pfmoore
Copy link
Member

pfmoore commented Apr 8, 2025

Absolutely not accepting this, we should have moved past this 40 years ago as a profession.

Agreed. Tell the Unix systems that don't handle shebangs with spaces in the path.

On a more practical note, if anyone can provide a solution that doesn't break systems that have limited shebang support, we'll gladly accept a PR (or more likely, a new release of distlib that contains such a PR, as I think we just inherit distlib's behaviour here). Personally, I'd even be willing to accept a sufficiently compelling argument that we don't need to support such limited systems - although I'd be likely to just revert the change if it turned out that we did break someone's environment.

@flying-sheep
Copy link

flying-sheep commented Apr 9, 2025

As said:

  1. The simple way is to rely on env -S and tell people for whom that doesn’t work to bully their distro into adding support. I think FreeBSD 6.0+, Linux with GNU coreutils or uutils, MacOS is an impressive array of platforms (see below for windows)
  2. A variation would be to somehow detect if a platform supports it and fall back to current behavior if not
  3. The hard way is maybe to do the same as on windows and create executables instead of scripts.

Which one do you think makes more sense to pursue, or do you have any other ideas?

PS: I really don’t see why “just don’t use spaces” is being brought up here at all. That’s a user-side workaround. Users have been informed that it exists as a crutch until the bug is fixed. Isn’t an issue tracker about actually fixing the bug?

@pfmoore
Copy link
Member

pfmoore commented Apr 9, 2025

Personally I think (2) is the only reasonable option. I don’t believe “tell people to bully their OS1” is productive or helpful. Speaking as a Windows user, I don’t like (3), and I wish Windows had shebang support in the OS. Bullying Microsoft to add it wouldn’t work, though…

Footnotes

  1. Note that it’s the OS, specifically the kernel, and not the distribution, that defines shebang semantics.

@uranusjr
Copy link
Member

uranusjr commented Apr 9, 2025

IIRC arguments did not even work on macOS back when this was reported. Maybe this had changed, but compatibility is still a thing. I also think 2 is the only practical option.

@flying-sheep
Copy link

OK, so Linux actually came last.

  • FreeBSD had it first, since 6.0 (2005)
  • OS X / MacOS adopted it in at least 10.11 “El Capitan” (2015)
  • GNU coreutils adopted it in 8.30 (2018)

@pfmoore
Copy link
Member

pfmoore commented Apr 9, 2025

What are you referring to here? Are you talking about the -S option of the env command? That's irrelevant. The problem we're trying to get you to understand is that not all operating system kernels allow spaces in the shebang.

So, suppose you have

#!/usr/bin/env -S "some command line"

There are operating system kernels out there which will treat that as a request to run the executable '/usr/bin/env -S "some command line"'. That is, the file 'env -S "some command line"' in /usr/bin. What env command options exist is irrelevant, as the operating system isn't running the env command - it's giving "command not found" because there's no command in /usr/bin containing two spaces, a hyphen and two " characters in its name!

@flying-sheep
Copy link

flying-sheep commented Apr 10, 2025

Totally missed that, sorry. OK, let me get this straight, please correct me if I misunderstood anything:

  1. pip rewrites shebangs for console scripts using the “sh triple-quote” hack.
  2. pip does not rewrite non-console script shebangs for unknown reasons. This issue is about that.
  3. Despite @eabase implying the opposite, the “sh triple-quote” hack is actually extremely portable.1
  4. The only reason why pipx uses non-standard paths on MacOS is because of point 2 – non-console scripts break because pip doesn’t rewrite their shebangs using the “sh triple-quote” hack (this is the biggest guess of all of them: @Gitznik please comment on this, I still have no clue what exactly pipx did that for)

Footnotes

    1. The “sh triple-quote” hack uses the most compatible shebang possible: #!/bin/sh is short, has no spaces or arguments, and uses a Path guaranteed to exist on any POSIX-compliant platform.
    2. The hack’s shell code only uses POSIX constructs and is therefore 100% portable across everything we can hope to support except Windows, for which we have separate support anyway.

    If something is not even POSIX compliant or doesn’t support that shebang, its name is better “Windows” because it’s an extremely exotic and weird platform.

@pfmoore
Copy link
Member

pfmoore commented Apr 10, 2025

The "sh triple-quote" hack is implemented by distlib, which we use. Non-console scripts aren't acually a thing on Unix (which is the only place the sh trick is used) are they? Why would you use gui-scripts on Unix? They only make a difference on Windows.

If you want distlib to use the sh trick for GUI scripts on Unix, then you should open an issue there.

Ultimately, this is nothing to do with pip, it's all implemented by distlib. (We could override distlib's behaviour, but as a matter of policy we avoid doing that).

@pfmoore pfmoore added the resolution: wrong project Should be reported elsewhere label Apr 10, 2025
@pfmoore
Copy link
Member

pfmoore commented Apr 10, 2025

As far as I can tell, the OP's problem was actually related to xonsh. I'm not sure why pipx got brought into the discussion, or how it's relevant. I suggest that if there is a problem in pipx (the linked comment makes a claim about how pip behaves, but I'm not sure it's accurate based on this thread) then someone should open a new issue, with a reproducible example, and explaining clearly why it's a pip issue and not a distlib issue (given what's been said here about pip simply inheriting distlib's behaviour).

Closing this, as the original issue seems to have been resolved.

@flying-sheep
Copy link

OK, I re-filed the bug after creating a reproducer and making sure that it’s really a pip bug: #13389

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: scripts How scripts and entry points are managed resolution: wrong project Should be reported elsewhere type: feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

6 participants