Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constrain the matrix of shebang, platforms and ICs as much as possible. #1540

Closed
jsirois opened this issue Dec 17, 2021 · 8 comments · Fixed by #2296
Closed

Constrain the matrix of shebang, platforms and ICs as much as possible. #1540

jsirois opened this issue Dec 17, 2021 · 8 comments · Fixed by #2296

Comments

@jsirois
Copy link
Member

jsirois commented Dec 17, 2021

Building and booting a PEX can involve the interplay of 4 concepts currently:

  1. The interpreter(s) the Pex CLI is run with.
  2. The platforms specified.
  3. The interpreter constraints specified.
  4. The shebang the resulting PEX file is built with.

These all need to be in alignment to produce a PEX that "boots" properly. Further, they need to be particular beyond that in order to squeeze performance out of some combinations of those items.

Pex cannot ensure alignment, let along maximally performing alignment, in all cases. It can probably catch a few out-of-aligment cases though and - preferably - automatically bring them in alignment, or - less preferably - warn or potential issues. For the latter, care will be needed not to provided unwanted nannying for a knowledgeable user or a too sharp knife (the current situation) for the casual user.

An example of automated alignment would be basing the default shebang for a single --platform PEX, or else multiplatform PEX where all platforms share the same major / minor versions in their platform tag, off that major / minor version pair. I.E.: for --platform linux-x86_64-cp-37-cp37m --platform macosx-10.13-x86_64-cp-37-m use a default shebang of #!/usr/bin/env python3.7. This is in contrast to the behavior today, which is the shebang defaulting to #!/usr/bin/env pythonX.Y where X / Y are the major / minor versions of the interpreter used to run the Pex CLI, and which need bear no relation to the platforms selected.

The answer to what can be and should be constrained will change once #1020 is implemented, but even shy of that the status quo can be improved to prevent this style of unambiguous mis-aligment.

@stuhood
Copy link

stuhood commented Dec 17, 2021

  1. The shebang the resulting PEX file is built with.

Lightly related, but a thing I've wondered about: how awkward would it be to replace the bootstrap/shebang entrypoint with a bash script to locate Pythons, rather than a Python script? The bootstrap code is already intentionally small I presume, and that would remove one dimension here.

@jsirois
Copy link
Member Author

jsirois commented Dec 17, 2021

What cases that aren't currently handled would be your goal to handle with such a bash header? It does seem that /usr/bin/env bash is more universally available than /usr/bin/env python or /usr/bin/env python3; so it would concretely start to launch successfully on more machines - but it seems that particular issue has been a very narrow corner of issues. So do I have that right, you're proposing:

  1. bash header finds python compatible with Pex bootstrap code - today Python 2.7 & 3.5+ then re-execs using that
  2. PEX bootstrap now potentially re-execs again to match ICs or after its run Decouple PEX runtime interpreter selection from buildtime interpreter selection. #1020 logic
  3. PEX finally runs using correct interpreter

@stuhood
Copy link

stuhood commented Dec 20, 2021

That sounds right, although I suppose that I hadn't considered that both steps 1 and 2 would be necessary (implementing interpreter constraints in bash would be a pain). The goal would mostly be to remove a dimension from the description, since shebang setting has been a non-trivial complication (since a user has to choose exactly one, and even /usr/bin/env python is not universally available, since some systems only install python3).

@jsirois
Copy link
Member Author

jsirois commented Dec 21, 2021

... setting has been a non-trivial complication (since a user has to choose exactly one ...

Yeah - I think the error here is auto-selecting a hashbang. Since these aren't in fact universal, Pex should probably force you to pick one to make it clear you should think about this and know the right answer for your fleet. There are exactly 2 cases I'm aware of where users can't answer this because the fleet is not theirs - the Pex PEX and Pants' PEX. For the former, Pants uses your suggestion (externalized though) to find the bootstrap Python to run the Pex PEX with.

@jsirois
Copy link
Member Author

jsirois commented Apr 13, 2022

I realized step 1 in this list above could be done in Python code that emits a bash header. That simplifies the bash greatly.

I still think that Pex needs to keep its Python 1st tradition to both not break current users and to make sure PEX by default can ship to a machine with Python and work - it would be bad to land on a machine with Python in the PATH but not a shell and fail to run. It would also hinder the many-times-stalled effort to run out of the box on Windows.

So, if put behind a flag though, this is viable. In fact it cleans up a whole mess of code over in Pants designed to avoid ~50ms of leftover overhead in the PEX zip figuring out it has an already installed ~/.pex/venvs/... and re-execing into it and it effectively puts that code in the PEX where it belongs and where non-Pants users can benefit too:

Numbers look good. Does not affect the cold case but is snappy for the --seed (Pants use case) and warm cases:

Cold:

$ hyperfine \
  -p 'python -mpex cowsay -c cowsay -o cowsay.pex && rm -rf ~/.pex' \
  -p 'python -mpex cowsay -c cowsay -o cowsay.venv.pex --venv && rm -rf ~/.pex' \
  -p 'python -mpex cowsay -c cowsay -o cowsay.bash.pex --bash-boot && rm -rf ~/.pex' \
  -p 'python -mpex cowsay -c cowsay -o cowsay.bash.venv.pex --bash-boot --venv && rm -rf ~/.pex' \
  './cowsay.pex Moo' \
  './cowsay.venv.pex Moo' \
  './cowsay.bash.pex Moo' \
  './cowsay.bash.venv.pex Moo'
Benchmark 1: ./cowsay.pex Moo
  Time (mean ± σ):     358.6 ms ±   4.7 ms    [User: 329.0 ms, System: 29.9 ms]
  Range (min … max):   352.7 ms … 368.6 ms    10 runs
 
Benchmark 2: ./cowsay.venv.pex Moo
  Time (mean ± σ):     607.9 ms ±   5.7 ms    [User: 557.2 ms, System: 50.8 ms]
  Range (min … max):   600.5 ms … 618.7 ms    10 runs
 
Benchmark 3: ./cowsay.bash.pex Moo
  Time (mean ± σ):     362.7 ms ±   5.0 ms    [User: 335.5 ms, System: 27.7 ms]
  Range (min … max):   355.3 ms … 370.4 ms    10 runs
 
Benchmark 4: ./cowsay.bash.venv.pex Moo
  Time (mean ± σ):     614.0 ms ±  11.1 ms    [User: 556.1 ms, System: 58.1 ms]
  Range (min … max):   604.6 ms … 642.7 ms    10 runs
 
Summary
  './cowsay.pex Moo' ran
    1.01 ± 0.02 times faster than './cowsay.bash.pex Moo'
    1.70 ± 0.03 times faster than './cowsay.venv.pex Moo'
    1.71 ± 0.04 times faster than './cowsay.bash.venv.pex Moo'

Seed:

$ hyperfine \
  -p 'rm -rf ~/.pex && python -mpex cowsay -c cowsay -o cowsay.pex --seed' \
  -p 'rm -rf ~/.pex && python -mpex cowsay -c cowsay -o cowsay.venv.pex --venv --seed' \
  -p 'rm -rf ~/.pex && python -mpex cowsay -c cowsay -o cowsay.bash.pex --bash-boot --seed' \
  -p 'rm -rf ~/.pex && python -mpex cowsay -c cowsay -o cowsay.bash.venv.pex --bash-boot --venv --seed' \
  './cowsay.pex Moo' \
  './cowsay.venv.pex Moo' \
  './cowsay.bash.pex Moo' \
  './cowsay.bash.venv.pex Moo'
Benchmark 1: ./cowsay.pex Moo
  Time (mean ± σ):     321.6 ms ±   3.1 ms    [User: 295.4 ms, System: 26.7 ms]
  Range (min … max):   313.8 ms … 323.7 ms    10 runs
 
Benchmark 2: ./cowsay.venv.pex Moo
  Time (mean ± σ):      75.1 ms ±   0.9 ms    [User: 66.9 ms, System: 9.1 ms]
  Range (min … max):    74.0 ms …  76.6 ms    10 runs
 
Benchmark 3: ./cowsay.bash.pex Moo
  Time (mean ± σ):     285.9 ms ±   4.5 ms    [User: 261.9 ms, System: 24.5 ms]
  Range (min … max):   280.2 ms … 295.0 ms    10 runs
 
Benchmark 4: ./cowsay.bash.venv.pex Moo
  Time (mean ± σ):      16.0 ms ±   0.6 ms    [User: 13.0 ms, System: 3.0 ms]
  Range (min … max):    15.6 ms …  17.7 ms    10 runs
 
Summary
  './cowsay.bash.venv.pex Moo' ran
    4.68 ± 0.19 times faster than './cowsay.venv.pex Moo'
   17.82 ± 0.74 times faster than './cowsay.bash.pex Moo'
   20.04 ± 0.79 times faster than './cowsay.pex Moo'

Warm:

$ hyperfine \
  -w 2 \
  -p 'python -mpex cowsay -c cowsay -o cowsay.pex' \
  -p 'python -mpex cowsay -c cowsay -o cowsay.venv.pex --venv' \
  -p 'python -mpex cowsay -c cowsay -o cowsay.bash.pex --bash-boot' \
  -p 'python -mpex cowsay -c cowsay -o cowsay.bash.venv.pex --bash-boot --venv' \
  -p 'python -mpex cowsay -c cowsay -o cowsay.bash.venv.pex --bash-boot --venv' \
  './cowsay.pex Moo' \
  './cowsay.venv.pex Moo' \
  './cowsay.bash.pex Moo' \
  './cowsay.bash.venv.pex Moo' \
  '/home/jsirois/.pex/venvs/s/bccd8b55/venv/pex Moo'
Benchmark 1: ./cowsay.pex Moo
  Time (mean ± σ):     201.1 ms ±   2.2 ms    [User: 180.1 ms, System: 21.8 ms]
  Range (min … max):   196.6 ms … 203.8 ms    10 runs
 
Benchmark 2: ./cowsay.venv.pex Moo
  Time (mean ± σ):      73.5 ms ±   0.5 ms    [User: 63.1 ms, System: 11.4 ms]
  Range (min … max):    73.0 ms …  74.7 ms    10 runs
 
Benchmark 3: ./cowsay.bash.pex Moo
  Time (mean ± σ):     163.6 ms ±   3.6 ms    [User: 145.0 ms, System: 19.5 ms]
  Range (min … max):   160.0 ms … 170.9 ms    10 runs
 
Benchmark 4: ./cowsay.bash.venv.pex Moo
  Time (mean ± σ):      14.0 ms ±   0.3 ms    [User: 12.1 ms, System: 1.9 ms]
  Range (min … max):    13.6 ms …  14.5 ms    10 runs
 
Benchmark 5: /home/jsirois/.pex/venvs/s/bccd8b55/venv/pex Moo
  Time (mean ± σ):      12.4 ms ±   0.2 ms    [User: 11.9 ms, System: 0.5 ms]
  Range (min … max):    12.0 ms …  12.5 ms    10 runs
 
Summary
  '/home/jsirois/.pex/venvs/s/bccd8b55/venv/pex Moo' ran
    1.13 ± 0.03 times faster than './cowsay.bash.venv.pex Moo'
    5.95 ± 0.08 times faster than './cowsay.venv.pex Moo'
   13.24 ± 0.33 times faster than './cowsay.bash.pex Moo'
   16.28 ± 0.27 times faster than './cowsay.pex Moo'

N.B.: I threw in direct execution of the venv pex script in the warm perf comparison. So we lose ~2ms by not executing the venv pex script directly. This is the smae penalty Pants experiences today - it just writes its own bash re-directory script instead of using one embedded in the PEX header like this.

The interpreter selection for --bash-boot is very robust and fast despite the large list of binary names to try. In my tests, removing the p from the binary names so that the bash boot script has to rip through all of them before failing, I find that takes 2-3ms:

$ head -71 cowsay.bash.venv.pex
#!/usr/bin/env sh
# N.B.: This script should stick to syntax defined for POSIX `sh` and
# avoid non-builtins:
#   https://pubs.opengroup.org/onlinepubs/9699919799/idx/shell.html
set -eu

VENV="1"

# N.B.: This ensures tilde-expansion of the DEFAULT_PEX_ROOT value.
DEFAULT_PEX_ROOT="$(echo ~/.pex)"

PEX_ROOT="${PEX_ROOT:-${DEFAULT_PEX_ROOT}}"
PEX="${PEX_ROOT}/venvs/346aee797b51ee3f468179886b3a5db7c6d723a3/0c2af63c3815d1d03077ee9c1f2cbc64e6c7925d/pex"

if [ -n "${VENV}" -a -x "${PEX}" ]; then
    exec "${PEX}" "$@"
fi

find_python() {
    for python in \
"python3.10" \
"python2.7" \
"python3.5" \
"python3.6" \
"python3.7" \
"python3.8" \
"python3.9" \
"python3.11" \
"python3" \
"python2" \
"python" \
    ; do
        if command -v "${python}" 2>/dev/null; then
            return
        fi
    done
}

python_exe="$(find_python)"
if [ -n "${python_exe}" ]; then
    if [ -z "${VENV}" -a -e "${PEX}" ]; then
        exec "${python_exe}" "${PEX}" "$@"
    else
        # The slow path, run the PEX zipapp so it can rebuild its fast
        # path layout under the PEX_ROOT.
        exec "${python_exe}" "$0" "$@"
    fi
else
    echo >&2 "Failed to find any of these python binaries on the PATH:"
    for python in \
"python3.10" \
"python2.7" \
"python3.5" \
"python3.6" \
"python3.7" \
"python3.8" \
"python3.9" \
"python3.11" \
"python3" \
"python2" \
"python" \
    ; do
        echo >&2 "${python}"
    done
    echo >&2 "Either adjust your $PATH which is currently:"
    echo >&2 "${PATH}"
    echo >&2 -n "Or else install an appropriate Python that provides "
    echo >&2 "one of the binaries in this list."
    exit 1
fi
P!
  .bootstrap/P!.bootstrap/pex/P!�};

@jsirois
Copy link
Member Author

jsirois commented Apr 13, 2022

And busybox crushes all comers:

$ hyperfine -L interpreter 'sh,bash,ksh,dash,zsh,busybox ash,python' '{interpreter} ./cowsay.bash.venv.pex Moo'
Benchmark 1: sh ./cowsay.bash.venv.pex Moo
  Time (mean ± σ):      13.5 ms ±   0.4 ms    [User: 11.4 ms, System: 2.2 ms]
  Range (min … max):    12.8 ms …  15.9 ms    188 runs
 
Benchmark 2: bash ./cowsay.bash.venv.pex Moo
  Time (mean ± σ):      13.5 ms ±   0.3 ms    [User: 11.9 ms, System: 1.8 ms]
  Range (min … max):    12.8 ms …  14.7 ms    195 runs
 
Benchmark 3: ksh ./cowsay.bash.venv.pex Moo
  Time (mean ± σ):      13.2 ms ±   0.3 ms    [User: 11.6 ms, System: 1.7 ms]
  Range (min … max):    12.5 ms …  14.3 ms    196 runs
 
Benchmark 4: dash ./cowsay.bash.venv.pex Moo
  Time (mean ± σ):      12.3 ms ±   0.4 ms    [User: 10.4 ms, System: 2.1 ms]
  Range (min … max):    11.8 ms …  13.5 ms    207 runs
 
Benchmark 5: zsh ./cowsay.bash.venv.pex Moo
  Time (mean ± σ):      13.6 ms ±   0.4 ms    [User: 11.2 ms, System: 2.4 ms]
  Range (min … max):    12.9 ms …  15.4 ms    193 runs
 
Benchmark 6: busybox ash ./cowsay.bash.venv.pex Moo
  Time (mean ± σ):      12.1 ms ±   0.3 ms    [User: 10.5 ms, System: 1.7 ms]
  Range (min … max):    11.4 ms …  13.2 ms    217 runs
 
Benchmark 7: python ./cowsay.bash.venv.pex Moo
  Time (mean ± σ):      74.2 ms ±   0.8 ms    [User: 64.4 ms, System: 10.7 ms]
  Range (min … max):    72.7 ms …  75.9 ms    38 runs
 
Summary
  'busybox ash ./cowsay.bash.venv.pex Moo' ran
    1.02 ± 0.04 times faster than 'dash ./cowsay.bash.venv.pex Moo'
    1.09 ± 0.04 times faster than 'ksh ./cowsay.bash.venv.pex Moo'
    1.11 ± 0.04 times faster than 'bash ./cowsay.bash.venv.pex Moo'
    1.11 ± 0.04 times faster than 'sh ./cowsay.bash.venv.pex Moo'
    1.12 ± 0.04 times faster than 'zsh ./cowsay.bash.venv.pex Moo'
    6.10 ± 0.18 times faster than 'python ./cowsay.bash.venv.pex Moo'

jsirois added a commit that referenced this issue Apr 14, 2022
Allow users to choose `sh` as the boot mechanism for their PEXes. Not
only is `/bin/sh` probably more widely available than any given Python
shebang, but it's also much faster.

Relates to #1115 and #1540
@stuhood
Copy link

stuhood commented Apr 14, 2022

Super awesome.

In the spirit of making progress on this ticket (and understanding that what you said about wanting to support environments without shells installed means that python-shebang will need to stick around as an option), it seems like maybe --bash-boot would be a reasonable default... it's certainly easier to get right than shebang setting.

@jsirois
Copy link
Member Author

jsirois commented Apr 14, 2022

Pants can definitely choose that as a default for its users. Pex cannot until the Pex 3 release - my policy is to only fix bugs and add features but not break users without bumping major.

jsirois added a commit to jsirois/pex that referenced this issue Dec 5, 2023
Although its not always possible to derive an appropriate shebang that
will work for multiplatform PEXes, we now do so when possible and warn
when we cannot.

Fixes pex-tool#1540
jsirois added a commit to jsirois/pex that referenced this issue Dec 5, 2023
Although its not always possible to derive an appropriate shebang that
will work for multiplatform PEXes, we now do so when possible and warn
when we cannot.

Fixes pex-tool#1540
jsirois added a commit to jsirois/pex that referenced this issue Dec 5, 2023
Although its not always possible to derive an appropriate shebang that
will work for multi-platform PEXes, we now do so when possible and warn
when we cannot.

Fixes pex-tool#1540
jsirois added a commit that referenced this issue Dec 5, 2023
Although it's not always possible to derive an appropriate shebang that
will work for multi-platform PEXes, we now do so when possible and warn
when we cannot.

Fixes #1540
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants