Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --scie option to produce native PEX exes. #2466

Merged
merged 12 commits into from
Jul 17, 2024

Conversation

jsirois
Copy link
Member

@jsirois jsirois commented Jul 15, 2024

You can now specify --scie {eager,lazy} when building a PEX file and
one or more additional native executable PEX scies will be produced
along side the PEX file. These PEX scies will contain a portable CPython
interpreter from Python Standalone Builds in the --scie eager
case and will instead fetch a portable CPython interpreter just in time
on first boot on a given machine if needed in the --scie lazy case.

Although Pex will pick the target platforms and target portable CPython
interpreter version automatically, if more control is desired over which
platforms are targeted and which Python version is used, then
--scie-platform, --scie-pbs-release, and --scie-python-version can
be specified.

Closes #636
Closes #1007
Closes #2096

You can now specify `--scie {eager,lazy}` when building a PEX file and
one or more additional native executable PEX scies will be produced
along side the PEX file. These PEX scies will contain a portable CPython
interpreter from [Python Standalone Builds][PBS] in the `--scie eager`
case and will instead fetch a portable CPython interpreter just in time
on first boot on a given machine if needed in the `--scie lazy` case.

Although Pex will pick the target platforms and target portable CPython
interpreter version automatically, if more control is desired over which
platforms are targeted and which Python version is used, then
`--scie-platform`, `--scie-pbs-release`, and `--scie-python-version` can
be specified.

Closes pex-tool#636
Closes pex-tool#1007
Closes pex-tool#2096

[PBS]: https://github.com/indygreg/python-build-standalone
@jsirois
Copy link
Member Author

jsirois commented Jul 15, 2024

Reviewers - yet another big one. Thanks in advance for any time you can spare. This 1st commit has no tests, those are coming in a bit, but I wanted to get this out in case you wanted to start reading. There has been pretty extensive manual testing, both for perf (see binding command that resulted to bring perf down to --sh-boot levels in all cases) and for feature-matrix complexity.

"args": ["{scie.bindings.configure:PEX}"],
}
],
"bindings": [
Copy link
Member Author

@jsirois jsirois Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is important for perf. Its pretty bad to package up your "native Python executable" and see it take ~70ms (for cowsay) when a plain --venv --sh-boot cowsay PEX gets ~20ms. The binding gets any PEX that is scie'd up to --sh-boot perf levels.

@attr.s(frozen=True)
class ScieConfiguration(object):
@classmethod
def from_tags(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unused, but would / will be used by pex3 scie create pre-existing.pex. The idea is to get the wheel tags for all the distributions in the pre-existing PEX and use those to determine the platforms to target.

@sureshjoshi
Copy link
Collaborator

Looking to carve off some time this evening to review this, but before I start, would it be safe to say that this is a (strict?) subset of the equivalent functionality when using science + a lift.toml to create a naively packaged pex + interpreter (e.g. excluding busy box functionality and custom bindings).

@jsirois
Copy link
Member Author

jsirois commented Jul 16, 2024

Yes. You'll find a nod to this and a pointer to science docs in the --scie help string as a consequence (i.e.: if you need to get more fancy, go there instead). Text starts here: https://github.com/pex-tool/pex/pull/2466/files#diff-bbf96d2c6fdcaa284ebb9e1fc92f6485b122a18e9ed241d96e943c5a90fbe168R62

jsirois added 2 commits July 15, 2024 19:08
This stresses the full matrix of basic cases (no `--scie-*` options).
Comment on lines +62 to +65
"scie making for a larger file, but requiring no internet access to boot. If you have "
"customization needs not addressed by the Pex `--scie*` options, consider using "
"`science` to build your scies (which is what Pex uses behind the scenes); see: "
"https://science.scie.app.".format(lazy=ScieStyle.LAZY, eager=ScieStyle.EAGER)
Copy link
Member Author

@jsirois jsirois Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sureshjoshi I see that your Pants plugin accepts an optional custom lift manifest, parses it if present, then injects bits into it. I think to support that sort of thing in a principled way, I'd have to parse the user supplied manifest and confirm they do not set the following keys:

  • ptex
  • scie_jump
  • files: with matching names
  • interpreters or interpreter_groups: with matching ids
  • commands: with a default command (I use this to launch the PEX)
  • bindings: with a matching name (needed for the default command to work)

Additionally, I'd have to advertise that I bind ptex to "ptex" for lazy scies, and always bind configure:PYTHON and configure:PEX.

Without all this I don't see how the user supplied manifest can work with Pex needs fruitfully. Can you think of any other corners? Perhaps I'm overthinking. Do you need this functionality?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess for ptex and scie_jump I could allow user-specified versions (but no more) IFF those versions were compatible with a lower bound.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hacking around tonight, trying to envision how I'd re-build something like pantsible (for example).

One idea was to manipulate the embedded manifest after pex generates it (add the custom bindings and whatnot by piping the file to another tool), but then I realized I don't think I'd want to be able to dynamically modify the manifest of what should be a "sealed" binary, as that would be crazy for supply chain purposes - and I don't want to be able to dynamically alter the commands the executable could call.

In the case of the plugin (which, I wouldn't really use as a reference for anything - as I made it a few years ago to solve an immediate deployment problem on a client project), I think we try to use the optional lift.toml where possible and inject the target names under certain conditions.

For this PR, I don't see any problems with deferring all of those concerns, but I'm of two minds.

  • pex being able to accept a custom manifest template that has to be perfectly structured, with/without certain keys feels a bit hacky
  • Using a separate tool (science), which overlaps with a lot of what pex would provide, feels off too

Would it make sense/be possible for science to defer to pex in some way, for the embedded Python interpreter? I'm trying to envision some sort of cleaner composition between two tools which have similar base functionality - but science allows some added knobs.

[lift]
name = "pantsible"
description = "Ansible with an embedded Python interpreter."
platforms = ... inferred from pex ...

[[lift.interpreters]] -> ... inferred from pex ...

[[lift.files]]
name = "pex"

[[lift.commands]]
name = "ansible"
exe = "{scie.bindings.venv}/venv/bin/ansible"
args = []

...

Although, one immediate problem I see here... I think I'm conflating a pex file with the pex CLI.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading through the PR, another thought that popped into my head is allowing for the pex CLI's generated TOML to act as an overlay or merge-manifest with a local one.

Whether that functionality is in science or pex CLI - overlaying/overwriting the user created manifest seems reasonable.

Copy link
Member Author

@jsirois jsirois Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hacking around tonight, trying to envision how I'd re-build something like pantsible (for example).

Well, pantsible uses a feature specific to scies over and above a PEX, namely the BusyBox support. It makes sense to me to just directly support this with --scie-busybox [list of entry points]. If you specify that then Pex emits a manifest with no default command and just named commands for each listed entry point.

Would it make sense/be possible for science to defer to pex in some way, for the embedded Python interpreter? I'm trying to envision some sort of cleaner composition between two tools which have similar base functionality - but science allows some added knobs.

Well, science is general purpose - Any language; so it doesn't really make sense for it to know about Python let alone Pex. It does have a Provider interface to supply interpreters and that has exactly 1 implementation currently, that provides PBS interpreters. A PEX provider might make sense.

That said, Pex creates PEXes - single file executables. These do not have:

  1. BusyBox support: You need conscript, for example, for that.
  2. Bindings support: I.E.: Pex offers you no way to do pre-launch setup. You just have to write Python code to do 1 time setup in your main if you want that or provide alternate entry points fired off with {PEX_MODULE=foo,PEX_SCRIPT=bar} ./my.pex

As such, I think it makes sense for Pex to offer the ability to take your PEX file and turn it into a scie that behaves exactly the same, with nothing extra except maybe running faster. Everything you'd do in a custom manifest, afaict, would add things the PEX cannot already do. At that point, having to move up a layer and use science yourself with a custom lift manifest to build your app not using Pex directly makes sense. I.E.: what scie-pants has to do. The Pants app is more complex than just what the Pants PEX does / has tight perf overhead concerns; so it makes sense to move up to the higher layer.

Reading through the PR, another thought that popped into my head is allowing for the pex CLI's generated TOML to act as an overlay or merge-manifest with a local one.

That's exactly what I meant by all this: #2466 (comment) It seems to me you can't just overlay, you must confirm the key mechanisms Pex uses in its lift are not destroyed by the merge before merging.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm referring to downstream tools like science, not pex, in this case. As in "once you've created a pex, then ..."

Anyways, the things I have in my mind are probably out of scope of this PR, and if they're important enough, or strongly enough use-cased, I can open a new ticket later.

Copy link
Member Author

@jsirois jsirois Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. So I think the PEX interpreter Provider would just use the pex3 scie create ... logic I referenced here: #2466 (comment)

I.E.: not create the scie, but use the ScieConfiguration.from_tags API + a given PEX file to source the tags to implement platform / interpreter selection via the calculated ScieConfiguration's ScieTarget targets which include platform, pbs_release and python_version.

That said, the current science Provider interface only allows providing an interpreter and not a set of platforms; so new API work would need to be done in science anyhow it seems to plug all this in.

Copy link
Member Author

@jsirois jsirois Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the current API does allow enough for a PEX interpreter Provider to error when asked to produce an interpreter distribution via Provider.distribution(platform) for a platform the PEX does not support. That's probably actually enough:

[lift]
name = "example"
platforms = [
    "linux-aarch64",
    "linux-x86_64",
    "macos-aarch64",
]

[[lift.files]]
name = "pex"

[[lift.interpreters]]
id = "cpython"
provider = "PEX"
pex = "{pex}"

Here if I ran science lift --file pex=my-py37.pex build ... the PEX interpreter Provider could fail since CPython 3.7 is not supported and if I ran science lift --file pex=my-py38.pex build ... it could fail fast if, for example, there were no 3.8 linux-aarch64 distributions in the latest PBS release.

Copy link
Collaborator

@sureshjoshi sureshjoshi Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, there we go - that's the kinda thing I see value in. One less place where head scratching can take place.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The restricting use case for --scie-platform I mentioned now has a test in 61f55a4 as does auto platforms detection.

pex/scie/science.py Outdated Show resolved Hide resolved
pex/scie/__init__.py Outdated Show resolved Hide resolved
@@ -39,6 +42,7 @@ def register_options(parser):

parser.add_argument(
"--scie",
"--par",
Copy link
Member Author

@jsirois jsirois Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the issues attached to this PR prove, people in the world know "PAR"; so it seems to make sense to add a --par alias to this one option for discoverability by those people. If they need more than the default --par treatment, then they really must learn about scies and --scie-* advanced options anyhow.

@benjyw
Copy link
Collaborator

benjyw commented Jul 16, 2024

Will try and make some time to review this in a few hours. But very cool feature!

@jsirois
Copy link
Member Author

jsirois commented Jul 16, 2024

OK, CI is now down to erroring on the Linux runners having ~/.netrc as a directory and the Python netrc stdlib not dealing with this 🤦 . I can work around this in science - where the error is originating from - but for now I'd like to solve just Pex issues; so I'll work around in CI instead.

... and again the face-palm was mine own. This was an issue in the dtox.sh script used on the Linux runners - now fixed.

jsirois added 4 commits July 16, 2024 11:19
Exercise platform / interpreter auto detection as well as explicit
restriction via `--scie-platform`. Also open up control via `PEX_*`
env vars: there was no need to mask thse for proper scie operation
and the end result is a scie that works exactly like the PEX it was
built from.
@jsirois
Copy link
Member Author

jsirois commented Jul 16, 2024

Alright reviewers, the tests are now complete. Good for a final review.

@sureshjoshi I'm happy to break off a feature request for either or both of the --scie-manifest and --scie-busybox ideas that came up in our thread above, just let me know if either makes sense / are features you will use.

@sureshjoshi
Copy link
Collaborator

I'm happy to break off a feature request for either or both of the --scie-manifest and --scie-busybox ideas that came up in our thread above, just let me know if either makes sense / are features you will use.

Yep, after this lands, I can play around with it a bit more and see where it leads to.

In the meantime, I want to confirm that this is the expected behaviour.

# foo.py
import uvicorn
from fastapi import FastAPI

app = FastAPI()


@app.get("/")
async def root():
    return {"message": "Hello World"}

if __name__ == "__main__":
    uvicorn.run(app, host="localhost", port=8000)
% python3.12 -m pex fastapi uvicorn --scie eager --scie-python-version 3.11  -o foo.pex -- foo.py 

% SCIE=inspect ./foo
...
 "files": [
    {
      "name": "cpython-3.11.9+20240713-aarch64-apple-darwin-install_only.tar.gz",
      "key": "cpython",
...

% ./foo
Python 3.12.4 (main, Jun  6 2024, 18:26:44) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> import fastapi
>>>

In my example, as the pex was built with python3.12, the pex shebang is /usr/bin/env python3.12 - so even though the scie is bundled with Python 3.11, we are expecting to enter a 3.12 REPL, correct?

Based on the comment in the thread above:

As such, I think it makes sense for Pex to offer the ability to take your PEX file and turn it into a scie that behaves exactly the same, with nothing extra except maybe running faster.

% python3.11 ./foo.pex 
Python 3.12.4 (main, Jun  6 2024, 18:26:44) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> 

The current behaviour matches what would happen if I just ran the pex with python3.11, so everything seems to line up and I'm just confirming my understanding of the feature.

Copy link
Collaborator

@sureshjoshi sureshjoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me after the back and forth. Would still be good to get eyeballs from someone more familiar with pex itself than me.

My familiarity was more with the scie side of things.

@jsirois
Copy link
Member Author

jsirois commented Jul 17, 2024

Looks good to me after the back and forth. Would still be good to get eyeballs from someone more familiar with pex itself than me.

AFAICT that is currently basically no one except me.

@sureshjoshi
Copy link
Collaborator

Looks good to me after the back and forth. Would still be good to get eyeballs from someone more familiar with pex itself than me.

AFAICT that is currently basically no one except me.

😆 Good point

@zmanji
Copy link
Collaborator

zmanji commented Jul 17, 2024

I don't have time to review this but i do have a question.

If one needs to customize the binaries, they would need to use science to create new binaries right?

@jsirois
Copy link
Member Author

jsirois commented Jul 17, 2024

The current behaviour matches what would happen if I just ran the pex with python3.11, so everything seems to line up and I'm just confirming my understanding of the feature.

Well ... you did a super weird thing too though. What do you think you meant by the trailing -- foo.py?! Did you mean to use --exe foo.py? Or were you just stressing buggy use cases? The -- foo.py you used just throws away those extra args, which is probably a bug - you should be warned at least. So you just get a foo.pex (an thus a foo scie) without an entrypoint.

All that weird aside, what actually happened here is this: You build a platform specific PEX for Python 3.12, but instead of letting Pex use that to configure a 3.12 PBS, you overrode that and said 3.11 is fine - which it's not. When the boot binding runs, Pex is smart enough to test the current PBS 3.11 interpreter, find it can't load the PEX, then continue on to try other Pythons on the PATH. It finds a python3.12, which works to load the PEX and then writes out these bindings on my machine:

cat /home/jsirois/.cache/nce/5f4d759f14822688a76e0fd21f7a93897017bba9ba2218635023781d324ee362/locks/configure-bfbf6d1d4ddde46844370bf7672b02dfc07b0e8318fb2ce7b277b8436167a67b
PYTHON=/usr/bin/python3.12
PEX=/home/jsirois/.cache/nce/5f4d759f14822688a76e0fd21f7a93897017bba9ba2218635023781d324ee362/bindings/pex_root/unzipped_pexes/c27c9d03a91f03a2286d5901502f2ab7872918e5/__main__.py

So, as for --scie-platform, the use case for --scie-pbs-release and --scie-python-version is generally narrowing the values that naturally arise from the PEX in question. The PEX here only supports 3.12; but you pushed the version out of bounds.

I guess I probably should blank out PATH in the boot binding to keep things hermetic:

:; git diff pex/scie/science.py
diff --git a/pex/scie/science.py b/pex/scie/science.py
index 61f9f7ac..50935894 100644
--- a/pex/scie/science.py
+++ b/pex/scie/science.py
@@ -114,6 +114,7 @@ def create_manifests(
             {
                 "env": {
                     "default": env_default,
+                    "remove_exact": ["PATH"],
                     "remove_re": ["PEX_.*"],
                     "replace": {
                         "PEX_INTERPRETER": "1",
:; git diff pex/pex_bootstrapper.py
diff --git a/pex/pex_bootstrapper.py b/pex/pex_bootstrapper.py
index a097736f..e3609efc 100644
--- a/pex/pex_bootstrapper.py
+++ b/pex/pex_bootstrapper.py
@@ -314,7 +314,7 @@ def find_compatible_interpreter(interpreter_test=None):
                         path=(
                             os.pathsep.join(ENV.PEX_PYTHON_PATH)
                             if ENV.PEX_PYTHON_PATH
-                            else os.getenv("PATH")
+                            else os.getenv("PATH", "(The PATH is empty!)")
                         )
                     )
                 )

Gives:

:; python3.12 -m pex fastapi uvicorn --scie eager --scie-python-version 3.11 -o foo.pex
:; ./foo
Failed to find compatible interpreter on path (The PATH is empty!).

Examined the following interpreters:
1.) /home/jsirois/.cache/nce/1f91c44febc850376a35ae77e1d45f7c823994b0c80293bbbc17e647eb893853/cpython-3.11.9+20240713-x86_64-unknown-linux-gnu-install_only.tar.gz/python/bin/python3.11 CPython==3.11.9

No interpreter compatible with the requested constraints was found:

  Failed to resolve requirements from PEX environment @ /home/jsirois/.cache/nce/263d2999f5e4edddedbbcb29b3aeaf6f49d373ee26a76de93ad97d16f9959b0d/bindings/pex_root/unzipped_pexes/c27c9d03a91f03a2286d5901502f2ab7872918e5.
  Needed cp311-cp311-manylinux_2_35_x86_64 compatible dependencies for:
   1: pydantic-core==2.20.1
      Required by:
        pydantic 2.8.2
      But this pex had no ProjectName(raw='pydantic-core', validated=False, normalized='pydantic-core') distributions.
   2: MarkupSafe>=2.0
      Required by:
        Jinja2 3.1.4
      But this pex had no ProjectName(raw='MarkupSafe', validated=False, normalized='markupsafe') distributions.
   3: httptools>=0.5.0; extra == "standard"
      Required by:
        uvicorn 0.30.1
      But this pex had no ProjectName(raw='httptools', validated=False, normalized='httptools') distributions.
   4: pyyaml>=5.1; extra == "standard"
      Required by:
        uvicorn 0.30.1
      But this pex had no ProjectName(raw='pyyaml', validated=False, normalized='pyyaml') distributions.
   5: uvloop!=0.15.0,!=0.15.1,>=0.14.0; (sys_platform != "win32" and (sys_platform != "cygwin" and platform_python_implementation != "PyPy")) and extra == "standard"
      Required by:
        uvicorn 0.30.1
      But this pex had no ProjectName(raw='uvloop', validated=False, normalized='uvloop') distributions.
   6: watchfiles>=0.13; extra == "standard"
      Required by:
        uvicorn 0.30.1
      But this pex had no ProjectName(raw='watchfiles', validated=False, normalized='watchfiles') distributions.
   7: websockets>=10.4; extra == "standard"
      Required by:
        uvicorn 0.30.1
      But this pex had no ProjectName(raw='websockets', validated=False, normalized='websockets') distributions.
Error: Failed to establish atomic directory /home/jsirois/.cache/nce/263d2999f5e4edddedbbcb29b3aeaf6f49d373ee26a76de93ad97d16f9959b0d/locks/configure-9150f882feea5a550e5936c10776cf44573934ed3669cd27f5bd99ec8ef75f90. Population of work directory failed: Boot binding command failed: exit status: 1

The ./foo scie contains no alternate boot commands.

What do you think @sureshjoshi? Keep it behaving just like the PEX and bouncing down the PATH to find an interpreter that works (this means we shipped the wrong Python but the target machine had the right one), or keep things hermetic and fail as my experiment above does?


FWIW, I debugged all this with 2 techniques:

  1. Sanity check what's going on: rm -rf ~/.cache/nce && RUST_LOG=trace PEX_VERBOSE=1 ./foo
  2. Debug the binding step: SCIE=split ./foo dist && _PEX_SCIE_INSTALLED_PEX_DIR=fake SCIE_BINDING_ENV=/dev/fd/0 PEX_VERBOSE=1 python3.11 dist/pex dist/configure-binding.py

@jsirois
Copy link
Member Author

jsirois commented Jul 17, 2024

If one needs to customize the binaries, they would need to use science to create new binaries right?

@zmanji in short, probably yes.

You could use science, but you can also just use cat plus a copy of the scie-jump (and a copy of ptex if you want lazy loading). See: https://github.com/a-scie/jump/blob/main/docs/packaging.md for more, but science is just a high level tool that dogfoods itself and these low level tools to provide a native python science binary that make assembling scies a bit easier.

As per my debug session above of @sureshjoshi's test rig case, you can also just use Pex to build your scie, then split it into its components with SCIE=split ./my-pex-scie /tmp/workbench, then cd to the /tmp/workbench and edit the lift.json, and symlink or copy any extra files you added to the manifest to the directory and then run ./scie-jump to re-assemble the scie. It will plop out in that directory.

@zmanji
Copy link
Collaborator

zmanji commented Jul 17, 2024

On being hermetic, I will just say that pex's strength is being hermetic out of the box with flags to disable that if needed. I think a pex built with this feature should strip the PATH by default.

@jsirois
Copy link
Member Author

jsirois commented Jul 17, 2024

On being hermetic, I will just say that pex's strength is being hermetic out of the box with flags to disable that if needed. I think a pex built with this feature should strip the PATH by default.

I like it! Even though this breaks the "PEX scie works just like the PEX" ~guaranty, it breaks the one part about a PEX this fixes, which is sealing in the interpreter. The only reason the PEX needs to bounce around to find a compatible Python if there even is one, is because of that 1 glaring bit of non-hermiticity in traditional PEXes.

@sureshjoshi
Copy link
Collaborator

In my case, I wasn't trying to generate an exe or script - I was just trying to make a packaged repl with fastapi, uvicorn, and my foo.py (which seemed to work, as far as I could tell). I grabbed that example from something I was doing a couple of weeks ago on one of my many weird side-tangents. I'm sure there's a better way, but it worked one time I tried it, and I just ran with it since it's just a scratchpad.

Keep it behaving just like the PEX and bouncing down the PATH to find an interpreter that works (this means we shipped the wrong Python but the target machine had the right one), or keep things hermetic and fail as my experiment above does?

Alright, yeah, my behavioural expectation test was presuming the goal was: "PEX scie works just like the PEX" - which it does.

BUT, having said that, I think being hermetic is preferable. Building with and bundling different interpreters is an easy blunder to make, and the last place you want to find that error is after deployment.

@jsirois
Copy link
Member Author

jsirois commented Jul 17, 2024

In my case, I wasn't trying to generate an exe or script - I was just trying to make a packaged repl with fastapi, uvicorn, and my foo.py (which seemed to work, as far as I could tell).

@sureshjoshi it did not. The foo.py was not included. I think you are confused by how Pex works when you don't specify -o - then, and only then, the -- ... extra args get passed to the ephemeral PEX that is created, run, and thrown away.

@sureshjoshi
Copy link
Collaborator

In my case, I wasn't trying to generate an exe or script - I was just trying to make a packaged repl with fastapi, uvicorn, and my foo.py (which seemed to work, as far as I could tell).

@sureshjoshi it did not. The foo.py was not included.

🤦🏽

It was just loading the local foo.py all along.

Whelp, at least my pain and suffering led to a hermetic scie.

@jsirois
Copy link
Member Author

jsirois commented Jul 17, 2024

It was just loading the local foo.py all along.

@sureshjoshi yes. Thanks for that though - as you said, everything is better as a result - except perhaps your sanity. So, people seem to never zipinfo on their PEXes, but its really helpful. So helpful, I went through alot of effort to make it so that you can do that to your PEX scie as well.

Hopefully very (power?) user friendly:

:; file foo
foo: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), static-pie linked, BuildID[sha1]=f1f01ca2ad165fed27f8304d4b2fad02dcacdffe, stripped

:; tail -1 foo | jq '.scie.lift.files[] | select(.key == "cpython")'
{
  "name": "cpython-3.11.9+20240713-x86_64-unknown-linux-gnu-install_only.tar.gz",
  "key": "cpython",
  "size": 29814546,
  "hash": "1f91c44febc850376a35ae77e1d45f7c823994b0c80293bbbc17e647eb893853",
  "type": "tar.gz"
}

:; zipinfo -1 foo | tail
warning [foo]:  31627135 extra bytes at beginning or within zipfile
  (attempting to process anyway)
.deps/websockets-12.0-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl/websockets-12.0.dist-info/
.deps/websockets-12.0-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl/websockets-12.0.dist-info/INSTALLER
.deps/websockets-12.0-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl/websockets-12.0.dist-info/LICENSE
.deps/websockets-12.0-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl/websockets-12.0.dist-info/METADATA
.deps/websockets-12.0-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl/websockets-12.0.dist-info/WHEEL
.deps/websockets-12.0-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl/websockets-12.0.dist-info/top_level.txt
PEX-INFO
__main__.py
__pex__/
__pex__/__init__.py

:; unzip -qc foo PEX-INFO | jq .requirements
warning [foo]:  31627135 extra bytes at beginning or within zipfile
  (attempting to process anyway)
[
  "fastapi",
  "uvicorn"
]

@sureshjoshi
Copy link
Collaborator

Yeah, I unzipped and grepped, but I had the file referenced otherwise - so it showed up in my grep, but it was just a filename, not the file itself.

As I said, very weird tangents I was messing around with 🤦🏽

@jsirois
Copy link
Member Author

jsirois commented Jul 17, 2024

@benjyw I'm headed to the hills for a bit; so I'm going to proceed to merge this and get out a release. I feel good about the current commitments, but I'll circle back if you spot bugs or have questions.

@jsirois jsirois merged commit 19c45fb into pex-tool:main Jul 17, 2024
26 checks passed
@jsirois jsirois deleted the issues/2096 branch July 17, 2024 06:31
@benjyw
Copy link
Collaborator

benjyw commented Jul 17, 2024

Sounds fine, I'll take a look ASAP - I'm on vacation in Europe so code reviews are backing up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants