Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support PEX files as resolve repositories. #1108

Closed
jsirois opened this issue Nov 9, 2020 · 5 comments · Fixed by #1182
Closed

Support PEX files as resolve repositories. #1108

jsirois opened this issue Nov 9, 2020 · 5 comments · Fixed by #1182

Comments

@jsirois
Copy link
Member

jsirois commented Nov 9, 2020

This would allow using --pex-repository instead of --index / --find-links to attempt to resolve requirements from an already built PEX. This feature would support a Pants use case in pantsbuild/pants#11105 and the underlying implementation is also needed to support #1020 correctly. In that case, each interpreter found on a given host could be tried for compatibility with a PEX by attempting to resolve all its requirements from itself using that interpreter.

@jsirois
Copy link
Member Author

jsirois commented Nov 9, 2020

A few notes on the resolve logic implementation:

  • At buildtime to implement --pex-repository:
    1. Implementing custom logic to compare requirements to .deps/ wheel tags to interpreters and platforms recursing on wheel Requires-Dist metadata requirements.
    2. Leveraging pip:
      This requires the "wheels" we embed in .deps/ - which are not actually wheels but wheels installed in individual chroots - to be uninstalled wheels. This in turn requires the ability for the PEX .bootstrap/ runtime to contain wheel installing code. Currently it does not since this code lives in pip and we do not vendor Pip into the runtime .bootstrap/.
  • At runtime to implement appropriate interpreter selection:
    Again, since the PEX .bootstrap/ does not currently contain Pip, which is large, we're pushed towards custom logic as sketched above.

Taking Pip as being too large to vendor in the PEX runtime .bootstrap/ as a given, we'll need to implement the custom resolve logic. With that in place we can support --pex-repository, but only when used as the exclusive source of dists. That covers the Pants use case.

If we did leverage Pip instead to do resolves / convert .deps/ to contain wheels instead of installed wheel chroots, we could support a mix of --pex-respository, --index and --find-links at build-time since, in that case, PEX files would contain wheels which could be exposed to pip directly via --find-links.

@Eric-Arellano
Copy link
Contributor

Taking Pip as being too large to vendor in the PEX runtime .bootstrap/ as a given

The concern is bundle size, right? Vendoring pip at build time removed a lot of complexity from Pex, and I suspect that vendoring here would avoid adding lots of complexity/edge cases to runtime.

@jsirois
Copy link
Member Author

jsirois commented Nov 9, 2020

The concern is bundle size, right?

Right. For example the Pex PEX itself would go from 2,683,442 bytes to 4,126,954 bytes.

Vendoring pip at build time removed a lot of complexity from Pex,

That is definitely not the case:

jsirois@gill ~/dev/pantsbuild/pex ((v1.6.12)) $ find pex -type f -name "*.py" ! -wholename "*/vendor/*" | xargs wc -l
   302 pex/package.py
    52 pex/sorter.py
   157 pex/platforms.py
    41 pex/iterator.py
   333 pex/pep425tags.py
    75 pex/fetcher.py
   308 pex/finders.py
   110 pex/requirements.py
    53 pex/archiver.py
   153 pex/installer.py
   240 pex/resolver_options.py
   181 pex/crawler.py
   307 pex/resolvable.py
   292 pex/http.py
   167 pex/translator.py
   586 pex/resolver.py
    93 pex/glibc.py
   135 pex/link.py
   ...
 (3585)
  9507 total

Vs:

jsirois@gill ~/dev/pantsbuild/pex (master) $ find pex -type f -name "*.py" ! -wholename "*/vendor/*" | xargs wc -l
    34 pex/dist_metadata.py
   124 pex/platforms.py
   370 pex/jobs.py
    58 pex/finders.py
    71 pex/requirements.py
   403 pex/pip.py
    55 pex/network_configuration.py
    84 pex/distribution_target.py
  1209 pex/resolver.py
  ...
 (2408)
 10231 total

What it did do though is make PEX resolution identical to the resolution people expected since Pip was and surely still is the dominant resolver / de-facto standard.

and I suspect that vendoring here would avoid adding lots of complexity/edge cases to runtime.

This could be true. Keep in mind, all the difficulty of a real resolve is completed by Pip and hidden behind the distributions inside the --pex-repository. No version needs to be chosen when resolving from a --pex-repository, only compatible wheels need be selected (matching tags). Assuming though vendoring Pip in the runtime .bootstrap/ saves us 10 bugs going forward, it will also undoubtedly be slower at performing the runtime resolve and this will be overhead on every run of a PEX. The Pip speed can be simulated with a pure --find-links resolve so I think I'd like to proceed with the custom logic far enough to time it and then compare to the Pip --find-links resolve.

jsirois added a commit that referenced this issue Nov 9, 2020
The motivation is improvements to tag handling across the board and
opening the door for treating PEXes as repositories to resolve from.

Work towards #1108.
@jsirois jsirois mentioned this issue Nov 10, 2020
4 tasks
@stuhood
Copy link

stuhood commented Nov 10, 2020

At buildtime to implement --pex-repository:

1. Implementing custom logic to compare requirements to `.deps/` wheel tags to interpreters and platforms recursing on wheel `Requires-Dist` metadata requirements.

2. Leveraging pip:
   This requires the "wheels" we embed in `.deps/` - which are not actually wheels but wheels installed in individual chroots - to be uninstalled wheels. This in turn requires the ability for the PEX `.bootstrap/` runtime to contain wheel installing code. Currently it does not since this code lives in pip and we do not vendor Pip into the runtime `.bootstrap/`.

@jsirois : My primary question to help decide between these is: "how quickly can pip resolve from a directory/collection of wheels like this?": if it's pretty quick (maybe, within 2x of a custom solution), then it seems like that would be the way to go, because there is potentially more logic involved in the "graph walk" than just literal lookups in a hashmap, and duplicating that logic in PEX to avoid talking to pip would probably be worth avoiding unless the performance difference is significant.

@jsirois
Copy link
Member Author

jsirois commented Nov 10, 2020

@stuhood your conclusion sounds exactly equal to mine above. To do a time comparison the cheapest way is to implement the custom logic 1st since the pip side of the timing can just be emulated via a pip download command. To do the inverse is a much bigger change since getting pip to resolve from a PEX file requires converting PEX files to contain wheels (today they contain pre-installed wheels), and then installing them at runtime.

jsirois added a commit to jsirois/pex that referenced this issue Nov 17, 2020
This is a prerequisite for fixing #pex-tool#899, implementing pex-tool#1108 and the API
will also be useful to Pants which currently can only handle a subset
of requirements, forcing edits of existing requirment files to use
alternative requirement forms.
jsirois added a commit that referenced this issue Nov 28, 2020
This is a prerequisite for fixing ##899, implementing #1108 and the API
will also be useful to Pants which currently can only handle a subset
of requirements, forcing edits of existing requirment files to use
alternative requirement forms.
jsirois added a commit to jsirois/pex that referenced this issue Dec 4, 2020
jsirois added a commit that referenced this issue Dec 5, 2020
This is needed to support #1020 and #1108.
This was referenced Dec 14, 2020
@jsirois jsirois self-assigned this Dec 30, 2020
jsirois added a commit to jsirois/pex that referenced this issue Jan 4, 2021
jsirois added a commit that referenced this issue Jan 5, 2021
jsirois added a commit to jsirois/pex that referenced this issue Jan 8, 2021
This removes our dependency on pkg_resources Environment / WorkingSet in
favor of performing our own recursive resolve of runtime distributions
to activate using distribution metadata. This fixes an old test bug
noticed by Benjy but, more importanty, sets the stage to fix pex-tool#899, pex-tool#1020
and pex-tool#1108 by equipping PEXEnvironment with the ability to resolve the
appropriate transitive set of distributions from a root set of
requirements instead of the current full set of transitive requirements
stored post-resolve in PexInfo.
jsirois added a commit that referenced this issue Jan 8, 2021
This removes our dependency on pkg_resources Environment / WorkingSet in
favor of performing our own recursive resolve of runtime distributions
to activate using distribution metadata. This fixes an old test bug
noticed by Benjy but, more importanty, sets the stage to fix #899, #1020
and #1108 by equipping PEXEnvironment with the ability to resolve the
appropriate transitive set of distributions from a root set of
requirements instead of the current full set of transitive requirements
stored post-resolve in PexInfo.
jsirois added a commit to jsirois/pex that referenced this issue Jan 17, 2021
We need this in order to support resolving foreign platforms from PEX
file repositories just like we support resolving foreign platforms from
traditional index servers and find links repositories via Pip.

We use `pip debug` for this even though Pip warns against relying on the
output of that command. Since Pip is vendored, this is only a concern
when we upgade our vendored Pip. If Pip does yank the debug output we
need or alter its format, we can either implement our own logic using
packaging.tags (~150 LOC) or adapt our parsing logic respectively.

Work towards pex-tool#1108.
jsirois added a commit to jsirois/pex that referenced this issue Jan 17, 2021
We need this in order to support resolving foreign platforms from PEX
file repositories just like we support resolving foreign platforms from
traditional index servers and find links repositories via Pip.

We use `pip debug` for this even though Pip warns against relying on the
output of that command. Since Pip is vendored, this is only a concern
when we upgade our vendored Pip. If Pip does yank the debug output we
need or alter its format, we can either implement our own logic using
packaging.tags (~150 LOC) or adapt our parsing logic respectively.

Work towards pex-tool#1108.
jsirois added a commit that referenced this issue Jan 18, 2021
We need this in order to support resolving foreign platforms from PEX
file repositories just like we support resolving foreign platforms from
traditional index servers and find links repositories via Pip.

We use `pip debug` for this even though Pip warns against relying on the
output of that command. Since Pip is vendored, this is only a concern
when we upgade our vendored Pip. If Pip does yank the debug output we
need or alter its format, we can either implement our own logic using
packaging.tags (~150 LOC) or adapt our parsing logic respectively.

Work towards #1108.
jsirois added a commit to jsirois/pex that referenced this issue Jan 19, 2021
Instead of requiring a PythonIntepreter to resolve distributions in a
PEXEnvironment, a DistributionTarget is enough. This allows for
resolving distributions from a PEXEnvironment given only a Platform
which we need to support resolving from a `--pex-repository` in pex-tool#1108.
jsirois added a commit to jsirois/pex that referenced this issue Jan 19, 2021
Instead of requiring a PythonIntepreter to resolve distributions in a
PEXEnvironment, a DistributionTarget is enough. This allows for
resolving distributions from a PEXEnvironment given only a Platform
which we need to support resolving from a `--pex-repository` in pex-tool#1108.
jsirois added a commit to jsirois/pex that referenced this issue Jan 19, 2021
Instead of requiring a PythonIntepreter to resolve distributions in a
PEXEnvironment, a DistributionTarget is enough. This allows for
resolving distributions from a PEXEnvironment given only a Platform
which we need to support resolving from a `--pex-repository` in pex-tool#1108.
jsirois added a commit to jsirois/pex that referenced this issue Jan 20, 2021
Introduce a `--pex-repository` option to the Pex CLI to switch
requirement resolution from using index servers and find-links
repositories to using a local PEX file with pre-resolved requirements.
This can be useful when a number of projects share a consistent resolve
via a shared requirement file. You can resolve the full requirement file
into a requirements PEX and then later resolve just the portions needed
by each individual project from the fully resolved requirements PEX.

Fixes pex-tool#1108
jsirois added a commit that referenced this issue Jan 20, 2021
Instead of requiring a PythonIntepreter to resolve distributions in a
PEXEnvironment, a DistributionTarget is enough. This allows for
resolving distributions from a PEXEnvironment given only a Platform
which we need to support resolving from a `--pex-repository` in #1108.
jsirois added a commit to jsirois/pex that referenced this issue Jan 20, 2021
Introduce a `--pex-repository` option to the Pex CLI to switch
requirement resolution from using index servers and find-links
repositories to using a local PEX file with pre-resolved requirements.
This can be useful when a number of projects share a consistent resolve
via a shared requirement file. You can resolve the full requirement file
into a requirements PEX and then later resolve just the portions needed
by each individual project from the fully resolved requirements PEX.

Fixes pex-tool#1108
jsirois added a commit that referenced this issue Jan 20, 2021
Introduce a `--pex-repository` option to the Pex CLI to switch
requirement resolution from using index servers and find-links
repositories to using a local PEX file with pre-resolved requirements.
This can be useful when a number of projects share a consistent resolve
via a shared requirement file. You can resolve the full requirement file
into a requirements PEX and then later resolve just the portions needed
by each individual project from the fully resolved requirements PEX.

Fixes #1108
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants