Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for --no-pre-install-wheels and --max-install-jobs. #2298

Merged

Conversation

jsirois
Copy link
Member

@jsirois jsirois commented Dec 13, 2023

The --no-pre-install-wheels option causes built PEXes to use raw
.whl files. For --layout zipapp this means a single .whl file is
STORED per dep, and for --layout {packed,loose} this means the loose
.deps/ dir contains raw .whl files. This speeds up all PEX builds by
avoiding pre-installing wheel deps (~unzipping into the PEX_ROOT) and
then, in the case of zipapp and packed layout, re-zipping. For large
dependencies the time savings can be dramatic.

Not pre-installing wheels comes with a PEX boot cold-start performance
tradeoff since installation now needs to be done at runtime. This is
generally a penalty of O(100ms), but that penalty can be erased for some
deployment scenarios with the new --max-install-jobs build option /
PEX_MAX_INSTALL_JOBS runtime env var. By default, runtime installs are
performed serially, but this new option can be set to use multiple
parallel install processes, which can speed up cold boots for large
dependencies.

Fixes #2292

A new `PEX_MAX_INSTALL_JOBS` variable will use this.
This change renames types and variables to take the emphasis off
installation and put it on the more generic result of a resolve (which
will soon include raw `.whl` files as an option in addition to default
installed wheel chroots).

Also factor out the machinery needed to be able to ask `BuildAndInstall`
for `.whl` file results in addition to the current installed wheel
chroot results.
These will be used to select raw wheel dependencies (vs the current
pre-installed wheel chroots) and the parallelism used to install
dependencies at runtime upon the first PEX boot.
This will be used in subsequent changes.
This support will automatically be used as soon as resolves are wired
with an option to return raw `.whl` file results.
This will be needed when building PEXes that use raw `.whl` file dists.
Use the Pip resolver to test the resulting PEX contains .whl files that
are not stored in the PEX with compression.
@jsirois
Copy link
Member Author

jsirois commented Dec 13, 2023

I'll attach some example perf numbers for the torch 2GB case here as soon as CI goes green. Suffice to say, this is both a build time win and run time win for that case.

@jsirois
Copy link
Member Author

jsirois commented Dec 13, 2023

Reviewers: Thanks in advance - this is large. I did break it up into independent commits though, none of which is larger than 550 delta-lines, with most being ~100 delta lines. So instead of ~12 PRs you get 1 with 12 commits. I needed the 1 personally to work out all the titchy details comprehensively and ensure this all worked / vet it was complicated and actually required the 2 new knobs instead of just being the new default.

The break is only under pypy3.{7,8,9,10} and was caused by the
prompt-toolkit 3.0.42 release today.
Working through the perf analysis in pex-tool#2292 brought these to light.
@jsirois
Copy link
Member Author

jsirois commented Dec 13, 2023

Alright, #2292 now has perf analysis for the OP torch case. I'm still working on the 3 mac-specific IT failures, but this is ready for review.

These were solved by switching away from the `temporary_dir`
context manager to pytest's `tmpdir` fixture. Under `temporary_dir` the
`~/.pex/unzipped_pexes/*/.bootstrap` symlink was not working. The
mechanism by which this wasn't working still needs to be root-caused.
Comment on lines +284 to 288
if not self._pex_info.deps_are_wheel_files:
raise ResolveError(
"Cannot resolve .whl files from PEX at {pex}; its dependencies are in the "
"form of pre-installed wheel chroots.".format(pex=self.source_pex)
)
Copy link
Member Author

@jsirois jsirois Dec 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I only like to add orthogonal features that can be combined freely. There are 3 Pex resolver sources: Pip, lock files and PEX repositories, and 2 PEX dependency formats: installed wheel chroots and (now) raw .whl files. In this 6 element matrix, only this case of resolving raw .whl files from a PEX repository containing dependencies in installed wheel chroot form cannot be satisfied currently (this is tested explicitly in the changes to test_pex_repository_resolve.py).

The pep_427.py work that now powers .whl file installs instead of pip install does make it possible to plug this gap in a later release though with some modification of pep_376.py. This will also allow the existing PEX_TOOLS=1 ./a.pex repository extract ... tool to produce fully correct wheels for the installed wheel chroot dependencies in PEXes of that style as well.

Filed: #2299

Copy link
Collaborator

@kaos kaos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not gone through all commits yet. Only been reading the code, trying to grok the change with limited knowledge of the surroundings.

pex/bin/pex.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@huonw huonw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The numbers from #2292 are awesome, nice!

group.add_argument(
"--max-install-jobs",
dest="max_install_jobs",
default=1,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming this default is to preserve behaviour; if a breaking change was possible (e.g. in 3.0.0), would you consider defaulting to -1?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not. Parallel installs slow down most cases and although -1 tries to guess the divide in the cases, I'm sure it does so poorly. My tests were on a 16 core SSD machine and I'm not confident on the heuristic for my machine let alone other disk / CPU combos.

pex/environment.py Show resolved Hide resolved
Comment on lines +244 to +257
with identify_layout({pex!r}) as layout, atomic_directory(
{spread_dest!r}, source={source!r}
) as spread_chroot:
if not spread_chroot.is_finalized():
with TRACER.timed({extracting_msg!r}):
layout.extract_dist(
dest_dir=spread_chroot.work_dir,
dist_relpath={dist_relpath!r},
is_wheel_file={is_wheel_file!r}
)

symlink_dest = {symlink_dest!r}
safe_mkdir(os.path.dirname(symlink_dest))
os.symlink({symlink_src!r}, symlink_dest)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth exposing the core of this logic as a function that can be used by both both _ensure_distributions_installed_serial directly and by this script?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how you get much better. This code is pretty small and the sticking point is 7 parameters.

pex/layout.py Outdated
pass


AVERAGE_DISTRIBUTION_SIZE_PARALLEL_JOB_THRESHOLD = 1 * 1024 * 1024 # ~1MB
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this number come from experiments?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. As noted elsewhere, on 1 machine and with limited cases. I'm very much not confident in the heuristic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented.

pex/layout.py Show resolved Hide resolved
pex/resolver.py Show resolved Hide resolved
attr.ib()
) # Mapping[ProjectName, OrderedSet[Requirement]]

def adjust(self, distributions):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this adjusting? (I acknowledge it's just moved code.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of the extracted class is indicative here, it adjusts a distribution to note the direct top level requirement it satisfies, if any. Most dists usually don't get adjusted since they satisfy only transitive requirements. The need for associating dists to the direct requirements they satisfy goes back to handling local project directory dependencies. These have no project name known except after the resolve process which builds them into a wheel with a name and version.

def test_entry_point_verification_3rdparty(tmpdir):
# type: (Any) -> None
@pytest.mark.parametrize(
"layout", [pytest.param(layout, id=layout.value) for layout in Layout.values()]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally minor, but are you aware that pytest.mark.parametrize supports an ids parameter that can take a function, so these could potentially be something like:

@pytest.mark.parametrize("layout", Layout.values(), ids=lambda layout: layout.value)
...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not aware. I'm not sure I like that though! It's not obvious the two align when you split things up like that. 10 values, 6 ids. Clearly they align in your example, but it takes a second to make sure of that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I read poorly. It's 1 function mapped over all the values. Gotcha.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, in addition to the function, it also supports passing a list with exactly the right number of IDs, but I don't like that possibility either.

pex = os.path.join(str(tmpdir), "pex")
exe = os.path.join(str(tmpdir), "exe")
with open(exe, "w") as fp:
fp.write("import colors; print(colors.blue('Moo?'))")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the import colours is meaning to validate that ansicolors is actually installed in a useful way. Should there be an import cowsay too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@jsirois
Copy link
Member Author

jsirois commented Dec 14, 2023

Alright - thanks @huonw and @kaos. I've added the perf analysis for small PEXes to the ticket which backs up what is expected: Builds of .whl file-based PEXes is always at least a little bit faster, Cold 1st runs is always a little bit slower and install parallelization is generally a loss. Although I'd love to not have knobs for these two new features, it seems necessary - there are unavoidable tradeoffs that are hard to justify removing from the user's purview.

@jsirois
Copy link
Member Author

jsirois commented Dec 14, 2023

@kaos, I'm going to submit - I feel pretty confident I've covered the bases here - but I'll follow up with any feedback you have. There is refactoring and simplification to do as at least partly tracked by #2299 that further feedback will fall naturally into.

@jsirois jsirois merged commit b3a7767 into pex-tool:main Dec 14, 2023
24 checks passed
@jsirois jsirois deleted the issues/2292/support-runtime-wheel-install branch December 14, 2023 02:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pex zip-creation takes a very long time for torch>=2
3 participants