-
-
Notifications
You must be signed in to change notification settings - Fork 644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmark various defaults for PEX compression for internal PEXes #15062
Comments
Are there internal PEX that aren't |
Yes. A packed layout consists of N + 1 zips. 1 PEX |
Has anyone tried seeing whether PEX files might start up a little faster too if they're uncompressed? |
All PEX files are ~uncompressed (unzipped). Whether |
Oh yes! Thanks! |
To underscore - the benchmark here is about PEX creation time then, not run time. |
With https://github.com/pantsbuild/pex/releases/tag/v2.1.154, potentially we could be considering using
|
(#20347 updates to pex 2.1.155 by default, but doesn't touch |
All changes: - https://github.com/pantsbuild/pex/releases/tag/v2.1.153 - https://github.com/pantsbuild/pex/releases/tag/v2.1.154 - https://github.com/pantsbuild/pex/releases/tag/v2.1.155 Highlights: - `--no-pre-install-wheels` (and `--max-install-jobs`) that likely helps with: - #15062 - (the root cause of) #20227 - _maybe_ arguably #18293, #18965, #19681 - improved shebang selection, helping with #19514, but probably not the full solution (#19925) - performance improvements
Changelog: https://github.com/pex-tool/pex/releases/tag/v2.3.0 - `sync` of interest for pantsbuild#15704 - error message clarification regarding pantsbuild#15062 - fix for explicit flags as implemented pantsbuild#20598 ``` Lockfile diff: 3rdparty/python/user_reqs.lock [python-default] == Upgraded dependencies == asgiref 3.7.2 --> 3.8.1 cryptography 42.0.3 --> 42.0.5 pex 2.2.1 --> 2.3.0 pyparsing 3.1.1 --> 3.1.2 python-dateutil 2.8.2 --> 2.9.0.post0 sniffio 1.3.0 --> 1.3.1 ```
Changelog: https://github.com/pex-tool/pex/releases/tag/v2.3.0 - `sync` of interest for #15704 - error message clarification regarding #15062 - fix for explicit flags as implemented #20598 ``` Lockfile diff: 3rdparty/python/user_reqs.lock [python-default] == Upgraded dependencies == asgiref 3.7.2 --> 3.8.1 cryptography 42.0.3 --> 42.0.5 pex 2.2.1 --> 2.3.0 pyparsing 3.1.1 --> 3.1.2 python-dateutil 2.8.2 --> 2.9.0.post0 sniffio 1.3.0 --> 1.3.1 ```
…s of internal pexes (#20670) This has all internal PEXes be built with settings to improve performance: - with `--no-pre-install-wheels`, to package `.whl` directly rather than unpack and install them. (NB. this requires Pex 2.3.0 to pick up pex-tool/pex#2392) - with `PEX_MAX_INSTALL_JOBS`, to use more concurrency for install, when available This is designed to be a performance improvement for any processing where Pants synthesises a PEX internally, like `pants run path/to/script.py` or `pants test ...`. pex-tool/pex#2292 has benchmarks for the PEX tool itself. For benchmarks, I did some more purposeful ones with tensorflow (PyTorch seems a bit awkward hard to set-up and Tensorflow is still huge), using https://gist.github.com/huonw/0560f5aaa34630b68bfb7e0995e99285 . I did 3 runs each of two goals, with 2.21.0.dev4 and with `PANTS_SOURCE` pointing to this PR, and pulled the numbers out by finding the relevant log lines: - `pants --no-local-cache --no-pantsd --named-caches-dir=$(mktemp -d) test example_test.py`. This involves building 4 separate PEXes partially in parallel, partially sequentially: `requirements.pex`, `local_dists.pex` `pytest.pex`, and then `pytest_runner.pex`. The first and last are the interesting ones for this test. - `pants --no-local-cache --no-pantsd --named-caches-dir=$(mktemp -d) run script.py`. This just builds the requirements into `script.pex`. (NB. these are potentially unrealistic in they're running with all caching turned off or cleared, so are truly a worst case. This means they're downloading tensorflow wheels and all the others, each time, which takes about 30s on my 100Mbit/s connection. Faster connections will thus see a higher ratio of benefit.) | goal | period | before (s) | after (s) | |---------------------|------------------------------|-----------:|----------:| | `run script.py` | building requirements | 74-82 | 49-52 | | `test some_test.py` | building requirements | 67-71 | 30-36 | | | building pytest runner | 8-9 | 17-18 | | | total to start running tests | 76-80 | 53-58 | I also did more adhoc ones on a real-world work repo of mine, which doesn't use any of the big ML libraries, just running some basic goals once. | goal | period | before (s) | after (s) | | |---------------------------------------------------|-----------------------------------------|-----------:|----------:|----| | `pants export` on largest resolve | building requirements | 66 | 35 | | | | total | 82 | 54 | | | "random" `pants test path/to/file.py` (1 attempt) | building requirements and pytest runner | 1 | 49 | 38 | Fixes #15062
Pants uses lots of PEXes "internally", which are never exported for users to consume. Since these are implementation details, we have free-rein to choose an internal-default PEX compression level... and some values can make a considerable difference in performance.
To experiment/adjust for "internal only", logic similar to the logic for PEX's
layout
setting could be added, which set a default compression value for internal use:pants/src/python/pants/backend/python/util_rules/pex.py
Lines 196 to 198 in 902d3ac
The text was updated successfully, but these errors were encountered: