-
-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce pex3 cache {dir,info,purge}
.
#2513
Conversation
2563b79
to
b94113a
Compare
pex3 cache {dir,info,purge}
.
Pants folks - I added you since you have: pantsbuild/pants#11167 This initial introduction of Pex cache management commands does not have any JSON output option - its just for humans currently. The linked Pants issue seems to require structured information though. Much like Pants needing to come up with a representation it prefers for dependency graphs (I assume that's still not done), this integration point also needs similar spec'ing. If you have opinions, or better, a spec, I'd be happy to follow up with support for various alternate output formats for the cache usage information. |
management = [ | ||
# N.B.: Released on 2017-09-01 and added support for the `process_iter(attrs, ad_value)` API we | ||
# use in `pex.cache.access`. | ||
"psutil>=5.3" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
N.B.: If Pants wants to use these cache management commands and it still uses the Pex PEX releases, I'll either need to add a pex+management
PEX to the release that embeds psutil for all supported platforms or else start releasing Pex PEX scies, which naturally handle platform specific deps. I favor the scies here since the pex+management approach doesn't scale well once more CLI-specific-deps are added, but I wanted to get Pants maintainers opinions on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... there's a longer story on why I went with psutil / access info gathering at delete-time, but suffice it to say, recording usage info in the read-write access lock path (I used sqlite3) added ~2-30ms overhead to every PEX launch and I did not deem that acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pants currently still uses the Pex PEX releases, but could straightforwardly switch to Pex scies (assuming they are drop in replacements for each other in terms of the CLI interface, which I assume would be the case). Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I wasn't clear on: If not scies, why would the alternative be a separate pex+management
PEX rather than embedding psutil for all platforms in the multiplatform pex PEX? I can see why you want psutil in the management
extra, to keep the base Pex dist widely installable, but does that have to be mirrored in the release PEX?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, two reasons:
- It means embedding all these wheels in the Pex PEX or the Pex+management PEX - so, in part size:
- psutil-5.9.5-cp27-cp27m-macosx_10_9_x86_64.whl
- psutil-5.9.5-cp27-cp27m-manylinux2010_i686.whl
- psutil-5.9.5-cp27-cp27m-manylinux2010_x86_64.whl
- psutil-5.9.5-cp27-cp27mu-manylinux2010_i686.whl
- psutil-5.9.5-cp27-cp27mu-manylinux2010_x86_64.whl
- psutil-5.9.5-cp36-abi3-macosx_10_9_x86_64.whl
- psutil-5.9.5-cp36-abi3-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl
- psutil-5.9.5-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- psutil-5.9.5-cp38-abi3-macosx_11_0_arm64.whl
- But it also means inventing a bit of new functionality to support a
Universal
Target
to complement the currentAbbreviatedPlatform
,CompletePLatform
andLocalInterpreter
Target
s - this newTarget
type would only work when the Pex resolve was against a --lock or a --pex-repository and it would be able to grab all the wheels listed in 1 despite a machine not having all the correspondingLocalInterpreter
targets available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm, I think I need to do most of 2 anyhow to support creating a lock for the Pex scies with something like pex3 lock create --project ".[management]" --pip-version latest --style universal --target-system linux --target-system mac --interpreter-constraint "CPython==3.12.*" -o pex-scie.lock
. Alternatively, I could create a manual workflow that generated a complete platform for each of the 4 supported platforms using the scie PBS and generate a strict multi-lock using those 4 --complete-platform
targets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's a hassle then an alternative is for Pants to install pex as a dist, with the management extra, when it needs to run management commands?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its free to do so, although, last I knew this introduces a whole can of worms it currently leaves to the Pex PEX "binary" which it only needs to know how to download. Now you're in the land of building a venv, caching it somewhere efficiently, etc.
I'll definitely be at least adding Pex scies to this release anyhow; so your choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of downloading the Pex PEX as before for most uses, but then using that to bootstrap a Pex venv if needed for this one use case. But obviously Pex scies would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, OK. Yeah - that would work. I should have the Pex scies + release PRs up today.
Some examples. Get cache info sorted by size: :; python -mpex.cli cache info -HS
Path: /home/jsirois/.cache/pex
Pex Docs: docs/0
Artifacts used in serving Pex docs via `pex --docs` and `pex3 docs`.
0 bytes in 0 subdirectories and 0 files.
Abbreviated Platforms: platforms/0
Information calculated about abbreviated platforms specified via `--platform`.
291 kB in 13 subdirectories and 26 files.
User Code: user_code/0
User code added to PEX files using `-D` / `--sources-directory`, `-P` / `--package` and `-M` / `--module`.
531 kB in 117 subdirectories and 168 files.
Packed Bootstraps: bootstrap_zips/0
PEX runtime bootstrap code, zipped up for `--layout packed` PEXes.
1.42 MB in 2 subdirectories and 4 files.
Unzipped PEXes: unzipped_pexes/0
The unzipped PEX files executed on this machine.
3.12 MB in 483 subdirectories and 1824 files.
Packed Wheels: packed_wheels/0
The same content as 'installed_wheels/0', but zipped up for `--layout packed` PEXes.
6.32 MB in 6 subdirectories and 12 files.
Interpreters: interpreters/0
Information about interpreters found on the system.
14.4 MB in 374 subdirectories and 704 files.
Bootstraps: bootstraps/0
PEX runtime bootstrap code.
19.5 MB in 125 subdirectories and 1657 files.
Pex Tools: tools/0
Caches for the various `PEX_TOOLS=1` / `pex-tools` subcommands.
21.4 MB in 254 subdirectories and 1886 files.
Built Wheels: built_wheels/0
Wheels built by Pex from resolved sdists when creating PEX files.
29.6 MB in 156 subdirectories and 59 files.
Isolated Pex Code: isolated/0
The Pex codebase isolated for internal use in subprocesses.
45.0 MB in 367 subdirectories and 3803 files.
Scie Tools: scies/0
Tools and caches used when building PEX scies via `--scie {eager,lazy}`.
151 MB in 5 subdirectories and 40 files.
Lock Artifact Downloads: downloads/0
Distributions downloaded when resolving from a Pex lock file.
236 MB in 379 subdirectories and 1980 files.
Pip Versions: pip/0
Isolated Pip caches and Pip PEXes Pex uses to resolve distributions.
324 MB in 3349 subdirectories and 17300 files.
Virtual Environments: venvs/0
Virtual environments generated at runtime for `--venv` mode PEXes.
583 MB in 9165 subdirectories and 40569 files.
Pre-installed Wheels: installed_wheels/0
Pre-installed wheel chroots used to both build PEXes and serve as runtime `sys.path` entries.
961 MB in 8975 subdirectories and 51739 files.
Total: 2.40 GB in 23770 subdirectories and 121771 files. Dry run purge of just installed_wheels cache: :; python -mpex.cli cache purge --entries installed_wheels -nRH
Would purge requested entries from /home/jsirois/.cache/pex: installed_wheels/0
Would also purge those entries transitive dependents in: unzipped_pexes/0, venvs/0
Would have purged cache Unzipped PEXes from unzipped_pexes/0
3.12 MB in 483 subdirectories and 1824 files.
Would have purged cache Pre-installed Wheels from installed_wheels/0
961 MB in 8975 subdirectories and 51739 files.
Would have purged cache Virtual Environments from venvs/0
583 MB in 9165 subdirectories and 40569 files.
Total: 1.55 GB in 18623 subdirectories and 94132 files. And go for it (no psutil): :; python -mpex.cli cache purge --entries installed_wheels -RH
Purging requested entries from /home/jsirois/.cache/pex: installed_wheels/0
Also purging those entries transitive dependents in: unzipped_pexes/0, venvs/0
Failed to import psutil: No module named 'psutil'
Will proceed with basic output.
---
Note: this process will block until all other running Pex processes have exited.
To get information on which processes these are, re-install Pex with the
management extra; e.g.: with requirement pex[management]
Attempting to acquire cache write lock (press CTRL-C to abort) ...
^C
No cache entries purged. With psutil: :; python -mpex.cli cache purge --entries installed_wheels -RH
Purging requested entries from /home/jsirois/.cache/pex: installed_wheels/0
Also purging those entries transitive dependents in: unzipped_pexes/0, venvs/0
Waiting on 2 in flight processes (with shared lock on /home/jsirois/.cache/pex/access.lck) to complete before deleting:
---
1. pid 281904 started by jsirois at 2024-09-02 15:45:24
Pex env: {'PEX': '/home/jsirois/dev/pex-tool/pex/empty.pex'}
cmdline: ['/home/jsirois/.pyenv/versions/3.11.9/bin/python3.11', '/home/jsirois/.cache/pex/unzipped_pexes/0/292f879052303680091fdcb445c2a746967b4e0f']
2. pid 282594 started by jsirois at 2024-09-02 15:45:50
Pex env: {'PEX_TOOLS': '1', 'PEX': '/home/jsirois/dev/pex-tool/pex/empty-tools.pex'}
cmdline: ['/home/jsirois/.pyenv/versions/3.11.9/bin/python3.11', '/home/jsirois/.cache/pex/unzipped_pexes/0/2c278c488639385dd1ca190c245fc4e8da7a0f30', 'repository', 'extract', '-f', '/tmp/find-links', '--serve']
Attempting to acquire cache write lock (press CTRL-C to abort) ...
^C
No cache entries purged. And ending the processes with the shared lock: :; python -mpex.cli cache purge --entries installed_wheels -RH
Purging requested entries from /home/jsirois/.cache/pex: installed_wheels/0
Also purging those entries transitive dependents in: unzipped_pexes/0, venvs/0
Waiting on 2 in flight processes (with shared lock on /home/jsirois/.cache/pex/access.lck) to complete before deleting:
---
1. pid 281904 started by jsirois at 2024-09-02 15:45:17
Pex env: {'PEX': '/home/jsirois/dev/pex-tool/pex/empty.pex'}
cmdline: ['/home/jsirois/.pyenv/versions/3.11.9/bin/python3.11', '/home/jsirois/.cache/pex/unzipped_pexes/0/292f879052303680091fdcb445c2a746967b4e0f']
2. pid 282594 started by jsirois at 2024-09-02 15:45:43
Pex env: {'PEX_TOOLS': '1', 'PEX': '/home/jsirois/dev/pex-tool/pex/empty-tools.pex'}
cmdline: ['/home/jsirois/.pyenv/versions/3.11.9/bin/python3.11', '/home/jsirois/.cache/pex/unzipped_pexes/0/2c278c488639385dd1ca190c245fc4e8da7a0f30', 'repository', 'extract', '-f', '/tmp/find-links', '--serve']
Attempting to acquire cache write lock (press CTRL-C to abort) ...
Purged cache Unzipped PEXes from unzipped_pexes/0
3.12 MB in 483 subdirectories and 1824 files.
Purged cache Pre-installed Wheels from installed_wheels/0
961 MB in 8975 subdirectories and 51739 files.
Purged cache Virtual Environments from venvs/0
583 MB in 9165 subdirectories and 40569 files.
Total: 1.55 GB in 18623 subdirectories and 94132 files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice.
management = [ | ||
# N.B.: Released on 2017-09-01 and added support for the `process_iter(attrs, ad_value)` API we | ||
# use in `pex.cache.access`. | ||
"psutil>=5.3" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pants currently still uses the Pex PEX releases, but could straightforwardly switch to Pex scies (assuming they are drop in replacements for each other in terms of the CLI interface, which I assume would be the case). Thanks!
management = [ | ||
# N.B.: Released on 2017-09-01 and added support for the `process_iter(attrs, ad_value)` API we | ||
# use in `pex.cache.access`. | ||
"psutil>=5.3" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I wasn't clear on: If not scies, why would the alternative be a separate pex+management
PEX rather than embedding psutil for all platforms in the multiplatform pex PEX? I can see why you want psutil in the management
extra, to keep the base Pex dist widely installable, but does that have to be mirrored in the release PEX?
Re-structure the Pex cache to both support versioning as well as adding
access tracking for shared (normal) use and for exclusive use when
portions of the cache need to be deleted. With this new ground work, add
a new
pex3 cache {dir,info,purge}
family of commands for inspectingand safely trimming the Pex cache.
Closes #1176
Closes #1655
Closes #2201