-
-
Notifications
You must be signed in to change notification settings - Fork 644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecate lambdex for building FaaS packages (pending) #19032
Conversation
This fixes #18879 by allowing the `python_awslambda` and `python_google_cloud_function` FaaS artefacts to be generated in "simple" format, using the `pex3 venv create --layout=flat-zipped` functionality recently added in PEX 2.1.135 (https://github.com/pantsbuild/pex/releases/tag/v2.1.135). This format is just: put everything at the top-level, e.g. the zip contains `cowsay/__init__.py` etc., rather than `.deps/cowsay-....whl` (plus the dynamic PEX initialisation). This shifts the dynamic dependency computation/extraction/layout from run-time to build-time, relying on the FaaS environment to be generally consistent. It shouldn't change what actually happens after initialisation. This can: - reduce cold-starts noticeably: for instance, some of our lambdas spend 1s doing PEX/Lambdex start up. - reduce package size somewhat (the PEX `.bootstrap/` folder seems to be about 2MB uncompressed, ~1MB compressed). - increase build times. For instance, for one Python 3.9 Lambda in our codebase: | metric | before | after | |-------------------------------|----------|------------------| | init time on cold start | 2.3-2.5s | 1.3-1.4s (-1s) | | compressed size | 24.6MB | 23.8MB (-0.8MB) | | uncompressed size | 117.8MB | 115.8MB (-2.0MB) | | PEX-construction build time | ~5s | ~5s | | PEX-postprocessing build time | 0.14s | 4.8s | (The PEX-postprocessing time metric is specifically the time to run the `Setting up handler` (lambdex) or `Build python_awslambda` (`pex3 venv create`) process, computed by running `pants --keep-sandboxes=always package ...` for each layout, and then `hyperfine -r3 -w1 path/to/first/__run.sh path/to/second/__run.sh`. This _doesn't_ include the time to construct the input PEX, which is the same for both.) This functionality is driven by adding a new `layout` field. It defaults to `lambdex` (retaining the current code paths), but also supports `zip`, which keys into the functionality above. I've tried to keep the non-lambdex implementation generally separate to the lambdex one, rather than reusing all of the code that happens to be common currently, because it'd make sense to deprecate/remove the lambdex functionality and thus I feel it's best for this new functionality to be mostly a fresh start. This PR's commits can be reviewed independently. It comes in three phases: 1. Add the `pex_venv.py` util rules for running `pex3 venv create ...`. Currently this only supports a limited subset of the functionality there, but can presumably be expanded freely as required. (First commit) 2. Do some minor refactoring. (Commits labelled "refactor: ...") 3. Draw the rest of the owl. (The others.) I _think_ this is an acceptable MVP for this functionality, but there's various bits of follow-up: - deprecate `layout="lambdex"` (in favour of `layout="zip"` and/or normal `pex_binary`) (#19032) - add a warning about `files` being loaded into these packages, which has been temporarily lost (#19027) - adjust documentation - other improvements like #18195 and #18880 - improve performance, e.g. potentially `pex3 venv create ...` could use the lock file and sources to directly compute the appropriate files, without having to materialise a normal pex first
1ce058c
to
6255594
Compare
@@ -180,6 +183,15 @@ def handler(event, context): | |||
if sys.platform == "darwin": | |||
assert "`python_awslambda` targets built on macOS may fail to build." in caplog.text | |||
|
|||
global _first_run_of_deprecated_warning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a big code smell, but also we don't typically test deprecation messages, as they are comprehensively tested generically. Manually observing that the right thing happens is sufficient.
@@ -180,6 +183,15 @@ def handler(event, context): | |||
in caplog.text | |||
) | |||
|
|||
global _first_run_of_deprecated_warning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
@@ -1752,6 +1752,20 @@ class GlobalOptions(BootstrapOptions, Subsystem): | |||
default=[], | |||
) | |||
|
|||
use_deprecated_lambdex_layout = BoolOption( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this a global option? In fact, why do we need this option at all?
I might be missing something, but I think it would be simpler to do the relevant check in the code that uses the subsystem, and issue the deprecation warning there?
I'm not sure why this is a complicated as it is, so maybe I'm not seeing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to allow users (including my work repo) to opt in to the new layout="zip"
-by-default behaviour early and globally. In particular, I wouldn't want to have to set it on each target when I upgrade to 2.18... and then go through and delete them all in 2.19, once the default flips. I also thought it would be nice to allow users to silence the warning, but that's less critical.
I suppose another option would be: unconditional (but memoized etc.) warning as it is, and users can add a __defaults__
in the root BUILD
to reduces the updates required. This does mean pants won't be able prompt users to remove it when it's not necessary, but that's probably okay, I can have the warning suggest something with a comment.
I'm more than happy to be guided to simpler options, if you think I'm solving problems that don't need to be solved!
(Side note, is 2.19 the right release to remove? Maybe it should be a longer deprecation period?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a very low chance that switching from lambdex to zip will have any negative consequence, so I think 2.19.x is fine!
But I'm sure we can find a much simpler way of deprecating. For one thing, maybe (and I wish I'd thought of this when reviewing that change, sorry) we don't need a per-target layout
at all, and this can be a repo-wide setting? After all, we're about to get rid of it entirely, so the intermediate state of "some lambdas use lambdex and some use zip" seems unnecessary. Esp. since we expect zero user-noticeable difference except better performance (and I guess the name change of the handler). That change has not been in any release yet, so not too late to replace it.
Instead, we can put layout on the lambdex
scope (for want of a better one), and default it to lambdex
. But, if you rely on that default in 2.17.x (I think a cherry-pick is fine for this) you get a warning that you should set it explicitly and that lambdex
is going away. Then we make zip
the default in 2.18.x, with a warning if you set to lambdex
(or if you set explicitly to zip
, since that is now the default, and the option is going away), and then we remove the option entirely in 2.19.x.
Does that make sense? This is often how we do stuff like this - introduce an option to deprecate a feature, then deprecation the option one release later.
Either way, this shouldn't be on the global options, because it is very specific to a backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For one thing, maybe (and I wish I'd thought of this when reviewing that change, sorry) we don't need a per-target layout at all, and this can be a repo-wide setting?
Does that make sense? This is often how we do stuff like this - introduce an option to deprecate a feature, then deprecation the option one release later.
Yeah, that makes a lot of sense.
Simplifying layout to a global option is way better 👍 I had originally been thinking of other layouts that do make sense per-target (e.g. layout="directory"
to not have it zipped, or even layout="layer"
), but that's all a total distraction from the core of "move away from Lambdex" and should be handled independently.
Next steps for me:
- revert Implement layout="zip" for Lambda/GCF, skipping lambdex #19022, because it'll make the next steps cleaner: Revert "Implement layout="zip" for Lambda/GCF, skipping lambdex" #19071
- set-up the global
layout = "lambdex"
setting: Add [lambdex].layout to opt-in to soon-to-be-deprecated FaaS behaviour #19074 - redo Implement layout="zip" for Lambda/GCF, skipping lambdex #19022, getting rid of the
layout
field and using the global setting instead - return to this PR if required
Thanks for taking the time to think through a better approach!
) Reverts #19022. Per #19032 (comment), we're going to change the approach for moving away from Lambdex to a global setting, rather than the per-target `layout` field added in #19022. That change will simplify things, and involve an extra moving part that will be cherry-picked to 2.17, so I think it's better to start from scratch, and then more carefully cherry-pick the parts of #19022 that are relevant.
This fixes #18879 by allowing the `python_awslambda` and `python_google_cloud_function` FaaS artefacts to be generated in "simple" format, using the `pex3 venv create --layout=flat-zipped` functionality recently added in PEX 2.1.135 (https://github.com/pantsbuild/pex/releases/tag/v2.1.135). This format is just: put everything at the top-level. For instance, the zip contains `cowsay/__init__.py` etc., rather than `.deps/cowsay-....whl`. This avoids the need to do the dynamic PEX initialisation/venv creation. This shifts the dynamic dependency computation/extraction/layout from run-time to build-time, relying on the FaaS environment to be generally consistent. It shouldn't change what actually happens after initialisation. This can: - reduce cold-starts noticeably: for instance, some of our lambdas spend 1s doing PEX/Lambdex start up. - reduce package size somewhat (the PEX `.bootstrap/` folder seems to be about 2MB uncompressed, ~1MB compressed). - increase build times. For instance, for one Python 3.9 Lambda in our codebase: | metric | before | after | |---|---|---| | init time on cold start | 2.3-2.5s | 1.3-1.4s (-1s) | | compressed size | 24.6MB | 23.8MB (-0.8MB) | | uncompressed size | 117.8MB | 115.8MB (-2.0MB) | | PEX-construction build time | ~5s | ~5s | | PEX-postprocessing build time | 0.14s | 4.8s | (The PEX-postprocessing time metric is specifically the time to run the `Setting up handler` (lambdex) or `Build python_awslambda` (`pex3 venv create`) process, computed by running `pants --keep-sandboxes=always package ...` for each layout, and then `hyperfine -r3 -w1 path/to/first/__run.sh path/to/second/__run.sh`. This _doesn't_ include the time to construct the input PEX, which is the same for both.) --- This functionality is driven by adding a new option to the `[lambdex].layout` option added in #19074. In #19074 (targeted for 2.17), it defaults `lambdex` (retaining the current code paths). This PR flips the default to the new option `zip`, which keys into the functionality above. I've tried to keep the non-lambdex implementation generally separate to the lambdex one, rather than reusing all of the code that happens to be common currently, because it'd make sense to deprecate/remove the lambdex functionality and thus I feel it's best for this new functionality to be mostly a fresh start. This PR's commits can be reviewed independently. I _think_ this is an acceptable MVP for this functionality, but there's various bits of follow-up: - add a warning about `files` being loaded into these packages, which has been temporarily lost (#19027) - adjust documentation #19067 - other improvements like #18195 and #18880 - improve performance, e.g. potentially `pex3 venv create ...` could use the lock file and sources to directly compute the appropriate files, without having to materialise a normal pex first This is a re-doing of #19022 with a simpler approach to deprecation, as discussed in #19074 (comment) and #19032 (comment). The phasing will be: | release | supports lambdex? | supports zip? | default layout | deprecation warnings | |---|---|---|---|---| | 2.17 (this PR) | ✅ | ✅ | lambdex | if `layout = "lambdex"` is implicit, tell people to set it: recommend `zip`, but allow `lambdex` if they have to | | 2.18 | ✅ | ✅ | zip | if `layout = "lambdex"` is set at all, tell people to remove it and switch to `zip` | | 2.19 | ❌ | ✅ | zip | none, migration over (or maybe just about removing the `[lambdex]` section entirely) |
This fixes #18879 by allowing the `python_awslambda` and `python_google_cloud_function` FaaS artefacts to be generated in "simple" format, using the `pex3 venv create --layout=flat-zipped` functionality recently added in PEX 2.1.135 (https://github.com/pantsbuild/pex/releases/tag/v2.1.135). This format is just: put everything at the top-level. For instance, the zip contains `cowsay/__init__.py` etc., rather than `.deps/cowsay-....whl`. This avoids the need to do the dynamic PEX initialisation/venv creation. This shifts the dynamic dependency computation/extraction/layout from run-time to build-time, relying on the FaaS environment to be generally consistent. It shouldn't change what actually happens after initialisation. This can: - reduce cold-starts noticeably: for instance, some of our lambdas spend 1s doing PEX/Lambdex start up. - reduce package size somewhat (the PEX `.bootstrap/` folder seems to be about 2MB uncompressed, ~1MB compressed). - increase build times. For instance, for one Python 3.9 Lambda in our codebase: | metric | before | after | |---|---|---| | init time on cold start | 2.3-2.5s | 1.3-1.4s (-1s) | | compressed size | 24.6MB | 23.8MB (-0.8MB) | | uncompressed size | 117.8MB | 115.8MB (-2.0MB) | | PEX-construction build time | ~5s | ~5s | | PEX-postprocessing build time | 0.14s | 4.8s | (The PEX-postprocessing time metric is specifically the time to run the `Setting up handler` (lambdex) or `Build python_awslambda` (`pex3 venv create`) process, computed by running `pants --keep-sandboxes=always package ...` for each layout, and then `hyperfine -r3 -w1 path/to/first/__run.sh path/to/second/__run.sh`. This _doesn't_ include the time to construct the input PEX, which is the same for both.) --- This functionality is driven by adding a new option to the `[lambdex].layout` option added in #19074. In #19074 (targeted for 2.17), it defaults `lambdex` (retaining the current code paths). This PR flips the default to the new option `zip`, which keys into the functionality above. I've tried to keep the non-lambdex implementation generally separate to the lambdex one, rather than reusing all of the code that happens to be common currently, because it'd make sense to deprecate/remove the lambdex functionality and thus I feel it's best for this new functionality to be mostly a fresh start. This PR's commits can be reviewed independently. I _think_ this is an acceptable MVP for this functionality, but there's various bits of follow-up: - add a warning about `files` being loaded into these packages, which has been temporarily lost (#19027) - adjust documentation #19067 - other improvements like #18195 and #18880 - improve performance, e.g. potentially `pex3 venv create ...` could use the lock file and sources to directly compute the appropriate files, without having to materialise a normal pex first This is a re-doing of #19022 with a simpler approach to deprecation, as discussed in #19074 (comment) and #19032 (comment). The phasing will be: | release | supports lambdex? | supports zip? | default layout | deprecation warnings | |---|---|---|---|---| | 2.17 (this PR) | ✅ | ✅ | lambdex | if `layout = "lambdex"` is implicit, tell people to set it: recommend `zip`, but allow `lambdex` if they have to | | 2.18 | ✅ | ✅ | zip | if `layout = "lambdex"` is set at all, tell people to remove it and switch to `zip` | | 2.19 | ❌ | ✅ | zip | none, migration over (or maybe just about removing the `[lambdex]` section entirely) |
…d#19076) This fixes pantsbuild#18879 by allowing the `python_awslambda` and `python_google_cloud_function` FaaS artefacts to be generated in "simple" format, using the `pex3 venv create --layout=flat-zipped` functionality recently added in PEX 2.1.135 (https://github.com/pantsbuild/pex/releases/tag/v2.1.135). This format is just: put everything at the top-level. For instance, the zip contains `cowsay/__init__.py` etc., rather than `.deps/cowsay-....whl`. This avoids the need to do the dynamic PEX initialisation/venv creation. This shifts the dynamic dependency computation/extraction/layout from run-time to build-time, relying on the FaaS environment to be generally consistent. It shouldn't change what actually happens after initialisation. This can: - reduce cold-starts noticeably: for instance, some of our lambdas spend 1s doing PEX/Lambdex start up. - reduce package size somewhat (the PEX `.bootstrap/` folder seems to be about 2MB uncompressed, ~1MB compressed). - increase build times. For instance, for one Python 3.9 Lambda in our codebase: | metric | before | after | |---|---|---| | init time on cold start | 2.3-2.5s | 1.3-1.4s (-1s) | | compressed size | 24.6MB | 23.8MB (-0.8MB) | | uncompressed size | 117.8MB | 115.8MB (-2.0MB) | | PEX-construction build time | ~5s | ~5s | | PEX-postprocessing build time | 0.14s | 4.8s | (The PEX-postprocessing time metric is specifically the time to run the `Setting up handler` (lambdex) or `Build python_awslambda` (`pex3 venv create`) process, computed by running `pants --keep-sandboxes=always package ...` for each layout, and then `hyperfine -r3 -w1 path/to/first/__run.sh path/to/second/__run.sh`. This _doesn't_ include the time to construct the input PEX, which is the same for both.) --- This functionality is driven by adding a new option to the `[lambdex].layout` option added in pantsbuild#19074. In pantsbuild#19074 (targeted for 2.17), it defaults `lambdex` (retaining the current code paths). This PR flips the default to the new option `zip`, which keys into the functionality above. I've tried to keep the non-lambdex implementation generally separate to the lambdex one, rather than reusing all of the code that happens to be common currently, because it'd make sense to deprecate/remove the lambdex functionality and thus I feel it's best for this new functionality to be mostly a fresh start. This PR's commits can be reviewed independently. I _think_ this is an acceptable MVP for this functionality, but there's various bits of follow-up: - add a warning about `files` being loaded into these packages, which has been temporarily lost (pantsbuild#19027) - adjust documentation pantsbuild#19067 - other improvements like pantsbuild#18195 and pantsbuild#18880 - improve performance, e.g. potentially `pex3 venv create ...` could use the lock file and sources to directly compute the appropriate files, without having to materialise a normal pex first This is a re-doing of pantsbuild#19022 with a simpler approach to deprecation, as discussed in pantsbuild#19074 (comment) and pantsbuild#19032 (comment). The phasing will be: | release | supports lambdex? | supports zip? | default layout | deprecation warnings | |---|---|---|---|---| | 2.17 (this PR) | ✅ | ✅ | lambdex | if `layout = "lambdex"` is implicit, tell people to set it: recommend `zip`, but allow `lambdex` if they have to | | 2.18 | ✅ | ✅ | zip | if `layout = "lambdex"` is set at all, tell people to remove it and switch to `zip` | | 2.19 | ❌ | ✅ | zip | none, migration over (or maybe just about removing the `[lambdex]` section entirely) |
Replaced by #19122. |
…ck of #19076) (#19120) This fixes #18879 by allowing the `python_awslambda` and `python_google_cloud_function` FaaS artefacts to be generated in "simple" format, using the `pex3 venv create --layout=flat-zipped` functionality recently added in PEX 2.1.135 (https://github.com/pantsbuild/pex/releases/tag/v2.1.135). This format is just: put everything at the top-level. For instance, the zip contains `cowsay/__init__.py` etc., rather than `.deps/cowsay-....whl`. This avoids the need to do the dynamic PEX initialisation/venv creation. This shifts the dynamic dependency computation/extraction/layout from run-time to build-time, relying on the FaaS environment to be generally consistent. It shouldn't change what actually happens after initialisation. This can: - reduce cold-starts noticeably: for instance, some of our lambdas spend 1s doing PEX/Lambdex start up. - reduce package size somewhat (the PEX `.bootstrap/` folder seems to be about 2MB uncompressed, ~1MB compressed). - increase build times. For instance, for one Python 3.9 Lambda in our codebase: | metric | before | after | |---|---|---| | init time on cold start | 2.3-2.5s | 1.3-1.4s (-1s) | | compressed size | 24.6MB | 23.8MB (-0.8MB) | | uncompressed size | 117.8MB | 115.8MB (-2.0MB) | | PEX-construction build time | ~5s | ~5s | | PEX-postprocessing build time | 0.14s | 4.8s | (The PEX-postprocessing time metric is specifically the time to run the `Setting up handler` (lambdex) or `Build python_awslambda` (`pex3 venv create`) process, computed by running `pants --keep-sandboxes=always package ...` for each layout, and then `hyperfine -r3 -w1 path/to/first/__run.sh path/to/second/__run.sh`. This _doesn't_ include the time to construct the input PEX, which is the same for both.) --- This functionality is driven by adding a new option to the `[lambdex].layout` option added in #19074. In #19074 (targeted for 2.17), it defaults `lambdex` (retaining the current code paths). This PR flips the default to the new option `zip`, which keys into the functionality above. I've tried to keep the non-lambdex implementation generally separate to the lambdex one, rather than reusing all of the code that happens to be common currently, because it'd make sense to deprecate/remove the lambdex functionality and thus I feel it's best for this new functionality to be mostly a fresh start. This PR's commits can be reviewed independently. I _think_ this is an acceptable MVP for this functionality, but there's various bits of follow-up: - add a warning about `files` being loaded into these packages, which has been temporarily lost (#19027) - adjust documentation #19067 - other improvements like #18195 and #18880 - improve performance, e.g. potentially `pex3 venv create ...` could use the lock file and sources to directly compute the appropriate files, without having to materialise a normal pex first This is a re-doing of #19022 with a simpler approach to deprecation, as discussed in #19074 (comment) and #19032 (comment). The phasing will be: | release | supports lambdex? | supports zip? | default layout | deprecation warnings | |---|---|---|---|---| | 2.17 (this PR) | ✅ | ✅ | lambdex | if `layout = "lambdex"` is implicit, tell people to set it: recommend `zip`, but allow `lambdex` if they have to | | 2.18 | ✅ | ✅ | zip | if `layout = "lambdex"` is set at all, tell people to remove it and switch to `zip` | | 2.19 | ❌ | ✅ | zip | none, migration over (or maybe just about removing the `[lambdex]` section entirely) |
We're changing the approach, so this may not be necessary: #19032 (comment)
This deprecates the
layout="lambdex"
value for FaaS packages (python_awslambda
andpython_google_cloud_function
), now that #19022 has implementedlayout="zip"
, which is the cloud-vendor's recommended layout.This doesn't change the behaviour by default, other than a new warning. This also adds a flag
use_deprecated_lambdex_layout
that can be explicitly set totrue
to preserve behaviour and silence the warning, or set tofalse
to opt in tolayout="zip"
by default.Questions:
[lambdex]
subsystem section inpants.toml
, if someone has that set? I guess for now it's fine to leave it there, without a warning, and once we remove Lambdex support, we tell people that the section is unused/irrelevant?removal_hint
etc.)?