Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate lambdex for building FaaS packages (pending) #19032

Closed
wants to merge 1 commit into from

Conversation

huonw
Copy link
Contributor

@huonw huonw commented May 18, 2023

We're changing the approach, so this may not be necessary: #19032 (comment)


This deprecates the layout="lambdex" value for FaaS packages (python_awslambda and python_google_cloud_function), now that #19022 has implemented layout="zip", which is the cloud-vendor's recommended layout.

This doesn't change the behaviour by default, other than a new warning. This also adds a flag use_deprecated_lambdex_layout that can be explicitly set to true to preserve behaviour and silence the warning, or set to false to opt in to layout="zip" by default.

Questions:

  • I'm not sure how to deprecate/emit warnings about the [lambdex] subsystem section in pants.toml, if someone has that set? I guess for now it's fine to leave it there, without a warning, and once we remove Lambdex support, we tell people that the section is unused/irrelevant?
  • I'm not sure if there's a better way to deprecate a particular value to a field, rather than the whole field itself (removal_hint etc.)?

huonw added a commit that referenced this pull request May 20, 2023
This fixes #18879 by allowing the `python_awslambda` and
`python_google_cloud_function` FaaS artefacts to be generated in
"simple" format, using the `pex3 venv create --layout=flat-zipped`
functionality recently added in PEX 2.1.135
(https://github.com/pantsbuild/pex/releases/tag/v2.1.135). This format
is just: put everything at the top-level, e.g. the zip contains
`cowsay/__init__.py` etc., rather than `.deps/cowsay-....whl` (plus the
dynamic PEX initialisation).

This shifts the dynamic dependency computation/extraction/layout from
run-time to build-time, relying on the FaaS environment to be generally
consistent. It shouldn't change what actually happens after
initialisation. This can:

- reduce cold-starts noticeably: for instance, some of our lambdas spend
1s doing PEX/Lambdex start up.
- reduce package size somewhat (the PEX `.bootstrap/` folder seems to be
about 2MB uncompressed, ~1MB compressed).
- increase build times.
 
For instance, for one Python 3.9 Lambda in our codebase:

| metric                        | before   | after            |
|-------------------------------|----------|------------------|
| init time on cold start       | 2.3-2.5s | 1.3-1.4s (-1s)   |
| compressed size               | 24.6MB   | 23.8MB (-0.8MB)  |
| uncompressed size             | 117.8MB  | 115.8MB (-2.0MB) |
| PEX-construction build time   | ~5s      | ~5s              |
| PEX-postprocessing build time | 0.14s    | 4.8s             |

(The PEX-postprocessing time metric is specifically the time to run the
`Setting up handler` (lambdex) or `Build python_awslambda` (`pex3 venv
create`) process, computed by running `pants --keep-sandboxes=always
package ...` for each layout, and then `hyperfine -r3 -w1
path/to/first/__run.sh path/to/second/__run.sh`. This _doesn't_ include
the time to construct the input PEX, which is the same for both.)

This functionality is driven by adding a new `layout` field. It defaults
to `lambdex` (retaining the current code paths), but also supports
`zip`, which keys into the functionality above. I've tried to keep the
non-lambdex implementation generally separate to the lambdex one, rather
than reusing all of the code that happens to be common currently,
because it'd make sense to deprecate/remove the lambdex functionality
and thus I feel it's best for this new functionality to be mostly a
fresh start.

This PR's commits can be reviewed independently. It comes in three
phases:

1. Add the `pex_venv.py` util rules for running `pex3 venv create ...`.
Currently this only supports a limited subset of the functionality
there, but can presumably be expanded freely as required. (First commit)
2. Do some minor refactoring. (Commits labelled "refactor: ...")
3. Draw the rest of the owl. (The others.)

I _think_ this is an acceptable MVP for this functionality, but there's
various bits of follow-up:

- deprecate `layout="lambdex"` (in favour of `layout="zip"` and/or
normal `pex_binary`) (#19032)
- add a warning about `files` being loaded into these packages, which
has been temporarily lost (#19027)
- adjust documentation
- other improvements like #18195 and #18880 
- improve performance, e.g. potentially `pex3 venv create ...` could use
the lock file and sources to directly compute the appropriate files,
without having to materialise a normal pex first
@huonw huonw force-pushed the feature/18879-deprecate-lambdex branch from 1ce058c to 6255594 Compare May 20, 2023 02:12
@huonw huonw marked this pull request as ready for review May 20, 2023 02:29
@huonw huonw requested review from kaos, benjyw and thejcannon May 20, 2023 02:30
@@ -180,6 +183,15 @@ def handler(event, context):
if sys.platform == "darwin":
assert "`python_awslambda` targets built on macOS may fail to build." in caplog.text

global _first_run_of_deprecated_warning
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a big code smell, but also we don't typically test deprecation messages, as they are comprehensively tested generically. Manually observing that the right thing happens is sufficient.

@@ -180,6 +183,15 @@ def handler(event, context):
in caplog.text
)

global _first_run_of_deprecated_warning
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

@@ -1752,6 +1752,20 @@ class GlobalOptions(BootstrapOptions, Subsystem):
default=[],
)

use_deprecated_lambdex_layout = BoolOption(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this a global option? In fact, why do we need this option at all?

I might be missing something, but I think it would be simpler to do the relevant check in the code that uses the subsystem, and issue the deprecation warning there?

I'm not sure why this is a complicated as it is, so maybe I'm not seeing something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to allow users (including my work repo) to opt in to the new layout="zip"-by-default behaviour early and globally. In particular, I wouldn't want to have to set it on each target when I upgrade to 2.18... and then go through and delete them all in 2.19, once the default flips. I also thought it would be nice to allow users to silence the warning, but that's less critical.

I suppose another option would be: unconditional (but memoized etc.) warning as it is, and users can add a __defaults__ in the root BUILD to reduces the updates required. This does mean pants won't be able prompt users to remove it when it's not necessary, but that's probably okay, I can have the warning suggest something with a comment.

I'm more than happy to be guided to simpler options, if you think I'm solving problems that don't need to be solved!

(Side note, is 2.19 the right release to remove? Maybe it should be a longer deprecation period?)

Copy link
Contributor

@benjyw benjyw May 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a very low chance that switching from lambdex to zip will have any negative consequence, so I think 2.19.x is fine!

But I'm sure we can find a much simpler way of deprecating. For one thing, maybe (and I wish I'd thought of this when reviewing that change, sorry) we don't need a per-target layout at all, and this can be a repo-wide setting? After all, we're about to get rid of it entirely, so the intermediate state of "some lambdas use lambdex and some use zip" seems unnecessary. Esp. since we expect zero user-noticeable difference except better performance (and I guess the name change of the handler). That change has not been in any release yet, so not too late to replace it.

Instead, we can put layout on the lambdex scope (for want of a better one), and default it to lambdex. But, if you rely on that default in 2.17.x (I think a cherry-pick is fine for this) you get a warning that you should set it explicitly and that lambdex is going away. Then we make zip the default in 2.18.x, with a warning if you set to lambdex (or if you set explicitly to zip, since that is now the default, and the option is going away), and then we remove the option entirely in 2.19.x.

Does that make sense? This is often how we do stuff like this - introduce an option to deprecate a feature, then deprecation the option one release later.

Either way, this shouldn't be on the global options, because it is very specific to a backend.

Copy link
Contributor Author

@huonw huonw May 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For one thing, maybe (and I wish I'd thought of this when reviewing that change, sorry) we don't need a per-target layout at all, and this can be a repo-wide setting?

Does that make sense? This is often how we do stuff like this - introduce an option to deprecate a feature, then deprecation the option one release later.

Yeah, that makes a lot of sense.

Simplifying layout to a global option is way better 👍 I had originally been thinking of other layouts that do make sense per-target (e.g. layout="directory" to not have it zipped, or even layout="layer"), but that's all a total distraction from the core of "move away from Lambdex" and should be handled independently.

Next steps for me:

  1. revert Implement layout="zip" for Lambda/GCF, skipping lambdex #19022, because it'll make the next steps cleaner: Revert "Implement layout="zip" for Lambda/GCF, skipping lambdex" #19071
  2. set-up the global layout = "lambdex" setting: Add [lambdex].layout to opt-in to soon-to-be-deprecated FaaS behaviour #19074
  3. redo Implement layout="zip" for Lambda/GCF, skipping lambdex #19022, getting rid of the layout field and using the global setting instead
  4. return to this PR if required

Thanks for taking the time to think through a better approach!

huonw added a commit that referenced this pull request May 21, 2023
)

Reverts #19022.

Per
#19032 (comment),
we're going to change the approach for moving away from Lambdex to a
global setting, rather than the per-target `layout` field added in
#19022. That change will simplify things, and involve an extra moving
part that will be cherry-picked to 2.17, so I think it's better to start
from scratch, and then more carefully cherry-pick the parts of #19022
that are relevant.
@huonw huonw marked this pull request as draft May 21, 2023 02:18
@huonw huonw changed the title Deprecate lambdex for building FaaS packages Deprecate lambdex for building FaaS packages (pending) May 21, 2023
@huonw huonw removed request for kaos and thejcannon May 21, 2023 02:19
thejcannon pushed a commit that referenced this pull request May 23, 2023
This fixes #18879 by allowing the `python_awslambda` and
`python_google_cloud_function` FaaS artefacts to be generated in
"simple" format, using the `pex3 venv create --layout=flat-zipped`
functionality recently added in PEX 2.1.135
(https://github.com/pantsbuild/pex/releases/tag/v2.1.135).

This format is just: put everything at the top-level. For instance, the
zip contains `cowsay/__init__.py` etc., rather than
`.deps/cowsay-....whl`. This avoids the need to do the dynamic PEX
initialisation/venv creation.

This shifts the dynamic dependency computation/extraction/layout from
run-time to build-time, relying on the FaaS environment to be generally
consistent. It shouldn't change what actually happens after
initialisation. This can:

- reduce cold-starts noticeably: for instance, some of our lambdas spend
1s doing PEX/Lambdex start up.
- reduce package size somewhat (the PEX `.bootstrap/` folder seems to be
about 2MB uncompressed, ~1MB compressed).
- increase build times.
 
For instance, for one Python 3.9 Lambda in our codebase:

| metric | before | after |
|---|---|---|
| init time on cold start | 2.3-2.5s | 1.3-1.4s (-1s) |
| compressed size |  24.6MB | 23.8MB (-0.8MB) |
| uncompressed size | 117.8MB | 115.8MB (-2.0MB) |
| PEX-construction build time | ~5s | ~5s |
| PEX-postprocessing build time | 0.14s | 4.8s |

(The PEX-postprocessing time metric is specifically the time to run the
`Setting up handler` (lambdex) or `Build python_awslambda` (`pex3 venv
create`) process, computed by running `pants --keep-sandboxes=always
package ...` for each layout, and then `hyperfine -r3 -w1
path/to/first/__run.sh path/to/second/__run.sh`. This _doesn't_ include
the time to construct the input PEX, which is the same for both.)

---

This functionality is driven by adding a new option to the
`[lambdex].layout` option added in #19074. In #19074 (targeted for
2.17), it defaults `lambdex` (retaining the current code paths). This PR
flips the default to the new option `zip`, which keys into the
functionality above. I've tried to keep the non-lambdex implementation
generally separate to the lambdex one, rather than reusing all of the
code that happens to be common currently, because it'd make sense to
deprecate/remove the lambdex functionality and thus I feel it's best for
this new functionality to be mostly a fresh start.

This PR's commits can be reviewed independently. 

I _think_ this is an acceptable MVP for this functionality, but there's
various bits of follow-up:

- add a warning about `files` being loaded into these packages, which
has been temporarily lost (#19027)
- adjust documentation #19067
- other improvements like #18195 and #18880 
- improve performance, e.g. potentially `pex3 venv create ...` could use
the lock file and sources to directly compute the appropriate files,
without having to materialise a normal pex first

This is a re-doing of #19022 with a simpler approach to deprecation, as
discussed in
#19074 (comment)
and
#19032 (comment).
The phasing will be:

| release | supports lambdex? | supports zip? | default layout |
deprecation warnings |
|---|---|---|---|---|
| 2.17 (this PR) | ✅ | ✅ | lambdex | if `layout = "lambdex"` is
implicit, tell people to set it: recommend `zip`, but allow `lambdex` if
they have to |
| 2.18 | ✅ | ✅ | zip | if `layout = "lambdex"` is set at all, tell
people to remove it and switch to `zip` |
| 2.19 | ❌ | ✅ | zip | none, migration over (or maybe just about
removing the `[lambdex]` section entirely) |
WorkerPants pushed a commit that referenced this pull request May 23, 2023
This fixes #18879 by allowing the `python_awslambda` and
`python_google_cloud_function` FaaS artefacts to be generated in
"simple" format, using the `pex3 venv create --layout=flat-zipped`
functionality recently added in PEX 2.1.135
(https://github.com/pantsbuild/pex/releases/tag/v2.1.135).

This format is just: put everything at the top-level. For instance, the
zip contains `cowsay/__init__.py` etc., rather than
`.deps/cowsay-....whl`. This avoids the need to do the dynamic PEX
initialisation/venv creation.

This shifts the dynamic dependency computation/extraction/layout from
run-time to build-time, relying on the FaaS environment to be generally
consistent. It shouldn't change what actually happens after
initialisation. This can:

- reduce cold-starts noticeably: for instance, some of our lambdas spend
1s doing PEX/Lambdex start up.
- reduce package size somewhat (the PEX `.bootstrap/` folder seems to be
about 2MB uncompressed, ~1MB compressed).
- increase build times.
 
For instance, for one Python 3.9 Lambda in our codebase:

| metric | before | after |
|---|---|---|
| init time on cold start | 2.3-2.5s | 1.3-1.4s (-1s) |
| compressed size |  24.6MB | 23.8MB (-0.8MB) |
| uncompressed size | 117.8MB | 115.8MB (-2.0MB) |
| PEX-construction build time | ~5s | ~5s |
| PEX-postprocessing build time | 0.14s | 4.8s |

(The PEX-postprocessing time metric is specifically the time to run the
`Setting up handler` (lambdex) or `Build python_awslambda` (`pex3 venv
create`) process, computed by running `pants --keep-sandboxes=always
package ...` for each layout, and then `hyperfine -r3 -w1
path/to/first/__run.sh path/to/second/__run.sh`. This _doesn't_ include
the time to construct the input PEX, which is the same for both.)

---

This functionality is driven by adding a new option to the
`[lambdex].layout` option added in #19074. In #19074 (targeted for
2.17), it defaults `lambdex` (retaining the current code paths). This PR
flips the default to the new option `zip`, which keys into the
functionality above. I've tried to keep the non-lambdex implementation
generally separate to the lambdex one, rather than reusing all of the
code that happens to be common currently, because it'd make sense to
deprecate/remove the lambdex functionality and thus I feel it's best for
this new functionality to be mostly a fresh start.

This PR's commits can be reviewed independently. 

I _think_ this is an acceptable MVP for this functionality, but there's
various bits of follow-up:

- add a warning about `files` being loaded into these packages, which
has been temporarily lost (#19027)
- adjust documentation #19067
- other improvements like #18195 and #18880 
- improve performance, e.g. potentially `pex3 venv create ...` could use
the lock file and sources to directly compute the appropriate files,
without having to materialise a normal pex first

This is a re-doing of #19022 with a simpler approach to deprecation, as
discussed in
#19074 (comment)
and
#19032 (comment).
The phasing will be:

| release | supports lambdex? | supports zip? | default layout |
deprecation warnings |
|---|---|---|---|---|
| 2.17 (this PR) | ✅ | ✅ | lambdex | if `layout = "lambdex"` is
implicit, tell people to set it: recommend `zip`, but allow `lambdex` if
they have to |
| 2.18 | ✅ | ✅ | zip | if `layout = "lambdex"` is set at all, tell
people to remove it and switch to `zip` |
| 2.19 | ❌ | ✅ | zip | none, migration over (or maybe just about
removing the `[lambdex]` section entirely) |
huonw added a commit to huonw/pants that referenced this pull request May 23, 2023
…d#19076)

This fixes pantsbuild#18879 by allowing the `python_awslambda` and
`python_google_cloud_function` FaaS artefacts to be generated in
"simple" format, using the `pex3 venv create --layout=flat-zipped`
functionality recently added in PEX 2.1.135
(https://github.com/pantsbuild/pex/releases/tag/v2.1.135).

This format is just: put everything at the top-level. For instance, the
zip contains `cowsay/__init__.py` etc., rather than
`.deps/cowsay-....whl`. This avoids the need to do the dynamic PEX
initialisation/venv creation.

This shifts the dynamic dependency computation/extraction/layout from
run-time to build-time, relying on the FaaS environment to be generally
consistent. It shouldn't change what actually happens after
initialisation. This can:

- reduce cold-starts noticeably: for instance, some of our lambdas spend
1s doing PEX/Lambdex start up.
- reduce package size somewhat (the PEX `.bootstrap/` folder seems to be
about 2MB uncompressed, ~1MB compressed).
- increase build times.
 
For instance, for one Python 3.9 Lambda in our codebase:

| metric | before | after |
|---|---|---|
| init time on cold start | 2.3-2.5s | 1.3-1.4s (-1s) |
| compressed size |  24.6MB | 23.8MB (-0.8MB) |
| uncompressed size | 117.8MB | 115.8MB (-2.0MB) |
| PEX-construction build time | ~5s | ~5s |
| PEX-postprocessing build time | 0.14s | 4.8s |

(The PEX-postprocessing time metric is specifically the time to run the
`Setting up handler` (lambdex) or `Build python_awslambda` (`pex3 venv
create`) process, computed by running `pants --keep-sandboxes=always
package ...` for each layout, and then `hyperfine -r3 -w1
path/to/first/__run.sh path/to/second/__run.sh`. This _doesn't_ include
the time to construct the input PEX, which is the same for both.)

---

This functionality is driven by adding a new option to the
`[lambdex].layout` option added in pantsbuild#19074. In pantsbuild#19074 (targeted for
2.17), it defaults `lambdex` (retaining the current code paths). This PR
flips the default to the new option `zip`, which keys into the
functionality above. I've tried to keep the non-lambdex implementation
generally separate to the lambdex one, rather than reusing all of the
code that happens to be common currently, because it'd make sense to
deprecate/remove the lambdex functionality and thus I feel it's best for
this new functionality to be mostly a fresh start.

This PR's commits can be reviewed independently. 

I _think_ this is an acceptable MVP for this functionality, but there's
various bits of follow-up:

- add a warning about `files` being loaded into these packages, which
has been temporarily lost (pantsbuild#19027)
- adjust documentation pantsbuild#19067
- other improvements like pantsbuild#18195 and pantsbuild#18880 
- improve performance, e.g. potentially `pex3 venv create ...` could use
the lock file and sources to directly compute the appropriate files,
without having to materialise a normal pex first

This is a re-doing of pantsbuild#19022 with a simpler approach to deprecation, as
discussed in
pantsbuild#19074 (comment)
and
pantsbuild#19032 (comment).
The phasing will be:

| release | supports lambdex? | supports zip? | default layout |
deprecation warnings |
|---|---|---|---|---|
| 2.17 (this PR) | ✅ | ✅ | lambdex | if `layout = "lambdex"` is
implicit, tell people to set it: recommend `zip`, but allow `lambdex` if
they have to |
| 2.18 | ✅ | ✅ | zip | if `layout = "lambdex"` is set at all, tell
people to remove it and switch to `zip` |
| 2.19 | ❌ | ✅ | zip | none, migration over (or maybe just about
removing the `[lambdex]` section entirely) |
@huonw
Copy link
Contributor Author

huonw commented May 23, 2023

Replaced by #19122.

@huonw huonw closed this May 23, 2023
@huonw huonw deleted the feature/18879-deprecate-lambdex branch May 23, 2023 21:39
huonw added a commit that referenced this pull request May 23, 2023
…ck of #19076) (#19120)

This fixes #18879 by allowing the `python_awslambda` and
`python_google_cloud_function` FaaS artefacts to be generated in
"simple" format, using the `pex3 venv create --layout=flat-zipped`
functionality recently added in PEX 2.1.135
(https://github.com/pantsbuild/pex/releases/tag/v2.1.135).

This format is just: put everything at the top-level. For instance, the
zip contains `cowsay/__init__.py` etc., rather than
`.deps/cowsay-....whl`. This avoids the need to do the dynamic PEX
initialisation/venv creation.

This shifts the dynamic dependency computation/extraction/layout from
run-time to build-time, relying on the FaaS environment to be generally
consistent. It shouldn't change what actually happens after
initialisation. This can:

- reduce cold-starts noticeably: for instance, some of our lambdas spend
1s doing PEX/Lambdex start up.
- reduce package size somewhat (the PEX `.bootstrap/` folder seems to be
about 2MB uncompressed, ~1MB compressed).
- increase build times.
 
For instance, for one Python 3.9 Lambda in our codebase:

| metric | before | after |
|---|---|---|
| init time on cold start | 2.3-2.5s | 1.3-1.4s (-1s) |
| compressed size |  24.6MB | 23.8MB (-0.8MB) |
| uncompressed size | 117.8MB | 115.8MB (-2.0MB) |
| PEX-construction build time | ~5s | ~5s |
| PEX-postprocessing build time | 0.14s | 4.8s |

(The PEX-postprocessing time metric is specifically the time to run the
`Setting up handler` (lambdex) or `Build python_awslambda` (`pex3 venv
create`) process, computed by running `pants --keep-sandboxes=always
package ...` for each layout, and then `hyperfine -r3 -w1
path/to/first/__run.sh path/to/second/__run.sh`. This _doesn't_ include
the time to construct the input PEX, which is the same for both.)

---

This functionality is driven by adding a new option to the
`[lambdex].layout` option added in #19074. In #19074 (targeted for
2.17), it defaults `lambdex` (retaining the current code paths). This PR
flips the default to the new option `zip`, which keys into the
functionality above. I've tried to keep the non-lambdex implementation
generally separate to the lambdex one, rather than reusing all of the
code that happens to be common currently, because it'd make sense to
deprecate/remove the lambdex functionality and thus I feel it's best for
this new functionality to be mostly a fresh start.

This PR's commits can be reviewed independently. 

I _think_ this is an acceptable MVP for this functionality, but there's
various bits of follow-up:

- add a warning about `files` being loaded into these packages, which
has been temporarily lost (#19027)
- adjust documentation #19067
- other improvements like #18195 and #18880 
- improve performance, e.g. potentially `pex3 venv create ...` could use
the lock file and sources to directly compute the appropriate files,
without having to materialise a normal pex first

This is a re-doing of #19022 with a simpler approach to deprecation, as
discussed in
#19074 (comment)
and
#19032 (comment).
The phasing will be:

| release | supports lambdex? | supports zip? | default layout | deprecation warnings |
|---|---|---|---|---|
| 2.17 (this PR) | ✅ | ✅ | lambdex | if `layout = "lambdex"` is implicit, tell people to set it: recommend `zip`, but allow `lambdex` if they have to |
| 2.18 | ✅ | ✅ | zip | if `layout = "lambdex"` is set at all, tell people to remove it and switch to `zip` |
| 2.19 | ❌ | ✅ | zip | none, migration over (or maybe just about removing the `[lambdex]` section entirely) |
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants