Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move pip into its own layer #254

Closed
edmorley opened this issue Aug 29, 2024 · 0 comments · Fixed by #258
Closed

Move pip into its own layer #254

edmorley opened this issue Aug 29, 2024 · 0 comments · Fixed by #258
Assignees
Labels
enhancement New feature or request semver: major

Comments

@edmorley
Copy link
Member

edmorley commented Aug 29, 2024

Currently pip is installed into the same layer as Python, since it is installed into the system site-packages directory.

This is primarily because the user site-packages directory is used for the app dependencies, leaving us few other options as to where to install pip, given that:

  • we don't want pip in the layer with the app dependencies (otherwise it can't be cached, given the app dependencies layer for pip can't be cached due to pip's non-determinism since it doesn't sync environments)
  • we can't use PYTHONPATH since any directories specified via PYTHONPATH are given a higher precedence in Python's sys.path than the Python stdlib (unlike system and user site-packages, which are added to sys.path after the Python stdlib) - which can then cause hard to debug issues if apps use outdated backport libraries (which can often happen unintentionally via broken/suboptimal packages in their transitive dependency tree).

pip being in the same layer as Python means that:

  • we can't exclude pip from the run image (ie: make it a build time only layer), either optionally or by default (see Exclude pip from the app image #255)
  • a change in pip version means the Python layer has to be unnecessarily unvalidated (albeit this only occurs a few times a year)
  • the Python layer will vary based on the choice of package manager (since we wouldn't need to install pip when using Poetry or uv), which will reduce layer re-use between apps (if the images are stored in an environment where layers can be shared across apps).

However, once we move the app dependencies into a virtual environment in #253 this will free up the user site-packages, meaning we perform a user install of pip, into its own layer.

GUS-W-16616956.

@edmorley edmorley added enhancement New feature or request semver: major labels Aug 29, 2024
@edmorley edmorley self-assigned this Aug 29, 2024
edmorley added a commit that referenced this issue Aug 30, 2024
App dependencies are now installed into a virtual environment (aka venv
or virtualenv) instead of into a custom user site-packages location.

This:
1. Avoids user site-packages compatibility issues with some packages
   when using relocated Python (see #253)
2. Improves parity with how dependencies will be installed when using
   Poetry in the future (since Poetry doesn't support `--user`)
3. Unblocks being able to move pip into its own layer (see #254)

This approach is possible since pip 22.3+ supports a new `--python`
/ `PIP_PYTHON` option which can be used to make pip operate against
a different environment to the one in which it is installed. This
allow us to continuing keeping pip in a separate layer to the app
dependencies (currently the Python layer, but in a later PR pip will
be moved to its own layer).

Now that app dependencies are installed into a venv, we no longer need
to make the system site-packages directory read-only to protect against
later buildpacks installing into the wrong location.

This has been split out of the Poetry PR for easier review.

See also:
- https://docs.python.org/3/library/venv.html
- https://pip.pypa.io/en/stable/cli/pip/#cmdoption-python

Closes #253.
GUS-W-16616226.
edmorley added a commit that referenced this issue Aug 30, 2024
pip is now installed into its own layer (as a user site-packages
install) instead of into system site-packages in the Python layer.

This is possible now that the user site-packages is no longer being
used for app dependencies, after the switch to venvs in #257.

pip being in its own layer has the following advantages:
1. We can more easily exclude pip from the build/run images when using
   other packages managers (such as for the upcoming Poetry support).
2. A change in pip version no longer unnecessarily invalidates the
   Python layer.
3. In the future we could more easily exclude pip from the run image
   entirely, should we wish (see #255).

This has been split out of the Poetry PR for easier review.

Closes #254.
GUS-W-16616956.
@edmorley edmorley linked a pull request Aug 30, 2024 that will close this issue
edmorley added a commit that referenced this issue Aug 30, 2024
App dependencies are now installed into a Python virtual environment
(aka venv / virtualenv) instead of into a custom user site-packages
location.

This:
1. Avoids user site-packages compatibility issues with some packages
   when using relocated Python (see #253)
2. Improves parity with how dependencies will be installed when using
   Poetry in the future (since Poetry doesn't support `--user` installs)
3. Unblocks being able to move pip into its own layer (see #254)

This approach is possible since pip 22.3+ supports a new `--python` /
`PIP_PYTHON` option which can be used to make pip operate against a
different environment to the one in which it is installed. This allows
us to continuing keeping pip in a separate layer to the app dependencies
(currently the Python layer, but in a later PR pip will be moved to its
own layer).

For a venv to work, it depends upon the `<venv_layer>/bin/python` script
being earlier in `PATH` than the main Python installation. To achieve
that with CNBs, the venv's layer name must be alphabetically after the
Python layer name. In addition, lifecycle 0.20.1+ is required, since
earlier versions didn't implement the spec correctly during the
execution of later buildpacks - see:
buildpacks/lifecycle#1393

Now that app dependencies are installed into a venv, we no longer need
to make the system site-packages directory read-only to protect against
later buildpacks installing into the wrong location.

This has been split out of the Poetry PR for easier review.

See also:
- https://docs.python.org/3/library/venv.html
- https://pip.pypa.io/en/stable/cli/pip/#cmdoption-python

Closes #253.
GUS-W-16616226.
edmorley added a commit that referenced this issue Aug 30, 2024
pip is now installed into its own layer (as a user site-packages
install) instead of into system site-packages in the Python layer.

This is possible now that the user site-packages is no longer being
used for app dependencies, after the switch to venvs in #257.

pip being in its own layer has the following advantages:
1. We can more easily exclude pip from the build/run images when using
   other packages managers (such as for the upcoming Poetry support).
2. A change in pip version no longer unnecessarily invalidates the
   Python layer.
3. In the future we could more easily exclude pip from the run image
   entirely, should we wish (see #255).

This has been split out of the Poetry PR for easier review.

Closes #254.
GUS-W-16616956.
edmorley added a commit that referenced this issue Sep 4, 2024
After #254, pip is now installed into its own layer rather than into the
system site-packages directory inside the Python layer.

This means its now possible to exclude pip from the final app image, by
making the pip layer be a build-only layer.

Excluding pip from the final app image:
- Prevents several classes of user error/confusion/bad app design
  patterns seen in support tickets (see #255 for more details).
- Reduces app image supply chain surface area.
- Reduces app image size by 13 MB and layer count by 1, meaning less
  to have to push to the remote registry.
- Matches the approach used for Poetry, where we don't make Poetry
  available at run-time either.

Users that need pip at run-time for a temporary debugging task can run
`python -m ensurepip --default-pip` in the container at run-time to make
it available again (this command doesn't even have to download anything
- it uses the pip bundled with Python).

Or if pip is an actual run-time dependency of the app, then the app can
add `pip` to its `requirements.txt` (which much more clearly conveys the
requirements of the app, and also allows the app to pick what pip
version it needs at run-time).

Closes #255.
edmorley added a commit that referenced this issue Sep 4, 2024
After #254, pip is now installed into its own layer rather than into the
system site-packages directory inside the Python layer.

This means its now possible to exclude pip from the final app image, by
making the pip layer be a build-only layer.

Excluding pip from the final app image:
- Prevents several classes of user error/confusion/bad app design
  patterns seen in support tickets (see #255 for more details).
- Reduces app image supply chain surface area.
- Reduces app image size by 13 MB and layer count by 1, meaning less
  to have to push to the remote registry.
- Matches the approach used for Poetry, where we don't make Poetry
  available at run-time either.

Users that need pip at run-time for a temporary debugging task can run
`python -m ensurepip --default-pip` in the container at run-time to make
it available again (this command doesn't even have to download anything
- it uses the pip bundled with Python).

Or if pip is an actual run-time dependency of the app, then the app can
add `pip` to its `requirements.txt` (which much more clearly conveys the
requirements of the app, and also allows the app to pick what pip
version it needs at run-time).

Should we find that pip's absence causes confusion in the future, we
could always add a wrapper/shim `pip` script in the app image which does
something like:

```
echo "pip isn't installed at run-time, if you need it temporarily run 'python -m ensurepip --default-pip' to install it"
exit 1
```

...to improve discoverability.

We'll also document pip (and Poetry) being available at build-time only
in the docs that will be added by #11.

Closes #255.
edmorley added a commit that referenced this issue Sep 9, 2024
After #254, pip is now installed into its own layer rather than into the
system site-packages directory inside the Python layer.

This means its now possible to exclude pip from the final app image, by
making the pip layer be a build-only layer.

Excluding pip from the final app image:
- Prevents several classes of user error/confusion/bad app design
  patterns seen in support tickets (see #255 for more details).
- Reduces app image supply chain surface area.
- Reduces app image size by 13 MB and layer count by 1, meaning less
  to have to push to the remote registry.
- Matches the approach used for Poetry, where we don't make Poetry
  available at run-time either.

Users that need pip at run-time for a temporary debugging task can run
`python -m ensurepip --default-pip` in the container at run-time to make
it available again (this command doesn't even have to download anything
- it uses the pip bundled with Python).

Or if pip is an actual run-time dependency of the app, then the app can
add `pip` to its `requirements.txt` (which much more clearly conveys the
requirements of the app, and also allows the app to pick what pip
version it needs at run-time).

Should we find that pip's absence causes confusion in the future, we
could always add a wrapper/shim `pip` script in the app image which does
something like:

```
echo "pip isn't installed at run-time, if you need it temporarily run 'python -m ensurepip --default-pip' to install it"
exit 1
```

...to improve discoverability.

We'll also document pip (and Poetry) being available at build-time only
in the docs that will be added by #11.

Closes #255.
GUS-W-16697386.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request semver: major
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant