Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Layer env vars are not processed in layer name order at build time #1393

Closed
edmorley opened this issue Aug 20, 2024 · 1 comment · Fixed by #1394
Closed

Layer env vars are not processed in layer name order at build time #1393

edmorley opened this issue Aug 20, 2024 · 1 comment · Fixed by #1394
Labels
status/ready type/bug Something isn't working

Comments

@edmorley
Copy link
Contributor

edmorley commented Aug 20, 2024

Summary

The Buildpack API spec defines what should happen if multiple layers within the same buildpack contribute the same buildpack provided env var.

For example:

Override

...

  • For environment variable file contents originating from the same buildpack, file contents that are later (when sorted alphabetically ascending by associated layer name) MUST override file contents that are earlier.

Prepend

...

  • Environment variable file contents originating from the same buildpack MUST be sorted alphabetically descending by associated layer name.

See:
https://github.com/buildpacks/spec/blob/main/buildpack.md#environment-variable-modification-rules

lifecycle currently implements the spec correctly at launch time, however, at build time this layer name ordering is not honoured and worse, is actually non-deterministic.

This is a significant issue for buildpacks where it is unavoidable that there are several executables with the same name on PATH. For example, a Python buildpack where there is a main Python install and a virtual environment, where the virtual environment's python needs to be ahead of the main python on PATH.

The cause of this issue is the iteration of a Go map here:

for path, layerMetadataFile := range createdLayers {

...since in Go the iteration order of a map is not guaranteed:

The iteration order over maps is not specified and is not guaranteed to be the same from one iteration to the next.

(see https://go.dev/ref/spec#For_range)

Compare this to the launch time implementation, which iterates over the slice returned from os.ReadDir (which returns its results sorted by filename):

if err := eachLayer(bpDir, l.doLayerRoot()); err != nil {


Reproduction

Steps
  1. mkdir testcase && cd $_ && touch project.toml
  2. Edit the project.toml to have the contents below
  3. echo "Values during build..." && for i in {1..5}; do pack build testcase --builder heroku/builder:24 --pull-policy if-not-present | grep -E "^(PATH|MYVAR)="; done
  4. echo "Values at launch..." && for i in {1..5}; do docker run --rm testcase -- bash -c 'printenv | grep -E "^(PATH|MYVAR)="'; done
project.toml
[_]
schema-version = "0.2"

[[io.buildpacks.group]]
id = "bp"

  [io.buildpacks.group.script]
  api = "0.10"
  shell = "/bin/bash"
  inline = '''
set -euo pipefail
for layer in {a..c}; do
  layer_dir="${CNB_LAYERS_DIR}/${layer}"
  mkdir -p "${layer_dir}/bin"
  mkdir -p "${layer_dir}/env"
  echo -e "[types]\nbuild = true\nlaunch = true" > "${layer_dir}.toml"
  echo "${layer}" > "${layer_dir}/env/MYVAR.override"
done
'''

[[io.buildpacks.group]]
id = "bp2"

  [io.buildpacks.group.script]
  api = "0.10"
  shell = "/bin/bash"
  inline = '''
set -euo pipefail
printenv | grep -E "^(PATH|MYVAR)="
'''

Current behavior

Non-deterministic values that don't match the corresponding values at launch time (or the spec):

Values during build...
MYVAR=b
PATH=/layers/bp/b/bin:/layers/bp/a/bin:/layers/bp/c/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
MYVAR=b
PATH=/layers/bp/b/bin:/layers/bp/a/bin:/layers/bp/c/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
MYVAR=c
PATH=/layers/bp/c/bin:/layers/bp/b/bin:/layers/bp/a/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
MYVAR=a
PATH=/layers/bp/a/bin:/layers/bp/c/bin:/layers/bp/b/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
MYVAR=c
PATH=/layers/bp/c/bin:/layers/bp/b/bin:/layers/bp/a/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Values at launch...
MYVAR=c
PATH=/layers/bp/c/bin:/layers/bp/b/bin:/layers/bp/a/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
MYVAR=c
PATH=/layers/bp/c/bin:/layers/bp/b/bin:/layers/bp/a/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
MYVAR=c
PATH=/layers/bp/c/bin:/layers/bp/b/bin:/layers/bp/a/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
MYVAR=c
PATH=/layers/bp/c/bin:/layers/bp/b/bin:/layers/bp/a/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
MYVAR=c
PATH=/layers/bp/c/bin:/layers/bp/b/bin:/layers/bp/a/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

Expected behavior

For the values during the build to:

  • Be deterministic, instead of changing every build
  • To match the values seen at launch
  • To match the spec

Context

lifecycle version

0.20.0

platform version(s)

Pack 0.34.2+git-ce8db3c.build-6005

@edmorley edmorley added status/triage type/bug Something isn't working labels Aug 20, 2024
@natalieparellano
Copy link
Member

Thanks for reporting @edmorley. We'll look into fixing this in the next patch

@natalieparellano natalieparellano added this to the lifecycle 0.20.1 milestone Aug 20, 2024
edmorley added a commit to heroku/buildpacks-python that referenced this issue Aug 29, 2024
Whilst writing the tests for the upcoming Poetry support, I made a few
changes to the overall package manager testing strategy (such as using
a testing buildpack to verify that at build time the tools and env vars
are configured correctly) - which I've split out of the later PRs for
easier review.

In particular, the new testing buildpack added here is what helped me
debug and locate this upstream lifecycle build time env vars bug:
buildpacks/lifecycle#1393

GUS-W-16617242.
edmorley added a commit to heroku/buildpacks-python that referenced this issue Aug 29, 2024
Whilst writing the tests for the upcoming Poetry support, I made a few
changes to the overall package manager testing strategy (such as using
a testing buildpack to verify that at build time the tools and env vars
are configured correctly) - which I've split out of the later PRs for
easier review.

In particular, the new testing buildpack added here is what helped me
debug and locate this upstream lifecycle build time env vars bug (which
would have broken apps when we switch to venvs shortly):
buildpacks/lifecycle#1393

GUS-W-16617242.
edmorley added a commit to heroku/buildpacks-python that referenced this issue Aug 29, 2024
Whilst writing the tests for the upcoming Poetry support, I made a few
changes to the overall package manager testing strategy (such as using
a testing buildpack to verify that at build time the tools and env vars
are configured correctly) - which I've split out of the later PRs for
easier review.

In particular, the new testing buildpack added here is what helped me
debug and locate this upstream lifecycle build time env vars bug (which
would have broken apps when we switch to venvs shortly):
buildpacks/lifecycle#1393

GUS-W-16617242.
edmorley added a commit to heroku/buildpacks-python that referenced this issue Aug 30, 2024
Whilst writing the tests for the upcoming Poetry support, I made a few
changes to the overall package manager testing strategy (such as using
a testing buildpack to verify that at build time the tools and env vars
are configured correctly) - which I've split out of the later PRs for
easier review.

In particular, the new testing buildpack added here is what helped me
debug and locate this upstream lifecycle build time env vars bug (which
would have broken apps when we switch to venvs shortly):
buildpacks/lifecycle#1393

GUS-W-16617242.
edmorley added a commit to heroku/buildpacks-python that referenced this issue Aug 30, 2024
App dependencies are now installed into a Python virtual environment
(aka venv / virtualenv) instead of into a custom user site-packages
location.

This:
1. Avoids user site-packages compatibility issues with some packages
   when using relocated Python (see #253)
2. Improves parity with how dependencies will be installed when using
   Poetry in the future (since Poetry doesn't support `--user` installs)
3. Unblocks being able to move pip into its own layer (see #254)

This approach is possible since pip 22.3+ supports a new `--python` /
`PIP_PYTHON` option which can be used to make pip operate against a
different environment to the one in which it is installed. This allows
us to continuing keeping pip in a separate layer to the app dependencies
(currently the Python layer, but in a later PR pip will be moved to its
own layer).

For a venv to work, it depends upon the `<venv_layer>/bin/python` script
being earlier in `PATH` than the main Python installation. To achieve
that with CNBs, the venv's layer name must be alphabetically after the
Python layer name. In addition, lifecycle 0.20.1+ is required, since
earlier versions didn't implement the spec correctly during the
execution of later buildpacks - see:
buildpacks/lifecycle#1393

Now that app dependencies are installed into a venv, we no longer need
to make the system site-packages directory read-only to protect against
later buildpacks installing into the wrong location.

This has been split out of the Poetry PR for easier review.

See also:
- https://docs.python.org/3/library/venv.html
- https://pip.pypa.io/en/stable/cli/pip/#cmdoption-python

Closes #253.
GUS-W-16616226.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/ready type/bug Something isn't working
Projects
None yet
2 participants