Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch back from debian to ubuntu in hub and singleuser-sample images, and bump python to 3.10 #2921

Closed
wants to merge 2 commits into from

Conversation

consideRatio
Copy link
Member

@consideRatio consideRatio commented Oct 28, 2022

We observed (#2917 (comment)) many reported vulnerabiliies by container scanners that didn't seem to get fixed in the current base images python:3.9-bullseye-slim -> debian:slim-bullseye.

With this PR we switch to ubuntu:22.04 as a base image, and by doing that goes to no reported vulnerabilities again.

Results

Image Size Reported vulnerabilities Build time
(current) 413 MB Total: 625 (... HIGH: 454, CRITICAL: 20) min 1m40s, max 2m40s
Debian slim 266 MB Total: 95 (... HIGH: 40, CRITICAL: 4) -
Debian fat 414 MB Total: 625 (... HIGH: 454, CRITICAL: 20) min 2m51s, max 3m53s
Ubuntu slim 399 MB Total: 0 (... HIGH: 0, CRITICAL: 0) -
Ubuntu fat 571 MB Total: 0 (... HIGH: 0, CRITICAL: 0) min 3m49s, max 6m45s

Conclusion

  • Using ubuntu mean longer build times, larger size, little to no reported vulnerabilities.
  • Providing an intermediary slim stage seems like a pure win pretty much

My inclination is that we should go with ubuntu and a slim version.

@consideRatio consideRatio changed the title pr/back to ubuntu Switch back hub/singleuser-sample images to ubuntu Oct 28, 2022
@consideRatio consideRatio marked this pull request as draft October 28, 2022 00:21
@consideRatio consideRatio changed the title Switch back hub/singleuser-sample images to ubuntu Switch back hub/singleuser-sample images from debian to ubuntu Oct 28, 2022
@consideRatio consideRatio marked this pull request as ready for review October 28, 2022 00:51
@consideRatio consideRatio changed the title Switch back hub/singleuser-sample images from debian to ubuntu Switch back from debian to ubuntu in hub and singleuser-sample images, and bump python to 3.10 Oct 28, 2022
@minrk
Copy link
Member

minrk commented Oct 28, 2022

ubuntu slim is almost twice debian slim? Do you know what accounts for that? Both use micromamba?

Another option that's a smaller change if we still want to opt-in to Python versions is to use the deadsnakes ppa.

Copy link
Member

@minrk minrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to go ahead with this from me. If you'd like to explore keeping Python from apt via deadsnakes, that works for me, too. I'm also okay with sticking to the LTS Python, since we aren't very sensitive to Python versions here.

We do have one debugging tool (py-spy) in requirements.in that we could shift to the non-slim image. Because it's a debugging tool instead of a running-the-hub tool, I don't think resolving the strict pins via the lockfile is important.

@consideRatio
Copy link
Member Author

ubuntu slim is almost twice debian slim? Do you know what accounts for that? Both use micromamba?

Python:3.9 install vs micronamva install is the difference of relevance i think.

I'll look into deadsnakes ppa

@consideRatio
Copy link
Member Author

consideRatio commented Oct 28, 2022

Arrrgh this is soo hard...

  1. I forgot a key reason for switching to python:3.9 images - that we could run the freeze script without needing to install Python there as well
    docker run --rm \
    --env=CUSTOM_COMPILE_COMMAND='Use the "Run workflow" button at https://github.com/jupyterhub/zero-to-jupyterhub-k8s/actions/workflows/watch-dependencies.yaml' \
    --volume="$PWD:/io" \
    --workdir=/io \
    --user=root \
    python:3.9-bullseye \
    sh -c 'pip install pip-tools==6.* && pip-compile --upgrade'
  2. Deadsnakes PPA was a bit messy to work with
    • You need to add the apt repository, and to do that, you want software-properties-common first
    • You need to install pip, which in turn requires disttools
    • You need to handle the python3.X python3 python alias situation
    • I failed to get a clean install, dragging in dependencies to the non-deadsnake-ppa python3 in various ways
    • I ended up with a massive image

Ideally we would have a python:3.9-jammy and python:3.9-jammy-slim (jammy = ubuntu 22.04). I'm not sure how to proceed.

This PR as it is now isn't going to be very good as it endangers function of the freeze script - we would use a debian based environment and use it in a python based environment. Maybe that is okay though? Can we do that? @minrk what do you think about staying with python:3.X-bullseye as a freeze script image even though we use ubuntu with Python 3.X installed?

@minrk
Copy link
Member

minrk commented Oct 28, 2022

I don't think pip-compile will produce different results based on the linux distro (as long as it isn't alpine), so a regular debian-based python image should still be able to freeze for ubuntu as long as the Python x.y version matches.

@consideRatio
Copy link
Member Author

@minrk pweh okay, then I think we should go for this PR as it is for now. I invested some hours but failed to optimize this further using deadsnakes ppa - both in terms of complexity and image size.

@consideRatio
Copy link
Member Author

@minrk I trialed building https://github.com/docker-library/python/blob/master/3.11/slim-bullseye/Dockerfile but replacing it the base image with ubuntu. The result became a 128 MB image with python 3.11 and pip ❤️

I'm very exited about having such size reduction, but unsure about how to go about it sustainable from an open source perspective as that project isn't interested in adding such build.

@manics
Copy link
Member

manics commented Oct 28, 2022

This seems fine to me.

In practice does reducing the image size further make much practical difference in the context of a full deployment? It's a potential large relative reduction, but I don't think we need to expend significant effort on saving 300MB when most people using docker-stacks will be pulling multi-GB images anyway.

Do we need to document the new slim hub image somewhere, e.g. on https://z2jh.jupyter.org/en/stable/administrator/optimization.html or https://z2jh.jupyter.org/en/stable/administrator/security.html ?

@consideRatio
Copy link
Member Author

Now I'm again conflicted about going for a switch back to Ubuntu, as it could be a fault in the container scanners - that mainly gets whats considered resolved in ubuntu but not in debian.

See docker-library/python#708 (comment).

Overall, I'm very happy about the python:3.X-slim-bullseye image as its otherwise ideal for us - its pre-built, about as slimmed as it can get, regularly updated, allows us to choose python version, comes side by side with a python:3.9-bullseye based on buildpack-deps:bullseye and has a lot of things of relevance for building wheels in a build-stage...

I'm thinking maybe we should go for #2920 and wait with this. I'm conflicted... If we could choose python:3.X-ubuntu (which is as slim as slim-bullseye), we could have it all. In practice, the Dockerfiles building python:3.X-slim-bullseye can build ubuntu as well without any modification, so I'm almost inclined to fork and maintain such image as well...

@consideRatio
Copy link
Member Author

Another strategy I've considered is to COPY --from=python-image-in-previous-stage and extract the build Python. The problem with that is that it would hardcode several paths.

Another great outcome would be if we could do something like COPY all things added in specific layers of python:3.9-slim-bullseye. This for example include...

# initial step
usr/local
    bin
        2to3
        2to3-3.11
        idle3
        idle3.11
        pydoc3
        pydoc3.11
        python3
        python3-config
        python3.11
        python3.11-config
    include
        python3.11/
    lib
        libpython3.11.so
        libpython3.11.so.1.0
        libpython3.so
        pkgconfig/
        python3.11/
    share/man/man1
        python3.1
        python3.11.1

# symlink step
usr/local/bin
    idle
    pydoc
    python
    python-config

# pip step
usr/local/bin/
    pip
    pip3
    pip3.11
    wheel
/usr/local/lib
    python3.11

But also more things a bit messier to extract, making it tricky. I've concluded this using the tool dive.

image

@consideRatio
Copy link
Member Author

In practice does reducing the image size further make much practical difference in the context of a full deployment? It's a potential large relative reduction, but I don't think we need to expend significant effort on saving 300MB when most people using docker-stacks will be pulling multi-GB images anyway.

@manics yeah maybe its overkill to care about MBs. If you upgrade, and the upgrade is by recreating the hub container, the MB/s and MB combines to an added downtime during upgrades, and added downtime while relocating to another node etc.

Good point about documentation, and thanks for concrete suggestions!!

@consideRatio consideRatio marked this pull request as draft October 28, 2022 23:28
@consideRatio
Copy link
Member Author

Closing, let's settle for having hub-slim for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants