Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace vim with nano in hub image to reduce known vulnerabilities #2917

Conversation

consideRatio
Copy link
Member

@consideRatio consideRatio commented Oct 25, 2022

I hope we don't need to discuss what editors to use in general, but agree that nano can do the trick for the occational debugging needed in the hub pod.

@yuvipanda
Copy link
Collaborator

I would really really really really really like to not remove vim and also avoid a conversation about editors :)

@yuvipanda
Copy link
Collaborator

Maybe we can switch to nvim?

@consideRatio
Copy link
Member Author

consideRatio commented Oct 25, 2022

Installing neovim resulted in this report summary by trivy (from python:3.9-slim-bullseye).

# python:3.9-slim-bullseye after apt-get upgrade
Total: 84 (UNKNOWN: 0, LOW: 12, MEDIUM: 31, HIGH: 37, CRITICAL: 4)

# python:3.9-slim-bullseye after apt-get upgrade and nano installed (no change)
Total: 84 (UNKNOWN: 0, LOW: 12, MEDIUM: 31, HIGH: 37, CRITICAL: 4)

# python:3.9-slim-bullseye after apt-get upgrade and install vim (16 more critical, 300+ more high)
Total: 576 (UNKNOWN: 0, LOW: 16, MEDIUM: 111, HIGH: 429, CRITICAL: 20)

# python:3.9-slim-bullseye after apt-get upgrade and install neovim (18 more critical, 100+ more high)
Total: 258 (UNKNOWN: 0, LOW: 13, MEDIUM: 63, HIGH: 160, CRITICAL: 22)

I'll investigate if ubuntu is the resolution to all trouble with regards to this choice now, see related #2918.

Conclusion: both vim and neovim as installed as apt packages in ubuntu:22.04 as a base image led to 0 reported vulnerabilities in trivy.

@yuvipanda
Copy link
Collaborator

I do a lot of debugging work in the hub pod and removing vim completely in there would have a serious effect on that. We also don't really have conda in there so I can't really just install it either. I also personally believe these CVEs won't really apply to us in this context (which is one of the reasons I hate simple CVE scans). But I understand they cause noise which might drown out useful signals. I'm not sure what is the right thing to do :(

@consideRatio
Copy link
Member Author

But I understand they cause noise which might drown out useful signals.

That you voiced this part makes me feel heard, this is the key reason for me to look into this atm!

@yuvipanda
Copy link
Collaborator

Thank you so much @consideRatio for working on this!

@manics
Copy link
Member

manics commented Oct 25, 2022

One option to consider is publishing two Hub images:

  • one minimal (no editors, no debugging packages, nothing, this is usually considered "best practice". I've seen some images remove even the most basic CLI commands but I think that goes too far).
  • a second image built on top of the minimal one with basic editors, and anything else that's generally useful like ps, lsof (or whatever you use for looking for leaking file handles), basic network tools, etc

We can default to the second one in Values.yaml, but provide the minimal one for anyone who wants to get past a vulnerability scan.

@minrk
Copy link
Member

minrk commented Oct 26, 2022

Question for @manics idea: with our chartpress setup, is it feasible to inherit from one image to another? I think it might not be easy because the tag of one image isn't known or accessible in the build of another.

I was thinking the simplest version of this would be to have the bigger image FROM the smaller one, with a couple extra install lines, but I don't know how I'd do that with chartpress. Lockfiles and such aren't kind to layering either, so maybe the two having the same Dockerfile only differing by apt.txt/requirements.txt would be the easiest in practice.

@consideRatio
Copy link
Member Author

consideRatio commented Oct 26, 2022

Opinion about multiple hub images

I'm quite strongly opinionated currently that its not worth the complexity of having two different kinds of images, or even a common base image, and would argue that whatever isn't a good compromise in what we provide by default, users would have to adjust to.

I see the complexity related to multiple images or more base images etc is that it adds significantly to the complexity of maintaining source code, CI system, tooling, and documentation related to it.

More info needed for a decision

I'd like to postpone opining on a choice or action point until we have a better understanding about the security reports for the ubuntu and debian images. If the ubuntu image is far better patched, then I'd prefer using it over using the debian image.

  • Use python based on debian and vim from apt (current)
  • Use python based on debian and nano from apt
  • Use python based on debian and install mamba that users can use to install a text editor from conda-forge
  • Use ubuntu and vim from apt
  • Use ubuntu and nano from apt
  • Use ubuntu based on debian and install mamba that users can use to install a text editor from conda-forge

Overall, I want low amounts of unpatched CVEs by default in all images we maintain for many reasons, where the key action point as I see it from us as maintainers is to make our fresh releases not show up with F rating on artifacthub.io:
image

CVE detections in ubuntu vs debian

# ubuntu:22.04 after apt-get upgrade
Total: 0 (UNKNOWN: 0, LOW: 0, MEDIUM: 0, HIGH: 0, CRITICAL: 0)

# ubuntu:22.04 after apt-get upgrade and less, nano, vim, neovim installed
Total: 0 (UNKNOWN: 0, LOW: 0, MEDIUM: 0, HIGH: 0, CRITICAL: 0)

# debian:bullseye after apt-get upgrade
Total: 74 (UNKNOWN: 0, LOW: 12, MEDIUM: 26, HIGH: 32, CRITICAL: 4)

# debian:bullseye after apt-get upgrade and less, nano, vim, neovim installed
Total: 627 (UNKNOWN: 0, LOW: 16, MEDIUM: 123, HIGH: 454, CRITICAL: 34)

The vim apt package in ubuntu 2:8.2.3995-1ubuntu2.1, is patched Sept 15, and in debian 2:8.2.2434-3+deb11u1, is patched Dec 18 2021.

Current thinking

  • We should go with ubuntu as a base image
  • If needed in a multi stage build, we start out with buildpack-deps:22.04 in the build stage
  • As part of switching to ubuntu, we embrace the complexity of installing Python ourselves using micromamba that is very helpful in doing that from an environment without Python.
  • We stick with vim as an apt package after using ubuntu again

@manics
Copy link
Member

manics commented Oct 26, 2022

It's not supported in Chartpress at the moment, but I think it's a reasonable addition if there's demand for a more secure minimal image.

The approach I'd take is to have a single multi-stage Dockerfile, and tag an intermediate stage

This ensures the multiple images are always in sync and rebuilt/tagged together, since they originate from the same Dockerfile.

@minrk
Copy link
Member

minrk commented Oct 27, 2022

The approach I'd take is to have a single multi-stage Dockerfile, and tag an intermediate stage

That sounds cool! I've not done that before.

I think it's reasonable to ask what's the role of these debugging tools in the default image:

  1. who uses them? (speaks to reasonable expectations)
  2. when do folks decide to use them? (i.e. does switching to a non-default image defeat the purpose because critical context is lost when the container is restarted)
  3. what permissions are needed to be installed, if they aren't in the image? (i.e. can they be installed in a running container)

For instance, when I'm kubectl exec debugging, I regularly do some pip-installing, and that works fine because I don't need root, so there's no real benefit I see to those tools being in the image if they are pip-installable. But I do need root for apt-get install, and, unlike docker exec, kubectl exec doesn't allow specifying users (I'm not sure how robust tools like kubectl-execuser are).

If we lost these tools in the default image, what would be involved for a deployment that wanted an additional apt package added to the default image? I guess this is the general 'custom hub image' case. Maybe the value of having them in the default image is because it's too much of a pain to maintain an extended image, and that's the problem we should work on?

For the more short-term choice, I think switching to ubuntu base makes perfect sense, based on @consideRatio's findings. I like the simpler, more straightforward installation of Python from apt instead of conda for simple environments like ours, but that doesn't allow easy selection of the Python version itself, so if we want to get e.g. Python 3.11 in a relatively short time frame, micromamba makes sense. We can do micromamba for just Python itself, and keep dependabot-compatible pip-compile tooling, or we could switch to conda-lock, which doesn't need to run in a container, and wouldn't have issues with e.g. pycurl compilation dependencies, because there would never be a compile step.

@consideRatio
Copy link
Member Author

We can do micromamba for just Python itself, and keep dependabot-compatible pip-compile tooling, or we could switch to conda-lock, which doesn't need to run in a container, and wouldn't have issues with e.g. pycurl compilation dependencies, because there would never be a compile step.

A big 👍 for just using it to install Python.

I propsed it over apt to not depend on whats shipped with ubuntu that during latest update went from 3.7 to 3.10. I think our system with requirements.[in|txt] and pip-compile works good, and don't want to switch to a conda-forge based installation of python packages based on an experience of added complexity and practical issues without sufficient benefits.

Strategy on providing hub-slim

If tagging a layer is easy, I'm +1 for doing it. If it means adding notable complexity in tooling or similar I'm hesitant.

I've not tried this, but if this would work that would be simple enough in my book to have it worth supporting it.

images:
# hub, the container where JupyterHub, KubeSpawner, and the configured
# Authenticator are running.
hub:
valuesPath: hub.image

And make it...

  images: 
    # hub, the container where JupyterHub, KubeSpawner, and the configured 
    # Authenticator are running. 
    hub: 
      valuesPath: hub.image
    # hub-slim, an alternative hub image that doesn't include some
    # basic utilities for k8s admins
    hub-slim:
      contextPath: "images/hub"
      extraBuildCommandOptions:
        - --target=slim-stage

@consideRatio
Copy link
Member Author

I'm closing this as...

  1. We can provide a jupyterhub/k8s-hub-slim version alongside a jupyterhub/k8s-hub image with ease.
    Add a jupyterhub/k8s-hub-slim image alongside jupyterhub/k8s-hub #2920
  2. We can avoid all known vulnerabilities by switching to ubuntu:22.04 that seem to patch things far faster.
    Switch back from debian to ubuntu in hub and singleuser-sample images, and bump python to 3.10 #2921

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Replace vim with nano in the hub image
4 participants