Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CASM-4908 Runtime container image signature validation #3703

Draft
wants to merge 1 commit into
base: release/1.6
Choose a base branch
from

Conversation

mtupitsyn
Copy link
Contributor

@mtupitsyn mtupitsyn commented Oct 10, 2024

Summary and Scope

During initial testing of image signature validation, it was discovered that Kyverno tries to contact https://artifactory.alogl60.net/ for image verification, and this blocks deployments in air-gapped environments, even in Audit mode (CASMTRIAGE-7283). We need to set Kyverno to contact local registry instead, for both images and their respective signatures. This will allow us to turn on signature validation in runtime (during initial deployments, upgrades and in background on running clusters).

Proposed solution involves these key steps:

  • Deploy new Kyverno cluster policy prepend-registry, which will automatically add registry.local/ to the beginning of image spec for any new pod (if it doesn't already start with registry.local/).
  • Add a mirroring rule to containerd configuration, so that images with names starting from registry.local/ are looked in https://pit.nmn first and in https://registry.local/ second. This rule is needed to support a switch from PIT Nexus to Cloud Nexus during initial install. It is similar to already existing rule for image names starting from artifactory.algol60.net, which now becomes obsolete.
  • Move Kyverno and policies deployment into separate manifest, and deploy it early in install/upgrade pipeline, thus ensuring that image name mutation and signature validation happen to all deployments after Kyverno.
  • For the duration of fresh install, when images are downloaded from PIT Nexus, put a temporary hosts record override into CoreDNS ConfigMap. This override will point to PIT Nexus instead of Cloud Nexus. It is needed for Kyverno admission controller to look for images and their signatures in the right location during fresh install (when Cloud Nexus is not yet deployed).

This change consists of the following PR's:

Issues and Related PRs

Testing

Tested on:

  • Virtual Shasta

Test description:

  • Created custom builds of CSM and docs-csm with changes outlined above
  • Performed multiple automated deployments on vShasta in different combinations: fresh install and upgrade, with validationFailureAction set to Audit and Enforce.

Risks and Mitigations

None known ATM.

Pull Request Checklist

  • Version number(s) incremented, if applicable
  • Copyrights updated
  • License file intact
  • Target branch correct
  • Testing is appropriate and complete, if applicable
  • HPC Product Announcement prepared, if applicable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please ignore all changes to Jenkinsfile.github, they are needed to produce temporary testing artifact and will be rolled back before merge.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please ignore all changes to hack/embedded-repo.sh, they are needed to produce temporary testing artifact and will be rolled back before merge.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes to vendor/github.com/Cray-HPE/shasta-cfg/customizations.yaml are made in csm repo temporarily. They will be moved to https://github.com/Cray-HPE/shasta-cfg/blob/release/1.6/customizations.yaml and vendor reference in csm updated before merge.

type: repo
location: https://artifactory.algol60.net/artifactory/csm-helm-charts/
charts:
- name: cray-drydock
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drydock must be deployed first, because it creates namespaces for other deployments. We need to separate sonar deployment from cray-drydock, otherwise it is deployed before kyverno and bypasses signature validation.

@mtupitsyn mtupitsyn force-pushed the feature/prepend-registry branch 4 times, most recently from ecf9aa0 to 813eaab Compare October 17, 2024 21:41
* Use NCN images which support HTTPS on PIT Nexus and
  "registry.local/*" > ["pit.nmn/*, "registry.local/*"] mirroring rule
* Deploy Kyverno prepend-registry policy
* Manually prepend registry.local/ to chart images, missed by prepend-registry
  Kyverno policy
* Move Kyverno charts to separate manifest, deploy before any other chart
* Temporarily override record for `registry.local` in CoreDNS configmap
  during fresh install, restore right after.
* Stop supporting images such as "alpine:latest" - we need to know exactly
  how to mirror them, as docker.io/alpine or docker.io/library/alpine.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant