Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-3000: Image Promotion and Distribution Policy #3079

Merged
merged 17 commits into from
May 11, 2022

Conversation

hh
Copy link
Member

@hh hh commented Dec 7, 2021

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 7, 2021
@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. labels Dec 7, 2021
@hh
Copy link
Member Author

hh commented Dec 7, 2021

/assign @justaugustus @dims


### Non-Goals

Anything related to creation of artifacts, bom, digital signatures.
Copy link
Member

@ameukam ameukam Dec 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

artifacts

The title of the KEP implies artifacts are concerned by this KEP. We need more clarification about which type of artifacts are non-goals.

Then the promotion process occurs
```

#### Cloud Customer - Installing K8s via kubeadm
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to consider non-cloud users ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The greatest percentage of cost comes from cloud customers, so we will focus on them for this KEP

Copy link
Member

@justaugustus justaugustus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Mentioned on Slack first)

@hh -- I've left a few tweaks (which will also fix the presubmits) here: https://github.com/justaugustus/enhancements/tree/MST-3000

@k8s-ci-robot k8s-ci-robot added the sig/release Categorizes an issue or PR as relevant to SIG Release. label Dec 7, 2021
@hh
Copy link
Member Author

hh commented Dec 7, 2021

/assign @spiffxp

@eddiezane
Copy link
Member

/cc

Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know there aare more questions I am forgetting to ask


### Goals

A policy and procedure for use by SIG Release to promote container images and release binaries to multiple registries and mirrors.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should break this into 2 major phases:

  1. Container images
  2. Other artifacts

We may even want to break it to 2 KEPs so we can "finish" one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good thought! This KEP just focuses on container images now.


## Design Details

### Artifact Promotion
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's document and consider two main approaches:

  1. Push. Each mirror provider gives us a mechanism and credentials to push container images. As we promote images from staging to prod, we push to all mirrors. We need to consider credential security and rotation.

  2. Pull. We publish a log (git repo?) of image changes and mirrors are expected to sync changes in a reasonable period of time (99p @ 10 mins?).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opted into a push based mechanism, where sig-k8s-infra manages the content of the buckets


### Artifact Distribution

#### Policy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to detail how to on-board a mirror. E.g.:

  • provide some guarantees of service (e.g. a contract with CNCF) and point of contact
  • provide an emergency contact in case of outage
  • provide a mapping of client IPs to mirror (maybe through a git repo)

Then we can add the mirror and have the front-end server start redirecting traffic.

We will want to periodically healthcheck each mirror (e.g. pull a random blob, measure latency). If HC fails, remove mirror until it passes N times. We need a site or something indicating which mirrors are healthy, maybe stats.

We will want to log all redirects and set up a PII-anonymizing process so we can publish some aggregated information about how much traffic is going to each mirror, top images globally, etc.

Copy link
Member

@ameukam ameukam Dec 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

provide some guarantees of service (e.g. a contract with CNCF) and point of contact

Most of the managed services for Container registries have SLAs. We need to agree about minimum level of SLA. We now plan to usage object storage services

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going off of what @ameukam says here, we are working closely with the providers who consume the most to bring up infra that we manage.


#### Policy

#### Process
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to detail how we turn up the new DNS name and redirector and how we plan to convert users of old GCR name into the new name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Members of sig-k8s-infra have made PRs against projects like kops and Kubernetes to change the defaults.
There's also e2e testing through changing the domain for various jobs running in Prow.

@thockin thockin self-assigned this Dec 9, 2021

Anything related to creation of artifacts, bom, digital signatures.

## Proposal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also document that we're explicitly OK with a model where the management of the mirror is opaque to us as long as the other criteria are met.

Comment on lines 15 to 18
approvers:
- "@ameukam"
- "@justaugustus"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
approvers:
- "@ameukam"
- "@justaugustus"
approvers:
- "@ameukam"
- "@dims"
- "@justaugustus"
- "@saschagrunert"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated in 6bfecc8

#### Cloud Customer - Installing K8s via kubeadm

```feature
As a CLOUD end-user
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not stick the user story to a cloud environment. We don't want to break the existing way of consuming those container images produced.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the most of the spend is only cloud users, this should be fine to focus on for this KEP

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pulled out User Stories for this merge


Given some compute resources at CLOUD
When I use kubeadm to deploy Kubernetes
Then I will be redirected to a local CLOUD registry
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should be more clear about the meaning of "local":

Suggested change
Then I will be redirected to a local CLOUD registry
Then I will be redirected to the closest network endpoint of the registry

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved in f960d6f


### How much is this going to save us?

![Cost of K8s Artifact hosting - Data Studio Graphs](https://i.imgur.com/LAn4UIE.png)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should provide a link to the full report and not just a screenshot for better transparency.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update to show AWS involvement without exposing company specific usage patterns more context, in 7030358.


## Infrastructure Needed

It would be good to request some donations for some larger providers, including one in China, via [Cloud Native Credits program](https://www.cncf.io/credits/).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should provide a list of what exactly is needed in terms of infrastructure to ensure this KEP go to implementable.

source: Ben's doc (kubernetes-sigs/oci-proxy/cmd/archeio/docs/request-handling.md)
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented May 11, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. and removed cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 11, 2022
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels May 11, 2022
@hh
Copy link
Member Author

hh commented May 11, 2022

(Mentioned on Slack first)

@hh -- I've left a few tweaks (which will also fix the presubmits) here: https://github.com/justaugustus/enhancements/tree/MST-3000

Thanks!

@dims
Copy link
Member

dims commented May 11, 2022

@justaugustus @saschagrunert this is ready (all conversations resolved). Please take a look and approve if appropriate.

Copy link
Member

@saschagrunert saschagrunert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 11, 2022
@dims
Copy link
Member

dims commented May 11, 2022

Let's merge and iterate thanks @BobyMCbobs

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 11, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dims, hh, saschagrunert

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/release Categorizes an issue or PR as relevant to SIG Release. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
Development

Successfully merging this pull request may close these issues.