-
Notifications
You must be signed in to change notification settings - Fork 13
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
5fc53d6
commit 98a29fd
Showing
1 changed file
with
123 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,123 @@ | ||
--- | ||
title: Quota Management and Enforcement on Quay | ||
authors: | ||
- "@fmissi" | ||
- "@sleesinc" | ||
- "@sdadi" | ||
reviewers: | ||
- "@fmissi" | ||
- "@sleesinc" | ||
- "@sdadi" | ||
approvers: | ||
- "@fmissi" | ||
- "@sleesinc" | ||
- "@sdadi" | ||
creation-date: 2021-12-10 | ||
last-updated: 2021-12-10 | ||
status: implementable | ||
--- | ||
|
||
# Quay as a cache proxy for upstream registries | ||
|
||
Container development has become widely popular. Customers today rely on container images from upstream registries(like docker, gcp) to get desired services up and running. | ||
Registries now have rate limitation and throttling on the number of times users can pull from these registries. | ||
This proposal is to enable Quay to act as a pull through cache where, once pulled images are only pulled again when upstream images have been updated. | ||
|
||
## Release Signoff Checklist | ||
|
||
- [x] Enhancement is `implementable` | ||
- [x] Design details are appropriately documented from clear requirements | ||
- [ ] Test plan is defined | ||
- [ ] Graduation criteria for dev preview, tech preview, GA | ||
|
||
## Open Questions | ||
|
||
> 1. Will the check against the upstream registry count against an upstream rate limit? | ||
## Summary | ||
|
||
Dependencies on container images has increased tremendously with the adoption of container driven development. With the introduction of rate limits | ||
on popular container registries, Quay will act as a proxy cache to circumvent pull rate limitation from upstream registries. Adding a cache will also | ||
accelerate pull performance as images are pulled from cache rather than upstream dependencies. Cached images are updated only when the upstream | ||
image digest differs from cached image. | ||
|
||
### Goals | ||
|
||
* A Quay user can define and configure(credentials, staleness period) via config file/app, a repository in Quay that acts a cache for a specific upstream registry. | ||
* A Quay super can user leverage storage quota of an organization to limit cache size. This means when cache size reaches its quota limit, | ||
images from cache are evicted based on LRU. | ||
* A proxy cache organization will transparently cache and stream images to client. The images in the proxy cache organization should | ||
support the default expected behaviour (like security scanner, time machine, etc) as other images on Quay. | ||
* Given the sensitive nature of accessing potentially untrusted upstream registry all cache pulls needs to be logged (audit log). | ||
* A Quay user can flush the cache to eliminate excess storage consumption. | ||
* Robot accounts can be created in the cache organization as usual, their RBAC can be managed for all existing cached repositories at a given point in time. | ||
* Provide metrics to track cache activity and efficiency (hit rate, size, evictions). | ||
|
||
### Non-Goals | ||
|
||
* In the first phase, configuring a cache proxy organization, caching upstreaming images, and quota management on cached repositories is the target. | ||
Other goals will be implemented subsequently based on the work of this proposal. | ||
* Cached images are read-only. | ||
|
||
## Design Details | ||
|
||
The expected pull overview is depicted as below: | ||
![](https://user-images.githubusercontent.com/11522230/145866763-58f44c94-839b-4edb-a95b-b9c3648cf187.png) | ||
Design credits: @fmissi | ||
|
||
A user initiates a pulls of an image(say a `postgres:14` image) from an upstream repository on Quay. The repository is checked to see if the image is present. | ||
1. If the image does not exist, a fresh pull is initiated. | ||
* The user is authenticated into the upstream registry and all the layers of `postgres:14` are pulled. | ||
* The pulled layers are saved to cache and served to the user in parallel. | ||
* This is depicted as below: | ||
![](https://user-images.githubusercontent.com/11522230/145871778-da01585a-7b1b-4c98-903f-809c214578da.png) | ||
Design credits: @fmissi | ||
|
||
2. If the image exists in cache: | ||
* A user can rely on Quay's cache to stay coherent with the upstream source so that I transparently get newer images from the cache | ||
when tags have been overwritten in the upstream registry immediately or after a certain period of time. | ||
* If the upstream image and cached version of the image are same: | ||
* No layers are pulled from the upstream repository and the cached image is served to the user. | ||
|
||
* If the upstream image and cached version of the image are different: | ||
* The user is authenticated into the upstream registry and only the changed layers of `postgres:14` are pulled. | ||
* The new layers are updated in cache and served to the user in parallel. | ||
* This is depicted as below: | ||
![](https://user-images.githubusercontent.com/11522230/145872216-31350e08-6746-4e34-aebf-e59a7bf6b372.png) | ||
Design credits: @fmissi | ||
|
||
3. If user initiates a pull when the upstream registry which is down: | ||
* If the pull happens with the configured staleness period, the image stored in cache is served. | ||
* If the pull happens after the configured staleness period, the error is propagated to the user. | ||
* This is depicted as below: | ||
![](https://user-images.githubusercontent.com/11522230/145878373-c23d094b-709d-4859-b875-013ea33e34f7.png) | ||
Design credits: @fmissi | ||
|
||
A quay admin can leverage quota of an organization to limit cache size so the backend storage consumption remains predictable | ||
by discarding images from the cache according to least recently used frequency. | ||
This is depicted as below: | ||
![](https://user-images.githubusercontent.com/11522230/145884935-df19297f-96b5-4c1c-9cdc-e199e04df176.png) | ||
|
||
A user initiates a pulls of an image(say a `postgres:14` image) from an upstream repository on Quay. If the storage consumption of the organization | ||
is beyond the quota limit, images in the namespace are removed based on least recently used, to make space for `postgres:14` to be cached. | ||
|
||
### Constraints | ||
|
||
* If quota management is enabled with proxy cache organization, and say an org is set to a max quota of 500mb and the image the user wants to pull is 700mb. | ||
In such a case, the image pulled will be cached and will overflow beyond quota limit. | ||
|
||
### Risks and Mitigations | ||
|
||
* The cached images should have all properties that images on a Quay repository would have. | ||
* When an upstream image exists in cache, only the updated layers are to be pulled. | ||
|
||
### Test Plan | ||
|
||
* Implement image pull and save layers to cache. | ||
* Implement image pull of an outdated image and save only changed layers to cache. | ||
* Implement configuring of a proxy cache organization. | ||
* Implement quota management. | ||
|
||
### Implementation History | ||
|
||
* 2021-12-13 Initial proposal |