Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker Hub pull rate limits #3099

Open
6 of 12 tasks
rnorth opened this issue Aug 14, 2020 · 37 comments
Open
6 of 12 tasks

Docker Hub pull rate limits #3099

rnorth opened this issue Aug 14, 2020 · 37 comments
Labels
resolution/acknowledged resolution/waiting-for-info Waiting for more information of the issue author or another 3rd party.

Comments

@rnorth
Copy link
Member

rnorth commented Aug 14, 2020

N.B. This issue description will be updated as new information becomes available.

As of 2020-08-13, Docker have updated their terms of service and pricing page, indicating that:

  • unauthenticated pulls will be rate limited to 100 per 6h
  • authenticated pulls will be rate limited to 200 per 6h

The page on rate limits clarifies that this applies to layers being pulled, not images. Since most images comprise multiple layers, the effective image pull rate limit will be very low.

The rate limit page is inconsistent about the number, but also states that these rate limits are being introduced gradually.

Based on a recent build it does not appear that these rate limits are being enforced yet. However, it seems as though they will be in effect by 2020-11-01, so we need to understand the implications for various groups.

Implications

  • For environments where images are persisted between builds, the rate limit may not be hit. For ephemeral CI environments where images are discarded between builds, it seems more likely that the rate limit will be hit.

  • In general, all user groups may need to either:

    • use an authenticated Docker Hub account for builds. This page describes how this can be set up.
      • testcontainers-java has good support for authenticated image pulls. Other language forks will need work done.
      • Docker Hub's Free/Pro/Team plans will allow 300/unlimited/unlimited layer pulls per 6h.
    • OR, users may wish to set up their own registry to mirror/copy images which they require. This is already possible, and work is underway within Testcontainers to continue to make this easier.
  • user groups with low image pulling requirements (e.g. few images or a low rate of builds) will presumably not be impacted.

For open source projects using Testcontainers within their builds (including Testcontainers itself)

For companies using Testcontainers within their builds

  • If Hub's rate limiting is based on IP address, and if developers share an IP address, it seems possible that rate limits could apply to developers as well as CI infrastructure.
  • For projects that set up authenticated pulls in their CI infrastructure, it is also not clear what approach should be followed for repository forks. Using build secrets to hold Docker Hub auth tokens means it will not be available to forks, although in a company environment this may be more manageable.
  • Companies may be more likely to have private docker registries (e.g. ECR, GCR) which they can copy required images to. Testcontainers will implement an image substitution mechanism to allow private registry image names to be used as substitutes for public Docker Hub images (see below).

Actions for users

  • Be mindful that rate limits may apply soon, and some preparation will be needed. Consider the above options and implement.
  • Please help clarify open questions if you have additional information.

Actions for Testcontainers team

  • Set up authenticated pulls for Testcontainers' own builds (N.B. this refers to adding docker login to our CI scripts. Authenticated pull support is already implemented in testcontainers-java)

    • GitHub Actions - skip, not needed
    • CircleCI, Azure Pipelines for Linux, Travis - switch to master branch only and use secrets
    • Azure Pipelines for Windows (triggered by maintainer PR comments) - use secrets
  • Implement transparent 'image substitution' (whereby Testcontainers will consult a user-provided piece of code to obtain an alias for an image, which may reside on a private registry). PR: Image substitution #3102

  • Investigate the possibility of providing an implementation of image substitution that treats GitHub Packages as a cache.

  • Retrofit to language forks: authenticated image pulls and image substitution

  • Improve documentation around authenticated image pulls and options for mirroring registries / copying images to non-Hub registries

Open questions

  • When will rate limits noticeably kick in? Unclear: nominally 1st November, but some Docker/Testcontainers users have reported rate limiting already

    • Docker have said that IPs pulling the highest volume will be experiencing the rate limit earlier. All other signs point to 1st November being the start.
  • Are CI platforms going to attempt to mitigate this?

    • GitHub Actions: YES
    • Others, no information yet
  • Is there a separate Free plan for Open Source projects that allows unlimited pulls? Yes, see https://www.docker.com/blog/scaling-docker-to-serve-millions-more-developers-network-egress/

  • How should repo forks manage the need to provide Hub credentials?

    • Aside from GitHub Actions providing a different approach to mitigation, it appears there is no solution to this problem for other CI platforms.
    • teams using their own CI executors may find other mitigations, such as configuring a local docker registry server to act as an authenticated mirror.
@rnorth rnorth pinned this issue Aug 14, 2020
@rnorth rnorth added resolution/waiting-for-info Waiting for more information of the issue author or another 3rd party. resolution/acknowledged labels Aug 14, 2020
@rnorth
Copy link
Member Author

rnorth commented Aug 14, 2020

One idea: for fork builds on GitHub Actions, we could look at using GitHub Packages as a cache for required images. Images pushed to GitHub Packages should be readable on forks, so the main (automatable) work would be:

  • to ensure that an OSS project's docker dependencies are published to GitHub Packages
  • to ensure that image substitution is performed to rewrite references to the original image name with a corresponding GitHub Package URI.

Both could be accomplished using the pluggable image substitution mechanism that we have in mind.

I'll add an action above for us to investigate.

@vlsi
Copy link

vlsi commented Aug 14, 2020

AFAIK, GitHub registry requires API token even for image pulls from public repositories, so I'm not sure images would be readable on forks.

@rnorth
Copy link
Member Author

rnorth commented Aug 15, 2020

@vlsi you might be right, so I'd like to do some investigation before putting a lot of effort in. My understanding of the docs was that forks can access a GITHUB_API_TOKEN secret which is a different, read-only, token. It's quite possible that I'm wrong.

@WtfJoke
Copy link

WtfJoke commented Aug 15, 2020

@vlsi, @rnorth is correct the GITHUB_TOKEN provided by github action has read only access on forked repos.

See https://docs.github.com/en/actions/configuring-and-managing-workflows/authenticating-with-the-github_token#permissions-for-the-github_token

@fongie
Copy link

fongie commented Aug 25, 2020

just fyi, this started happening for us today in aws codebuild in our pipeline, tests fail because limit exceeded on docker hub, so some sort of rates are being enforced already (not waiting until november)

@rnorth
Copy link
Member Author

rnorth commented Aug 25, 2020

@fongie that's alarming and disappointing - my understanding was that this was not being applied yet.

Based on a test branch (#3098) I get the impression that GitHub Actions and CircleCI might not be getting rate limited yet.

The advice would remain the same - to use an authenticated account for pulling or to use a mirror/copy in another registry. I'm saddened that we've not had time to release our image substitution yet, though.

@rnorth
Copy link
Member Author

rnorth commented Aug 25, 2020

Updated above with link to https://www.docker.com/blog/scaling-docker-to-serve-millions-more-developers-network-egress/ which clarifies:

  • that pulls will be based on image pulls, not layers.
  • an open source project plan will be available and Docker have an application form here: https://forms.gle/vvKURDTYwok7Pc4r5

From my perspective, two major topics remain unresolved:

  • for open source projects, how can Docker Hub auth credentials be shared with forks of repos for PR builds? Will CI platform vendors step in with a solution?
  • we want to ship transparent image substitution in Testcontainers. If we have until November I'd be confident in this; if rate limiting is being applied much sooner then we do not have time.

@fongie I've reached out to a contact at Docker to see if we can get clarification on when the rate limiting is supposed to be taking effect.

@fongie
Copy link

fongie commented Aug 26, 2020

Great! Sorry for not being more specific last post, I was busy trying to get our release out hehe

This is the error message we got:
Caused by: com.github.dockerjava.api.exception.DockerClientException: Could not pull image: error pulling image configuration: toomanyrequests: Too Many Requests. Please see https://docs.docker.com/docker-hub/download-rate-limit/

in AWS CodeBuild, trying to pull 'postgres:10.7'

It worked on the third attempt simply by retrying. I understand that there is the option to set up an authenticated account, I just dont have the time for that right now, but I would guess this is happening to more people and you might get more questions about this issue, so it might be nice to prepare thorough instructions for how to do it. If I log in to another docker registry in my codebuild environment, will testcontainers automatically use that to pull from when I do new PostgreSQLContainer("postgres:10.7"), or do I need to instruct testcontainers on which registry to use?

@rnorth
Copy link
Member Author

rnorth commented Sep 1, 2020

@fongie sorry for the slow response. If you log in to Docker Hub then Testcontainers will be able to use that hub account for pulling.

If you log in to another docker registry (e.g. ECR) then that does not help directly, but it could be made to work. ECR does not function as a pull-through cache (AFAIK), so you'd have to copy the required image into ECR and use new PostgreSQLContainer("your.ecr.registry.amazonaws.com/somepath/postgres:10.7") to create the container. This is a bit of a pain, which is why I'm working on #3102 to at least help with the second aspect.

Logging in to Docker Hub requires fewer changes.

@rnorth
Copy link
Member Author

rnorth commented Sep 1, 2020

For GitHub Actions users, this issue looks worth following: actions/runner-images#1445

@rnorth
Copy link
Member Author

rnorth commented Oct 24, 2020

Really good news for GitHub Actions users: actions/runner-images#1445 (comment)

@aaronjwhiteside
Copy link
Contributor

Would it not make sense to configure a registry mirror the same way native docker does?

We have Nexus setup to mirror the official DockerHub registry, and we have configured our Jenkins slaves to point towards this mirror..

# docker info
....
 Registry Mirrors:
  https://<internal_company_mirror_here>:5000/
...

Although it's nice that there will be a piece of callback code that we can hook into to modify the image name before it's fetched, would it not be easier to allow a registry mirror to be configured in ~/.testcontainers.properties that would act as the default host to fetch images from that do not explicitly specify a host in the image name?

This seems like what 99% of people would want, and it would stop everyone implementing the same code to do it.

@bsideup
Copy link
Member

bsideup commented Nov 17, 2020

@aaronjwhiteside registry mirror is the easiest way to "fix" the rate limits issue, yes. There is no need to implement any code for it and it can be set in Docker's settings already today.

Also, there is #3413 that implements a default, prefixing substitutor, for those who don't have access to Docker's settings.

@aaronjwhiteside
Copy link
Contributor

@bsideup Interesting, we have a registry mirror set but I still see rate limit errors while fetching images, I'll have to dig into our build system and try and figure out what is going on.

#3413 looks promising!

@brianwyka
Copy link

brianwyka commented Nov 19, 2020

I'm seeing the same behavior as @aaronjwhiteside on our internal network with registry mirrors setup.

@rnorth, @bsideup we are prefixing our containers, such as kafka with our internal registry, which Testcontainers is picking up, however we still get the rate limit errors. Something else we need to do?

@bsideup
Copy link
Member

bsideup commented Nov 19, 2020

@brianwyka
Copy link

@bsideup, That circularly brought me back here 😆 . We are on testcontainers 1.14.x. Maybe we need to update to 1.15.0 ??

@bsideup
Copy link
Member

bsideup commented Nov 19, 2020

@brianwyka oh, yes, definitely. 1.14.x is from pre-ratepocalypse era :D

@aaronjwhiteside
Copy link
Contributor

@brianwyka @bsideup
I think what is happening is that when docker gets an error while pulling from the configured registry mirror is falls back to going directly to dockerhub, I checked our nexus logs and found it was receiving the rate limit error too..

I'm not sure if this is documented behaviour, I haven't checked, though a little unexpected it kinda makes sense.

@poznas
Copy link

poznas commented Nov 23, 2020

we solved the issue by forcing docker-java (which is used under the hood) library to point to the right registry

src/test/resources/docker-java.properties

registry.url=your.artifactory.domain/some-path/

@rnorth
Copy link
Member Author

rnorth commented Nov 24, 2020

That's really interesting. I wonder if we could/should obtain the daemon's registry URL from the info endpoint and use that automatically. That may not be universally applicable though...

At the very least we should document this.

@brianwyka
Copy link

@poznas, were using that docker-java.properties configuration in addition to testcontainers.properties. I ran into the rate limit problem with just the docker-java.properties configuration present...

@poznas
Copy link

poznas commented Nov 30, 2020

@brianwyka, for one of our repos this solution also did not work.
Tired of trying to track down the cause, I just threw testcontainers away 😁
I replaced it with a slightly more manual approach: gradle-docker-plugin

@dargiri
Copy link

dargiri commented Dec 14, 2020

@poznas IMHO moving away from testcontainers to gradle docker plugin & maven docker plugin is 2 steps back. It decreases significantly development experience.

@Fraserhardy
Copy link

Are there any plans to publish test containers to the ECR public gallery: https://aws.amazon.com/about-aws/whats-new/2020/12/announcing-amazon-ecr-public-and-amazon-ecr-public-gallery/

This could be a good solution for many who have their CI on AWS environments as it provides free data transfer if you're on AWS.

@bsideup
Copy link
Member

bsideup commented Dec 16, 2020

@Fraserhardy testcontainers/* is exempt from rate limits on Docker Hub already.

@jvegarag
Copy link

@bsideup now that testcontainer images are whitelisted in Docker Hub, can this issue be considered as closed? do you still recommend to update to 1.15.1 and use the "image name prefix" solution to point to an internal registry? Thanks

@gubbaraviteja
Copy link

@bsideup We are getting rate limit issues in aws codeBuild. If testcontainers/* are exempt from rate-limit, I don’t understand why we are getting this error. Below are the logs.

we are using testcontainers 1.15.1

2021-01-19T09:57:18,596 546 [main] WARN o.t.u.TestcontainersConfiguration - Attempted to read Testcontainers configuration file at file:/root/.testcontainers.properties but the file was not found. Exception message: FileNotFoundException: /root/.testcontainers.properties (No such file or directory) 
2021-01-19T09:57:18,634 584 [main] INFO o.t.d.DockerMachineClientProviderStrategy - docker-machine executable was not found on PATH ([/root/.goenv/shims, /root/.goenv/bin, /go/bin, /root/.phpenv/shims, /root/.phpenv/bin, /root/.pyenv/shims, /root/.pyenv/bin, /root/.rbenv/shims, /usr/local/rbenv/bin, /usr/local/rbenv/shims, /root/.dotnet/, /root/.dotnet/tools/, /usr/local/sbin, /usr/local/bin, /usr/sbin, /usr/bin, /sbin, /bin, /opt/tools, /usr/local/android-sdk-linux/tools, /usr/local/android-sdk-linux/tools/bin, /usr/local/android-sdk-linux/platform-tools, /codebuild/user/bin]) 
2021-01-19T09:57:19,445 1395 [main] INFO o.t.d.DockerClientProviderStrategy - Found Docker environment with local Unix socket (unix:///var/run/docker.sock) 
2021-01-19T09:57:19,492 1442 [main] INFO o.t.utility.ImageNameSubstitutor - Image name substitution will be performed by: DefaultImageNameSubstitutor (composite of 'ConfigurationFileImageNameSubstitutor' and 'PrefixingImageNameSubstitutor') 
2021-01-19T09:57:20,429 2379 [docker-java-stream-1239536715] ERROR c.g.d.a.a.ResultCallbackTemplate - Error during callback 
com.github.dockerjava.api.exception.InternalServerErrorException: Status 500: {"message":"toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit"} 
at org.testcontainers.shaded.com.github.dockerjava.core.DefaultInvocationBuilder.execute(DefaultInvocationBuilder.java:247) 
at org.testcontainers.shaded.com.github.dockerjava.core.DefaultInvocationBuilder.lambda$executeAndStream$1(DefaultInvocationBuilder.java:269) 
at java.base/java.lang.Thread.run(Thread.java:834)

should we copy the testcontainer images to our internal registry and use the 'image name prefix' solution?

@adambriny
Copy link

@Fraserhardy testcontainers/* is exempt from rate limits on Docker Hub already.

@bsideup That's good news, but many internally used images are not from that repo, like alpine or localstack.

I think a single property in testcontainers.properties would be a handy way to set a registry for all the internal images listed in org.testcontainers.utility.TestcontainersConfiguration.

Because now I have to do this one-by-one in testcontainers.properties, like:

ambassador.container.image=my-registry.com/richnorth/ambassador
socat.container.image=my-registry.com/alpine/socat
vncrecorder.container.image=my-registry.com/testcontainers/vnc-recorder
...etc...

which is cumbersome and fragile for changes.

Something similar to the docker-java solution, but I don't want to directly depend on testcontainers' third-parties either.

we solved the issue by forcing docker-java (which is used under the hood) library to point to the right registry

src/test/resources/docker-java.properties

registry.url=your.artifactory.domain/some-path/

@adambriny
Copy link

adambriny commented Jan 19, 2021

Thank you @bsideup!
It's my fault, I was looking for the rate limit keywords only.. 🤦‍♂️
Maybe it would be easier to find if it was mentioned on https://www.testcontainers.org/supported_docker_environment/image_registry_rate_limiting/ as a possible solution...

@bsideup
Copy link
Member

bsideup commented Jan 19, 2021

@adambriny good idea!

P.S. contributions are more than welcome 😊

@bademux
Copy link

bademux commented Mar 25, 2021

@perrin4869
Copy link

I just tried to use ECR as the testcontainers prefix (TESTCONTAINERS_HUB_IMAGE_NAME_PREFIX=public.ecr.aws/), but unfortunately the images aren't publically available there. I found some random public repositories in ECR such as TESTCONTAINERS_HUB_IMAGE_NAME_PREFIX=public.ecr.aws/bigeye/ which host a mirror of the testcontainer images, but since those aren't official, I can't use them at work... is there any chance in the future testcontainers could setup a public repo in ECR as well?

@rnorth
Copy link
Member Author

rnorth commented Feb 22, 2022

Hi @perrin4869
The testcontainers org is in Docker's open source program, so is supposed to be exempt from rate limits... Are you seeing rate limiting occurring?

@rnorth
Copy link
Member Author

rnorth commented Feb 22, 2022

(Just to clarify, this means that images like testcontainers/ryuk should be exempt. Other images may still have rate limits)

@PSanetra
Copy link

@rnorth As far as I can see, the ryuk image is not part of the open source program. The open source program label is missing on the docker hub page:

Comparing:
https://hub.docker.com/r/testcontainers/ryuk is missing that label
https://hub.docker.com/r/fluent/fluent-bit has that label

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
resolution/acknowledged resolution/waiting-for-info Waiting for more information of the issue author or another 3rd party.
Projects
None yet
Development

No branches or pull requests