Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: docs publishing fails due to docker api limit #4924

Closed
conorsch opened this issue Nov 15, 2024 · 2 comments
Closed

ci: docs publishing fails due to docker api limit #4924

conorsch opened this issue Nov 15, 2024 · 2 comments
Assignees
Labels
A-CI/CD Relates to continuous integration & deployment of Penumbra

Comments

@conorsch
Copy link
Contributor

The CI job for publishing docs changes is failing:

docker-api-failure

We've encountered this on other jobs before, when interacting with docker hub directly, and the solution was to make authenticated requests instead. Will modify the job the docs job to do the same, which should resolve the issue. Filing this ticket just so it's linkable if and when the problem occurs again.

@conorsch conorsch self-assigned this Nov 15, 2024
@github-actions github-actions bot added the needs-refinement unclear, incomplete, or stub issue that needs work label Nov 15, 2024
@conorsch conorsch added A-CI/CD Relates to continuous integration & deployment of Penumbra and removed needs-refinement unclear, incomplete, or stub issue that needs work labels Nov 15, 2024
@conorsch
Copy link
Contributor Author

the solution was to make authenticated requests instead.

Not so simple this time: the pull attempt is happening on the container that the firebase-action helper uses, and GitHub CI pulls all containers in a job before running the first step of that job, which is where the ratelimit is triggered. Therefore even if we add a docker login action, it won't run early enough to affect the preparatory image pulls.

Notably the official GHA runners automatically receive a docker login token with much higher rate limits. Since we use BuildJet runners, however, we don't enjoy the same increased rate limits.

Intriguingly there are reports that Docker Hub only recently (i.e. with the past few days) started enforcing IPv6 rate limits, which could explain the sudden change.

I've rerun the job in question and it passed fine. I suspect we'll see these failures periodically, but I'm not taking further action right now, since a lot of folks are already reporting it and debugging accordingly. Will keep an eye on the actions list and report results in here.

@conorsch conorsch changed the title ci: docs publishings due to docker api limit ci: docs publishing fails due to docker api limit Nov 15, 2024
@conorsch
Copy link
Contributor Author

This error hasn't reoccurred over the last week, so I'm closing. One can view the historical docs jobs here when checking up in the future. I did notice one unrelated error:

The self-hosted runner: buildjet.com_a2719040-1fb1-4e01-a1cc-a0605f243efd lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.

which looks more like a BuildJet flake than recurring issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-CI/CD Relates to continuous integration & deployment of Penumbra
Projects
None yet
Development

No branches or pull requests

1 participant