-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker in Docker stopped working #31889
Comments
/sig testing |
IIUC, to fix this we can either:
in the |
CC @BenTheElder |
Same here: all our Cluster API Provider Azure jobs started failing with the same issue above. See kubernetes-sigs/cluster-api-provider-azure#4553. |
/priority critical-urgent |
This also changed with picking up a new docker version, right? (haven't had time to look myself) we probably also want to revert docker to 1.24 overall in the meantime, we've had other 1.25.x breakage like kubernetes-sigs/kind#3487 |
Exactly, it started failing with Docker 25.0.0. Could this get prioritized? It completely blocks submission of any pull requests in k/k as well. |
The images and job configs are in this repo, PRs welcome? |
The quickest option is to rollback the images in the job config. Whoever rolled them forward would do that ideally. |
…images and k8s-staging-test-infra AR images" This reverts commit 2bf070c. See: kubernetes#31889
We'll also need to actually deal with the docker-in-docker changes, but this should've been rolled back first, then we can roll forward with a fix. |
If you see something like this in the future, please ping #testing-ops to raise visibility faster than the issue tracker. |
#sig-testing slack discussing roll forward |
Here's how to test the kubekins image:
when you get dropped into the command line, you can try starting the docker daemon, print version etc.
|
Looks like we need to fix the currently it looks like:
We need to switch from |
I'm having a nasty issue in kubernetes-sigs/jobset#400 and I wonder if its related. Seeing a segmentation fault in controller tools but its not clear to me if this has been fixed for my jobs yet. |
I wouldn't expect a unit test job to have docker in docker enabled, is it even using docker? That looks like a nil deref error in the code under test? |
DinD should be fixed now, I believe @dims's roll-forward fix is now out? |
Yea we could move things around but we use docker to generate some python sdk and run some unit tests in python. Either way, I also see the same failure in the e2e tests where I am using kind: https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/kubernetes-sigs_jobset/400/pull-jobset-test-e2e-main-1-29/1756049401039556608 So I don't think the unit test code is the issue. We are seeing this on a few of our PRs so its not related to any code in the PRs. |
Maybe its just a coincidence. I'll look into this failure on my end. |
circling back to this, @alculquicondor pointed out to me that the images we use for these jobs uses golang 1.22 and it seems to have broke the controller-gen/controller-tools version I was using. I updated my controller-tools to 0.14.0 and the CI for jobset seems happy again. kubernetes-sigs/jobset#403 |
That sounds right, see: FWIW with 1.21+ you can now use either go.mod toolchain or GOTOOLCHAIN env to control go. I'm going to close this as fixed for now, we can follow-up on another issue if we should revert to 1.22 |
…images and k8s-staging-test-infra AR images" This reverts commit 2bf070c. See: kubernetes#31889
What happened:
Docker in Docker doesn't work, e.g. in https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-kubemark-500-gce/1755589711310622720 we have the following error on DinD enablement:
What you expected to happen:
DinD works.
How to reproduce it (as minimally and precisely as possible):
Execute any Prow job with image basing on
bootstrap
image andDOCKER_IN_DOCKER_ENABLED
set to true:test-infra/images/bootstrap/runner.sh
Lines 63 to 86 in a6299e8
service docker start
simply fails with/etc/init.d/docker: 62: ulimit: error setting limit (Invalid argument)
.Please provide links to example occurrences, if any:
https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-kubemark-500-gce/1755589711310622720
Anything else we need to know?:
I see this is an issue in the newest Docker: docker/cli#4807.
Some workaround is mentioned in https://forums.docker.com/t/etc-init-d-docker-62-ulimit-error-setting-limit-invalid-argument-problem/139424.
The text was updated successfully, but these errors were encountered: