-
Notifications
You must be signed in to change notification settings - Fork 578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NVIDIA Container Toolkit Daemonset Fails with ImagePullBackOff #3782
Comments
@kacole2: This issue is currently awaiting triage. If CAPA/CAPI contributors determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
In the
|
We have an E2E test running with the current changes for GPU instances in CAPA clusters, if there was an issue, that test would fail. |
/triage needs-information |
We use nvidia registry in our tests which works fine
Maybe it is the env problem ?? |
CAPA has e2e tests that cover GPU functionality, and there is no evidence this issue was related to CAPA. /triage unresolved |
/close |
@dlipovetsky: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/kind bug
What steps did you take and what happened:
I am attempting to install the NVIDIA GPU Components following
kubectl apply -f clusterpolicy-crd.yaml
andkubectl apply -f gpu-operator-components.yaml
. Thenvidia-container-toolkit-daemonset
pod is failing withInit:ImagePullBackOff
. That will then make thenvidia-driver-daemonset
fail withCrashLoopBackOff
. Does this have anything to do with NVIDIA/gpu-operator#388?What did you expect to happen:
The container images should be pulling
Anything else you would like to add:
Clean environment using Tanzu Kubernetes Grid.
Environment:
Machine type: g4dn.8xlarge
kubectl version
): Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.3", GitCommit:"434bfd82814af038ad94d62ebe59b133fcb50506", GitTreeState:"clean", BuildDate:"2022-10-12T10:47:25Z", GoVersion:"go1.19.2", Compiler:"gc", Platform:"darwin/amd64"}/etc/os-release
): Ubuntu 20.04The text was updated successfully, but these errors were encountered: