-
Notifications
You must be signed in to change notification settings - Fork 39.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Failing Test] [sig-apps] ReplicaSet should serve a basic image on each replica with a private image, ReplicationController should serve a basic image on each replica with a private image #97002
Comments
@thejoycekung: This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
It seems this is the problem
|
Question asked by @aojea here https://kubernetes.slack.com/archives/C09QZ4DQB/p1606896985218000 |
Now also affecting:
|
https://kubernetes.slack.com/archives/C09QZ4DQB/p1606896985218000 The project hosting the GCR repo was swept up by a security audit because it hadn't been properly accounted for. That change has been reverted. Now waiting to see affected jobs go back to green. We should create a community-owned equivalent project, I'll open a followup issue for that |
/assign @BenTheElder @spiffxp @kubernetes/ci-signal I feel like this should be assigned to someone from CI signal to track the jobs going green, what's your policy for that? |
@spiffxp -- CI Signal should continue to monitor. I think since @krzyzacy restored the project, y'all are in the clear for the time being. 🙃 /assign @justaugustus @hasheddan |
@spiffxp -- Opened one here: kubernetes/k8s.io#1458 |
Hrm, I'm seeing this fail still in downstream repo tests. Are the tests injecting a secret (the only hardcoded GCR secret I see is in Who is able to access that repo? If it was previously "all projects" then I think that wasn't restored correctly. https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/24887/pull-ci-openshift-origin-master-e2e-gcp/1334318300561149952 is a 1.19 codebase trying to run in the openshift-gce-devel-ci GCP project, but is getting access denied. EDIT: This looks like it has started passing again at midnight EST? Maybe some sort of wierd perms propagation issue. DISREGARD |
I'll leave it to Ben or Aaron to report on what permissions are configured on the repo (as they now have access to it). |
As best as I can tell here, we aren't using any imagePullSecrets for the Pods kubernetes/test/e2e/apps/replica_set.go Line 98 in cea1d4e
So that likely means that the service account that we are mounting has the proper credentials attached. |
@spiffxp I think we likely just need to add permissions to |
i confirm that these two tests are flaking a lot and blocking presubmits |
Now that I have access to the project, I'm working on restoring permissions. I had hoped this would be a 10min fix, but it's taking longer than I expected. I can currently list the backing bucket, but cannot list images. |
In the event that I can't get this done expediently for release, I think the release team should consider some alternatives. In order of my personal preference
|
OK, thanks to some help from @amwat I am now able to list images with my google account
|
There are no specific permissions on the bucket for a service account Without docker configured for auth:
Configuring docker to use my personal gcloud account for auth (which has no explicit permissions to this registry):
So this may be enough to unblock tests |
I'm starting to see passes for presubmits that were affected by this, looking at https://prow.k8s.io/?repo=kubernetes%2Fkubernetes&job=pull-kubernetes-e2e-gce-ubuntu-containerd |
I'm questioning whether we even want to keep these tests around (ref #97026 (comment)) but I don't think that needs to happen for v1.20.0 |
closing this one now since testgrid has been green for the past week ish |
@thejoycekung: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Which jobs are failing:
ci-kubernetes-e2e-gci-gce
ci-kubernetes-e2e-gce-cos-k8sbeta-default
Which test(s) are failing:
[sig-apps] ReplicaSet should serve a basic image on each replica with a private image
[sig-apps] ReplicationController should serve a basic image on each replica with a private image
Since when has it been failing:
Started failing between 2:04 and 2:40PM PST Dec 1
Testgrid link:
https://k8s-testgrid.appspot.com/sig-release-master-blocking#gce-cos-master-default
https://k8s-testgrid.appspot.com/sig-release-1.20-blocking#gce-cos-k8sbeta-default
Reason for failure:
pod never run
? Looks like both are timing out waiting for containers to be readyReplicaSet should serve a basic image on each replica with a private image:
ReplicationController should serve a basic image on each replica with a private image
Anything else we need to know:
Example Spyglass links:
Having trouble finding a good Triage link - will drop one if I can find
Wondering whether this has anything to do with the
Pod pending timeout
errors happening on some of the jobs on the 1.20 boards now?/sig apps
/cc @kubernetes/ci-signal @kubernetes/sig-apps-test-failures
The text was updated successfully, but these errors were encountered: