Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error 404 updating ArtifactRegistryRepository #339

Closed
currankaushik opened this issue Dec 9, 2020 · 5 comments
Closed

Error 404 updating ArtifactRegistryRepository #339

currankaushik opened this issue Dec 9, 2020 · 5 comments
Labels
bug Something isn't working

Comments

@currankaushik
Copy link

Describe the bug
When "transferring" the management of an Artifact Registry Repository from one namespace to another (i.e. abandoning it from the first namespace then creating a new ArtifactRegistryRepository in the second namespace), the controller seems to encounter a 404 error. (Full details in the steps to reproduce below.)

ConfigConnector Version
Run the following command to get the current ConfigConnector version

$ kubectl get ns cnrm-system -o jsonpath='{.metadata.annotations.cnrm\.cloud\.google\.com/version}' 
1.27.1

To Reproduce
Steps to reproduce the behavior:
I created an ArtifactRegistryRepository in 1 namespace, but later decided to migrate the management of this ArtifactRegistryRepository to a different namespace. To do this, I added the necessary annotation to my ArtifactRegistryRepository, then deleted the ArtifactRegistryRepository using kubectl delete. In my new namespace, I then used kubectl apply to recreate my ArtifactRegistryRepository in the new namespace. Upon doing so, I was expecting my new ArtifactRegistryRepository instance to pick up the abandoned Artifact Registry Repository that already exists. Instead, I see this error:

$ kubectl -n cnrm-system logs -f cnrm-controller-manager-0 -c manager
...
{"severity":"info","logger":"artifactregistryrepository-controller","msg":"starting reconcile","resource":{"namespace":"newns","name":"reponame"}}
{"severity":"info","logger":"artifactregistryrepository-controller","msg":"creating/updating underlying resource","resource":{"namespace":"newns","name":"reponame"}}
{"severity":"error","logger":"controller-runtime.controller","msg":"Reconciler error","controller":"artifactregistryrepository-controller","request":"newns/reponame","error":"Update call failed: error applying desired state: summary: Error updating Repository \"projects/myproject/locations/us-east1/repositories/reponame\": googleapi: got HTTP response code 404 with body: <!DOCTYPE html>\n<html lang=en>\n  <meta charset=utf-8>\n  <meta name=viewport content=\"initial-scale=1, minimum-scale=1, width=device-width\">\n  <title>Error 404 (Not Found)!!1</title>\n  <style>\n    *{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7%!a(MISSING)uto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100%!p(MISSING)x no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0%/100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100%}}#logo{display:inline-block;height:54px;width:150px}\n  </style>\n  <a href=//www.google.com/><span id=logo aria-label=Google></span></a>\n  <p><b>404.</b> <ins>That’s an error.</ins>\n  <p>The requested URL <code>/v1beta1/projects/myproject/locations/us-east1/repositories/reponame?alt=json</code> was not found on this server.  <ins>That’s all we know.</ins>\n, detail: "}
...

Other potentially relevant information:

  • I set the same cnrm.cloud.google.com/project-id: "myproject" annotation on both Namespaces.
  • In the original namespace, the repo name and namespace name were the same. In the new namespace, they are no longer the same.

YAML snippets:

Original ArtifactRegistryRepository:

apiVersion: artifactregistry.cnrm.cloud.google.com/v1beta1
kind: ArtifactRegistryRepository
metadata:
  name: reponame
  namespace: oldns
  annotations:
    cnrm.cloud.google.com/deletion-policy: abandon
spec:
  format: DOCKER
  location: us-east1

New ArtifactRegistryRepository:

apiVersion: artifactregistry.cnrm.cloud.google.com/v1beta1
kind: ArtifactRegistryRepository
metadata:
  name: reponame
  namespace: newns
spec:
  format: DOCKER
  location: us-east1

Thanks!

@currankaushik currankaushik added the bug Something isn't working label Dec 9, 2020
@caieo
Copy link
Contributor

caieo commented Dec 10, 2020

Hi @currankaushik, sorry you ran into this. I tried reproducing your issue and noticed that when applying the new ArtifactRegistryRepository, I received an error obtaining lease which I think might be the reason for the 404 error you saw. Basically, when namespace2 is trying to acquire the existing resource, it won't be able to find it because it is still technically leased under namespace1. You can read more about this in our Managing Conflicts concepts page.

In the meantime, to prevent this, I'd recommend adding this annotation to the first instance of your resource:

annotations:
  cnrm.cloud.google.com/management-conflict-prevention-policy: "none"

so that you can immediately acquire it on the second one.

Also, the lease expires after 40 minutes, so there's a good chance you can acquire the resource now without any issues (if it's been deleted from namespace1 for a while).

@currankaushik
Copy link
Author

Thanks @caieo for this context and the link to the documentation. A few follow-up questions:

  1. It seems like the cnrm-controller-manager is continuing to encounter this 404: is this something that the control loop should eventually heal itself (perhaps when the lease expires), or do I need to take some action (e.g. removing the labels from the actual Artifact Registry Repository via the console - I'm not sure if I can add the annotation you provided on the first CRD instance since I deleted it already).
  2. Interestingly, the cnrm-lease-expiration label has a value of 1605916937, which seems pretty far in the past. This is interesting to me because a) I would think that per your guidance the new namespace should have been able to claim the lease by now, hopefully resolving the 404 in the control loop and b) is this some sort of symptom that leases aren't being properly renewed?

@caieo
Copy link
Contributor

caieo commented Dec 16, 2020

@currankaushik, another user reported an error with ArtifactRegistryRepository (#348) where I posted a workaround for their situation. We will look into why/where the 404 error is appearing from, please monitor the issue I linked earlier for updates on that.

For your scenario in particular, you mentioned you weren't sure if you could add the annotation on the CRD you deleted. It might be best to reapply/reacquire the resource (in namespace1) so that you can update the annotation. Once that annotation is updated, you can delete it safely and reacquire it with your other namespace.

@currankaushik
Copy link
Author

currankaushik commented Dec 16, 2020

Ok, I tried following a combination of the guidance from here and #348. I'll recap my actions here in case it's relevant for #348.

I reapplied the ArtifactRegistoryRepository in namespace1. This resulted in the same 404 error from earlier in this thread and #348. I proceeded to modify the annotation. Nothing seemed to change. At this point, I tried following the guidance to delete the k8s object for the ArtifactRegistryRepository - I did this in namespace2 since I wanted to try deleting/recreating to "reclaim" the actual repository. This ended up deleting my repository and its contents 😢. At this point, I basically had a clean slate anyway, so I just deleted all my ArtifactRegistryRepository objects across both namespaces to start from scratch.

I'll go ahead and close this issue and follow #348: maybe the 404 issue was unrelated to transferring namespaces, but stems moreso from a situation where the k8s object overlaps with an actual GCP resource that already exists, but cnrm is unaware of?

@jcanseco
Copy link
Member

Hi @currankaushik, sorry for the late response.

This ended up deleting my repository and its contents

Firstly, I want to apologize that this issue has caused such a problem for you. This is definitely regrettable, and we're sorry.

Secondly, we've identified an issue with ArtifactRegistryRepository's update logic since our last investigation into this issue. We found that updating the resource at all causes a 404. This includes user updates, drift corrections, and periodic updates done by our leasing mechanism.

We've identified the root cause and have put out a fix for review. It should be out by the next release (or the one after). For now, to avoid your ArtifactRegistryRepository resources from triggering the error, our recommended workaround is to avoid updates to your resources and to disable the leasing mechanism as described here -- this is the same suggestion above but with the added clarification that you must abandon (not delete) your existing resources and then re-acquire them for the workaround to work.

Thanks for working with us through this issue. Please feel free to refer to Issue #348 for tracking the fix for this bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants