-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The Story of my Morning's F*** Up: Where is the "letsencrypt-prod" ClusterIssuer? #908
Comments
I need to know if I can postpone this work to later. Let's check whether some of the certificates inside the Secret resources managed by cert-manager are about to expire: $ kubectl get secrets -A -o json \
| jq .items \
| jq '[.[] | select(.metadata.annotations["certmanager.k8s.io/issuer-name"])]' \
| jq -r '.[] | select(.data["tls.crt"]) | "\(.metadata.name)\t\(.data["tls.crt"])"' \
| while IFS=$'\t' read -r name crt; do \
printf "$name\t$(echo "$crt" | base64 -d | openssl x509 -noout -enddate 2>/dev/null)\n"; \
done | column -t Which showed that the next X.509 to expire will be in 1 month (Oct 20, 2023).
So no problem for now! |
Ohh I just found about @ChaosInTheCRD's |
Sorry to see that you've been through such a disaster, these things happen to the best of us unfortunately 😢 It's made my day that you find my project! It needs a bit of work, I definitely should assign some time to it. Nonetheless thanks for giving it a shoutout 😄 |
offtopic: |
I have removed the leftover |
tl;dr: I mistakenly removed the ClusterIssuer
letsencrypt-prod
from thegithub-build-infra
cluster that runs https://prow.build-infra.jetstack.net/. There is no outage yet, but the 4 certificates won't be able to be renewed until I re-apply the manifest.This morning, I made a huge mistake! I was reviewing cert-manager/openshift-routes#32 so I installed openshift-routes as well as cert-manager 1.13.0... Except I didn't install it to my kind cluster... I installed it to the
github-build-infra
cluster where all the Prow services are running!The commands I intended for my kind cluster were:
It said that the cert-manager-webhook's selectors could not be changed. So I
figured I would delete the cert-manager deployment I had installed a while back
on y kind cluster (or so I thought).
This is at this point that I realized that I had nuked the cert-manager
installation that was on github-build-infra.
I thus carefully backtracked by deleting what I had installed:
And then reinstalled the cert-manager:
Then I looked at the logs, and it seems like four certificates existed before my mistake:
prometheus-prod-thanos
(created from the Ingress resource prometheus/prometheus-prod-thanos)prometheus-prod-server
(created from the Ingress resource prometheus/prometheus-prod-server)prow-tls
triageparty-tls
Thanks to the fact that there are no owner references set on Secret resources
created by cert-manager, there was no outage at this point!
But all four Certificates are now targetting a missing ClusterIssuer:
letsencrypt-prod
in thecert-manager
namespace.I haven't found the manifest for this ClusterIssuer. I ran:
cd prow/ gcloud container clusters get-credentials github-build-infra --project jetstack-build-infra --zone europe-west1-b make deploy-prow
Unrelated: I also had to comment out the PersistentVolumeClaim in
cluster/tot_deployment.yaml
because it was failing to create it:Where is the ClusterIssuer manifest?
The text was updated successfully, but these errors were encountered: