-
Notifications
You must be signed in to change notification settings - Fork 717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change etcd liveness probes to the new livez and readyz endpoints #3039
Change etcd liveness probes to the new livez and readyz endpoints #3039
Comments
cc @serathius @ahrtr |
Q: is there are reason etcd released this feature under a patch release? perhaps it was under a bugfix. kubeadm 1.30 (to be released) is using 3.12 so for kubeadm 1.31 we can switch to the new probe method. generally more opinions on this topic are appreciated. |
do we expect any behavior changes if we add a readiness probe for etcd in kubeadm? |
Since we backport etcd minor version to previous releases as well, it's quite safe to change the probe in v1.31. |
BTW, do we plan to add the ReadinessProbe?
|
+1 for applying the new livez/readyz probe in v1.31
That seems safe because we have already checked the etcd cluster health by client and have adopted a retry mechanism. |
I raised the same question/comment when I was reviewing the lives/readz design doc. Actually I wasn't a big fan of backporting it to stable releases, but I did not strongly against it, reasons,
I believe there isn't any behaviour change in etcd's 3.5 and 3.4 patch releases in term of the new We may extend both It's totally up to kubeadm / cluster-life-cycle's maintainers / leads to decide when to use the new /livez and /readyz endpoints. But please do not expect we can get etcd 3.6.0 released soon.
I think they are orthogonal changes.
We can reuse the |
ok, seems like this can be added in kubeadm 1.31 once code freeze for 1.30 is over. |
@siyuanfoundation @ahrtr any notable regressions in >= 3.5.11 that may keep users on < 3.5.11? i have a WIP PR for this here, but it does not do versions checking and it will cause an error for kubeadm users that are deploying a custom etcd version < 3.5.11 on kubeadm 1.31. i guess etcd will just fail to start. to be on the safe side the PR can also include some version checking, but it will not work for custom image strings as some users pass SHA for the etcd image. it would only work for users that are using a semver in the etcd image. i think my preference would be to just have this mentioned in the release note, and also make kubeadm users stuck on older custom etcd version to finally upgrade! |
@neolit123 Thanks for working on this! |
No regression in >= 3.5.11. Yes, it's recommended to bump etcd to the latest patch. |
This is related to etcd-io/etcd#16007
Since v3.5.11, etcd has added new livez/readyz HTTP endpoints
The design for the new probes is documented in etcd livez and readyz probes
The new probes are Kubernetes API compliant, have more fine-grained metrics, and are better suited for liveness and readiness checks than
/health?exclude=NOSPACE&serializable=true
, and/health?serializable=false
/health?exclude=NOSPACE&serializable=true
/livez
/livez
does not check for leaders because restarting local server does not mitigate the problem of cluster missing leader, while a crashing loop would worsen the problem./health?serializable=false
/readyz
/readyz
ignores theNOSPACE
alarm, because if the local server is out of quota, the server is still able to take read/delete requests, and still able forward write requests to the leader to write successfully in the cluster./readyz
We should switch the current etcd
LivenessProbe
andStartupProbe
to the new endpoints which are Kubernetes API compliant.Is this a BUG REPORT or FEATURE REQUEST?
FEATURE REQUEST
The text was updated successfully, but these errors were encountered: