-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes pre-deploy step having incorrect desired labels and adds a temp fix for race condition #835
Conversation
/retest |
1 similar comment
/retest |
/test pull-etcd-druid-e2e-kind-nondistroless-etcd |
3 similar comments
/test pull-etcd-druid-e2e-kind-nondistroless-etcd |
/test pull-etcd-druid-e2e-kind-nondistroless-etcd |
/test pull-etcd-druid-e2e-kind-nondistroless-etcd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
How to categorize this PR?
/area control-plane
/kind bug
What this PR does / why we need it:
In hotfix branch
PreDeploy
step was introduced to ensure that etcd-druid is backward compatible with the changes that are made in master w.r.t pod labels and statefulset label-selector. An issue was discovered when running e2e g/g tests.Desired labels passed to utils.ContainsAllDesiredLabels contains the additional labels which are fetched from
etcd.Spec.Labels
. When g/g e2e tests are run then in case we upgrade existing etcd cluster from 1 to 3, g/g will add additional networking labels on the etcd resource. These labels will never be present on the pod in thePreDeploy
step. This will result in the reconciliation getting stuck at this stage.Deploy
step is never called, so the new labels that are added by g/g on the etcd resource never make it to the pods.Additionally we now add one more safeguard in the custodian controller to skip reconciliation if the reconcile annotation is still added on the etcd resource and retry after 10s. This is done in addition to the already present predicate which also does something similar. However the predicate does not prevent requeue of requests due to error in the custodian which can still interfere with the etcd reconciler.
Which issue(s) this PR fixes:
Fixes #836
Special notes for your reviewer:
Release note: