Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid unnecessary rolling updates when replacing custom CA #10377

Merged

Conversation

scholzj
Copy link
Member

@scholzj scholzj commented Jul 23, 2024

Type of change

  • Bugfix

Description

This PR updates how we handle renewals / replacing of custom cluster CA. The delayed renewal of the Cluster Operator certificate was causing the replacement event to be detected twice:

  • In a first run it sees that all secrets are one generation behind and rolls out trust to the new CA and rolls out for all components apart from the Cluster Operator own certificates
  • In a second run it detects another CA replacement based on the Cluster Operator certificate using the old generation and repeats everything again, this time including the Cluster Operator certificate renewal

This PR changes the behavior and ignores the Cluster Operator certificate generation. That helps to avoid the whole second run that is unnecessary (The cluster operator certificate is still updated but without the whole CA replacement process).

This change should affect only custom CAs.

This should resolve #10364.

Checklist

  • Make sure all tests pass
  • Try your changes from Pod inside your Kubernetes and OpenShift cluster, not just locally
  • Reference relevant issue(s) and close them after merging

@scholzj scholzj added this to the 0.43.0 milestone Jul 23, 2024
@scholzj scholzj requested a review from ppatierno July 23, 2024 23:40
@scholzj
Copy link
Member Author

scholzj commented Jul 24, 2024

/azp run regression

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Signed-off-by: Jakub Scholz <[email protected]>
@scholzj
Copy link
Member Author

scholzj commented Jul 24, 2024

/azp run regression

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@@ -251,6 +251,8 @@ Future<Void> reconcileCas(Clock clock) {
} else if (secretName.equals(KafkaResources.kafkaSecretName(reconciliation.name()))) {
clusterCaSecrets.add(secret);
clientsCaSecrets.add(secret);
} else if (secretName.equals(KafkaResources.clusterOperatorCertsSecretName(reconciliation.name()))) {
// The CO certificate is excluded as it is renewed in a separate cycle
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has this behaviour happened due to the change from this refactor? #10023 What secret are we now catching in the else here? Could we not remove the else clause instead so we explicitly add the Secrets we care about?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has this behaviour happened due to the change from this refactor? #10023

No, it is not related to #10023. From my point of view, it originates from #6180 (that said, IIRC it did not worked at all before #6180, so it is not as if #6180 broke this)

What secret are we now catching in the else here? Could we not remove the else clause instead so we explicitly add the Secrets we care about?

Everything not listed above it. KE, CC, ETO, EUO, ZOO. I do not think we want to have a dedicated check for each of them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, makes sense

@scholzj scholzj merged commit 5d56e50 into strimzi:main Jul 25, 2024
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unnecessary CA replacement run with custom CA
3 participants