Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Completed Event/Condition should only trigger during updates #2103

Closed
jessesuen opened this issue Jun 16, 2022 · 1 comment · Fixed by #2203
Closed

Completed Event/Condition should only trigger during updates #2103

jessesuen opened this issue Jun 16, 2022 · 1 comment · Fixed by #2203
Assignees
Labels
bug Something isn't working
Milestone

Comments

@jessesuen
Copy link
Member

jessesuen commented Jun 16, 2022

Summary

Currently, we equate the Completed condition to a Healthy rollout. However, a Rollout becomes Progressing for other reasons than an update (e.g. pod restarts).

Looking at the intent of the condition, it is intended to only emit when updates complete. See:

	// RolloutCompletedReason is added in a rollout when it is completed.
	RolloutCompletedReason = "RolloutCompleted"
	// RolloutCompletedMessage is added when the rollout is completed
	RolloutCompletedMessage = "Rollout completed update to revision %d (%s): %s"

The Completed condition should only be set to False in the middle of an update and set to True at the end of the update. A good way to detect this is simply comparing current and stable pod hashes:

status:
  currentPodHash: 549676655f
  stableRS: 549676655f

Diagnostics

What version of Argo Rollouts are you running? v1.2.0


Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.

@jessesuen jessesuen added the bug Something isn't working label Jun 16, 2022
@jessesuen
Copy link
Member Author

To fix this, I think the existing RolloutComplete() helper should be renamed to RolloutHealthy() and we should introduce a new RolloutComplete() helper that ignores pod availability and only considers stable/current pod hash.

// RolloutComplete considers a rollout to be complete once all of its desired replicas
// are updated, available, and receiving traffic from the active service, and no old pods are running.
func RolloutComplete(rollout *v1alpha1.Rollout, newStatus *v1alpha1.RolloutStatus) bool {
completedStrategy := true
replicas := defaults.GetReplicasOrDefault(rollout.Spec.Replicas)
if rollout.Spec.Strategy.BlueGreen != nil {
activeSelectorComplete := newStatus.BlueGreen.ActiveSelector == newStatus.CurrentPodHash
previewSelectorComplete := true
if rollout.Spec.Strategy.BlueGreen.PreviewService != "" {
previewSelectorComplete = newStatus.BlueGreen.PreviewSelector == newStatus.CurrentPodHash
}
completedStrategy = activeSelectorComplete && previewSelectorComplete
}
if rollout.Spec.Strategy.Canary != nil {
stepCount := len(rollout.Spec.Strategy.Canary.Steps)
executedAllSteps := true
if stepCount > 0 && newStatus.CurrentStepIndex != nil {
executedAllSteps = int32(stepCount) == *newStatus.CurrentStepIndex
}
currentRSIsStable := newStatus.StableRS != "" && newStatus.StableRS == newStatus.CurrentPodHash
scaleDownOldReplicas := newStatus.Replicas == replicas
completedStrategy = executedAllSteps && currentRSIsStable && scaleDownOldReplicas
}
return newStatus.UpdatedReplicas == replicas &&
newStatus.AvailableReplicas == replicas &&
rollout.Status.ObservedGeneration == strconv.Itoa(int(rollout.Generation)) &&
completedStrategy
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment