Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restart task status.Succeeded fixes #440

Merged

Conversation

Miles-Garnsey
Copy link
Member

What this PR does:

Ensure that the number of pods successfully restarted is tracked via the UpdatedReplicas status field.

Which issue(s) this PR fixes:
Fixes #431

Checklist

  • Changes manually tested
  • Automated Tests added/updated
  • Documentation added/updated
  • CHANGELOG.md updated (not required for documentation PRs)
  • CLA Signed: DataStax CLA

@Miles-Garnsey Miles-Garnsey requested a review from a team as a code owner November 7, 2022 04:15
@Miles-Garnsey
Copy link
Member Author

Miles-Garnsey commented Nov 7, 2022

One note, in our original discussion @burmanm suggested that we need to ensure that the revision has updated before setting restartedPods = status.UpdatedReplicas.

I think this is actually already done here so I haven't done anything further with that.

I also wonder if there is a chance that restartedPods += int(status.UpdatedReplicas) will overcount the number of pods restarting if that code runs twice while the restart is taking place (there is a requeue here).

We could feasibly use a map of sts->UpdatedReplicas to keep better track. I might experiment with implementing that and see if I can make it work.

@Miles-Garnsey Miles-Garnsey changed the title Restart task completion fixes. Restart task status.Succeeded fixes Nov 7, 2022
controllers/control/jobs.go Outdated Show resolved Hide resolved

restartedPods[st.Name] = int(status.UpdatedReplicas)
totalRestarted := 0
for _, v := range restartedPods {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is never called with finished StSes as they jump out of the loop before this is reached.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, but we want unfinished ones to be tracked too, don't we?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do, but now we won't track the finished ones at all. We should update the TaskConfig.Completed inside that part of the loop also. Or use a pointer and update that one in TaskConfig..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check my latest commit and confirm that's what you're after?

@burmanm
Copy link
Contributor

burmanm commented Nov 14, 2022

.. just add missing changelog ;)

@burmanm burmanm merged commit 7332492 into k8ssandra:master Nov 14, 2022
emerkle826 pushed a commit to emerkle826/cass-operator that referenced this pull request Dec 1, 2022
* Restart task completion fixes.

* Add test for success tracking.

* Juggle types in test so that they match.

* More precise logic for tracking number of pods restarted for all STSs.

* Bring back simple int for restarted pods calculation.

* Ensure taskConfig.Completed is updated when the restart is completed.

* Changelog.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

K8SSAND-1837 ⁃ Succeeded field in CassandraTask status does not get updated for restart task
2 participants