Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CassandraTask updates #383

Merged
merged 10 commits into from
Aug 15, 2022
Merged

CassandraTask updates #383

merged 10 commits into from
Aug 15, 2022

Conversation

burmanm
Copy link
Contributor

@burmanm burmanm commented Jul 27, 2022

What this PR does:

  • Adds 3 new client.go endpoints for async operations (scrub, upgradesstables, compaction)
  • Adds 3 new client.go endpoints for sync operations (scrub, compaction, upgradesstables)
  • Refactor cassandratask_controller to separate jobs, reduce boilerplate and add a new validation function
  • Add new CassandraTask for upgradeSSTables

Which issue(s) this PR fixes:
Fixes #387
Fixes #384
Fixes #327

Checklist

  • Changes manually tested
  • Automated Tests added/updated
  • Documentation added/updated
  • CHANGELOG.md updated (not required for documentation PRs)
  • CLA Signed: DataStax CLA

burmanm added 3 commits July 26, 2022 13:43
…on function to check for necessary parameters, add keyspace_name to rebuild and upgradesstables, add new command upgradesstables
@burmanm burmanm requested a review from a team as a code owner July 27, 2022 15:47
Copy link
Contributor

@adutra adutra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a very superficial review for now, I will continue tomorrow and try to dig deeper.

controllers/control/cassandratask_controller.go Outdated Show resolved Hide resolved
controllers/control/cassandratask_controller.go Outdated Show resolved Hide resolved
controllers/control/cassandratask_controller.go Outdated Show resolved Hide resolved
apis/control/v1alpha1/cassandratask_types.go Show resolved Hide resolved
controllers/control/cassandratask_controller.go Outdated Show resolved Hide resolved
controllers/control/jobs.go Show resolved Hide resolved
controllers/control/jobs.go Outdated Show resolved Hide resolved
controllers/control/jobs.go Show resolved Hide resolved
controllers/control/jobs.go Outdated Show resolved Hide resolved
controllers/control/jobs.go Outdated Show resolved Hide resolved
@@ -156,6 +160,7 @@ func waitForTaskCompletion(taskKey types.NamespacedName) *api.CassandraTask {

var _ = Describe("Execute jobs against all pods", func() {
jobRunningRequeue = time.Duration(1 * time.Millisecond)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: these casts are redundant.

FormatOutput(json)
WithLabel(fmt.Sprintf("cassandra.datastax.com/datacenter=%s", dcName))

if completed {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: when completed is false, I would have expected the control.k8ssandra.io/status to NOT be equal to completed. Here instead, we retrieve all tasks, both complete and incomplete.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was the intention. The idea was to fetch all possible tasks to ensure no extra ones are created.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename to onlyComplete for clarity?

@burmanm
Copy link
Contributor Author

burmanm commented Aug 11, 2022

The test failure looks to be failure in test design or potentially a bug in the cass-operator (but not from this PR really). It seems the node is replaced correctly, but cass-operator puts itself in the "quiet period" and does not change the status of replace to complete. Maybe I'm replacing the same node twice too fast in the test and causing this issue.

@burmanm
Copy link
Contributor Author

burmanm commented Aug 11, 2022

Some relevant parts from the log:

Start of the process:

1.660197767282e+09	INFO	controller.cassandratask	calling Management API drain node - POST /api/v0/ops/node/drain	{"reconciler group": "control.k8ssandra.io", "reconciler kind": "CassandraTask", "name": "replace-node", "namespace": "test-node-replace", "pod": "cluster1-dc1-r3-sts-0"}

Then cass-operator notices it pretty quickly that pod has been deleted:

1.6601977791980631e+09	INFO	controllers.CassandraDatacenter	======== handler::Reconcile has been called	{"cassandradatacenter": "test-node-replace/dc1", "requestNamespace": "test-node-replace", "requestName": "dc1", "loopID": "62fce0a3-2cd0-4825-b148-f320a50476fd"}

Eventually we'll notice:

1.6601979557445126e+09	INFO	controllers.CassandraDatacenter.cassandradatacenter_controller.controller.cassandradatacenter-controller	Finished replacing pod	{"reconciler group": "cassandra.datastax.com", "reconciler kind": "CassandraDatacenter", "name": "dc1", "namespace": "test-node-replace", "namespace": "test-node-replace", "datacenterName": "dc1", "clusterName": "cluster1", "pod": "cluster1-dc1-r3-sts-0"}
1.6601979557446992e+09	INFO	controllers.CassandraDatacenter.cassandradatacenter_controller.controller.cassandradatacenter-controller	Finished replacing pod cluster1-dc1-r3-sts-0	{"reconciler group": "cassandra.datastax.com", "reconciler kind": "CassandraDatacenter", "name": "dc1", "namespace": "test-node-replace", "reason": "FinishedReplaceNode", "eventType": "Normal"}

That should update the status in cass-operator, but for some reason there's no error or update.

@burmanm
Copy link
Contributor Author

burmanm commented Aug 11, 2022

That said, I think there's a bug. It comes in the form of "PreProcess" being run too many times and that could cause this to happen also.

@burmanm burmanm merged commit 0bb3086 into k8ssandra:master Aug 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants