-
Notifications
You must be signed in to change notification settings - Fork 498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dm-operator/: support scaling a dm cluster with dm-masters and dm-workers #3186
Conversation
…o supportStartDMCluster
…o supportStartDMCluster
Codecov Report
@@ Coverage Diff @@
## master #3186 +/- ##
==========================================
- Coverage 40.90% 40.43% -0.48%
==========================================
Files 168 170 +2
Lines 18507 18818 +311
==========================================
+ Hits 7571 7609 +38
- Misses 10283 10553 +270
- Partials 653 656 +3
Flags with carried forward coverage won't be shown. Click here to find out more. |
@@ -31,5 +31,5 @@ type TiKVGroupManager interface { | |||
|
|||
type DMManager interface { | |||
// Sync implements the logic for syncing dmcluster. | |||
Sync(*v1alpha1.DMCluster) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why add DM
here? I think it's not necessary as it's a method of DMManager
which already includes the DM
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because reclaimPolicyManager
uses SyncDM
as interface and I want to keep the same with that. I think using Sync
is also okay.
return nil | ||
} | ||
|
||
// We need remove member from cluster before reducing statefulset replicas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is the member removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be removed in syncing worker status. For dm-worker
we can't remove its register info from dm-master when it's still alive. So we delete it later after it's keepalive lease is outdated or revoked. We can defer deleting dm-worker register info because dm-master will patch replication worker through keepalive info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean the comment here does not match the code logic.
ordinal, err := util.GetOrdinalFromPodName(worker.Name) | ||
if err != nil { | ||
klog.Errorf("invalid worker name %s, can't offline this worker automatically, err: %s", worker.Name, err) | ||
} else if ordinal >= dc.WorkerStsDesiredReplicas() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please refer to https://github.com/pingcap/tidb-operator/blob/master/pkg/manager/member/tikv_failover.go#L38-L45 for how to check if an ordinal is not desired or not in case of advanced statefulset is enabled.
return nil | ||
} | ||
|
||
// We need remove member from cluster before reducing statefulset replicas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean the comment here does not match the code logic.
@@ -549,3 +546,13 @@ func getWorkerConfigMap(dc *v1alpha1.DMCluster) (*corev1.ConfigMap, error) { | |||
} | |||
return cm, nil | |||
} | |||
|
|||
func isWorkerPodDesired(dc *v1alpha1.DMCluster, podName string) bool { | |||
ordinals := dc.WorkerStsDesiredOrdinals(true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be false
here?
…tidb-operator into supportScaleDMCluster
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What problem does this PR solve?
#2868
Support scale in/out dmcluster.
What is changed and how does it work?
Check pvc and delete defer deleting pvcs to scale-out dm-master and dm-worker.
Check List
Tests
Test scale in and then scale out dm-master. dm-master can start correctly and pvc will be flushed while dm-worker will not flush pvc.
Does this PR introduce a user-facing change?: