-
Notifications
You must be signed in to change notification settings - Fork 773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enable cloneset deleting pvc when pod hanging #1113
Conversation
Welcome @willise! It looks like this is your first PR to openkruise/kruise 🎉 |
Codecov Report
@@ Coverage Diff @@
## master #1113 +/- ##
==========================================
- Coverage 49.73% 49.33% -0.40%
==========================================
Files 137 138 +1
Lines 19331 19589 +258
==========================================
+ Hits 9614 9664 +50
- Misses 8667 8868 +201
- Partials 1050 1057 +7
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
How about taking the PVCs related with Competed Pods( |
@willise If the Pod is in the Terminating, it is still risky to delete the PVC directly. The difficulty with this way is that we can't quite tell if it's a normal process or a Hang. I agree with @FillZpp 's idea of adding AlwaysCreatePVC in spec.scaleStrategy to decide whether it should reuse the instanceId of existing free PVCs when creating new Pod. |
Thank you for your reply. I am not sure if I understand correctly. Now the |
I got what you mean and I think it is a good advice. But if |
Your current cleanup logic works, except that you need to add an additional judgment that only Pods that really don't exist anymore can be cleaned up. |
a447a1a
to
0b57d3f
Compare
Also I find the main logic when scaling in, pod deletes the pvc directly. Maybe I can update the ownerreference of the pod from cloneset to the pod and if the pod is deleted, pvc will be deleted by background cascading deletion later after the pod disappears finally. Just like https://github.com/kubernetes/kubernetes/blob/4086b45af3761d59cb82af6ee427d2d6557c1cbc/pkg/controller/statefulset/stateful_set_utils.go#L231 does. |
|
Good idea, and please add UT. |
0b57d3f
to
516cda9
Compare
4965c17
to
897bde2
Compare
d3f36cc
to
854616b
Compare
// the existing pvc first. Then it looks like the pod | ||
// is deleted by controller, new pod can be created. | ||
if updateCS.Spec.ScaleStrategy.DisablePVCReuse { | ||
uselessPVCs := getUselessPVCs(pods, pvcs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should remove uselessPVCs in pvcs? then, pvcs only contains used pvcs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zmberg If there are uselessPVCs, cleanupPVCs
should always set modified
to true then return, so maybe there is no need to update pvcs?
} | ||
|
||
for _, pvc := range uselessPVCs { | ||
if clonesetutils.IsTerminating(pvc) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t think it’s necessary to getInactivePods in the front, just get the pod here, update the owner ref if it exists, delete it if it doesn’t exist.
|
||
func updateClaimOwnerRefToPod(pvc *v1.PersistentVolumeClaim, cs *appsv1alpha1.CloneSet, pod *v1.Pod) bool { | ||
needsUpdate := false | ||
updateMeta := func(tm *metav1.TypeMeta) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use metav1.NewControllerRef() and you can realize the common function in utils like https://github.com/openkruise/rollouts/blob/master/pkg/util/condition.go.
852a53c
to
74de5e1
Compare
74de5e1
to
df1f121
Compare
activeIDs := getInstanceIDsFromPods(pods) | ||
|
||
useless := map[string]*v1.PersistentVolumeClaim{} | ||
for _, pvc := range pvcs { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Writing in this way may have some problems with pointers. The suggestions are as follows:
for i := range pvcs {
pvc := pvcs[i]
id := clonesetutils.GetInstanceID(pvc)
if id != "" && !activeIDs.Has(id) {
useless[id] = pvc
}
}
// GetAllPods returns all pods in this namespace. | ||
func GetAllPods(reader client.Reader, opts *client.ListOptions) ([]*v1.Pod, error) { | ||
podList := &v1.PodList{} | ||
if err := reader.List(context.TODO(), podList, opts); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if err := reader.List(context.TODO(), podList, opts, , utilclient.DisableDeepCopy); err != nil {
pkg/util/ownerref.go
Outdated
} | ||
ownerRefs := append( | ||
target.GetOwnerReferences(), | ||
metav1.OwnerReference{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*metav1.NewControllerRef(owner, ownerType.GroupVersionKind()))
pkg/util/pods.go
Outdated
@@ -343,3 +343,12 @@ func SetPodReadyCondition(pod *v1.Pod) { | |||
|
|||
SetPodCondition(pod, newPodReady) | |||
} | |||
|
|||
func UpdatePodMeta(tm *metav1.TypeMeta) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can remove the function.
needsUpdate = util.RemoveOwnerRef(pvc, cs) | ||
podMeta := &pod.TypeMeta | ||
util.UpdatePodMeta(podMeta) | ||
return util.SetOwnerRef(pvc, pod, podMeta) || needsUpdate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return util.SetOwnerRef(pvc, pod, metav1.TypeMeta{Kind: "Pod", APIVersion: "v1"}) || needsUpdate
pkg/util/ownerref.go
Outdated
return true | ||
} | ||
|
||
func SetOwnerRef(target, owner metav1.Object, ownerType *metav1.TypeMeta) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func SetOwnerRef(target, owner metav1.Object, ownerType metav1.TypeMeta) bool {
instanceID := clonesetutils.GetInstanceID(pod) | ||
if pvc, ok := uselessPVCs[instanceID]; ok { | ||
if updateClaimOwnerRefToPod(pvc, cs, pod) { | ||
if modified, err := r.updatePVC(cs, pvc); err != nil && !errors.IsNotFound(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if modified, err = r.updatePVC(cs, pvc); err != nil && !errors.IsNotFound(err) {
// If useless pvc owner pod does not exist, the pvc can be deleted directly, | ||
// else update pvc's ownerreference to pod. | ||
for _, pod := range pods { | ||
instanceID := clonesetutils.GetInstanceID(pod) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if kubecontroller.IsPodActive(pod) {
continue
}
|
||
for _, pvc := range uselessPVCs { | ||
// It's safe to delete pvc that has no pod found. | ||
if modified, err := r.deletePVC(cs, pvc); err != nil && !errors.IsNotFound(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if modified, err = r.deletePVC(cs, pvc); err != nil && !errors.IsNotFound(err) {
@willise In addition, Can you add e2e for this? |
apis/apps/v1alpha1/cloneset_types.go
Outdated
@@ -93,6 +93,10 @@ type CloneSetScaleStrategy struct { | |||
// The scale will fail if the number of unavailable pods were greater than this MaxUnavailable at scaling up. | |||
// MaxUnavailable works only when scaling up. | |||
MaxUnavailable *intstr.IntOrString `json:"maxUnavailable,omitempty"` | |||
|
|||
// Indicate if cloneset will reuse aleady existed pvc to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aleady -> already
return nil, err | ||
} | ||
|
||
// Ignore inactive pods | ||
var activePods []*v1.Pod | ||
for i, pod := range podList.Items { | ||
for i, pod := range podList { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm very worried that there will be problems with pointers, so I suggest:
for i := range podList {
pod := podList[i]
// Consider all rebuild pod as active pod, should not recreate
if kubecontroller.IsPodActive(pod) {
activePods = append(activePods, pod)
}
}
pkg/util/ownerref_test.go
Outdated
@@ -0,0 +1,154 @@ | |||
/* | |||
Copyright 2020 The Kruise Authors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copyright 2022 The Kruise Authors.
pkg/util/ownerref.go
Outdated
@@ -0,0 +1,66 @@ | |||
/* | |||
Copyright 2021 The Kruise Authors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copyright 2022 The Kruise Authors.
df1f121
to
6871ed3
Compare
Signed-off-by: willise <[email protected]>
6871ed3
to
ec8d173
Compare
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: furykerry The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Ⅰ. Describe what this PR does
fixes #1099
Ⅱ. Does this pull request fix one issue?
fixes #1099
Ⅲ. Describe how to verify it
Ⅳ. Special notes for reviews