-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[VPA] check OwnerRef against TargetRef to confirm VPA/Pod association #6460
[VPA] check OwnerRef against TargetRef to confirm VPA/Pod association #6460
Conversation
/assign @sophieliu15 |
@mwielgus This function also shows that if the pod matches in terms of label, it is not considered as part as the deployment if the ownerReference chain (Pod=>ReplicaSet=>Deployment) is not correct. |
@@ -37,7 +40,22 @@ func parseLabelSelector(selector string) labels.Selector { | |||
} | |||
|
|||
func TestGetMatchingVpa(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a testing case where VPA does not select a pod when its selector matches but the target ref does not match?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test was missing, I have added a test where selector match but not the ownerRef, here it is:
a5cae3b
@@ -52,7 +70,7 @@ func TestGetMatchingVpa(t *testing.T) { | |||
name: "matching selector", | |||
pod: podBuilder.Get(), | |||
vpas: []*vpa_types.VerticalPodAutoscaler{ | |||
vpaBuilder.WithUpdateMode(vpa_types.UpdateModeAuto).WithName("auto-vpa").Get(), | |||
vpaBuilder.WithUpdateMode(vpa_types.UpdateModeAuto).WithName("auto-vpa").WithTargetRef(targetRef).Get(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please correct me if I am wrong. I thought the target ref of vpa is usually "Deployment" kind. E.g., https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler#example-vpa-configuration
Are we assigning ReplicationController
kind here?
Once the code is submitted, does it still work for vpas using Deployment as target reference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test is using a controllerfetcher.FakeControllerFetcher
to retrieve the owner reference. The implementation of this mock returns the same reference that it receives (here). Not reflecting the reality of the ownerRef chain is not so important because here we are not testing the code of the embedded controllerfetcher
.
So for this test, the only important thing is that the reference set in the pod is the same that the one set in the VPA. We are left with 2 options if we stick to deployments, none reflecting really the reality:
Option 1: vpa.targetRef=> Deployment and pod.ownerRef=> Deployment (which is not possible in reality)
Option 2: pod.ownerRef=> ReplicaSet and pod.ownerRef=> Replicaset (which is not happening in reality)
In fact there is a 3rd option that we can use in the test and the would be more correct compare to reality: we use Statefulset. Here is a commit that does that change:
3c47994
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify, the functional code does traverse ownerRefs (pod -> replica-set -> deployment)?
If this test is really not meant to test the owner chain resolution (and if we really have another test making sure that still happens correctly) I'm okay with a surreal test case given it's explained in a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
58d117c
to
a5cae3b
Compare
/lgtm cc: @kwiesmueller for final code review and approval. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks good to me.
I'd prefer that we clarify with API review if this change is okay. I think we'll need to cut this as a behavior change.
@@ -52,7 +70,7 @@ func TestGetMatchingVpa(t *testing.T) { | |||
name: "matching selector", | |||
pod: podBuilder.Get(), | |||
vpas: []*vpa_types.VerticalPodAutoscaler{ | |||
vpaBuilder.WithUpdateMode(vpa_types.UpdateModeAuto).WithName("auto-vpa").Get(), | |||
vpaBuilder.WithUpdateMode(vpa_types.UpdateModeAuto).WithName("auto-vpa").WithTargetRef(targetRef).Get(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify, the functional code does traverse ownerRefs (pod -> replica-set -> deployment)?
If this test is really not meant to test the owner chain resolution (and if we really have another test making sure that still happens correctly) I'm okay with a surreal test case given it's explained in a comment.
@sophieliu15 can you look into the API question? Feel free to loop me in if help is needed. |
@kwiesmueller , following your comment, I am adding a comment into the test: |
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR looks good to me but I don't have approval rights. Please contact other reviewers on the list for review and approval. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dbenque, kwiesmueller The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Complete with fix: |
What type of PR is this?
/kind bug
What this PR does / why we need it:
The function
GetControllingVPAForPod
tries to match pods with VPA. The problem is that it does it based on label only.If some pods have label that match the labelSelector that was compute for the VPA the program will consider that the pod is controlled by the VPA. This is not always tru.
For example an orphan pod (no controller) can match the labelSelector but it should not be associated with any VPA! We had
that particular case in our infra. Another theoric case is that 2 deployments having same set of labels (or overlapping set) could result in bad assignment pod<=>vpa.
To fix these kind of issue, in this PR, we propose to validate that the function
GetControllingVPAForPod
validates that the pod matches in terms of labels but also in terms of ownerRef<=>targetRef.The first commit is just code move to expose the ControllerFetecher to other packages.
The second commit adds the missing bits to have the ownerRef<=>targetRef validation.
Which issue(s) this PR fixes:
Special notes for your reviewer:
Does this PR introduce a user-facing change?
NONE
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: