Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

topology-updater:compute pod set fingerprint #1049

Merged
merged 3 commits into from
Feb 22, 2023

Conversation

jlojosnegros
Copy link
Contributor

@jlojosnegros jlojosnegros commented Feb 3, 2023

Add an option to compute the fingerprint of the current pod set on each node.
Report this new fingerprint using an attribute in new released v1alpha2 NRT object.

needs: #1053
address: #1048
see: this issue for more info about new NRT api version

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Feb 3, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: jlojosnegros / name: Jose Luis Ojosnegros (dd36774)

@netlify
Copy link

netlify bot commented Feb 3, 2023

Deploy Preview for kubernetes-sigs-nfd ready!

Name Link
🔨 Latest commit b650150
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-nfd/deploys/63f5def97b61b300082c122a
😎 Deploy Preview https://deploy-preview-1049--kubernetes-sigs-nfd.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@k8s-ci-robot
Copy link
Contributor

Welcome @jlojosnegros!

It looks like this is your first PR to kubernetes-sigs/node-feature-discovery 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/node-feature-discovery has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Feb 3, 2023
@k8s-ci-robot
Copy link
Contributor

Hi @jlojosnegros. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Feb 3, 2023
@ffromani
Copy link
Contributor

ffromani commented Feb 3, 2023

/cc

@k8s-ci-robot k8s-ci-robot requested a review from ffromani February 3, 2023 14:32
@ffromani
Copy link
Contributor

ffromani commented Feb 3, 2023

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 3, 2023
Copy link
Contributor

@ffromani ffromani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the PR! initial review.
Do we have plans to add integration/e2e tests?

pkg/nfd-topology-updater/nfd-topology-updater.go Outdated Show resolved Hide resolved
@@ -172,7 +172,7 @@ func (w *nfdTopologyUpdater) Stop() {
}
}

func (w *nfdTopologyUpdater) updateNodeResourceTopology(zoneInfo v1alpha1.ZoneList) error {
func (w *nfdTopologyUpdater) updateNodeResourceTopology(zoneInfo v1alpha1.ZoneList, annotations map[string]string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shoudl evaluate here to wrap the args in a struct

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed it to use ScanResponse
I could have created a new struct including all the params ( right now only ZoneList) but as this is a "private" method that is called only once it does not seems worthy.

}

// NewPodResourcesScanner creates a new ResourcesScanner instance
func NewPodResourcesScanner(namespace string, podResourceClient podresourcesapi.PodResourcesListerClient, kubeApihelper apihelper.APIHelpers) (ResourcesScanner, error) {
func NewPodResourcesScanner(namespace string, podResourceClient podresourcesapi.PodResourcesListerClient, kubeApihelper apihelper.APIHelpers, podFingerprint bool) (ResourcesScanner, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also here we should consider to move args in a struct, but this can be deferred in a later PR

pkg/resourcemonitor/podresourcesscanner.go Outdated Show resolved Hide resolved
pkg/resourcemonitor/podresourcesscanner_test.go Outdated Show resolved Hide resolved
@jlojosnegros jlojosnegros force-pushed the node-signature branch 3 times, most recently from 28a6141 to abab034 Compare February 7, 2023 15:57
go.mod Outdated Show resolved Hide resolved
pkg/resourcemonitor/podresourcesscanner_test.go Outdated Show resolved Hide resolved
@jlojosnegros jlojosnegros force-pushed the node-signature branch 3 times, most recently from fcb05bb to f941cd7 Compare February 8, 2023 11:49
@PiotrProkop
Copy link
Contributor

I think we should create a seperate PR that upgrades NRT api to v1alpha2 and not bundle it in this one(you are missing changes to kustomize and helm deployment to install new CRDs).

@ffromani
Copy link
Contributor

ffromani commented Feb 9, 2023

I think we should create a seperate PR that upgrades NRT api to v1alpha2 and not bundle it in this one(you are missing changes to kustomize and helm deployment to install new CRDs).

good point! I'm fine with both approaches (and yes your suggestion is cleaner indeed)

@PiotrProkop
Copy link
Contributor

I think we should create a seperate PR that upgrades NRT api to v1alpha2 and not bundle it in this one(you are missing changes to kustomize and helm deployment to install new CRDs).

good point! I'm fine with both approaches (and yes your suggestion is cleaner indeed)

I just feel like we can easily miss important things we need to do with an upgrade like advertising Policies and Scope with Attributes 😄

@ffromani
Copy link
Contributor

ffromani commented Feb 9, 2023

I think we should create a seperate PR that upgrades NRT api to v1alpha2 and not bundle it in this one(you are missing changes to kustomize and helm deployment to install new CRDs).

good point! I'm fine with both approaches (and yes your suggestion is cleaner indeed)

I just feel like we can easily miss important things we need to do with an upgrade like advertising Policies and Scope with Attributes smile

Yes, thinking about it you're right. Let's split the update to its own PR

@jlojosnegros
Copy link
Contributor Author

Let's split this PR and create a new one only for NRT API update to v1alpha2
/hold

var status podfingerprint.Status
podFingerprintSign, err := computePodFingerprint(respPodResources, &status)
if err != nil {
klog.Errorf("podFingerprint: Unable to compute fingerprint %v", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we in this case remove the existing pod fingerprint attribute?

Copy link
Contributor

@ffromani ffromani Feb 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we in this case remove the existing pod fingerprint attribute?

I for myself I'm fine both ways. From the scheduler plugin perspective, clients consuming this attribute must tolerate either attribute disappearing (a previous update added, a later update removes) values and obsolete values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we are computing the ScanResponse. Podfingerprint attribute will not be added to the ScanResponse in case of an error.

Removing the existing attribute in the NTR should be done in the nfd-topology-updater's updateNodeResourceTopology function, when updating the attribute list.
That can be done but will go agains this (#1049 (comment)) and will require some changes in the updateAttribute function.
I understood that those changes will wait for the new utility functions in @ffromani 's PR

pkg/nfd-topology-updater/nfd-topology-updater.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 21, 2023
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 21, 2023
Comment on lines 280 to 284
for k, v := range check {
ret = append(ret, v1alpha2.AttributeInfo{Name: k, Value: v})
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I'd have squashed the fixing commit here. Codewise looks reasonnable.

pkg/nfd-topology-updater/nfd-topology-updater_test.go Outdated Show resolved Hide resolved
pkg/nfd-topology-updater/nfd-topology-updater_test.go Outdated Show resolved Hide resolved
Copy link
Contributor

@ffromani ffromani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 21, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: f46014c9128494f91dde8889772ada0f4f27f3c4

Copy link
Contributor

@marquiz marquiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @jlojosnegros. I think the code is basically ready to be merged (I had one nit about the cmdline flag description).

However, I probably forgot to ask this before, but we need to update the documentation (docs/reference/topology-updater-commandline-reference.md) to describe the new command line flag.

Also, could you add support for the new flag in the help chart:

  • add smth like topologyUpdater.podSetFingerprint into deployment/helm/node-feature-discovery/values.yaml
  • use it in deployment/helm/node-feature-discovery/templates/topologyupdater.yaml
  • document the new helm parameter in docs/deployment/helm.md

cmd/nfd-topology-updater/main.go Outdated Show resolved Hide resolved
@jlojosnegros
Copy link
Contributor Author

/test pull-node-feature-discovery-build-image-cross-generic

We are gonna add new data to Scan response so better introduce a new
ScanResponse struct as Scan return value to make it easier.
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 22, 2023
Add an option to compute the fingerprint of the current pod set on each
node.

Report this new fingerprint using an attribute in NRT object.
Copy link
Contributor

@ffromani ffromani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

updates to docs and helm chart seem good

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 22, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 3a4c5d93b06792657da6f0b572eb7aa16ffb34ce

Copy link
Contributor

@marquiz marquiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jlojosnegros for the quick update. Looks good to me now 👍 If something's missing let's fix it in subsequent PRs

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ArangoGutierrez, jlojosnegros, marquiz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 22, 2023
@k8s-ci-robot k8s-ci-robot merged commit 163a6dc into kubernetes-sigs:master Feb 22, 2023
@marquiz marquiz mentioned this pull request Apr 12, 2023
24 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants