-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add hugepage info to v1 node structure #2304
Conversation
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed (or fixed any issues), please reply here with What to do if you already signed the CLAIndividual signers
Corporate signers
ℹ️ Googlers: Go here for more info. |
Hi @ohsewon. Thanks for your PR. I'm waiting for a google or kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
@bg-chun: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@googlebot I signed it! |
CLAs look good, thanks! ℹ️ Googlers: Go here for more info. |
92b6dfa
to
d474239
Compare
This PR is related to the below issue and KEP. And also mentioned in the below slack thread. |
/assign dashpole |
/ok-to-test |
Let me know once the KEP is approved, and i'll take a look at this. |
65ea93f
to
d144da2
Compare
@ohsewon
|
d144da2
to
07b678e
Compare
Signed-off-by: sewon.oh <[email protected]>
07b678e
to
d138b59
Compare
/test pull-cadvisor-e2e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some small comments, but the overall implementation looks good to me.
Nice work. 👍
A few tests doesn't hurt, but that may be hard when we read kernel files inside the same functions as the logic tho.
machine/machine.go
Outdated
} | ||
|
||
hugePagesInfo = append(hugePagesInfo, info.HugePagesInfo{ | ||
PageSize: pageSize / 1024, // Convert to kB. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this size already in kB
?
A test case would be suitable I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the normal hugepage variant they use the pageSize directly, and since it is parsed from hugepages-<xyz>kB
that should be correct.
Line 90 in f5e7ddf
PageSize: pageSize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems we should be able to reuse the common logic for how we parse /sys/mm/kernel/hugepages with what we parse here so we dont do it differently. can we consolidate on a single approach and reuse what can be reused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree that we should reuse as much as possible. This is handled together in the Kernel, so imo. it makes perfect sense to just implement this once in cadvisor. The only thing that differ between them is the filepath, and maybe a bit different error handling, but the implementation should be similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@odinuge
In the normal hugepage variant they use the pageSize directly, and since it is parsed from hugepages-<xyz>kB that should be correct.
=> It makes sense!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@derekwaynecarr @odinuge
it seems we should be able to reuse the common logic
can we consolidate on a single approach
Yes, I agree that we should reuse as much as possible.
=> I totally agree with it. :)
IMO, I think the below strategy is the best option for us.
- Move
GetHugePagesInfo()
frominfo.go
tomachine.go
.
(It means that we will use existing way to parse hugepage capacity, not a regex.) - Then make it(
machine.GetHugePagesInfo()
) to take filepath and to parse capacity from a given file forreusability
.
(it will bemachine.GetHugePagesInfo(path string)
) - Now, we can reuse it(
machine.GetHugePagesInfo(path string)
) in both ofinfo.go
andmachine.go
. - In
info.go
,machine.GetHugePagesInfo(path string)
will be used to parse capacity at the machine level. - in 'machine.go',
machine.GetHugePagesInfo(path string)
will be used to parse capacity at the NUMA node level.
How about it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed duplicated logic.
machine/machine.go
Outdated
return nil, nil | ||
} | ||
|
||
for _, file := range files { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we cannot reuse some of the code from
Line 66 in f5e7ddf
for _, st := range files { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we consolidate how we parse capacity in GetHugePagesInfo
in info.go with how we would parse here? I am fine with the regex approach, I just want us to have a common approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left the comment for this below.
machine/machine.go
Outdated
@@ -191,6 +194,45 @@ func getNodeIdFromCpuBus(cpuBusPath string, threadId int) (int, error) { | |||
return nodeId, nil | |||
} | |||
|
|||
/* Look for per-node hugepages info using node id */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you switch to normal //
comments here to keep the consistency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is intended for the consistency.
Author(of this PR) and I referred below functions in this file.
Those functions have similar usage(parsing info from the given path) and they have comment style like below.
/* Look for sysfs cpu path containing core_id */
/* Such as: sys/bus/cpu/devices/cpu0/topology/core_id */
Take look below functions.
Line 135 in f5e7ddf
func getCoreIdFromCpuBus(cpuBusPath string, threadId int) (int, error) { |
Line 159 in f5e7ddf
func getNodeIdFromCpuBus(cpuBusPath string, threadId int) (int, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes total sense to get in cAdvisor, I would like us to just avoid duplicating parsing logic for machine versus node info. Can you update so we have common logic used in both cases and address the serialization token used to map what is used for machine?
info/v1/machine.go
Outdated
Cores []Core `json:"cores"` | ||
Caches []Cache `json:"caches"` | ||
Memory uint64 `json:"memory"` | ||
HugePages []HugePagesInfo `json:"huge_pages"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be hugepages
so its consistent with serialization in MachineInfo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed it.
machine/machine.go
Outdated
@@ -191,6 +194,45 @@ func getNodeIdFromCpuBus(cpuBusPath string, threadId int) (int, error) { | |||
return nodeId, nil | |||
} | |||
|
|||
/* Look for per-node hugepages info using node id */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
machine/machine.go
Outdated
return nil, nil | ||
} | ||
|
||
for _, file := range files { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we consolidate how we parse capacity in GetHugePagesInfo
in info.go with how we would parse here? I am fine with the regex approach, I just want us to have a common approach.
machine/machine.go
Outdated
} | ||
|
||
hugePagesInfo = append(hugePagesInfo, info.HugePagesInfo{ | ||
PageSize: pageSize / 1024, // Convert to kB. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems we should be able to reuse the common logic for how we parse /sys/mm/kernel/hugepages with what we parse here so we dont do it differently. can we consolidate on a single approach and reuse what can be reused?
64be0d1
to
ae53af9
Compare
Signed-off-by: sewon.oh <[email protected]>
ae53af9
to
cb3a2be
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating. Some nit, but otherwise lgtm. It is possible to write some tests (and that would be nice), but i'll defer the decision to the google folks on whether they require it or not.
/lgtm
@@ -191,6 +193,45 @@ func getNodeIdFromCpuBus(cpuBusPath string, threadId int) (int, error) { | |||
return nodeId, nil | |||
} | |||
|
|||
// GetHugePagesInfo returns information about pre-allocated huge pages | |||
func GetHugePagesInfo(hugepagesDirectory string) ([]info.HugePagesInfo, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a comment about what directories this function wants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: This makes it easy to write some simple tests.
There are already some tests her: https://github.com/google/cadvisor/blob/master/machine/topology_test.go. The coverage of unit tests is low, but some tests never hurt I guess. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your review. I updated.
Co-Authored-By: Odin Ugedal <[email protected]>
31199d8
to
455eaa2
Compare
/retest |
455eaa2
to
3814c69
Compare
Signed-off-by: sewon.oh <[email protected]>
3814c69
to
255729d
Compare
Thank you for making the update and adding tests! /lgtm |
@dashpole are you able to help merge this? |
Done. |
Purpose of this PR: Add the new field which describes the number of pre-allocate hugepages per NUMA node to v1 node structure.
The node structure is used to describe the resources of the NUMA node, inside the MachineInfo structure.
The new field will describe the number of pre-allocate hugepages per NUMA node.
The purpose of the new filed is to put additional information for future usage.
The future usage means guaranteeing alignment of node resources such as CPU, GPU, NIC, and hugepages by Kubelet component like Topology Manager.
Signed-off-by: sewon.oh [email protected]