Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More documentation on CPU monitoring #118

Merged
merged 3 commits into from
Sep 9, 2024
Merged

Conversation

wbjin
Copy link
Collaborator

@wbjin wbjin commented Sep 8, 2024

  • Check if gpu_indices is empty when syncing execution so it won't try to do it when only measuring CPU
  • Add documentation on availability of RAPL, how to specify cpu_index, and how to start a docker container for monitoring CPU

@wbjin wbjin requested a review from jaywonchung September 8, 2024 18:20
@jaywonchung
Copy link
Member

Thanks for your work! In general looks good and I can do a little nitpicking myself, but there is one thing that doesn't quite align with what I experienced in one of our Optane nodes:

$ docker run --rm -it -v /sys/class/powercap:/zeus_sys/class/powercap ubuntu:latest bash
root@03fdfd11b520:/# cat /zeus_sys/class/powercap/intel-rapl:0/energy_uj
cat: '/zeus_sys/class/powercap/intel-rapl:0/energy_uj': No such file or directory

$ docker run --rm -it -v /sys/devices/virtual/powercap:/zeus_sys/devices/virtual/powercap -v /sys/class/powercap:/zeus_sys/class/powercap ubuntu:latest bash
root@4d662a3c13e3:/# cat /zeus_sys/class/powercap/intel-rapl:0/energy_uj
23737296043

The above is what I just ran and copy-pasted. So mounting something from /sys/devices also seemed necessary. Could you look into this?

@jaywonchung
Copy link
Member

Offline discussion -- mounting /sys/class/powercap/intel-rapl is enough as Docker will resolve symlinks.

@jaywonchung
Copy link
Member

@wbjin Pushed some changes -- could you take a look?

@wbjin
Copy link
Collaborator Author

wbjin commented Sep 9, 2024

LGTM!

@jaywonchung jaywonchung merged commit 78106db into master Sep 9, 2024
2 checks passed
@jaywonchung jaywonchung deleted the cpu-documentation branch September 9, 2024 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants