-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for kernel lock analysis #114
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks very good. I left minor suggestions inline.
-
Please update README.md in the project's root directory to include the 'lock' command (similar to other commands).
-
Currently, only the txt report is readable. Please add an HTML renderer for equivalent readability.
We can consider other formats and more report format improvements in the future. Ideally, we will add Insights to help interpret the large amount of data presented to the user. -
The Lock Contention field isn't working for me. Likely due to the build/version of perf being used by PerfSpect. See tools/Makefile.
Lock Contention: Error: switch `b' is not available because no BUILD_BPF_SKEL=1
Usage: perf lock contention [<options>]
-b, --use-bpf use BPF program to collect lock contention stats
(not built-in because no BUILD_BPF_SKEL=1)
@harp-intel very appreciated for your review! I will resolve all your suggestions in the coming days and add test cases accordingly. Regarding to the perf lock contention, you are right, the -b option might not be supported. I should check the capability and by pass the perf lock contention if the feature missing on perf. Thanks. |
You are definitely right on the report part. I'd like to further improve it based on the feedbacks and also my personal experiences. For example, it's possible to see the lock cycle overall percentage and the lock cycle distribution with call stacks. The off-cpu analysis is also very useful when combined with lock contention information. As you pointed out, it could be further improvement on the reporting part, right now we just begin with a very rough feature that useable but not quite convenient. |
@TianyouLi Well done. I appreciate the updates. PerfSpect builds and packages a specific version of perf. This perf doesn't support the bpf option. So, let's try to make PerfSpect build and use perf that has the required bpf support. |
It should work with BUILD_BPF_SKEL=1 in tools/perf makefile tools/Makefile (make LDFLAGS="-static --static" BUILD_BPF_SKEL=1). The problem is I can not test it in my environment, I've never get the full build environment work correctly. I checked with the source, the current version of perf has already contains the support on BPF. But it relies on the build environment a lot to enable it. Since this feature could survive without perf lock contention -bv, would it be possible to have this feature merged and then deal with the perf tool issue? |
@harp-intel, on my dev box without any perfspect related , i tried the perf build with make LDFLAGS="-static --static" BUILD_BPF_SKEL=1, the following error appears. I don't think my build environment is aligned with yours. |
https://github.com/intel/PerfSpect/actions/runs/12229325293/job/34108928079?pr=116 It's not a hard dependencies. Would it be possible to merge this feature first? |
cf9ddcd
to
1e0ffe7
Compare
As system contains more and more cores, it would be usuful to be able to analyze the kernel lock overhead that impact the overall system scalability. Here added the perfspect lock comment to collect system wide hotspot, c2c and lock contention information, that will be helpful experienced performance engineers to have a chance to look into lock related problems.
Example Command:
./perfspect lock
collect hotspot, c2c and lock contention at the whole system in 10 seconds, with sample frequency as 11
./perfspect lock --duration 30 --frequency 99
collect hotspot, c2c and lock contention at the whole system in 30 seconds, with sample frequency as 99
Example Output:
The output will be the perf command output with text format.