-
Notifications
You must be signed in to change notification settings - Fork 565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
i#6486 kernel tracing: Include BPF JIT code in kcore dump #6619
Conversation
Fixes missing instruction encodings for some kernel code execution captured using Intel-PT. The root-cause seemed to be that JIT code executed by the kernel, eBPF code in this case, does not have entries in /proc/kallsyms, so our kcore dump logic did not include them. This fix looks for BPF related symbols in /proc/kallsyms and includes them in the copied regions from /proc/kcore. Note that BPF JIT symbols are not included in /proc/kallsyms by default. One needs to set /proc/sys/net/core/bpf_jit_harden and /proc/sys/net/core/bpf_jit_kallsyms appropriately (see https://docs.kernel.org/admin-guide/sysctl/net.html#proc-sys-net-core-network-core-options for more details). Added this suggestion to documentation. Tested PT tracing related tests locally on a machine that supports Intel-PT: $ ctest -R 'drpttracer|drcacheoff.kernel' ... Start 213: code_api|client.drpttracer_SUDO-test [sudo] password for sharmaabhinav: 1/5 Test #213: code_api|client.drpttracer_SUDO-test ..................... Passed 4.29 sec Start 412: code_api|tool.drcacheoff.kernel.simple_SUDO 2/5 Test #412: code_api|tool.drcacheoff.kernel.simple_SUDO .............. Passed 4.66 sec Start 413: code_api|tool.drcacheoff.kernel.opcode-mix_SUDO 3/5 Test #413: code_api|tool.drcacheoff.kernel.opcode-mix_SUDO .......... Passed 4.71 sec Start 414: code_api|tool.drcacheoff.kernel.syscall-mix_SUDO 4/5 Test #414: code_api|tool.drcacheoff.kernel.syscall-mix_SUDO ......... Passed 4.59 sec Start 415: code_api|tool.drcacheoff.kernel.invariant-checker_SUDO 5/5 Test #415: code_api|tool.drcacheoff.kernel.invariant-checker_SUDO ... Passed 5.75 sec 100% tests passed, 0 tests failed out of 5 Issue: #6486
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the address set to module region code.
On doing more stress testing: unfortunately the decode errors do not go away completely even after this fix, but they have |
Fixes drmemtrace kernel trace libipt post-processing failures caused by missing instruction encodings for some kernel code execution captured using Intel-PT. The root-cause seems to be that JIT code executed by the kernel, BPF code in this case, does not have entries in `/proc/modules`. So, our kcore dump logic did not include them. This fix looks for BPF related symbols in `/proc/kallsyms` and includes them in the copied regions from `/proc/kcore`. Note that BPF JIT symbols are not included in `/proc/kallsyms` by default. One needs to set `/proc/sys/net/core/bpf_jit_harden` and `/proc/sys/net/core/bpf_jit_kallsyms` appropriately (see https://docs.kernel.org/admin-guide/sysctl/net.html#proc-sys-net-core-network-core-options for more details). Added this suggestion to documentation. It may be better to not automatically make this possibly-too-intrusive change to the user's machine in cmake. This is probably fine because the issue is not widespread (not reproduced on public Linux distributions). Tested PT tracing related tests locally on a machine that supports Intel-PT: ``` $ ctest -R 'drpttracer|drcacheoff.kernel' ... Start 213: code_api|client.drpttracer_SUDO-test [sudo] password for sharmaabhinav: 1/5 Test #213: code_api|client.drpttracer_SUDO-test ..................... Passed 4.29 sec Start 412: code_api|tool.drcacheoff.kernel.simple_SUDO 2/5 Test #412: code_api|tool.drcacheoff.kernel.simple_SUDO .............. Passed 4.66 sec Start 413: code_api|tool.drcacheoff.kernel.opcode-mix_SUDO 3/5 Test #413: code_api|tool.drcacheoff.kernel.opcode-mix_SUDO .......... Passed 4.71 sec Start 414: code_api|tool.drcacheoff.kernel.syscall-mix_SUDO 4/5 Test #414: code_api|tool.drcacheoff.kernel.syscall-mix_SUDO ......... Passed 4.59 sec Start 415: code_api|tool.drcacheoff.kernel.invariant-checker_SUDO 5/5 Test #415: code_api|tool.drcacheoff.kernel.invariant-checker_SUDO ... Passed 5.75 sec 100% tests passed, 0 tests failed out of 5 ``` Unfortunately the decode errors do not go away completely even after this fix, but they have become very less frequent now (tool.kernel.simple in release build failed after 40 successful runs with this fix, which failed every run before). Issue: #6486
Fixes drmemtrace kernel trace libipt post-processing failures caused by missing
instruction encodings for some kernel code execution captured using Intel-PT.
The root-cause seems to be that JIT code executed by the kernel, BPF code in
this case, does not have entries in
/proc/modules
. So, our kcore dump logicdid not include them. This fix looks for BPF related symbols in
/proc/kallsyms
and includes them in the copied regions from
/proc/kcore
.Note that BPF JIT symbols are not included in
/proc/kallsyms
by default. Oneneeds to set
/proc/sys/net/core/bpf_jit_harden
and/proc/sys/net/core/bpf_jit_kallsyms
appropriately (seehttps://docs.kernel.org/admin-guide/sysctl/net.html#proc-sys-net-core-network-core-options
for more details). Added this suggestion to documentation. It may be better to
not automatically make this possibly-too-intrusive change to the user's
machine automatically in cmake. This is probably fine because the issue is not
widespread (not reproduced on public Linux distributions).
Tested PT tracing related tests locally on a machine that supports Intel-PT:
Unfortunately the decode errors do not go away completely even after this fix,
but they have become very less frequent now (tool.kernel.simple in release build
failed after 40 successful runs after this fix, which failed every run before).
Issue: #6486