-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix unwinder loading on 6.8 kernels #2667
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀
Wow, interesting.
Happy to merge when CI is green!
Is the failure of the native unwinder integration tests on arm64 expected? That should have worked, but the rest could fail. |
No, it's not expected. I need to look into that, thanks. |
See PR parca-dev#2667 for details about why this is needed, as well as the mailing list thread https://lore.kernel.org/bpf/[email protected]/ . The approach tried in that PR caused a regression on older kernels; hopefully, this one will work while also being less confusing.
See comments, and mailing list thread linked therein, for details. Before this change, recent kernels (6.8 and later) cannot verify the native unwinder, because improvements in the precision of register bounds tracking cause an explosion in state space resulting in exceeding the limit of 1 million instructions processed per load. This commit works around that issue, as the comment in the code explains.
This lets us inline the function without letting clang/llvm optimize out the double xor.
OK, this finally. @kakkoyun and @brancz , can you please have another look? We can't use function calls; everything has to be inlined in order to work on old kernels. But if we let clang/llvm inline the function, it will optimize it away... I solved this problem by just using inline volatile asm, which the compiler won't optimize. It's an ugly solution, but it seems to work. |
See PR #2667 for details about why this is needed, as well as the mailing list thread https://lore.kernel.org/bpf/[email protected]/ . The approach tried in that PR caused a regression on older kernels; hopefully, this one will work while also being less confusing.
See comments, and mailing list thread linked therein, for details. Before this change, recent kernels (6.8 and later) cannot verify the native unwinder, because improvements in the precision of register bounds tracking cause an explosion in state space resulting in exceeding the limit of 1 million instructions processed per load.
This commit works around that issue, as the comment in the code explains.