-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Musl support #2605
Musl support #2605
Conversation
d2caf14
to
36030d4
Compare
ac62dcd
to
355909b
Compare
pkg/objectfile/object_file.go
Outdated
@@ -30,7 +30,7 @@ import ( | |||
// ObjectFile represents an executable or library file. | |||
// It handles the lifetime of the underlying file descriptor. | |||
type ObjectFile struct { | |||
p *Pool | |||
Pool *Pool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we export this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ignore, WIP temporary hack.
pkg/stack/unwind/unwind_table.go
Outdated
@@ -200,37 +203,84 @@ func (ptb *UnwindTableBuilder) PrintTable(writer io.Writer, file *objectfile.Obj | |||
} | |||
|
|||
func ReadFDEs(file *objectfile.ObjectFile) (frame.FrameDescriptionEntries, elf.Machine, error) { | |||
// Try the normal path, if it fails try the alternatives and go with them if they |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It worth checking our debuginfo package. Especially the Find
. I believe we can reuse some code pieces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the tip! I especially want to nail the CRC checking before landing this.
cb1f1bb
to
996430a
Compare
996430a
to
09a0791
Compare
df30637
to
48c6449
Compare
48c6449
to
e1127bc
Compare
82cccd3
to
2748eb0
Compare
fbc4815
to
4b9ea4f
Compare
159894e
to
a298bcc
Compare
@umanwizard can you review this? Thanks! |
9802b36
to
bf59f6f
Compare
7c639ec
to
06fbe08
Compare
01b606a
to
175fe98
Compare
bpf/unwinders/native.bpf.c
Outdated
FIND_UNWIND_CHUNK_NOT_FOUND = 5, | ||
FIND_UNWIND_CHUNK_NOT_FOUND_FOR_PC = 6, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you leave a comment explaining what the difference is between these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
LOG("[error] could not find chunk for adjusted ip=0x%llx, mapping idx %d, mapping exe id 0x%llx", adjusted_pc, index, | ||
proc_info->mappings[index].executable_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make sure this still verifies on x86 5.4. IIRC this was exactly the logging I had to remove in order to get it to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to work on ubuntu 20.04 LTS (5.14 kernel). Are you sure you weren't just fixing printfs with more than 3 args?
@@ -1402,7 +1411,7 @@ int entrypoint(struct bpf_perf_event_data *ctx) { | |||
if (!is_debug_enabled_for_thread(per_process_id)) { | |||
bump_unwind_total_filter_misses(); | |||
BUMP_UNWIND_FAILED_COUNT(per_process_id, missed_filter); | |||
LOG("[debug] pid %u didn't match filter, ignoring.", per_process_id); | |||
// LOG("[debug] pid %u didn't match filter, ignoring.", per_process_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this commented out now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too verbose
// We use pc=0 as a sentinel so we can't have that, however some debug | ||
// files have FDE's with offset 0, usually empty or tiny but not always. | ||
// There's probably a better way to filter these but this works, we | ||
// panic over in maps.setUnwindTableForMapping if any start at 0. | ||
// An example is the alpine "ld-musl-x86_64.so.1.debug". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you say more about this / give an example / post some relevant output of objdump -WF
and nm
? What function in musl does that fde correspond to? Wouldn't that mean the process is expecting to be able to map things a the zero page?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to repro it and couldn't, I wonder if it was from parsing .debug_frame wrong. We'll see what more testing shows.
// and external debuginfo's. | ||
var debugLinkFDEs frame.FrameDescriptionEntries | ||
if uc.finder != nil { | ||
debugPath, err := uc.finder.Find(context.TODO(), root, file) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why context.TODO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well we should have a legit context here but we don't, this is a bread crumb to remember to wire in a real context.
pkg/stack/unwind/unwind_table.go
Outdated
if err != nil { | ||
level.Debug(uc.logger).Log("msg", "no .debug file found for ", "exe", exe, "err", err) | ||
} else if debugPath != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we check specifically that this is an expected type of error, so we can avoid swallowing possibly real errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
pkg/stack/unwind/unwind_table.go
Outdated
secName = ".debug_frame" | ||
if err != nil { | ||
if isUnexpected(err) { | ||
level.Warn(uc.logger).Log("msg", "error reading FDEs from .ebug_frame section in debuglink", "err", err, "exe", exe, "debuglink", debugPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
level.Warn(uc.logger).Log("msg", "error reading FDEs from .ebug_frame section in debuglink", "err", err, "exe", exe, "debuglink", debugPath) | |
level.Warn(uc.logger).Log("msg", "error reading FDEs from .debug_frame section in debuglink", "err", err, "exe", exe, "debuglink", debugPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch!
I still don’t have approval rights, but it looks good to me overall, so merge at your discretion! |
Add support for pulling symbols from debug link that is available locally, if debug link can't be found unwinding may fail. Future work may have the agent to pull debuginfo from parca server and allow dwarf unwinding w/o debug link files available locally. If the debug link file doesn't match the CRC report an error and don't use it. Fix finder so that it doesn't return debuglink matches pointing back at the exe itself, this just leads to spurious crc failures. Don't use -v so we only get logs on test failures report a successful sample when we miss on dwarf symbol Fix arm64/musl gilstate addressing Skip alpine 3.3 on arm64 better integration test helper
Add support for pulling unwind info from debug link that is available
locally, if debug link can't be found dwarf unwinding will fail. Future work
may have the agent to pull debuginfo from parca server and allow
dwarf unwinding w/o debug link files available locally.
If the debug link file doesn't match the CRC report an error and don't
use it.
Fix finder so that it doesn't return debuglink matches pointing back at
the exe itself, this just leads to spurious crc failures.