Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Musl support #2605

Merged
merged 7 commits into from
May 31, 2024
Merged

Musl support #2605

merged 7 commits into from
May 31, 2024

Conversation

gnurizen
Copy link
Contributor

@gnurizen gnurizen commented Mar 7, 2024

Add support for pulling unwind info from debug link that is available
locally, if debug link can't be found dwarf unwinding will fail. Future work
may have the agent to pull debuginfo from parca server and allow
dwarf unwinding w/o debug link files available locally.

If the debug link file doesn't match the CRC report an error and don't
use it.

Fix finder so that it doesn't return debuglink matches pointing back at
the exe itself, this just leads to spurious crc failures.

@gnurizen gnurizen requested a review from a team as a code owner March 7, 2024 22:50
@gnurizen gnurizen force-pushed the musl-debug-test branch 2 times, most recently from d2caf14 to 36030d4 Compare March 7, 2024 23:00
@gnurizen gnurizen marked this pull request as draft March 7, 2024 23:00
@gnurizen gnurizen force-pushed the musl-debug-test branch 3 times, most recently from ac62dcd to 355909b Compare March 8, 2024 18:07
@@ -30,7 +30,7 @@ import (
// ObjectFile represents an executable or library file.
// It handles the lifetime of the underlying file descriptor.
type ObjectFile struct {
p *Pool
Pool *Pool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we export this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignore, WIP temporary hack.

@@ -200,37 +203,84 @@ func (ptb *UnwindTableBuilder) PrintTable(writer io.Writer, file *objectfile.Obj
}

func ReadFDEs(file *objectfile.ObjectFile) (frame.FrameDescriptionEntries, elf.Machine, error) {
// Try the normal path, if it fails try the alternatives and go with them if they
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It worth checking our debuginfo package. Especially the Find. I believe we can reuse some code pieces.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the tip! I especially want to nail the CRC checking before landing this.

@gnurizen gnurizen force-pushed the musl-debug-test branch 4 times, most recently from cb1f1bb to 996430a Compare March 13, 2024 06:19
@gnurizen gnurizen changed the title [WIP] Musl support Musl support Mar 29, 2024
@gnurizen gnurizen force-pushed the musl-debug-test branch 5 times, most recently from df30637 to 48c6449 Compare March 30, 2024 11:36
@gnurizen gnurizen force-pushed the musl-debug-test branch 3 times, most recently from 82cccd3 to 2748eb0 Compare April 18, 2024 11:18
@gnurizen gnurizen marked this pull request as ready for review April 18, 2024 11:32
@gnurizen gnurizen force-pushed the musl-debug-test branch 2 times, most recently from fbc4815 to 4b9ea4f Compare April 22, 2024 14:30
@gnurizen gnurizen force-pushed the musl-debug-test branch 4 times, most recently from 159894e to a298bcc Compare May 14, 2024 17:34
@gnurizen
Copy link
Contributor Author

@umanwizard can you review this? Thanks!

@gnurizen gnurizen force-pushed the musl-debug-test branch 5 times, most recently from 9802b36 to bf59f6f Compare May 20, 2024 20:37
@gnurizen gnurizen force-pushed the musl-debug-test branch 3 times, most recently from 7c639ec to 06fbe08 Compare May 23, 2024 20:25
@gnurizen gnurizen force-pushed the musl-debug-test branch 5 times, most recently from 01b606a to 175fe98 Compare May 30, 2024 17:22
@gnurizen gnurizen marked this pull request as draft May 30, 2024 17:35
@gnurizen gnurizen marked this pull request as ready for review May 30, 2024 17:43
Comment on lines 115 to 118
FIND_UNWIND_CHUNK_NOT_FOUND = 5,
FIND_UNWIND_CHUNK_NOT_FOUND_FOR_PC = 6,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you leave a comment explaining what the difference is between these?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +694 to +697
LOG("[error] could not find chunk for adjusted ip=0x%llx, mapping idx %d, mapping exe id 0x%llx", adjusted_pc, index,
proc_info->mappings[index].executable_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure this still verifies on x86 5.4. IIRC this was exactly the logging I had to remove in order to get it to work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to work on ubuntu 20.04 LTS (5.14 kernel). Are you sure you weren't just fixing printfs with more than 3 args?

@@ -1402,7 +1411,7 @@ int entrypoint(struct bpf_perf_event_data *ctx) {
if (!is_debug_enabled_for_thread(per_process_id)) {
bump_unwind_total_filter_misses();
BUMP_UNWIND_FAILED_COUNT(per_process_id, missed_filter);
LOG("[debug] pid %u didn't match filter, ignoring.", per_process_id);
// LOG("[debug] pid %u didn't match filter, ignoring.", per_process_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this commented out now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too verbose

Comment on lines 155 to 159
// We use pc=0 as a sentinel so we can't have that, however some debug
// files have FDE's with offset 0, usually empty or tiny but not always.
// There's probably a better way to filter these but this works, we
// panic over in maps.setUnwindTableForMapping if any start at 0.
// An example is the alpine "ld-musl-x86_64.so.1.debug".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you say more about this / give an example / post some relevant output of objdump -WF and nm ? What function in musl does that fde correspond to? Wouldn't that mean the process is expecting to be able to map things a the zero page?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to repro it and couldn't, I wonder if it was from parsing .debug_frame wrong. We'll see what more testing shows.

// and external debuginfo's.
var debugLinkFDEs frame.FrameDescriptionEntries
if uc.finder != nil {
debugPath, err := uc.finder.Find(context.TODO(), root, file)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why context.TODO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well we should have a legit context here but we don't, this is a bread crumb to remember to wire in a real context.

Comment on lines 274 to 280
if err != nil {
level.Debug(uc.logger).Log("msg", "no .debug file found for ", "exe", exe, "err", err)
} else if debugPath != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we check specifically that this is an expected type of error, so we can avoid swallowing possibly real errors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

secName = ".debug_frame"
if err != nil {
if isUnexpected(err) {
level.Warn(uc.logger).Log("msg", "error reading FDEs from .ebug_frame section in debuglink", "err", err, "exe", exe, "debuglink", debugPath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
level.Warn(uc.logger).Log("msg", "error reading FDEs from .ebug_frame section in debuglink", "err", err, "exe", exe, "debuglink", debugPath)
level.Warn(uc.logger).Log("msg", "error reading FDEs from .debug_frame section in debuglink", "err", err, "exe", exe, "debuglink", debugPath)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

@umanwizard
Copy link
Contributor

I still don’t have approval rights, but it looks good to me overall, so merge at your discretion!

Add support for pulling symbols from debug link that is available
locally, if debug link can't be found unwinding may fail. Future work
may have the agent to pull debuginfo from parca server and allow
dwarf unwinding w/o debug link files available locally.

If the debug link file doesn't match the CRC report an error and don't
use it.

Fix finder so that it doesn't return debuglink matches pointing back at
the exe itself, this just leads to spurious crc failures.

Don't use -v so we only get logs on test failures

report a successful sample when we miss on dwarf symbol

Fix arm64/musl gilstate addressing

Skip alpine 3.3 on arm64

better integration test helper
@gnurizen gnurizen enabled auto-merge (squash) May 31, 2024 11:37
@gnurizen gnurizen merged commit 63f7cc9 into parca-dev:main May 31, 2024
11 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants