Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow running without syscall tracepoints #2990

Merged
merged 2 commits into from
Oct 9, 2024

Conversation

umanwizard
Copy link
Contributor

@umanwizard umanwizard commented Oct 8, 2024

Currently we fail to run if tracepoints can't be set on various sub-events of /sys/kernel/debug/tracing/events/syscalls/ . Currently these are used to detect various offsets when BTF is not available.

Since some kernel configurations don't have syscall tracepoints, but do have BTF info, things might still work if we just keep running here.

I have confirmed that with this patch, the agent starts up and seems to work normally on a kernel without CONFIG_FTRACE_SYSCALLS, which fails without this patch.

Currently we fail to run if tracepoints can't be set on various
sub-events of /sys/kernel/debug/tracing/events/syscalls/ . Currently
these are only used to detect process exit, so with
--disable-tracepoints we will not detect that as quickly/precisely,
leading to bloat in the BPF maps. Without tracepoints, we will only
detect exited processes every pidCleanupInterval (by default 5
minutes), through a scan of /proc. Thus setting this flag is not
recommended unless necessary.

It might be necessary because certain obscure distributions apparently
run with a configuration that doesn't expost these syscall tracepoint
events.

I have confirmed that with this patch (and setting the flag), the
agent starts up and seems to work normally on a kernel without
CONFIG_FTRACE_SYSCALLS, which fails without this patch.
@umanwizard
Copy link
Contributor Author

umanwizard commented Oct 8, 2024

Actually, I don't think we need to stop the sched tracepoint at all, only the syscalls one. And AFAICT in Parca we gain nothing from that (i.e., we weren't even using it, so probing is pointless). So let's do this in a different way.

@umanwizard umanwizard changed the title Allow disabling tracepoints Allow running without syscall tracepoints Oct 8, 2024
1. Get rid of the flag; unconditionally continue attempting to run
   even if ProbeTracepoint fails.

2. Don't gate attaching the scheduling monitor; this is available even
   when syscall probes aren't.
@umanwizard
Copy link
Contributor Author

The Otel agent has exactly the same issue: open-telemetry/opentelemetry-ebpf-profiler#173

We don't necessarily need to do the same thing as them, since this is entirely controlled within our own main function. But just in case, let's wait for the outcome of that discussion before merging this, in case I missed something.

@umanwizard
Copy link
Contributor Author

There were no strong objections on the OTel issue so I think this can be merged.

@umanwizard umanwizard merged commit 952267b into parca-dev:main Oct 9, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants