-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability to read trace id from goroutine labels of Go processes #2574
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
bpf/unwinders/go_traceid.h
Outdated
typedef __s32 s32; | ||
typedef __u32 u32; | ||
typedef __s64 s64; | ||
typedef __u64 u64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we repeat all these things defined in basic_types.h?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whops, yes! A leftover from my standalone development.
} | ||
u64 bucket_count = 1 << log_2_bucket_count; | ||
void *label_buckets; | ||
res = bpf_probe_read(&label_buckets, sizeof(label_buckets), labels_map_ptr + 16); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we declare a map struct and collapse these 3 reads into one? Would probably turn a 8 byte read, followed by 1 byte read, followed by 8 byte read into a memcpy I'm guessing? Probably a wash performance wise but maybe smaller code? Feel free to file under "thing for Tom to investigate".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it might be possible to optimize, but the bpf stack is only 512bytes, which already caused me to have to move one of these to be stored in a per-cpu bpf map instead
bpf/unwinders/native.bpf.c
Outdated
unwind_state->stack_key.trace_id[12] = 0; | ||
unwind_state->stack_key.trace_id[13] = 0; | ||
unwind_state->stack_key.trace_id[14] = 0; | ||
unwind_state->stack_key.trace_id[15] = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would think a bzero or memset call would be more efficient here but I haven't investigated it.
80a2677
to
98af3a6
Compare
Why?
It's already possible to attach trace ids to profiling data with instrumented profilers, but it would be great if we could also create the connection between distributed tracing and profiling data with collection from eBPF. Well this is exactly that.
With the trace ID added to profiled stacks it's possible to view all profiling data of a request in one flamegraph/iciclegraph, which makes it much easier to identify bottlenecks of a request than having to view it on a per process basis.
What?
Attempt to read the
otel.traceid
goroutine label from Go processes. Plus a flag to enable it, by default this will be disabled as it significantly increases the amount of data produced.How?
Go processes store the current goroutine in thread local store. From there this reads the
g
(aka goroutine) struct, then them
(the actual operating system thread) of that goroutine, and finallycurg
(current goroutine).This chain is necessary because
getg().m.curg
points to the current user g assigned to the thread (curg == getg()
when not on the system stack).curg
may be nil if there is no userg
, such as when running in the scheduler.Test Plan
Tested with a test binary.