-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use struct with pid and Go routine addr for Go BPF maps #1182
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1182 +/- ##
===========================================
+ Coverage 65.50% 81.05% +15.54%
===========================================
Files 135 136 +1
Lines 11492 11501 +9
===========================================
+ Hits 7528 9322 +1794
+ Misses 3295 1646 -1649
+ Partials 669 533 -136
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
697984a
to
5237475
Compare
I'm missing something. Shouldn't be each goroutine address unique in the system? I would have expected that the goroutine pointer contains a unique virtual memory address |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great Marc! There are few other maps we need to fix, but please let me know if you don't want to use this PR and you want to follow-up with another instead and I'll flip this to Approve.
@@ -32,60 +32,80 @@ char __license[] SEC("license") = "Dual MIT/GPL"; | |||
// Then it is retrieved in the return uprobes and used to know the HTTP call duration as well as its | |||
// attributes (method, path, and status code). | |||
|
|||
typedef struct goroutine_key { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to rename this data structure to something like go_addr_key. Because we'll be using it for more than goroutines. I'll show some examples below. Essentially, we need to replace all maps in the Go support where we use a Go address (or value) straight, to be keyed by the PID too.
@@ -56,21 +56,21 @@ struct { | |||
|
|||
struct { | |||
__uint(type, BPF_MAP_TYPE_LRU_HASH); | |||
__type(key, void *); // key: goroutine | |||
__type(value, void *); // the transport * | |||
__type(key, goroutine_key_t); // key: goroutine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to do the same for ongoing_grpc_transports
and ongoing_streams
.
__type(value, grpc_client_func_invocation_t); | ||
__uint(max_entries, MAX_CONCURRENT_REQUESTS); | ||
} ongoing_grpc_client_requests SEC(".maps"); | ||
|
||
struct { | ||
__uint(type, BPF_MAP_TYPE_LRU_HASH); | ||
__type(key, void *); // key: pointer to the request goroutine | ||
__type(key, goroutine_key_t); // key: pointer to the request goroutine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ongoing_grpc_header_writes
needs to be fixed up too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for grpc_framer_invocation_map
too.
@@ -36,8 +36,8 @@ struct { | |||
|
|||
struct { | |||
__uint(type, BPF_MAP_TYPE_LRU_HASH); | |||
__type(key, void *); // goroutine | |||
__type(value, topic_t); // topic info | |||
__type(key, goroutine_key_t); // goroutine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to wrap the key for produce_traceparents
and ongoing_produce_messages
.
bpf/go_redis.h
Outdated
__type(key, void *); // key: goroutine id | ||
__type(value, void *); // the *Conn | ||
__type(key, goroutine_key_t); // key: goroutine id | ||
__type(value, void *); // the *Conn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to do the same for http2_req_map
and framer_invocation_map
.
@@ -25,8 +25,8 @@ struct { | |||
|
|||
struct { | |||
__uint(type, BPF_MAP_TYPE_LRU_HASH); | |||
__type(key, void *); // key: goroutine id | |||
__type(value, u32); // correlation id | |||
__type(key, goroutine_key_t); // key: goroutine id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to do the same to kafka_requests
.
I see, thanks for the feedback. Yeah, I think it'd be better in a different PR |
You are right, but Go can grown and shrink the allocated memory, so theoretically, can't we have stale data from arena of the virtual address space that belonged to process A and then that same memory now is part of process B? Like one process reduces the available heap and frees that page, then another ends up taking it. Since we have a single tracer now, we can potentially access stale data, since a lot of our maps are LRU. I mean it's very unlikely, but it might happen. Same goes for process ending, and new one reclaiming the memory or part of it, e.g restart. We must do it for the streamID keyed variables, this is a bug anyway even today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Please don't merge this yet, it seems to be tracking/comitting binary .o files. |
2f76f1c
to
627cff4
Compare
After merging #1158, we change behaviour for Go tracer to be shared among different processes.
Currently we are using the goroutine address as key to store data in BPF maps.
This PR changes the key to use the goroutine address and PID of the process in order to avoid
colissions when storing data to maps.
Fixes #1159