-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change the PID filter implementation to use bloom filter #851
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #851 +/- ##
==========================================
- Coverage 79.63% 76.16% -3.48%
==========================================
Files 117 117
Lines 8050 8044 -6
==========================================
- Hits 6411 6127 -284
- Misses 1231 1496 +265
- Partials 408 421 +13
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing
@@ -6,15 +6,16 @@ | |||
#include "bpf_core_read.h" | |||
#include "pid_types.h" | |||
|
|||
#define MAX_CONCURRENT_PIDS 3000 // estimate: 1000 concurrent processes (including children) * 3 namespaces per pid | |||
#define MAX_CONCURRENT_PIDS 3001 // estimate: 1000 concurrent processes (including children) * 3 namespaces per pid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In another PR, I'd make this value configurable by the user.
Here is an example about how we do this for the Network flows:
Lines 39 to 45 in a22db74
// Key: the flow identifier. Value: the flow metrics for that identifier. | |
// The userspace will aggregate them into a single flow. | |
struct { | |
__uint(type, BPF_MAP_TYPE_LRU_PERCPU_HASH); | |
__type(key, flow_id); | |
__type(value, flow_metrics); | |
} aggregated_flows SEC(".maps"); |
beyla/pkg/internal/netolly/ebpf/tracer.go
Lines 85 to 86 in a22db74
// Resize aggregated flows map according to user-provided configuration | |
spec.Maps[aggregatedFlowsMap].MaxEntries = uint32(cacheMaxSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool!
Our existing PID filtering for kprobes used an LRU hashmap, which can easily overflow with large number of PIDs. This PR changes the underlying implementation to use a bloom filter, which will favour false positives over negative collisions in the map.
Essentially, we set a bit on a bit array for a given namespace,pid tuple. If this PID matches, we create an event. The userspace code in Go will consult the internal map if this event is possibly a false positive and filter it out.
Adding and removing processes to be tracked is more expensive now, since we rebuild the PID array from scratch each time and we update the BPF array.