-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NETOBSERV-559: use LookupAndDelete to read maps #283
Conversation
New image: It will expire after two weeks. To deploy this build, run from the operator repo, assuming the operator is running: USER=netobserv VERSION=c5b2f78 make set-agent-image |
my perf numbers look unsanely awesome I need to check with NDH |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #283 +/- ##
==========================================
- Coverage 36.74% 36.32% -0.42%
==========================================
Files 42 43 +1
Lines 3813 3879 +66
==========================================
+ Hits 1401 1409 +8
- Misses 2334 2391 +57
- Partials 78 79 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
@msherif1234 we really really really need to look at this. The perfs are insane, or I should say, our current version is really wrong, much more than I thought, with LookupAndDelete. |
@jotak: This pull request references NETOBSERV-559 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@jotak: This pull request references NETOBSERV-559 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Another set of tests still shows much much improved performances: https://docs.google.com/spreadsheets/d/1qakBaK1dk_rERO30k1cSR4W-Nn0SXW4A3lqQ1sZC4rE/edit#gid=807334756 |
pkg/ebpf/tracer.go
Outdated
ids = append(ids, id) | ||
} | ||
|
||
// Run the atomic Lookup+Delete; if new ids have been inserted in the meantime, they'll be fetched next time | ||
for _, id = range ids { | ||
if err := packetMap.LookupAndDelete(&id, &packet); err != nil { | ||
log.WithError(err).WithField("packetID", id).Warnf("couldn't delete entry") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not using BatchLookupAndDelete
directly here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess because of:
cilium/ebpf#1078
cilium/ebpf#1080
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes there's a follow-up for the batches: https://issues.redhat.com/browse/NETOBSERV-1550
// Run the atomic Lookup+Delete; if new ids have been inserted in the meantime, they'll be fetched next time | ||
for _, id = range ids { | ||
if err := flowMap.LookupAndDelete(&id, &metrics); err != nil { | ||
log.WithError(err).WithField("flowId", id).Warnf("couldn't delete flow entry") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so if u do it like this is bad ?
for iterator.Next(&id, &metrics) {
if err := flowMap.LookupAndDelete(&id, &metrics); err != nil {
log.WithError(err).WithField("flowId", id).Warnf("couldn't delete flow entry")
}
flows[id] = append(flows[id], metrics...)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't tried that one in particular but I think the same issue will come up as there's still a delete within the iteration. In the previous code, doing the Delete within the iteration resulted in screwed up keys ending up in iterating over the full 100K map entries
Thanks @jotak changes looks good to me pls run with large file with sampling of |
/LGTM |
@msherif1234 I addressed the feedback:
Also I wanted to make sure that there was no dup id anymore in the map when iterating in LookupAndDelete, so this is confirmed with the new metrics where |
/lgtm |
/approve |
/ok-to-test |
New image: It will expire after two weeks. To deploy this build, run from the operator repo, assuming the operator is running: USER=netobserv VERSION=94cfb8c make set-agent-image |
Keep legacy code for old kernels Do not base support detection on kernel version Instead, just try and fallback to legacy when relevant
New changes are detected. LGTM label has been removed. |
(rebased) |
New image: It will expire after two weeks. To deploy this build, run from the operator repo, assuming the operator is running: USER=netobserv VERSION=a0238fe make set-agent-image |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jotak The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Note this was merged despite regression test failures without QE approval |
Description
Dependencies
n/a
Checklist
If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.