Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various fixes for corrupt gRPC URLs and wrong request directions #930

Merged
merged 23 commits into from
Jun 17, 2024

Conversation

grcevski
Copy link
Contributor

This PR fixes a few edge cases in how we were handling gRPC requests:

  1. When parsing the HTTP2/gRPC headers we were passing in the full buffer of the eBPF array and we didn't consider the actual length. We could read beyond the end of the actual buffer length, find stale data on the ring buffer and report wrong URLs.
  2. We sort of relied on the fact that normal server ports are not in the ephemeral port range. However, some gRPC services like to use 50053 as port and this caused us to wrongly report client calls as server calls, if the ephemeral port is lower than 50053. I developed two solutions:
    a. For kprobes, we keep connections sortted, but now I track the original destination port when we parse the connection information. Before we send the traces to userspace we validate if this original port matches what we expect for client/server, and if not we swap the connection values.
    b. For Go, since we already know what's server and what's client, we don't sort the connection info and only sort it before we store it in the trace_map (which we share with kprobes for black-box context propagation). We swap the original connection tuples before we send the Go server traces (HTTP, HTTP2 or gRPC) to userspace, because userspace expects the destination port to be the server port.
  3. I added support for tracking multiple segments for reading the msghdr pointer. We assumed that this is always the first segment, but I found an Elixir example that had one empty iovec.
  4. I fixed a small Redis issue with the buffer reading some of the data, where the line started with \n. (HMSET redis command)

@grcevski grcevski requested review from mariomac and marctc as code owners June 13, 2024 21:47
@codecov-commenter
Copy link

codecov-commenter commented Jun 13, 2024

Codecov Report

Attention: Patch coverage is 78.78788% with 7 lines in your changes missing coverage. Please review.

Project coverage is 79.34%. Comparing base (5c20f6f) to head (e3e5c46).
Report is 1 commits behind head on main.

Files Patch % Lines
pkg/internal/ebpf/common/tcp_detect_transform.go 62.50% 1 Missing and 2 partials ⚠️
pkg/internal/ebpf/bhpack/hpack.go 33.33% 1 Missing and 1 partial ⚠️
pkg/internal/ebpf/common/http2grpc_transform.go 71.42% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #930      +/-   ##
==========================================
+ Coverage   72.28%   79.34%   +7.06%     
==========================================
  Files         130      131       +1     
  Lines       10026    10125      +99     
==========================================
+ Hits         7247     8034     +787     
+ Misses       2166     1582     -584     
+ Partials      613      509     -104     
Flag Coverage Δ
integration-test 55.06% <42.42%> (-0.03%) ⬇️
k8s-integration-test 59.25% <12.12%> (-0.05%) ⬇️
oats-test 35.53% <54.54%> (-0.21%) ⬇️
unittests 47.31% <48.48%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@mariomac mariomac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow! so many changes 🚀🚀🚀

Copy link
Contributor

@mariomac mariomac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@grcevski grcevski merged commit c04a1e7 into grafana:main Jun 17, 2024
6 checks passed
@grcevski grcevski deleted the corrupt_grpc_bufs branch June 17, 2024 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants