Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NPM] Add Enhanced TLS Tags #31464

Merged
merged 60 commits into from
Dec 23, 2024
Merged

[NPM] Add Enhanced TLS Tags #31464

merged 60 commits into from
Dec 23, 2024

Conversation

akarpz
Copy link
Contributor

@akarpz akarpz commented Nov 26, 2024

What does this PR do?

Adds eBPF side logic to parse the TLS hello packet payloads and extract the:

  • client supported versions
  • server chosen version
  • server chosen cipher suite

In order to do this the existing protocol classification tail call routing needed to be modified to:

  • add two new tail calls
  • reorder the layers (API, Application, Encryption) so that Encryption comes first. This allows the program to exit early before the other layers if encryption is classified.

A few other reviewer notes:

  • We emit the cipher suite IDs to the backend without hydrating them to the full strings, the backend will do this to keep the agent memory usage low. In addition it's easier for the backend to add new mappings, and mark certain ciphers as "weak" so users can filter on those connections. ticket for that work: https://datadoghq.atlassian.net/browse/NPM-3750
  • there is no windows side implementation right now, but it should be added shortly and is much lower complexity than this change.

eBPF Complexity Changes

Details are below, but on average no program's complexity rose by more than 3.5%. In the worst case on amazonlinux 5.4 one program's complexity rose ~9%.

Motivation

https://datadoghq.atlassian.net/browse/NPM-3678

Describe how to test/QA your changes

Possible Drawbacks / Trade-offs

Additional Notes

Load Test Results

Available Here
Very small changes to CPU/memory within the margin of error of what we're accustomed to with the TCP load test. Most importantly, the new map tls_enhanced_tags stays small even after many hours.

Staging dogfooding (oddish-c)

Test was run over a day, link here

NPM protocol tags for this cluster during the same period indicate no loss in resolution.

@github-actions github-actions bot added component/system-probe team/networks long review PR is complex, plan time to review it labels Nov 26, 2024
@agent-platform-auto-pr
Copy link
Contributor

agent-platform-auto-pr bot commented Dec 20, 2024

Uncompressed package size comparison

Comparison with ancestor b97c90616b68239053e33f46f4db6900f2c59f4a

Diff per package
package diff status size ancestor threshold
datadog-dogstatsd-x86_64-rpm 2.42MB ⚠️ 78.64MB 76.22MB 10.00MB
datadog-dogstatsd-x86_64-suse 2.42MB ⚠️ 78.64MB 76.22MB 10.00MB
datadog-dogstatsd-amd64-deb 2.42MB ⚠️ 78.57MB 76.15MB 10.00MB
datadog-dogstatsd-arm64-deb 1.92MB ⚠️ 55.77MB 53.85MB 10.00MB
datadog-iot-agent-x86_64-rpm 0.43MB ⚠️ 113.40MB 112.97MB 10.00MB
datadog-iot-agent-x86_64-suse 0.43MB ⚠️ 113.40MB 112.97MB 10.00MB
datadog-iot-agent-amd64-deb 0.43MB ⚠️ 113.33MB 112.90MB 10.00MB
datadog-iot-agent-aarch64-rpm 0.26MB ⚠️ 108.87MB 108.61MB 10.00MB
datadog-iot-agent-arm64-deb 0.26MB ⚠️ 108.80MB 108.54MB 10.00MB
datadog-heroku-agent-amd64-deb -1.68MB 504.88MB 506.56MB 70.00MB
datadog-agent-aarch64-rpm -62.69MB 944.14MB 1006.83MB 140.00MB
datadog-agent-arm64-deb -62.76MB 934.90MB 997.66MB 140.00MB
datadog-agent-x86_64-rpm -84.39MB 1198.20MB 1282.59MB 140.00MB
datadog-agent-x86_64-suse -84.39MB 1198.20MB 1282.59MB 140.00MB
datadog-agent-amd64-deb -84.46MB 1188.94MB 1273.40MB 140.00MB

Decision

⚠️ Warning

@akarpz akarpz requested a review from guyarb December 22, 2024 22:35
@akarpz
Copy link
Contributor Author

akarpz commented Dec 23, 2024

/merge

@dd-devflow
Copy link

dd-devflow bot commented Dec 23, 2024

Devflow running: /merge

View all feedbacks in Devflow UI.


2024-12-23 20:16:58 UTC ℹ️ MergeQueue: pull request added to the queue

The median merge time in main is 35m.


2024-12-23 20:53:08 UTC ℹ️ MergeQueue: This merge request was merged

@dd-mergequeue dd-mergequeue bot merged commit d3df21f into main Dec 23, 2024
301 checks passed
@dd-mergequeue dd-mergequeue bot deleted the akarpowich/tls_tail_calls branch December 23, 2024 20:53
@github-actions github-actions bot added this to the 7.62.0 milestone Dec 23, 2024
louis-cqrl pushed a commit that referenced this pull request Dec 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/system-probe long review PR is complex, plan time to review it qa/done QA done before merge and regressions are covered by tests team/networks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants