Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS payload report as tags #4131

Closed
wants to merge 6 commits into from
Closed

Conversation

jbertran
Copy link
Contributor

@jbertran jbertran commented Mar 4, 2024

What does this PR do?

This PR introduces AWS payload reporting as tags.

Configuration

We introduce 3 new environment variables:

  • DD_TRACE_CLOUD_REQUEST_PAYLOAD_TAGGING defines the activation of the feature for requests, values being either "all" (no additional redactionor a comma-separated list of JSONPath queries identifying payload paths to be replaced with the value"redacted"`.
  • DD_TRACE_CLOUD_RESPONSE_PAYLOAD_TAGGING
  • DD_TRACE_CLOUD_PAYLOAD_TAGGING_MAX_DEPTH sets the depth after which we stop creating tags from a payload

Behaviour

With the feature activated, aws-sdk calls to the enabled plugins will create additional tags representing the payload, with the following modifications:

  1. Paths known to be PII/sensitive are hard-coded to be redacted (service by service)
  2. Paths known to be user-input data likely to contain JSON are expanded
  3. Paths matching the JSONPath queries passed by the environment variables or corresponding runtime tracer configuration are redacted

This PR only provides the feature for SNS as a first service, but the framework introduced here only requires slight adaptations of a given AWS service plugin to make it available, as well as the addition of the static PII fields configuration.

New dependencies

Adding jsonpath seems safe given the constraints it imposes on its scripts, even if I don't expect scripts to be used. Using rfdc is more questionable - we need a deep clone because JSONPath apply can only do side-effects, and we must not modify the payload, but maybe something simpler works.

Remaining work

In some cases, JSONPath filter expressions are not sufficient to do what we want.

For example, setting attributes for entities (like SNS topics) requires setting an AttributeName and an AttributeValue at top-level of the JSON payload. Ideally, we should be able to redact the AttributeValue only when the AttributeName matches a disallowed value (for example KMSMasterKeyId). JSONPath syntax does not allow such a complex query, so we need to also specify custom logic hooks that do not go through JSONPath to redact data.

Motivation

This come from:

  1. the desire to have real-world data correlated with traces
  2. the fact that AWS upstream API is well-defined and well-documented, helping us avoid PII/sensitive data pitfalls
  3. the existence of such a mechanism in datadog-lambda-js, but only scoped to lambda function input and output. This provides the same level of information, with additional redaction granularity, for AWS plugins.

Plugin Checklist

Additional Notes

Security

Datadog employees:

  • If this PR touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.
  • This PR doesn't touch any of that.

Unsure? Have a question? Request a review!

Copy link

github-actions bot commented Mar 4, 2024

Overall package size

Self size: 6.21 MB
Deduped: 66.13 MB
No deduping: 67.08 MB

Dependency sizes

name version self size total size
@datadog/native-iast-taint-tracking 1.7.0 16.71 MB 16.72 MB
@datadog/native-appsec 7.1.0 14.37 MB 14.38 MB
@datadog/pprof 5.1.0 8.83 MB 9.68 MB
protobufjs 7.2.6 2.77 MB 7.07 MB
jsonpath 1.1.1 403.29 kB 2.8 MB
@opentelemetry/core 1.22.0 876.57 kB 2.53 MB
@datadog/native-iast-rewriter 2.2.3 2.19 MB 2.34 MB
@datadog/native-metrics 2.0.0 898.77 kB 1.3 MB
@opentelemetry/api 1.8.0 1.21 MB 1.21 MB
import-in-the-middle 1.7.3 67.62 kB 768.58 kB
pprof-format 2.0.7 588.12 kB 588.12 kB
msgpack-lite 0.1.26 201.16 kB 272.06 kB
opentracing 0.14.7 194.81 kB 194.81 kB
lru-cache 7.18.3 133.92 kB 133.92 kB
semver 7.6.0 94.24 kB 124.64 kB
@datadog/sketches-js 2.1.0 109.9 kB 109.9 kB
lodash.sortby 4.7.0 75.76 kB 75.76 kB
ipaddr.js 2.1.0 60.23 kB 60.23 kB
ignore 5.3.1 51.46 kB 51.46 kB
int64-buffer 0.1.10 49.18 kB 49.18 kB
shell-quote 1.8.1 44.96 kB 44.96 kB
istanbul-lib-coverage 3.2.0 29.34 kB 29.34 kB
rfdc 1.3.1 25.21 kB 25.21 kB
tlhunter-sorted-set 0.1.0 24.94 kB 24.94 kB
limiter 1.1.5 23.17 kB 23.17 kB
dc-polyfill 0.1.4 23.1 kB 23.1 kB
retry 0.13.1 18.85 kB 18.85 kB
node-abort-controller 3.1.1 16.89 kB 16.89 kB
jest-docblock 29.7.0 8.99 kB 12.76 kB
crypto-randomuuid 1.0.0 11.18 kB 11.18 kB
path-to-regexp 0.1.7 6.78 kB 6.78 kB
koalas 1.0.2 6.47 kB 6.47 kB
methods 1.1.2 5.29 kB 5.29 kB
module-details-from-path 1.0.3 4.47 kB 4.47 kB

🤖 This report was automatically generated by heaviest-objects-in-the-universe

Copy link

codecov bot commented Mar 4, 2024

Codecov Report

Attention: Patch coverage is 96.42857% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 85.23%. Comparing base (b496eae) to head (2f1d675).
Report is 99 commits behind head on master.

Files Patch % Lines
packages/dd-trace/src/payload-tagging/index.js 86.36% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4131      +/-   ##
==========================================
- Coverage   85.25%   85.23%   -0.02%     
==========================================
  Files         247      250       +3     
  Lines       10848    10962     +114     
  Branches       33       33              
==========================================
+ Hits         9248     9344      +96     
- Misses       1600     1618      +18     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jbertran jbertran force-pushed the jbertran/aws-payload-tagging branch from ec01bc4 to 41f2cec Compare March 5, 2024 10:10
@jbertran jbertran force-pushed the jbertran/aws-payload-tagging branch 4 times, most recently from 017814d to ea5cec7 Compare March 5, 2024 13:28
@jbertran jbertran force-pushed the jbertran/aws-payload-tagging branch 2 times, most recently from a9bf916 to b9d4462 Compare March 6, 2024 11:07
@jbertran jbertran force-pushed the jbertran/aws-payload-tagging branch from b9d4462 to 2f1d675 Compare March 6, 2024 11:09
@tlhunter tlhunter self-assigned this May 9, 2024
@tlhunter tlhunter mentioned this pull request May 15, 2024
8 tasks
@tlhunter
Copy link
Member

  386 passing (2m)
  19 pending
  20 failing

  1) Sns
       with aws-sdk >=2.3.0 (2.3.0)
         with payload tagging
           adds request and response payloads as flattened tags:
     AssertionError: expected { …(19) } to have property 'aws.request.body.TopicArn'
      at /home/runner/work/dd-trace-js/dd-trace-js/packages/datadog-plugin-aws-sdk/test/sns.spec.js:120:32
      at handler (packages/dd-trace/test/plugins/agent.js:201:16)
      at /home/runner/work/dd-trace-js/dd-trace-js/packages/dd-trace/test/plugins/agent.js:134:7

@tlhunter
Copy link
Member

Using rfdc is more questionable - we need a deep clone because JSONPath apply can only do side-effects, and we must not modify the payload, but maybe something simpler works.

Node.js v18.0+ has structuredClone() but v4.x of the tracer still supports v16.0+.

@tlhunter
Copy link
Member

I'm going to close this PR as work is continuing in #4309.

@tlhunter tlhunter closed this May 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants