Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract AWS payload tags #7811

Merged
merged 66 commits into from
Nov 4, 2024
Merged

Extract AWS payload tags #7811

merged 66 commits into from
Nov 4, 2024

Conversation

ygree
Copy link
Contributor

@ygree ygree commented Oct 18, 2024

What Does This Do

AWS Payload Tag Extraction

  • Support for AWS SDK Java 2 only
  • Disabled by default
  • Once enabled, only extracts tags for AWS ApiGateway, EventBridge, Sqs, Sns, S3, Kinesis, unless explicitly enabled for other services
  • Extracts tags for all JSON data it encounters
  • Provides some general and service-specific redaction rules
  • Support for user-defined redaction rules

Adds functionality to capture AWS JSON response/request payload and convert it to span tags while applying default and user defined redaction rules. It also tries to expand any possibly embedded JSON-like string and binary data.

Aside from the original PR, which was just capturing a raw response/request body and trying to parse it as JSON, it takes a different approach to tackling this by using SdkPojo for top-level field traversal, which allows us to:

  • Reduce resource requirements by avoiding buffering response and parsing top-level request/response bodies altogether
  • Improve coverage for more AWS services by supporting protocols other than JSON

Motivation

Having the ability to see data that was passed into an HTTPS payload from one service to the other.
Help customers (especially those who are using serverless architecture) reproduce and resolve bugs in their serverless compute code or configuration.

Additional Notes

Supersedes #7312

Jira ticket: AIDM-174

NodeJS: DataDog/dd-trace-js#4309
Python: DataDog/dd-trace-py#10642

Example 1: S3

TODO

Example 2: Sso

(manually enabled with custom redaction rules)
TODO

Contributor Checklist

Jira ticket: [PROJ-IDENT]

Remove all extra dependencies for JsonPath logic.
Avoid Json materialization, traverse using event-based Moshi JsonReader.
…nsumption. Add support for reading array of bytes.
@ygree ygree marked this pull request as ready for review November 1, 2024 16:21
@ygree ygree requested review from a team as code owners November 1, 2024 16:21
@ygree ygree requested a review from dougqh November 1, 2024 16:21
@ygree ygree removed the tag: do not merge Do not merge changes label Nov 1, 2024
@ygree ygree added this to the 1.42.0 milestone Nov 1, 2024
@ygree ygree requested a review from amarziali November 1, 2024 16:52
@ygree ygree enabled auto-merge (squash) November 1, 2024 19:46
@ygree ygree changed the title AWS Payload Tagging AWS Payload Tag Extraction Nov 1, 2024
@ygree ygree disabled auto-merge November 1, 2024 20:16
@ygree ygree enabled auto-merge (squash) November 1, 2024 20:16
@ygree ygree disabled auto-merge November 1, 2024 20:52
@ygree ygree enabled auto-merge (squash) November 1, 2024 20:52
requestSensitivePaths.removeAll(commonSensitivePaths);
responseSensitivePaths.removeAll(commonSensitivePaths);

System.out.println("\nCommon sensitive paths:\n" + String.join("\n", commonSensitivePaths));
Copy link
Collaborator

@amarziali amarziali Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those sysout might be removed or converted to different logging? Edit: perhaps is on purpose since it's on a internal utility

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tool is internal and is for extracting redaction rules from AWS JSON schemas. Kept this mostly for reference and in case we want to automate this in the future.

@ygree ygree merged commit 90483bb into master Nov 4, 2024
101 checks passed
@ygree ygree deleted the ygree/aws-payload-tagging-2 branch November 4, 2024 07:47
@PerfectSlayer PerfectSlayer changed the title AWS Payload Tag Extraction Extract AWS payload tags Nov 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
inst: aws sdk AWS SDK instrumentation type: enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants