Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(node) Sanitize URLs in Span descriptions and breadcrumbs (PII) #7206

Closed
wants to merge 7 commits into from

Conversation

aldenquimby
Copy link
Contributor

@aldenquimby aldenquimby commented Feb 16, 2023

Fixes #6389

My understanding of getsentry/develop#773 is that scrubbing will happen server side, but getsentry/sentry-python#1876 seems to do the scrubbing client side? #5347 could be related. Even if we do want client-side scrubbing, this PR could be a good first step.

First time contributing

Before submitting a pull request, please take a look at our Contributing guidelines and verify:

  • If you've added code that should be tested, please add tests.
  • Ensure your code lints and the test suite passes (yarn lint) & (yarn test).

@github-actions
Copy link
Contributor

Replay SDK metrics 🚀

    Plain +Sentry +Replay
Revision Value Value Diff Ratio Value Diff Ratio
LCP This PR b66559f 86.09 ms 117.91 ms +31.82 ms +36.96 % 139.80 ms +53.71 ms +62.39 %
Previous 1cf8988 70.81 ms 94.27 ms +23.46 ms +33.13 % 124.62 ms +53.81 ms +75.99 %
CLS This PR b66559f 0.06 ms 0.06 ms +0.00 ms +0.06 % 0.06 ms +0.00 ms +0.16 %
Previous 1cf8988 0.06 ms 0.06 ms -0.00 ms -0.39 % 0.06 ms -0.00 ms -0.46 %
CPU This PR b66559f 17.83 % 19.17 % +1.33 pp +7.48 % 25.67 % +7.84 pp +43.95 %
Previous 1cf8988 12.30 % 12.01 % -0.29 pp -2.37 % 17.10 % +4.80 pp +39.02 %
JS heap avg This PR b66559f 1.94 MB 1.99 MB +47.23 kB +2.43 % 2.86 MB +921.01 kB +47.45 %
Previous 1cf8988 1.94 MB 1.99 MB +48.93 kB +2.52 % 2.87 MB +929.88 kB +47.85 %
JS heap max This PR b66559f 2.3 MB 2.56 MB +256.16 kB +11.13 % 3.35 MB +1.05 MB +45.70 %
Previous 1cf8988 2.3 MB 2.57 MB +267.17 kB +11.60 % 3.35 MB +1.05 MB +45.56 %
netTx This PR b66559f 0 B 0 B 0 B n/a 2.27 kB +2.27 kB n/a
Previous 1cf8988 0 B 0 B 0 B n/a 2.22 kB +2.22 kB n/a
netRx This PR b66559f 0 B 0 B 0 B n/a 41 B +41 B n/a
Previous 1cf8988 0 B 0 B 0 B n/a 41 B +41 B n/a
netCount This PR b66559f 0 0 0 n/a 1 +1 n/a
Previous 1cf8988 0 0 0 n/a 1 +1 n/a
netTime This PR b66559f 0.00 ms 0.00 ms 0.00 ms n/a 107.25 ms +107.25 ms n/a
Previous 1cf8988 0.00 ms 0.00 ms 0.00 ms n/a 91.07 ms +91.07 ms n/a

Previous results on branch: develop

RevisionLCPCLSCPUJS heap avgJS heap maxnetTxnetRxnetCountnetTime
1cf8988+53.81 ms-0.00 ms+4.80 pp+929.88 kB+1.05 MB+2.22 kB+41 B+1+91.07 ms
68655e3+72.60 ms+0.00 ms+7.90 pp+922.72 kB+1.04 MB+2.22 kB+41 B+1+109.40 ms
a8449de+58.27 ms-0.00 ms+7.12 pp+927.42 kB+1.05 MB+2.2 kB+41 B+1+98.31 ms
79babe9+58.69 ms-0.00 ms+4.40 pp+927.46 kB+1.06 MB+2.23 kB+41 B+1+103.20 ms
5359ba9+55.62 ms-0.00 ms+4.29 pp+935.26 kB+1.05 MB+2.2 kB+41 B+1+79.05 ms

*) pp - percentage points - an absolute difference between two percentages.
Last updated: Thu, 16 Feb 2023 20:05:52 GMT

Copy link
Member

@antonpirker antonpirker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the one question I asked in a comment it looks good.

packages/node/test/integrations/http.test.ts Outdated Show resolved Hide resolved
@cleptric
Copy link
Member

My understanding of getsentry/develop#773 is that scrubbing will happen server side, but getsentry/sentry-python#1876 seems to do the scrubbing client side? #5347 could be related. Even if we do want client-side scrubbing, this PR could be a good first step.

It's a bit of both. One of the goals of the RFC was to enable better server-side scrubbing by moving the data into a structured format. And it will also allow us to apply better client-side scrubbing in the future.

@aldenquimby aldenquimby requested review from cleptric and antonpirker and removed request for cleptric and antonpirker February 17, 2023 19:41
@joshacheson
Copy link

Just wanted to note that my org is really looking forward to this landing as well! Thanks @aldenquimby and Sentry folks for putting this work together.

@aldenquimby
Copy link
Contributor Author

@antonpirker @cleptric is there anything else you'd like to see here to get this over the finish line? Appreciate the help, we're excited to remove sensitive info from our perf traces

@cleptric cleptric requested review from a team, lforst and Lms24 and removed request for a team March 5, 2023 20:01
@cleptric
Copy link
Member

cleptric commented Mar 5, 2023

@aldenquimby Assigned the FE team, which takes care of the JS SDK.

@Lms24 Lms24 self-assigned this Mar 6, 2023
Copy link
Member

@Lms24 Lms24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @aldenquimby, thanks for opening this PR, really appreciate it! Your changes LGTM. I left a few comments regarding your questions and a small remark. Once they are addressed we'll merge this and include it in the next release.

packages/node/src/integrations/http.ts Outdated Show resolved Hide resolved
packages/node/src/integrations/http.ts Outdated Show resolved Hide resolved
packages/node/src/integrations/utils/http.ts Show resolved Hide resolved
@Lms24 Lms24 changed the title feat(pii) Sanitize URLs in Span descriptions and breadcrumbs feat(node) Sanitize URLs in Span descriptions and breadcrumbs (PII) Mar 8, 2023
@aldenquimby
Copy link
Contributor Author

@Lms24 all changes made! You could argue we shouldn't deprecate rawUrl at all, and that user-controlled span/attachment filters should have full access to unsanitized URLs. If you'd prefer that I can do another small refactor to remove the deprecation and reuse a little more of the code. As it stands I tried to make rawUrl easy to delete in a future version

@Lms24
Copy link
Member

Lms24 commented Mar 20, 2023

If you'd prefer that I can do another small refactor to remove the deprecation and reuse a little more of the code. As it stands I tried to make rawUrl easy to delete in a future version.

Hey @aldenquimby you're right about the deprecation being unnecessary. We discussed this internally and we think passing the raw url to the filter function is fine, also in the future. So yes, please if you have time, go ahead and make that change :)

Sorry for the back and forth and thanks for your patience! :)

gggritso added a commit to getsentry/sentry that referenced this pull request Mar 22, 2023
As of getsentry/sentry-javascript#7206 the
Node.js SDK will not append search parameters to the span's description
or the URL. Instead, the search parameters will be in the `http.query`
key.
@Lms24
Copy link
Member

Lms24 commented Mar 30, 2023

Closing in favour of #7667 where I just added some comments and removed the deprecation.
Thanks again for opening this PR @aldenquimby. We'll make sure, you're mentioned as a contributor in the changelog :)

@Lms24 Lms24 closed this Mar 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sanitize sensitive data from URLs sent to Sentry
6 participants