-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AspNetCore Instrumentation] Scrubbing of sensitive data in URLs #1747
Comments
Related to #1791. |
Which instrumentation is recording this by default? Its explicit opt-in, like below: |
Oh, my bad. Apologies for false alarm. Deleting the comment to avoid confusion from the post. |
@cijothomas I created a new issue with a related feature ask #1794 |
@cijothomas i would like to take this issue, can you assign it to me? |
Assigned. Could you create sub-issues and propose the exact changes? The behavior of instrumentations are based on the semantic conventions, (which are still experimental), so we need to be in-sync with semantic conventions changes. |
Some thoughts I have on this...
The issue with this is we don't always have the route. The route data is added after ASP.NET or ASP.NET Core runs its routing logic. That isn't always guaranteed to run (static files, middleware which completes before routing) and it isn't always guaranteed to match a route (404s, etc). In my own code, I add logic in the
I like the idea of adding an option to not add path/url but if we don't have a route (see above), those spans would be pretty useless. Maybe the option should be a callback user can specify to control the route resolution and tagging logic completely (if they so desire)? |
We found an alternate solution that doesn't require changes in the instrumentation library and can still achieve minimal performance impact which goes like this
Note: Regex aren't used for any of the processing to avoid any perf overhead. |
@cijothomas Sorry about the wall of text 😅 Redating sensitive data from
|
Not recommended.
Processor is an SDK concept, and instrumentations should only have API dependency. Left couple of quick comments above. I am not actively working on this, so tagging @vishweshbankwar to further help. |
Related: #2191 @kaspertygesen Thanks for your interest in this. We are currently working on multiple changes on the instrumentation libraries. We have not evaluated this particular issue in detail. Once we do, I will reach out to you. |
Hi, is there any update on this? Thanks |
Have you folks considered leveraging Microsoft.Extensions.Compliance.Redaction for this work? Would be nice to have that abstraction being used vs creating something custom just for OTEL. As a bonus, it also promotes the standard redaction library and potentially pushes for more features in there (if something needed for OTEL is misssing for instance). |
Interesting but I am not sure a dependency like that will be accepted. |
If you know the data is sensitive, why would you expose it in the URL in the first place? |
The recommended way to redact is using the OpenTelemetry Collector. We now have redaction of the QueryString parameters. Further, it's not recommended that sensitive data is in the path anyway since that ends up in Access logs for webservers. I don't think there's anything more to do on this issue. |
Is there a way to opt out? |
|
thanks, @CodeBlanch! I asked and then found the corresponding environment variables myself. I asked a question about them here: open-telemetry/semantic-conventions#860 (comment) |
Feature Request
Make the instrumentation library more flexible to emit the URL information in a much more flexible manner to allow consumers to write a processor to scrub the sensitive information with minimal impact on performance.
Is your feature request related to a problem?
AspNetCore instrumentation library emits http.path and http.url tags containing the URL. These URLs can contains sensitive information e.g. something like, https;//www.contoso.com/users/[email protected]/chat/chat-12345
In this example this URL contains the user's id which is sensitive information that could lead to privacy issues. Currently the only way to scrub this information is to write a processor which would need to run complex regex on these URLs to detect and scrub sensitive information which takes a significant hit at performance.
Note: Similar problem exists for other instrumentation libraries and we would like to have similar support for those too.
Describe the solution you'd like:
I would like to propose the following
Another point worth discussing here: While in this solution consumers of the library will write their processors and the processors can be customized to have own redaction library and redaction mechanism, it would be good to see if there is a way to standardize on the redaction mechanism somehow.
Describe alternatives you've considered.
Additional Context
Not leaking sensitive data is critical for our enterprise services so not scrubbing isn't an option.
The text was updated successfully, but these errors were encountered: