Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ioredis): only serialize non sensitive arguments in db statement attribute #1052

Merged
merged 6 commits into from
Jun 15, 2022

Conversation

aptomaKetil
Copy link
Contributor

@aptomaKetil aptomaKetil commented Jun 8, 2022

Which problem is this PR solving?

The ioredis instrumentation is currently serializing the whole redis command, including potentially sensitive (or simply large) data.

Closes #1030.

Short description of the changes

Only serialize non-sensative command argument in db.statement attribute.

Checklist

  • Ran npm run test-all-versions for the edited package(s) on the latest commit if applicable.

@aptomaKetil aptomaKetil requested a review from a team June 8, 2022 14:26
@github-actions github-actions bot requested review from blumamir and naseemkullah June 8, 2022 14:26
@aptomaKetil aptomaKetil force-pushed the ioredis-update-serializer branch from 6f1528e to 9f099e6 Compare June 8, 2022 14:35
@codecov
Copy link

codecov bot commented Jun 8, 2022

Codecov Report

Merging #1052 (a2b8ffe) into main (dc0e954) will decrease coverage by 0.18%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main    #1052      +/-   ##
==========================================
- Coverage   95.91%   95.72%   -0.19%     
==========================================
  Files          13       16       +3     
  Lines         856     1005     +149     
  Branches      178      200      +22     
==========================================
+ Hits          821      962     +141     
- Misses         35       43       +8     
Impacted Files Coverage Δ
...opentelemetry-instrumentation-ioredis/src/utils.ts 100.00% <100.00%> (ø)
...tapackages/auto-instrumentations-node/src/utils.ts 97.95% <0.00%> (ø)
...try-instrumentation-ioredis/src/instrumentation.ts 91.35% <0.00%> (ø)

@aptomaKetil
Copy link
Contributor Author

I went through the full list at https://redis.io/commands, and not only is there a lot of commands where we can just serialize everything, but the remainder should very often serialize more than just the first argument (as @blumamir pointed out). I've created a map of command names to number of serializable arguments so we can deal with these on a case-by-case basis.

It's not perfect, and several commands will still be missing "safe" data that could be serialized, such as SET, which has a ton of optional arguments that come after the value. I've added [x other arguments] to the end of the statement string for this purpose, so users can get some idea of the full command.

@blumamir
Copy link
Member

blumamir commented Jun 9, 2022

Awsome @aptomaKetil , that is great work, thank you so much for doing it. I like your idea of adding [x other arguments] for the dropped values

I will look at the exact details later.

If you want (optionally), it could be great to extract this logic into another new package maybe @opentelemetry/instrumentation-redis-utils under packages directory and then share the defaultDbStatementSerializer also with redis and redis-4 instrumentations. They currently implement a serializer that only captures the command name:

const defaultDbStatementSerializer: DbStatementSerializer = cmdName => cmdName;

This PR is a great improvement on its own, so if you don't have the time or will to do this, let me know and I'll open a relevant issue

@aptomaKetil
Copy link
Contributor Author

Better create an issue for it, I won't have time to look at that right now.

Copy link
Member

@blumamir blumamir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Thanks again for fixing this
Left few optional nits

@aptomaKetil
Copy link
Contributor Author

Updated based on feedback from @blumamir and @vmarchaud.

Fallback to only serialize first argument, for added safety. Change to a whitelist approach for commands that should serialize a different subset of commands. Use regexes for subset groups to keep things short.

Copy link
Member

@vmarchaud vmarchaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@vmarchaud vmarchaud requested a review from blumamir June 10, 2022 12:03
Copy link
Member

@blumamir blumamir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aptomaKetil @vmarchaud , What do you think about changing the default behavior to capture no arguments instead of one for all commands not in the white list? I think it will be safer

@vmarchaud
Copy link
Member

What do you think about changing the default behavior to capture no arguments instead of one for all commands not in the white list? I think it will be safer

I don't think there is problem adding the command name since its already on the span name right ?

@blumamir
Copy link
Member

What do you think about changing the default behavior to capture no arguments instead of one for all commands not in the white list? I think it will be safer

I don't think there is problem adding the command name since its already on the span name right ?

If I am not mistaken, the current implementation will always record the command name, and for commands not in the white list it will additionally record the first command argument.
I think it's safest to record just the command name in this case, as we cannot tell if the first argument is a secret / large payload or not

@aptomaKetil
Copy link
Contributor Author

Updated to serialize no arguments by default.

@blumamir
Copy link
Member

Updated to serialize no arguments by default.

Amazing, thank you for addressing everything. This is great work 🥇
I'll keep this open for another day and merge it tomorrow if there are no new comments.

@blumamir blumamir changed the title feat(ioredis): only serialize first argument in redis command feat(ioredis): only serialize non sensitive arguments in db statement attribute Jun 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

improv: ioredis instrumentation are leaking data by default
4 participants