Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing db.statement / db.operation sensitive data sanitization option #1543

Open
3 of 7 tasks
nemoshlag opened this issue Dec 27, 2022 · 8 comments
Open
3 of 7 tasks

Comments

@nemoshlag
Copy link
Member

nemoshlag commented Dec 27, 2022

According to semantic-conventions, a DB instrumentation should have the option to sanitize sensitive data from either db.statement or db.operation attributes. Consider using instrumentation.dbapi to reduce instrumentation-specific logic & tests. This issue seems relevant for the following instrumentations:

13042023 Edit: Updating this issue with database specifications to add db.statement attribution only if it's sanitized.
image
https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/database.md#call-level-attributes

@avzis
Copy link
Contributor

avzis commented Jan 10, 2023

I think that the default behavior should be to sanitize the information,
and give an option to display the information if needed.

What do you think?
for example: #1512

@BuffaloWill
Copy link

Should psycopg2 be included in this list? db.statement already looks sanitized (maybe from the parameterized query?). For example:

{
  "value": "INSERT INTO \"db_model\" (\"title\") VALUES (%s) RETURNING \"db_model\".\"id\"",
  "key": "db.statement"
}

In my case I would like to see the raw db.statement instead of the sanitized one.

@nemoshlag
Copy link
Member Author

Since psycopg2 uses dbapi I think you can achieve raw db.statement by setting capture_parameters=True

@BuffaloWill
Copy link

BuffaloWill commented Mar 29, 2023

@nemoshlag
Copy link
Member Author

Yes, it needs to be added on psycopg2 side, similar to the way enable_commenter does. Here's how it looks from dbapi side.

@BuffaloWill
Copy link

Should I create a separate issue for this?

This is how I expected it to work in my code:

Psycopg2Instrumentor().instrument(capture_parameters=True)

@nemoshlag
Copy link
Member Author

Pls do, I'll add it to the aggregated list

@tammy-baylis-swi
Copy link
Contributor

Hi, I've got 2 questions:

  1. If we add db.statement only if it's sanitized, does it make sense if we always sanitize and always set db.statement?
  2. For the opt-in of adding original values, should the values be the full, unsanitized statement or a list of params passed in? Would it depend on the instrumentor?

For example, let's say a database client uses psycopg2 connection cursor to make (a) a parameterized query and (b) a hard-coded query:

# (a)
statement_a = "SELECT * FROM city WHERE id = %s"
city_id = "1818"
cursor.execute(statement_a, [city_id])

# (b)
statement_b = "SELECT * FROM city WHERE id = 1818"
cursor.execute(statement_b)

For both queries, I would expect db.statement to always be set and always sanitized so that it would be "SELECT * FROM city WHERE id = %s" for both (a) and (b). Or maybe a different substitution like "SELECT * FROM city WHERE id = 0".

If the user has opted into retaining the original values, what would they be? Would it be an extraction from (b) so that both (a) and (b) would set "original" parameters ["1818"]? Or would it be an interpolation for (a) so that both (a) and (b) set "original" statement "SELECT * FROM city WHERE id = 1818"? Or something else?

Looking at some existing code: In Python DBAPI instrumentor it captures db.statement.parameters as args (in this example it'd be ["1818"] and only for (a)). In the Python mongodb instrumentor we're capturing the full statement -- I don't know mongo very well so maybe there's a reason for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants