-
Notifications
You must be signed in to change notification settings - Fork 894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revise and extend semantic conventions for databases #575
Changes from 17 commits
ccd4a07
31efbd9
3207ed5
b2d335e
f999d31
b86ec24
050039d
ccefcfc
4b6e149
096a7a6
685bc4e
dd22c79
26ceb4f
3c22ce2
15d7c76
6f517d4
163b20c
72c6a19
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,24 +1,190 @@ | ||
# Semantic conventions for database client calls | ||
|
||
For database client call the `SpanKind` MUST be `Client`. | ||
<!-- Re-generate TOC with `markdown-toc --no-first-h1 -i` --> | ||
|
||
Span `name` should be set to low cardinality value representing the statement | ||
executed on the database. It may be stored procedure name (without argument), sql | ||
statement without variable arguments, etc. When it's impossible to get any | ||
meaningful representation of the span `name`, it can be populated using the same | ||
value as `db.instance`. | ||
<!-- toc --> | ||
|
||
Note, Redis, Cassandra, HBase and other storage systems may reuse the same | ||
attribute names. | ||
- [Connection-level attributes](#connection-level-attributes) | ||
* [Notes and well-known identifiers for `db.system`](#notes-and-well-known-identifiers-for-dbsystem) | ||
* [Connection-level attributes for specific technologies](#connection-level-attributes-for-specific-technologies) | ||
- [Call-level attributes](#call-level-attributes) | ||
* [Call-level attributes for specific technologies](#call-level-attributes-for-specific-technologies) | ||
- [Examples](#examples) | ||
* [MySQL](#mysql) | ||
* [Redis](#redis) | ||
* [MongoDB](#mongodb) | ||
|
||
<!-- tocstop --> | ||
|
||
**Span kind:** MUST always be `CLIENT`. | ||
|
||
The **span name** SHOULD be set to a low cardinality value representing the statement executed on the database. | ||
It may be a stored procedure name (without arguments), SQL statement without variable arguments, operation name, etc. | ||
When it's otherwise impossible to get any meaningful span name, `db.name` or the tech-specific database name MAY be used. | ||
|
||
## Connection-level attributes | ||
|
||
These attributes will usually be the same for all operations performed over the same database connection. | ||
Some database systems may allow a connection to switch to a different `db.user`, for example, and other database systems may not even have the concept of a connection at all. | ||
|
||
| Attribute name | Notes and examples | Required? | | ||
| :------------- | :----------------------------------------------------------- | --------- | | ||
| `db.type` | Database type. For any SQL database, `"sql"`. For others, the lower-case database category, e.g. `"cassandra"`, `"hbase"`, or `"redis"`. | Yes | | ||
| `db.instance` | Database instance name. E.g., In java, if the jdbc.url=`"jdbc:mysql://db.example.com:3306/customers"`, the instance name is `"customers"`. | Yes | | ||
| `db.statement` | A database statement for the given database type. Note, that the value may be sanitized to exclude sensitive information. E.g., for `db.type="sql"`, `"SELECT * FROM wuser_table"`; for `db.type="redis"`, `"SET mykey 'WuValue'"`. | Yes | | ||
| `db.user` | Username for accessing database. E.g., `"readonly_user"` or `"reporting_user"` | No | | ||
| `db.url` | JDBC substring like `"mysql://db.example.com:3306"` | Yes | | ||
| `db.system` | An identifier for the database management system (DBMS) product being used. See below for a [list of well-known identifiers](#notes-and-well-known-identifiers-for-dbsystem). | Yes | | ||
| `db.connection_string` | The connection string used to connect to the database. It is recommended to remove embedded credentials. | No | | ||
| `db.user` | Username for accessing the database, e.g., `"readonly_user"` or `"reporting_user"` | No | | ||
| `net.peer.name` | Defined in the general [network attributes][]. | See below | | ||
| `net.peer.ip` | Defined in the general [network attributes][]. | See below | | ||
| `net.peer.port` | Defined in the general [network attributes][]. | See below | | ||
| `net.transport` | Defined in the general [network attributes][]. | See below | | ||
|
||
Additionally at least one of `net.peer.name` or `net.peer.ip` from the [network attributes][] is required and `net.peer.port` is recommended. | ||
At least one of `net.peer.name` or `net.peer.ip` from the [network attributes][] is required and `net.peer.port` is recommended. | ||
If using a port other than the default port for this DBMS, `net.peer.port` is required. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Instrumentation may not know anything about ports. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @yurishkuro see the response to your comment above #575 (comment) |
||
Furthermore, it is strongly recommended to add the [`net.transport`][] attribute and follow its guidelines. | ||
For in-process databases, `net.transport` MUST be set to `"inproc"`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. knowledge of in-process vs. external DB my be abstracted away from instrumentation. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @yurishkuro see the response to your comment above #575 (comment) |
||
|
||
[network attributes]: span-general.md#general-network-connection-attributes | ||
[`net.transport`]: span-general.md#nettransport-attribute | ||
|
||
### Notes and well-known identifiers for `db.system` | ||
|
||
This is a non-exhaustive list of well-known identifiers to be specified for `db.system`. | ||
|
||
If a value defined in this list applies to the DBMS to which the request is sent, this value MUST be used. | ||
If no value defined in this list is suitable, a custom value MUST be provided. | ||
This custom value MUST be the name of the DBMS in lowercase and without a version number to stay consistent with existing identifiers. | ||
Oberon00 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
It is encouraged to open a PR towards this specification to add missing values to the list, especially when instrumentations for those missing databases are written. | ||
This allows multiple instrumentations for the same database to be aligned and eases analyzing for backends. | ||
|
||
The value `other_sql` is intended as a fallback and MUST only be used if the DBMS is known to be SQL-compliant but the concrete product is not known to the instrumentation. | ||
If the concrete DBMS is known to the instrumentation, its specific identifier MUST be used. | ||
|
||
| Value for `db.system` | Product name | Note | | ||
arminru marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| :-------------------- | :------------------------ | :----------------------------- | | ||
| `"db2"` | IBM Db2 | | | ||
| `"derby"` | Apache Derby | | | ||
| `"hive"` | Apache Hive | | | ||
| `"mariadb"` | MariaDB | | | ||
| `"mssql"` | Microsoft SQL Server | | | ||
| `"mysql"` | MySQL | | | ||
| `"oracle"` | Oracle Database | | | ||
| `"postgresql"` | PostgreSQL | | | ||
| `"sqlite"` | SQLite | | | ||
| `"teradata"` | Teradata | | | ||
| `"other_sql"` | Some other SQL Database | Fallback only. See note above. | | ||
| `"cassandra"` | Cassandra | | | ||
| `"cosmosdb"` | Microsoft Azure Cosmos DB | | | ||
| `"couchbase"` | Couchbase | | | ||
| `"couchdb"` | CouchDB | | | ||
| `"dynamodb"` | Amazon DynamoDB | | | ||
| `"hbase"` | HBase | | | ||
| `"mongodb"` | MongoDB | | | ||
| `"neo4j"` | Neo4j | | | ||
| `"redis"` | Redis | | | ||
|
||
Back ends could, for example, use the provided identifier to determine the appropriate SQL dialect for parsing the `db.statement`. | ||
|
||
When additional attributes are added that only apply to a specific DBMS, its identifier SHOULD be used as a namespace in the attribute key as for the attributes in the sections below. | ||
|
||
### Connection-level attributes for specific technologies | ||
|
||
| Technology | Attribute name | Notes and examples | Required? | | ||
| ---------- | :------------- | :----------------------------------------------------------- | --------- | | ||
| Microsoft SQL Server | `db.mssql.instance_name` | The [instance name][] connecting to. This name is used to determine the port of a named instance. | See below. | | ||
arminru marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| JDBC Clients | `db.jdbc.driver_classname` | The fully-qualified class name of the [Java Database Connectivity (JDBC)][jdbc] driver used to connect, e.g., `"org.postgresql.Driver"` or `"com.microsoft.sqlserver.jdbc.SQLServerDriver"`. | No | | ||
|
||
[instance name]: https://docs.microsoft.com/en-us/sql/connect/jdbc/building-the-connection-url?view=sql-server-ver15 | ||
[jdbc]: https://docs.oracle.com/javase/8/docs/technotes/guides/jdbc/ | ||
|
||
- Microsoft SQL Server: | ||
arminru marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- If setting a `db.mssql.instance_name`, `net.peer.port` is no longer required (but still recommended if non-standard). | ||
|
||
## Call-level attributes | ||
|
||
These attributes may be different for each operation performed, even if the same connection is used for multiple operations. | ||
Usually only one `db.name` will be used per connection though. | ||
|
||
| Attribute name | Notes and examples | Required? | | ||
| :------------- | :----------------------------------------------------------- | --------- | | ||
| `db.name` | If no [tech-specific attribute](#call-level-attributes-for-specific-technologies) is defined in the list below, this attribute is used to report the name of the database being accessed. For commands that switch the database, this should be set to the target database (even if the command fails). | Yes (if applicable and no more specific attribute is defined) | | ||
| `db.statement` | The database statement being executed.. Note that the value may be sanitized to exclude sensitive information. E.g., for `db.system="other_sql"`, `"SELECT * FROM wuser_table"`; for `db.system="redis"`, `"SET mykey 'WuValue'"`. | Yes (if applicable) | | ||
| `db.operation` | The name of the operation being executed, e.g. the [MongoDB command name][] such as `findAndModify`. While it would semantically make sense to set this, e.g., to an SQL keyword like `SELECT` or `INSERT`, it is *not* recommended to attempt any client-side parsing of `db.statement` just to get this property (the back end can do that if required). | Yes, if `db.statement` is not applicable. | | ||
arminru marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
[MongoDB command name]: https://docs.mongodb.com/manual/reference/command/#database-operations | ||
|
||
In some **SQL** databases, the database name to be used for `db.name` is called "schema name". | ||
|
||
For **Redis**, the value provided for `db.statement` SHOULD correspond to the syntax of the Redis CLI. | ||
If, for example, the [`HMSET` command][] is invoked, `"HMSET myhash field1 'Hello' field2 'World'"` would be a suitable value for `db.statement`. | ||
|
||
[`HMSET` command]: https://redis.io/commands/hmset | ||
|
||
In **CouchDB**, `db.operation` should be set to the HTTP method + the target REST route according to the API reference documentation. | ||
For example, when retrieving a document, `db.operation` would be set to (literally, i.e., without replacing the placeholders with concrete values): [`GET /{db}/{docid}`][CouchDB get doc]. | ||
|
||
[CouchDB get doc]: http://docs.couchdb.org/en/stable/api/document/common.html#get--db-docid | ||
|
||
### Call-level attributes for specific technologies | ||
|
||
| Technology | Attribute name | Notes and examples | Required? | | ||
| ---------- | :------------------------ | :----------------------------------------------------------- | --------- | | ||
| Cassandra | `db.cassandra.keyspace` | The name of the keyspace being accessed. To be used instead of the generic `db.name` attribute. | Yes | | ||
| HBase | `db.hbase.namespace` | The [HBase namespace][] being accessed. To be used instead of the generic `db.name` attribute. | Yes | | ||
| Redis | `db.redis.database_index` | The index of the database being accessed as used in the [`SELECT` command], provided as an integer. To be used instead of the generic `db.name` attribute. | Yes, if other than the default database (`0`) | | ||
| MongoDB | `db.mongodb.collection` | The collection being accessed within the database stated in `db.name`. | Yes | | ||
|
||
[HBase namespace]: https://hbase.apache.org/book.html#_namespace | ||
[`SELECT` command]: https://redis.io/commands/select | ||
|
||
## Examples | ||
|
||
### MySQL | ||
|
||
| Key | Value | | ||
| :---------------------- | :----------------------------------------------------------- | | ||
| Span name | `"SELECT * FROM orders WHERE order_id = ?"` | | ||
| `db.system` | `"mysql"` | | ||
| `db.connection_string` | `"Server=shopdb.example.com;Database=ShopDb;Uid=billing_user;TableCache=true;UseCompression=True;MinimumPoolSize=10;MaximumPoolSize=50;"` | | ||
| `db.user` | `"billing_user"` | | ||
| `net.peer.name` | `"shopdb.example.com"` | | ||
| `net.peer.ip` | `"192.0.2.12"` | | ||
| `net.peer.port` | `3306` | | ||
| `net.transport` | `"IP.TCP"` | | ||
| `db.name` | `"ShopDb"` | | ||
| `db.statement` | `"SELECT * FROM orders WHERE order_id = 'o4711'"` | | ||
| `db.operation` | not set | | ||
|
||
### Redis | ||
|
||
In this example, Redis is connected using a unix domain socket and therefore the connection string and `net.peer.ip` are left out. | ||
Furthermore, `db.name` is not specified as there is no database name in Redis. | ||
If a database other than the default database (index 0) is used, it would make sense to specify its index number as `db.name`. | ||
|
||
| Key | Value | | ||
| :---------------------- | :----------------------------------------------------------- | | ||
| Span name | `"HMSET myhash"` | | ||
| `db.system` | `"redis"` | | ||
| `db.connection_string` | not set | | ||
| `db.user` | not set | | ||
| `net.peer.name` | `"/tmp/redis.sock"` | | ||
| `net.transport` | `"Unix"` | | ||
| `db.name` | not set | | ||
| `db.statement` | `"HMSET myhash field1 'Hello' field2 'World"` | | ||
| `db.operation` | not set | | ||
|
||
### MongoDB | ||
|
||
| Key | Value | | ||
| :---------------------- | :----------------------------------------------------------- | | ||
| Span name | `"products.findAndModify"` | | ||
| `db.system` | `"mongodb"` | | ||
| `db.connection_string` | not set | | ||
| `db.user` | `"the_user"` | | ||
| `net.peer.name` | `"mongodb0.example.com"` | | ||
| `net.peer.ip` | `"192.0.2.14"` | | ||
| `net.peer.port` | `27017` | | ||
| `net.transport` | `"IP.TCP"` | | ||
| `db.name` | `"shopDb"` | | ||
| `db.statement` | not set | | ||
| `db.operation` | `"findAndModify"` | | ||
| `db.mongodb.collection` | `"products"` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think any of these can be required. Instrumentation may not know any of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This requirement was already there before, I just added the attributes to the table so they're not overlooked that easily.
An instrumentation should be able to capture it, otherwise there will only be little value if the database is unknown. In which cases would it not know about that, given that we expect instrumentation to be built-in into the database client library?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yurishkuro ping ^^^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After #653, maybe this can be updated in a follow-up (or as part of #653). As of now, it seems there is not much point in bikeshedding over required vs recommended.