Skip to content

Latest commit

 

History

History
275 lines (233 loc) · 18.1 KB

tracing-instrumentation-db.md

File metadata and controls

275 lines (233 loc) · 18.1 KB

Table of Contents

Database and Datastore spans

We capture spans for various types of database/data-stores operations, such as SQL queries, Elasticsearch queries, is commands, etc. Database and datastore spans must not have child spans that have a different type or subtype within the same transaction (see span-spec).

The following fields are relevant for database and datastore spans. Where possible, agents should provide information for as many as possible of these fields. The semantics of and concrete values for these fields may vary between different technologies. See sections below for details on specific technologies.

Field Description Mandatory
name The name of the exit database span. The span name must have a low cardinality as it is used as a dimension for derived metrics! Therefore, for SQL operations we perform a limited parsing of the statement, and extract the operation name and outer-most table involved. Other databases and storages may have different strategies for the span name (see specific databases and stores in the sections below).
type For database spans, the type should be db.
subtype For database spans, the subtype should be the database vendor name. See details below for specific databases.
action The database action, e.g. query



context.db.instance Database instance name, e.g. "customers". For DynamoDB, this is the region.
context.db.statement Statement/query, e.g. SELECT * FROM foo WHERE .... The full database statement should be stored in db.statement, which may be useful for debugging performance issues. We store up to 10000 Unicode characters per database statement. For Non-SQL data stores see details below.
context.db.type Database type/category, which should be "sql" for SQL databases, and the lower-cased database name otherwise.
context.db.user Username used for database access, e.g. readonly_user
context.db.link Some SQL databases (e.g. Oracle) provide a feature for linking multiple databases to form a single logical database. The DB link differentiates single DBs of a logical database. See #107 for more details.
context.db.rows_affected The number of rows / entities affected by the corresponding db statement / query.



context.destination.address The hostname / address of the database.
context.destination.port The port under which the database is accessible.
context.destination.service.resource DEPRECATED, replaced by service.target.{type,name} fields ✅ mandatory when APM server ignores service.target.{type,name}, optional otherwise
context.destination.service.type DEPRECATED, replaced by service.target.{type,name} fields
context.destination.service.name DEPRECATED, replaced by service.target.{type,name} fields
context.destination.cloud.region The cloud region in case the datastore is hosted in a public cloud or is a managed datasatore / database. E.g. AWS regions, such as us-east-1



service.target.type Defines destination service type, the (type,name) pair replaces deprecated context.destination.service.resource
service.target.name Defines destination service name, the (type,name) pair replaces deprecated context.destination.service.resource, for relational databases, the value is equal to context.db.instance

Specific Databases

AWS DynamoDb

Field Value / Examples Comments
name e.g. DynamoDB UpdateItem my_table The span name should capture the operation name (as used by AWS for the action name) and the table name, if available. The format should be DynamoDB <ActionName> <TableName>. TableName MAY be omitted from the name for operations (batchWriteItem, batchGetItem, PartiQL-related methods like executeStatement etc.) that are acting on more than a single table. If TableName is not available, agents SHOULD also check the TableArn or SourceTableArn query params for a table name and extract the table name from the AWS ARN value.
type db
subtype dynamodb
action query
context.db._

_.instance e.g. us-east-1 The AWS region where the table is.
_.statement e.g. ForumName = :name and Subject = :sub For a DynamoDB Query operation, capture the KeyConditionExpression in this field. In order to avoid a high cardinality of collected values, agents SHOULD NOT include the full SQL statment for PartiQL-related methods like `executeStatment.
_.type dynamodb
_.user
_.link
_.rows_affected
context.destination._

_.address e.g. dynamodb.us-west-2.amazonaws.com
_.port e.g. 5432
_.service.name dynamodb DEPRECATED
_.service.type db DEPRECATED
_.service.resource dynamodb, dynamodb/us-east-1 DEPRECATED
_.cloud.region e.g. us-east-1 The AWS region where the table is, if available.
service.target._

_.type dynamodb
_.name e.g. us-east-1 Use same value as context.db.instance

AWS S3

Field Value / Examples Comments
name e.g. S3 GetObject my-bucket The span name should follow this pattern: S3 <OperationName> <bucket-name>. Note that the operation name is in PascalCase.
type storage
subtype s3
action e.g. GetObject The operation name in PascalCase.
context.destination._

_.address e.g. s3.amazonaws.com Not available in some cases. Only set if the actual connection is available.
_.port e.g. 443 Not available in some cases. Only set if the actual connection is available.
_.service.name s3 DEPRECATED, use service.target.{type,name}
_.service.type storage DEPRECATED, use service.target.{type,name}
_.service.resource e.g. s3/my-bucket, s3/accesspoint/myendpointslashes, or s3/accesspoint:myendpointcolons DEPRECATED, use service.target.{type,name}
_.cloud.region e.g. us-east-1 The AWS region where the bucket is.
service.target._

_.type s3
_.name e.g. my-bucket, accesspoint/myendpointslashes, or accesspoint:myendpointcolons The bucket name, if available. The s3 API allows either the bucket name or an Access Point to be provided when referring to a bucket. Access Points can use either slashes or colons. When an Access Point is provided, the access point name preceded by accesspoint/ or accesspoint: should be extracted. For example, given an Access Point such as arn:aws:s3:us-west-2:123456789012:accesspoint/myendpointslashes, the agent extracts accesspoint/myendpointslashes. Given an Access Point such as arn:aws:s3:us-west-2:123456789012:accesspoint:myendpointcolons, the agent extracts accesspoint:myendpointcolons.
otel.attributes._

_["aws.s3.bucket"] my-bucket The bucket name, if available. See OTel Semantic Conventions. Note: this must be a single dotted string key in the otel.attributes mapping -- for example {"otel": {"attributes": {"aws.s3.bucket": "my-bucket"}}} -- and not a nested object.
_["aws.s3.key"] my/key/path The S3 object key, if applicable. See OTel Semantic Conventions. Note: this must be a single dotted string key in the otel.attributes mapping and not a nested object.

Cassandra

Field Value / Examples Comments
type db
subtype cassandra
context.db.instance e.g. customers Keyspace name
context.destination.service.resource cassandra, cassandra/customers DEPRECATED
service.target.type cassandra
service.target.name e.g. customers Keyspace name

Elasticsearch

Span Field Value / Examples Comments
name e.g. Elasticsearch: GET /index/_search The span name should be Elasticsearch: <method> <path>
type db
subtype elasticsearch
action request
context.db._

_.instance e.g. my-cluster Cluster name, if available.
_.statement e.g.
{"query": {"match": {"user.id": "kimchy"}}}
For Elasticsearch search-type queries, the request body SHOULD be recorded according to the elasticsearch_capture_body_urls option (see below). If the body is gzip-encoded, the body MUST be decoded first.
_.type elasticsearch
_.user
_.link
_.rows_affected
context.http._

_.status_code 200
_.method GET
_.url https://localhost:9200/index/_search?q=user.id:kimchy
context.destination._

_.address e.g. localhost
_.port e.g. 5432
_.service.name elasticsearch DEPRECATED, use service.target.{type,name}
_.service.type db DEPRECATED, use service.target.{type,name}
_.service.resource elasticsearch, elasticsearch/my-cluster DEPRECATED, use service.target.{type,name}
context.service.target._

_.type elasticsearch
_.name e.g. my-cluster Cluster name, if available.

In addition to the usual error capture specification, the following apply to errors captured for Elasticsearch client auto-instrumentation.

Error Field Value / Examples Comments
exception.type $exceptionType or ${exceptionType} (${esResponseBody.error.type}), e.g. ResponseError (index_not_found_exception) Some Elasticsearch client errors include an error response body from Elasticsearch with a error.type. For these errors, the APM agent MAY capture that type as in the given example. This helps with error grouping in the APM app.

Cluster name

The Elasticsearch cluster name is not available in Elasticsearch clients. When using an Elastic Cloud deployment, the name of the Elasticsearch cluster is provided by the x-found-handling-cluster HTTP response header.

elasticsearch_capture_body_urls configuration

The URL patterns for which the agent is capturing the request body for Elasticsearch clients.

Agents MAY offer this configuration option. If they don't, they MUST use a hard-coded list of URLs that correspond with the default value of this option.

Type List<WildcardMatcher>
Default */_search, */_search/template, */_msearch, */_msearch/template, */_async_search, */_count, */_sql, */_eql/search
Central config false

MongoDB

Field Value / Examples Comments
name e.g. users.find The name for MongoDB spans should be the command name in the context of its collection/database.
type db
subtype mongodb
action e.g. find , insert, etc. The MongoDB command executed with this action.
context.db._

_.instance e.g. customers Database name, if available
_.statement e.g.
find({status: {$in: ["A","D"]}})
The MongoDB command encoded as MongoDB Extended JSON, if the command name matches mongodb_capture_statement_commands.
_.type mongodb
_.user
_.link
_.rows_affected
context.destination._

_.address e.g. localhost
_.port e.g. 5432
_.service.name mongodb DEPRECATED, use service.target.{type,name}
_.service.type db DEPRECATED, use service.target.{type,name}
_.service.resource mongodb/customers DEPRECATED, use service.target.{type,name}
service.target._

_.type mongodb
_.name e.g. customers Database name, same as db.instance if available

mongodb_capture_statement_commands configuration

This config var specifies the MongoDB command names for which the agent will capture the statement. Agents that support capturing MongoDB statements MUST implement this option.

Type List<WildcardMatcher>
Default "find,aggregate,count,distinct,mapReduce"
Central config false

Examples:

  • capture for all commands: * (This should be discouraged, because it can lead to capturing sensitive information in insert and update commands.)
  • capture common read commands: find,aggregate,count,distinct,mapReduce
  • capture no statement : "" (empty)

Redis

Field Value / Examples Comments
name e.g. GET or LRANGE The name for Redis spans can simply be set to the command name.
type db
subtype redis
action query
context.db._

_.instance
_.statement
_.type redis
_.user
_.link
_.rows_affected
context.destination._

_.address e.g. localhost
_.port e.g. 5432
_.service.name redis DEPRECATED, use service.target.{type,name}
_.service.type db DEPRECATED, use service.target.{type,name}
_.service.resource redis DEPRECATED, use service.target.{type,name}
service.target._

_.type redis
_.name

SQL Databases

Field Common values / patterns for all SQL DBs Comments
name e.g. SELECT FROM products For SQL operations we perform a limited parsing the statement, and extract the operation name and outer-most table involved (if any). See more details here.
type db
subtype e.g. oracle, mysql see below
action e.g. query, connect, ping, prepare, exec
context.db._

_.instance e.g. instance-name see below
_.statement e.g. SELECT * FROM products WHERE ... The full SQL statement. We store up to 10000 Unicode characters per database statement.
_.type sql
_.user e.g. readonly_user
_.rows_affected e.g. 123
context.destination._

_.address e.g. localhost
_.port e.g. 5432
_.service.type db DEPRECATED, use service.target.{type,name}
_.service.resource DEPRECATED, use service.target.{type,name}
service.target._

_.type e.g. mysql Same value as subtype
_.name Database name, same as db.instance if available

Database subtype

Database name subtype
MySQL mysql
MariaDB mariadb
PostgreSQL postgresql
Microsoft SQL server mssql
Oracle oracle
IBM Db2 db2

Database instance

For most relational databases, the value of db.instance should map to the concept of "current database". When no database selected, for example when creating a database, this field should be omitted.

While the semantics may vary across vendors, the goal here is to have a single string that can be used for correlation, it is thus important to be able to get the same value across all agents.

There are multiple ways to capture it, agents SHOULD attempt to capture it with the following priorities:

  1. Parsing the database connection string: parsing can be complex, no runtime impact,
  2. Querying connection metadata at runtime: acceptable as fallback, might trigger extra SQL queries, require caching to minimize overhead

For most databases, the database parameter of the connection string should be available. For those that implement the INFORMATION_SCHEMA standard, it should be included in the values returned by SELECT schema_name FROM information_schema.schemata;

Oracle : Use instance as defined in Oracle DB instances, the instance name should be the same as retrieved through SELECT sys_context('USERENV','INSTANCE_NAME') AS Instance. When multiple identifiers are available, the following priotity should be applied (first available wins): INSTANCE_NAME, SERVICE_NAME, SID.

MS SQL : Use instance as defined in MS SQL instances