From 32b75a8d465ddea8af396666cd4020c15f4859e1 Mon Sep 17 00:00:00 2001 From: Liudmila Molkova Date: Fri, 4 Oct 2024 09:23:41 -0700 Subject: [PATCH 1/4] Introduce per-message structured GenAI events instead of prompt/completion span events (#980) --- .chloggen/980.yaml | 4 + docs/attributes-registry/gen-ai.md | 60 ++- docs/gen-ai/README.md | 1 + docs/gen-ai/gen-ai-events.md | 383 ++++++++++++++++++ docs/gen-ai/gen-ai-spans.md | 66 +-- .../deprecated/registry-deprecated.yaml | 12 + model/gen-ai/events.yaml | 48 +++ model/gen-ai/registry.yaml | 12 - model/gen-ai/spans.yaml | 29 -- 9 files changed, 481 insertions(+), 134 deletions(-) create mode 100644 .chloggen/980.yaml create mode 100644 docs/gen-ai/gen-ai-events.md create mode 100644 model/gen-ai/events.yaml diff --git a/.chloggen/980.yaml b/.chloggen/980.yaml new file mode 100644 index 0000000000..e8bc588dfe --- /dev/null +++ b/.chloggen/980.yaml @@ -0,0 +1,4 @@ +change_type: breaking +component: gen_ai +note: Deprecate `gen_ai.prompt` and `gen_ai.completion` attributes, introduce log-based events for GenAI inputs and outputs. +issues: [834, 980] diff --git a/docs/attributes-registry/gen-ai.md b/docs/attributes-registry/gen-ai.md index 0dc935e462..12c5283980 100644 --- a/docs/attributes-registry/gen-ai.md +++ b/docs/attributes-registry/gen-ai.md @@ -14,34 +14,28 @@ This document defines the attributes used to describe telemetry in the context of Generative Artificial Intelligence (GenAI) Models requests and responses. -| Attribute | Type | Description | Examples | Stability | -| ---------------------------------- | -------- | ------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------- | ---------------------------------------------------------------- | -| `gen_ai.completion` | string | The full response received from the GenAI model. [1] | `[{'role': 'assistant', 'content': 'The capital of France is Paris.'}]` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `gen_ai.operation.name` | string | The name of the operation being performed. [2] | `chat`; `text_completion` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `gen_ai.prompt` | string | The full prompt sent to the GenAI model. [3] | `[{'role': 'user', 'content': 'What is the capital of France?'}]` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `gen_ai.request.frequency_penalty` | double | The frequency penalty setting for the GenAI request. | `0.1` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `gen_ai.request.max_tokens` | int | The maximum number of tokens the model generates for a request. | `100` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `gen_ai.request.model` | string | The name of the GenAI model a request is being made to. | `gpt-4` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `gen_ai.request.presence_penalty` | double | The presence penalty setting for the GenAI request. | `0.1` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `gen_ai.request.stop_sequences` | string[] | List of sequences that the model will use to stop generating further tokens. | `["forest", "lived"]` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `gen_ai.request.temperature` | double | The temperature setting for the GenAI request. | `0.0` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `gen_ai.request.top_k` | double | The top_k sampling setting for the GenAI request. | `1.0` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `gen_ai.request.top_p` | double | The top_p sampling setting for the GenAI request. | `1.0` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `gen_ai.response.finish_reasons` | string[] | Array of reasons the model stopped generating tokens, corresponding to each generation received. | `["stop"]`; `["stop", "length"]` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `gen_ai.response.id` | string | The unique identifier for the completion. | `chatcmpl-123` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `gen_ai.response.model` | string | The name of the model that generated the response. | `gpt-4-0613` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `gen_ai.system` | string | The Generative AI product as identified by the client or server instrumentation. [4] | `openai` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `gen_ai.token.type` | string | The type of token being counted. | `input`; `output` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `gen_ai.usage.input_tokens` | int | The number of tokens used in the GenAI input (prompt). | `100` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `gen_ai.usage.output_tokens` | int | The number of tokens used in the GenAI response (completion). | `180` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | - -**[1]:** It's RECOMMENDED to format completions as JSON string matching [OpenAI messages format](https://platform.openai.com/docs/guides/text-generation) - -**[2]:** If one of the predefined values applies, but specific system uses a different name it's RECOMMENDED to document it in the semantic conventions for specific GenAI system and use system-specific name in the instrumentation. If a different name is not documented, instrumentation libraries SHOULD use applicable predefined value. - -**[3]:** It's RECOMMENDED to format prompts as JSON string matching [OpenAI messages format](https://platform.openai.com/docs/guides/text-generation) - -**[4]:** The `gen_ai.system` describes a family of GenAI models with specific model identified +| Attribute | Type | Description | Examples | Stability | +| ---------------------------------- | -------- | ------------------------------------------------------------------------------------------------ | -------------------------------- | ---------------------------------------------------------------- | +| `gen_ai.operation.name` | string | The name of the operation being performed. [1] | `chat`; `text_completion` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `gen_ai.request.frequency_penalty` | double | The frequency penalty setting for the GenAI request. | `0.1` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `gen_ai.request.max_tokens` | int | The maximum number of tokens the model generates for a request. | `100` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `gen_ai.request.model` | string | The name of the GenAI model a request is being made to. | `gpt-4` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `gen_ai.request.presence_penalty` | double | The presence penalty setting for the GenAI request. | `0.1` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `gen_ai.request.stop_sequences` | string[] | List of sequences that the model will use to stop generating further tokens. | `["forest", "lived"]` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `gen_ai.request.temperature` | double | The temperature setting for the GenAI request. | `0.0` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `gen_ai.request.top_k` | double | The top_k sampling setting for the GenAI request. | `1.0` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `gen_ai.request.top_p` | double | The top_p sampling setting for the GenAI request. | `1.0` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `gen_ai.response.finish_reasons` | string[] | Array of reasons the model stopped generating tokens, corresponding to each generation received. | `["stop"]`; `["stop", "length"]` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `gen_ai.response.id` | string | The unique identifier for the completion. | `chatcmpl-123` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `gen_ai.response.model` | string | The name of the model that generated the response. | `gpt-4-0613` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `gen_ai.system` | string | The Generative AI product as identified by the client or server instrumentation. [2] | `openai` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `gen_ai.token.type` | string | The type of token being counted. | `input`; `output` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `gen_ai.usage.input_tokens` | int | The number of tokens used in the GenAI input (prompt). | `100` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `gen_ai.usage.output_tokens` | int | The number of tokens used in the GenAI response (completion). | `180` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +**[1]:** If one of the predefined values applies, but specific system uses a different name it's RECOMMENDED to document it in the semantic conventions for specific GenAI system and use system-specific name in the instrumentation. If a different name is not documented, instrumentation libraries SHOULD use applicable predefined value. + +**[2]:** The `gen_ai.system` describes a family of GenAI models with specific model identified by `gen_ai.request.model` and `gen_ai.response.model` attributes. The actual GenAI product may differ from the one identified by the client. @@ -104,7 +98,9 @@ Thie group defines attributes for OpenAI. Describes deprecated `gen_ai` attributes. -| Attribute | Type | Description | Examples | Stability | -| -------------------------------- | ---- | ----------------------------------------------------- | -------- | ------------------------------------------------------------------------------------------------------------------ | -| `gen_ai.usage.completion_tokens` | int | Deprecated, use `gen_ai.usage.output_tokens` instead. | `42` | ![Deprecated](https://img.shields.io/badge/-deprecated-red)
Replaced by `gen_ai.usage.output_tokens` attribute. | -| `gen_ai.usage.prompt_tokens` | int | Deprecated, use `gen_ai.usage.input_tokens` instead. | `42` | ![Deprecated](https://img.shields.io/badge/-deprecated-red)
Replaced by `gen_ai.usage.input_tokens` attribute. | +| Attribute | Type | Description | Examples | Stability | +| -------------------------------- | ------ | --------------------------------------------------------- | ----------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ | +| `gen_ai.completion` | string | Deprecated, use Event API to report completions contents. | `[{'role': 'assistant', 'content': 'The capital of France is Paris.'}]` | ![Deprecated](https://img.shields.io/badge/-deprecated-red)
Removed, no replacement at this time. | +| `gen_ai.prompt` | string | Deprecated, use Event API to report prompt contents. | `[{'role': 'user', 'content': 'What is the capital of France?'}]` | ![Deprecated](https://img.shields.io/badge/-deprecated-red)
Removed, no replacement at this time. | +| `gen_ai.usage.completion_tokens` | int | Deprecated, use `gen_ai.usage.output_tokens` instead. | `42` | ![Deprecated](https://img.shields.io/badge/-deprecated-red)
Replaced by `gen_ai.usage.output_tokens` attribute. | +| `gen_ai.usage.prompt_tokens` | int | Deprecated, use `gen_ai.usage.input_tokens` instead. | `42` | ![Deprecated](https://img.shields.io/badge/-deprecated-red)
Replaced by `gen_ai.usage.input_tokens` attribute. | diff --git a/docs/gen-ai/README.md b/docs/gen-ai/README.md index 086a6d327f..020fd0a4ca 100644 --- a/docs/gen-ai/README.md +++ b/docs/gen-ai/README.md @@ -16,6 +16,7 @@ use the conventions in limited non-critical workloads and share the feedback Semantic conventions for Generative AI operations are defined for the following signals: +* [Events](gen-ai-events.md): Semantic Conventions for Generative AI inputs and outputs - *events*. * [Metrics](gen-ai-metrics.md): Semantic Conventions for Generative AI operations - *metrics*. * [Spans](gen-ai-spans.md): Semantic Conventions for Generative AI requests - *spans*. diff --git a/docs/gen-ai/gen-ai-events.md b/docs/gen-ai/gen-ai-events.md new file mode 100644 index 0000000000..ea394a7c3c --- /dev/null +++ b/docs/gen-ai/gen-ai-events.md @@ -0,0 +1,383 @@ + + +# Semantic Conventions for GenAI events + +**Status**: [Experimental][DocumentStatus] + + + + + +- [Common attributes](#common-attributes) +- [System event](#system-event) +- [User event](#user-event) +- [Assistant event](#assistant-event) + - [`ToolCall` object](#toolcall-object) + - [`Function` object](#function-object) +- [Tool event](#tool-event) +- [Choice event](#choice-event) + - [`Message` object](#message-object) +- [Custom events](#custom-events) +- [Examples](#examples) + - [Chat completion](#chat-completion) + - [Tools](#tools) + - [Chat completion with multiple choices](#chat-completion-with-multiple-choices) + + + +GenAI instrumentations MAY capture user inputs sent to the model and responses received from it as [events](https://github.com/open-telemetry/opentelemetry-specification/tree/v1.33.0/specification/logs/event-api.md). + +> Note: +> Event API is experimental and not yet available in some languages. Check [spec-compliance matrix](https://github.com/open-telemetry/opentelemetry-specification/blob/main/spec-compliance-matrix.md#events) to see the implementation status in corresponding language. + +Instrumentations MAY capture inputs and outputs if and only if application has enabled the collection of this data. +This is for three primary reasons: + +1. Data privacy concerns. End users of GenAI applications may input sensitive information or personally identifiable information (PII) that they do not wish to be sent to a telemetry backend. +2. Data size concerns. Although there is no specified limit to sizes, there are practical limitations in programming languages and telemetry systems. Some GenAI systems allow for extremely large context windows that end users may take full advantage of. +3. Performance concerns. Sending large amounts of data to a telemetry backend may cause performance issues for the application. + +Body fields that contain user input, model output, or other potentially sensitive and verbose data +SHOULD NOT be captured by default. + +Semantic conventions for individual systems which extend content events SHOULD document all additional body fields and specify whether they +should be captured by default or need application to opt into capturing them. + +Telemetry consumers SHOULD expect to receive unknown body fields. + +Instrumentations SHOULD NOT capture undocumented body fields and MUST follow the documented defaults for known fields. +Instrumentations MAY offer configuration options allowing to disable events or allowing to capture all fields. + +## Common attributes + +The following attributes apply to all GenAI events. + + + + + + + + +| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | +|---|---|---|---|---|---| +| [`gen_ai.system`](/docs/attributes-registry/gen-ai.md) | string | The Generative AI product as identified by the client or server instrumentation. [1] | `openai` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + +**[1]:** The `gen_ai.system` describes a family of GenAI models with specific model identified +by `gen_ai.request.model` and `gen_ai.response.model` attributes. + +The actual GenAI product may differ from the one identified by the client. +For example, when using OpenAI client libraries to communicate with Mistral, the `gen_ai.system` +is set to `openai` based on the instrumentation's best knowledge. + +For custom model, a custom friendly name SHOULD be used. +If none of these options apply, the `gen_ai.system` SHOULD be set to `_OTHER`. + + + +`gen_ai.system` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. + +| Value | Description | Stability | +|---|---|---| +| `anthropic` | Anthropic | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `cohere` | Cohere | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `openai` | OpenAI | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `vertex_ai` | Vertex AI | ![Experimental](https://img.shields.io/badge/-experimental-blue) | + + + + + + + + +## System event + +This event describes the instructions passed to the GenAI model. + +The event name MUST be `gen_ai.system.message`. + +| Body Field | Type | Description | Examples | Requirement Level | +|---|---|---|---|---| +| `role` | string | The actual role of the message author as passed in the message. | `"system"`, `"instructions"` | `Conditionally Required`: if available and not equal to `system` | +| `content` | `AnyValue` | The contents of the system message. | `"You're a friendly bot that answers questions about OpenTelemetry."` | `Opt-In` | + +## User event + +This event describes the prompt message specified by the user. + +The event name MUST be `gen_ai.user.message`. + +| Body Field | Type | Description | Examples | Requirement Level | +|---|---|---|---|---| +| `role` | string | The actual role of the message author as passed in the message. | `"user"`, `"customer"` | `Conditionally Required`: if available and if not equal to `user` | +| `content` | `AnyValue` | The contents of the user message. | `What telemetry is reported by OpenAI instrumentations?` | `Opt-In` | + +## Assistant event + +This event describes the assistant message. + +The event name MUST be `gen_ai.assistant.message`. + +| Body Field | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | +|--------------|--------------------------------|----------------------------------------|-------------------------------------------------|-------------------| +| `role` | string | The actual role of the message author as passed in the message. | `"assistant"`, `"bot"` | `Conditionally Required`: if available and if not equal to `assistant` | +| `content` | `AnyValue` | The contents of the assistant message. | `Spans, events, metrics defined by the GenAI semantic conventions.` | `Opt-In` | +| `tool_calls` | [ToolCall](#toolcall-object)[] | The tool calls generated by the model, such as function calls. | `[{"id":"call_mszuSIzqtI65i1wAUOE8w5H4", "function":{"name":"get_link_to_otel_semconv", "arguments":{"semconv":"gen_ai"}}, "type":"function"}]` | `Conditionally Required`: if available | + +### `ToolCall` object + +| Body Field | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | +|------------|-----------------------------|------------------------------------|-------------------------------------------------|-------------------| +| `id` | string | The id of the tool call | `call_mszuSIzqtI65i1wAUOE8w5H4` | `Required` | +| `type` | string | The type of the tool | `function` | `Required` | +| `function` | [Function](#function-object)| The function that the model called | `{"name":"get_link_to_otel_semconv", "arguments":{"semconv":"gen_ai"}}` | `Required` | + +### `Function` object + +| Body Field | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | +|-------------|------------|----------------------------------------|----------------------------|-------------------| +| `name` | string | The name of the function to call | `get_link_to_otel_semconv` | `Required` | +| `arguments` | `AnyValue` | The arguments to pass the the function | `{"semconv": "gen_ai"}` | `Opt-In` | + +## Tool event + +This event describes the output of the tool or function submitted back to the model. + +The event name MUST be `gen_ai.tool.message`. + +| Body Field | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | +|----------------|--------|-----------------------------------------------|---------------------------------|-------------------| +| `role` | string | The actual role of the message author as passed in the message. | `"tool"`, `"function"` | `Conditionally Required`: if available and if not equal to `tool` | +| `content` | AnyValue | The contents of the tool message. | `opentelemetry.io` | `Opt-In` | +| `id` | string | Tool call that this message is responding to. | `call_mszuSIzqtI65i1wAUOE8w5H4` | `Required` | + +## Choice event + +This event describes model-generated individual chat response (choice). +If GenAI model returns multiple choices, each choice SHOULD be recorded as an individual event. + +When response is streamed, instrumentations that report response events MUST reconstruct and report the full message and MUST NOT report individual chunks as events. +If the request to GenAI model fails with an error before content is received, instrumentation SHOULD report an event with truncated content (if enabled). If `finish_reason` was not received, it MUST be set to `error`. + +The event name MUST be `gen_ai.choice`. + +Choice event body has the following fields: + +| Body Field | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | +|-----------------|----------------------------|-------------------------------------------------|----------------------------------------|-------------------| +| `finish_reason` | string | The reason the model stopped generating tokens. | `stop`, `tool_calls`, `content_filter` | `Required` | +| `index` | int | The index of the choice in the list of choices. | `1` | `Required` | +| `message` | [Message](#message-object) | GenAI response message | `{"content":"The OpenAI semantic conventions are available at opentelemetry.io"}` | `Recommended` | + +### `Message` object + +| Body Field | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | +|----------------|--------------------------------|-----------------------------------------------|---------------------------------|-------------------| +| `role` | string | The actual role of the message author as passed in the message. | `"assistant"`, `"bot"` | `Conditionally Required`: if available and if not equal to `assistant` | +| `content` | `AnyValue` | The contents of the assistant message. | `Spans, events, metrics defined by the GenAI semantic conventions.` | `Opt-In` | +| `tool_calls` | [ToolCall](#toolcall-object)[] | The tool calls generated by the model, such as function calls. | `[{"id":"call_mszuSIzqtI65i1wAUOE8w5H4", "function":{"name":"get_link_to_otel_semconv", "arguments":"{\"semconv\":\"gen_ai\"}"}, "type":"function"}]` | `Conditionally Required`: if available | + +## Custom events + +System-specific events that are not covered in this document SHOULD be documented in corresponding Semantic Conventions extensions and +SHOULD follow `gen_ai.{gen_ai.system}.*` naming pattern for system-specific events. + +## Examples + +### Chat completion + +This example covers the following scenario: + +- user requests chat completion from OpenAI GPT-4 model for the following prompt: + - System message: `You're a friendly bot that answers questions about OpenTelemetry.` + - User message: `How to instrument GenAI library with OTel?` + +- The model responds with `"Follow GenAI semantic conventions available at opentelemetry.io."` message + +Span: + +| Attribute name | Value | +|---------------------------------|--------------------------------------------| +| Span name | `"chat gpt-4"` | +| `gen_ai.system` | `"openai"` | +| `gen_ai.request.model` | `"gpt-4"` | +| `gen_ai.request.max_tokens` | `200` | +| `gen_ai.request.top_p` | `1.0` | +| `gen_ai.response.id` | `"chatcmpl-9J3uIL87gldCFtiIbyaOvTeYBRA3l"` | +| `gen_ai.response.model` | `"gpt-4-0613"` | +| `gen_ai.usage.output_tokens` | `47` | +| `gen_ai.usage.input_tokens` | `52` | +| `gen_ai.response.finish_reasons`| `["stop"]` | + +Events: + +1. `gen_ai.system.message`. + + | Property | Value | + |---------------------|-------------------------------------------------------| + | `gen_ai.system` | `"openai"` | + | Event body | `{"content": "You're a friendly bot that answers questions about OpenTelemetry."}` | + +2. `gen_ai.user.message` + + | Property | Value | + |---------------------|-------------------------------------------------------| + | `gen_ai.system` | `"openai"` | + | Event body | `{"content":"How to instrument GenAI library with OTel?"}` | + +3. `gen_ai.choice` + + | Property | Value | + |---------------------|-------------------------------------------------------| + | `gen_ai.system` | `"openai"` | + | Event body (with content enabled) | `{"index":0,"finish_reason":"stop","message":{"content":"Follow GenAI semantic conventions available at opentelemetry.io."}}` | + | Event body (without content) | `{"index":0,"finish_reason":"stop","message":{}}` | + +### Tools + +This example covers the following scenario: + +1. Application requests chat completion from OpenAI GPT-4 model and provides a function definition. + + - Application provides the following prompt: + - User message: `How to instrument GenAI library with OTel?` + - Application defines a tool (a function) names `get_link_to_otel_semconv` with single string argument named `semconv` + +2. The model responds with a tool call request which application executes +3. The application requests chat completion again now with the tool execution result + +Here's the telemetry generated for each step in this scenario: + +1. Chat completion resulting in a tool call. + + | Attribute name | Value | + |---------------------|-------------------------------------------------------| + | Span name | `"chat gpt-4"` | + | `gen_ai.system` | `"openai"` | + | `gen_ai.request.model`| `"gpt-4"` | + | `gen_ai.request.max_tokens`| `200` | + | `gen_ai.request.top_p`| `1.0` | + | `gen_ai.response.id`| `"chatcmpl-9J3uIL87gldCFtiIbyaOvTeYBRA3l"` | + | `gen_ai.response.model`| `"gpt-4-0613"` | + | `gen_ai.usage.output_tokens`| `17` | + | `gen_ai.usage.input_tokens`| `47` | + | `gen_ai.response.finish_reasons`| `["tool_calls"]` | + + Events parented to this span: + + - `gen_ai.user.message` (not reported when capturing content is disabled) + + | Property | Value | + |---------------------|-------------------------------------------------------| + | `gen_ai.system` | `"openai"` | + | Event body | `{"content":"How to instrument GenAI library with OTel?"}` | + + - `gen_ai.choice` + + | Property | Value | + |---------------------|-------------------------------------------------------| + | `gen_ai.system` | `"openai"` | + | Event body (with content) | `{"index":0,"finish_reason":"tool_calls","message":{"tool_calls":[{"id":"call_VSPygqKTWdrhaFErNvMV18Yl","function":{"name":"get_link_to_otel_semconv","arguments":"{\"semconv\":\"GenAI\"}"},"type":"function"}]}` | + | Event body (without content) | `{"index":0,"finish_reason":"tool_calls","message":{"tool_calls":[{"id":"call_VSPygqKTWdrhaFErNvMV18Yl","function":{"name":"get_link_to_otel_semconv"},"type":"function"}]}` | + +2. Application executes the tool call. Application may create span which is not covered by this semantic convention. +3. Final chat completion call + + | Attribute name | Value | + |---------------------------------|-------------------------------------------------------| + | Span name | `"chat gpt-4"` | + | `gen_ai.system` | `"openai"` | + | `gen_ai.request.model` | `"gpt-4"` | + | `gen_ai.request.max_tokens` | `200` | + | `gen_ai.request.top_p` | `1.0` | + | `gen_ai.response.id` | `"chatcmpl-call_VSPygqKTWdrhaFErNvMV18Yl"` | + | `gen_ai.response.model` | `"gpt-4-0613"` | + | `gen_ai.usage.output_tokens` | `52` | + | `gen_ai.usage.input_tokens` | `47` | + | `gen_ai.response.finish_reasons`| `["stop"]` | + + Events parented to this span: + (in this example, the event content matches the original messages, but applications may also drop messages or change their content) + + - `gen_ai.user.message` (not reported when capturing content is not enabled) + + | Property | Value | + |----------------------------------|------------------------------------------------------------| + | `gen_ai.system` | `"openai"` | + | Event body | `{"content":"How to instrument GenAI library with OTel?"}` | + + - `gen_ai.assistant.message` + + | Property | Value | + |----------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------| + | `gen_ai.system` | `"openai"` | + | Event body (content enabled) | `{"tool_calls":[{"id":"call_VSPygqKTWdrhaFErNvMV18Yl","function":{"name":"get_link_to_otel_semconv","arguments":"{\"semconv\":\"GenAI\"}"},"type":"function"}]}` | + | Event body (content not enabled) | `{"tool_calls":[{"id":"call_VSPygqKTWdrhaFErNvMV18Yl","function":{"name":"get_link_to_otel_semconv"},"type":"function"}]}` | + + - `gen_ai.tool.message` + + | Property | Value | + |----------------------------------|------------------------------------------------------------------------------------------------| + | `gen_ai.system` | `"openai"` | + | Event body (content enabled) | `{"content":"opentelemetry.io/semconv/gen-ai","id":"call_VSPygqKTWdrhaFErNvMV18Yl"}` | + | Event body (content not enabled) | `{"id":"call_VSPygqKTWdrhaFErNvMV18Yl"}` | + + - `gen_ai.choice` + + | Property | Value | + |----------------------------------|-------------------------------------------------------------------------------------------------------------------------------| + | `gen_ai.system` | `"openai"` | + | Event body (content enabled) | `{"index":0,"finish_reason":"stop","message":{"content":"Follow OTel semconv available at opentelemetry.io/semconv/gen-ai"}}` | + | Event body (content not enabled) | `{"index":0,"finish_reason":"stop","message":{}}` | + +### Chat completion with multiple choices + +This example covers the following scenario: + +- user requests 2 chat completion from OpenAI GPT-4 model for the following prompt: + + - System message: `You're a friendly bot that answers questions about OpenTelemetry.` + - User message: `How to instrument GenAI library with OTel?` + +- The model responds with two choices + + - `"Follow GenAI semantic conventions available at opentelemetry.io."` message + - `"Use OpenAI instrumentation library."` message + +Span: + +| Attribute name | Value | +|---------------------|--------------------------------------------| +| Span name | `"chat gpt-4"` | +| `gen_ai.system` | `"openai"` | +| `gen_ai.request.model`| `"gpt-4"` | +| `gen_ai.request.max_tokens`| `200` | +| `gen_ai.request.top_p`| `1.0` | +| `gen_ai.response.id`| `"chatcmpl-9J3uIL87gldCFtiIbyaOvTeYBRA3l"` | +| `gen_ai.response.model`| `"gpt-4-0613"` | +| `gen_ai.usage.output_tokens`| `77` | +| `gen_ai.usage.input_tokens`| `52` | +| `gen_ai.response.finish_reasons`| `["stop"]` | + +Events: + +1. `gen_ai.system.message`: the same as in the [Chat Completion](#chat-completion) example +2. `gen_ai.user.message`: the same as in the previous example +3. `gen_ai.choice` + + | Property | Value | + |------------------------------|-------------------------------------------------------| + | `gen_ai.system` | `"openai"` | + | Event body (content enabled) | `{"index":0,"finish_reason":"stop","message":{"content":"Follow GenAI semantic conventions available at opentelemetry.io."}}` | + +4. `gen_ai.choice` + + | Property | Value | + |------------------------------|-------------------------------------------------------| + | `gen_ai.system` | `"openai"` | + | Event body (content enabled) | `{"index":1,"finish_reason":"stop","message":{"content":"Use OpenAI instrumentation library."}}` | + +[DocumentStatus]: https://opentelemetry.io/docs/specs/otel/document-status diff --git a/docs/gen-ai/gen-ai-spans.md b/docs/gen-ai/gen-ai-spans.md index 0a3eec44b4..ed63699ae3 100644 --- a/docs/gen-ai/gen-ai-spans.md +++ b/docs/gen-ai/gen-ai-spans.md @@ -2,7 +2,7 @@ linkTitle: Generative AI traces ---> -# Semantic Conventions for GenAI operations +# Semantic Conventions for GenAI spans **Status**: [Experimental][DocumentStatus] @@ -11,9 +11,8 @@ linkTitle: Generative AI traces - [Name](#name) -- [Configuration](#configuration) - [GenAI attributes](#genai-attributes) -- [Events](#events) +- [Capturing inputs and outputs](#capturing-inputs-and-outputs) @@ -27,15 +26,6 @@ GenAI spans MUST follow the overall [guidelines for span names](https://github.c The **span name** SHOULD be `{gen_ai.operation.name} {gen_ai.request.model}`. Semantic conventions for individual GenAI systems and frameworks MAY specify different span name format. -## Configuration - -Instrumentations for Generative AI clients MAY capture prompts and completions. -Instrumentations that support it, MUST offer the ability to turn off capture of prompts and completions. This is for three primary reasons: - -1. Data privacy concerns. End users of GenAI applications may input sensitive information or personally identifiable information (PII) that they do not wish to be sent to a telemetry backend. -2. Data size concerns. Although there is no specified limit to sizes, there are practical limitations in programming languages and telemetry systems. Some GenAI systems allow for extremely large context windows that end users may take full advantage of. -3. Performance concerns. Sending large amounts of data to a telemetry backend may cause performance issues for the application. - ## GenAI attributes These attributes track input data and metadata for a request to an GenAI model. Each attribute represents a concept that is common to most Generative AI clients. @@ -125,54 +115,8 @@ Instrumentations SHOULD document the list of errors they report. -## Events - -In the lifetime of a GenAI span, an event for prompts sent and completions received MAY be created, depending on the configuration of the instrumentation. - - - - - - - - -The event name MUST be `gen_ai.content.prompt`. +## Capturing inputs and outputs -| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | -|---|---|---|---|---|---| -| [`gen_ai.prompt`](/docs/attributes-registry/gen-ai.md) | string | The full prompt sent to the GenAI model. [1] | `[{'role': 'user', 'content': 'What is the capital of France?'}]` | `Conditionally Required` if and only if corresponding event is enabled | ![Experimental](https://img.shields.io/badge/-experimental-blue) | - -**[1]:** It's RECOMMENDED to format prompts as JSON string matching [OpenAI messages format](https://platform.openai.com/docs/guides/text-generation) - - - - - - - - - - - - - - - - -The event name MUST be `gen_ai.content.completion`. - -| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | -|---|---|---|---|---|---| -| [`gen_ai.completion`](/docs/attributes-registry/gen-ai.md) | string | The full response received from the GenAI model. [1] | `[{'role': 'assistant', 'content': 'The capital of France is Paris.'}]` | `Conditionally Required` if and only if corresponding event is enabled | ![Experimental](https://img.shields.io/badge/-experimental-blue) | - -**[1]:** It's RECOMMENDED to format completions as JSON string matching [OpenAI messages format](https://platform.openai.com/docs/guides/text-generation) - - - - - - - - +User inputs and model responses may be recorded as events parented to GenAI operation span. See [Semantic Conventions for GenAI events](./gen-ai-events.md) for the details. -[DocumentStatus]: https://github.com/open-telemetry/opentelemetry-specification/tree/v1.22.0/specification/document-status.md +[DocumentStatus]: https://opentelemetry.io/docs/specs/otel/document-status diff --git a/model/gen-ai/deprecated/registry-deprecated.yaml b/model/gen-ai/deprecated/registry-deprecated.yaml index 04a2968a74..2115482f36 100644 --- a/model/gen-ai/deprecated/registry-deprecated.yaml +++ b/model/gen-ai/deprecated/registry-deprecated.yaml @@ -16,3 +16,15 @@ groups: deprecated: Replaced by `gen_ai.usage.output_tokens` attribute. brief: "Deprecated, use `gen_ai.usage.output_tokens` instead." examples: [42] + - id: gen_ai.prompt + type: string + stability: experimental + deprecated: "Removed, no replacement at this time." + brief: "Deprecated, use Event API to report prompt contents." + examples: ["[{'role': 'user', 'content': 'What is the capital of France?'}]"] + - id: gen_ai.completion + type: string + stability: experimental + deprecated: "Removed, no replacement at this time." + brief: "Deprecated, use Event API to report completions contents." + examples: ["[{'role': 'assistant', 'content': 'The capital of France is Paris.'}]"] diff --git a/model/gen-ai/events.yaml b/model/gen-ai/events.yaml new file mode 100644 index 0000000000..72a4e692f0 --- /dev/null +++ b/model/gen-ai/events.yaml @@ -0,0 +1,48 @@ +groups: + - id: gen_ai.common.event.attributes + type: attribute_group + stability: experimental + brief: > + Describes common Gen AI event attributes. + attributes: + - ref: gen_ai.system + + - id: gen_ai.system.message + name: gen_ai.system.message + type: event + stability: experimental + brief: > + This event describes the instructions passed to the GenAI system inside the prompt. + extends: gen_ai.common.event.attributes + + - id: gen_ai.user.message + name: gen_ai.user.message + type: event + stability: experimental + brief: > + This event describes the prompt message specified by the user. + extends: gen_ai.common.event.attributes + + - id: gen_ai.assistant.message + name: gen_ai.assistant.message + type: event + stability: experimental + brief: > + This event describes the assistant message passed to GenAI system or received from it. + extends: gen_ai.common.event.attributes + + - id: gen_ai.tool.message + name: gen_ai.tool.message + type: event + stability: experimental + brief: > + This event describes the tool or function response message. + extends: gen_ai.common.event.attributes + + - id: gen_ai.choice + name: gen_ai.choice + type: event + stability: experimental + brief: > + This event describes the Gen AI response message. + extends: gen_ai.common.event.attributes diff --git a/model/gen-ai/registry.yaml b/model/gen-ai/registry.yaml index 5b3d1cff79..816f457093 100644 --- a/model/gen-ai/registry.yaml +++ b/model/gen-ai/registry.yaml @@ -119,18 +119,6 @@ groups: brief: 'Output tokens (completion, response, etc.)' brief: The type of token being counted. examples: ['input', 'output'] - - id: gen_ai.prompt - stability: experimental - type: string - brief: The full prompt sent to the GenAI model. - note: It's RECOMMENDED to format prompts as JSON string matching [OpenAI messages format](https://platform.openai.com/docs/guides/text-generation) - examples: ["[{'role': 'user', 'content': 'What is the capital of France?'}]"] - - id: gen_ai.completion - stability: experimental - type: string - brief: The full response received from the GenAI model. - note: It's RECOMMENDED to format completions as JSON string matching [OpenAI messages format](https://platform.openai.com/docs/guides/text-generation) - examples: ["[{'role': 'assistant', 'content': 'The capital of France is Paris.'}]"] - id: gen_ai.operation.name stability: experimental type: diff --git a/model/gen-ai/spans.yaml b/model/gen-ai/spans.yaml index d634d94473..86ddfc4a82 100644 --- a/model/gen-ai/spans.yaml +++ b/model/gen-ai/spans.yaml @@ -54,35 +54,6 @@ groups: The `error.type` SHOULD match the error code returned by the Generative AI provider or the client library, the canonical name of exception that occurred, or another low-cardinality error identifier. Instrumentations SHOULD document the list of errors they report. - events: - - gen_ai.content.prompt - - gen_ai.content.completion - - - id: gen_ai.content.prompt - name: gen_ai.content.prompt - type: event - brief: > - In the lifetime of an GenAI span, events for prompts sent and completions received - may be created, depending on the configuration of the instrumentation. - attributes: - - ref: gen_ai.prompt - requirement_level: - conditionally_required: if and only if corresponding event is enabled - note: > - It's RECOMMENDED to format prompts as JSON string matching [OpenAI messages format](https://platform.openai.com/docs/guides/text-generation) - - - id: gen_ai.content.completion - name: gen_ai.content.completion - type: event - brief: > - In the lifetime of an GenAI span, events for prompts sent and completions received - may be created, depending on the configuration of the instrumentation. - attributes: - - ref: gen_ai.completion - requirement_level: - conditionally_required: if and only if corresponding event is enabled - note: > - It's RECOMMENDED to format completions as JSON string matching [OpenAI messages format](https://platform.openai.com/docs/guides/text-generation) - id: trace.gen_ai.client extends: trace.gen_ai.client.common From 6e77ed5f6c39c2b26e96a85bb3b030c1ee2d20dc Mon Sep 17 00:00:00 2001 From: Joao Grassi <5938087+joaopgrassi@users.noreply.github.com> Date: Mon, 7 Oct 2024 11:36:02 +0200 Subject: [PATCH 2/4] Mark *.size messaging attributes as opt-in (#1442) --- .chloggen/size-attributes-opt-in.yaml | 4 ++++ docs/messaging/kafka.md | 10 +++++----- docs/messaging/messaging-spans.md | 24 ++++++++++++------------ docs/messaging/rabbitmq.md | 14 +++++++------- docs/messaging/rocketmq.md | 10 +++++----- model/messaging/spans.yaml | 8 ++++++-- 6 files changed, 39 insertions(+), 31 deletions(-) create mode 100755 .chloggen/size-attributes-opt-in.yaml diff --git a/.chloggen/size-attributes-opt-in.yaml b/.chloggen/size-attributes-opt-in.yaml new file mode 100755 index 0000000000..2062d885e1 --- /dev/null +++ b/.chloggen/size-attributes-opt-in.yaml @@ -0,0 +1,4 @@ +change_type: breaking +component: messaging +note: Mark *.size messaging attributes as Opt-In +issues: [474] diff --git a/docs/messaging/kafka.md b/docs/messaging/kafka.md index 5890dcac3f..b86f3a6f50 100644 --- a/docs/messaging/kafka.md +++ b/docs/messaging/kafka.md @@ -43,9 +43,9 @@ For Apache Kafka, the following additional attributes are defined: | [`messaging.destination.partition.id`](/docs/attributes-registry/messaging.md) | string | String representation of the partition id the message (or batch) is sent to or received from. | `1` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`messaging.kafka.message.key`](/docs/attributes-registry/messaging.md) | string | Message keys in Kafka are used for grouping alike messages to ensure they're processed on the same partition. They differ from `messaging.message.id` in that they're not unique. If the key is `null`, the attribute MUST NOT be set. [9] | `myKey` | `Recommended` If span describes operation on a single message. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`messaging.kafka.offset`](/docs/attributes-registry/messaging.md) | int | The offset of a record in the corresponding Kafka partition. | `42` | `Recommended` If span describes operation on a single message. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`messaging.message.body.size`](/docs/attributes-registry/messaging.md) | int | The size of the message body in bytes. [10] | `1439` | `Recommended` If span describes operation on a single message. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`messaging.message.id`](/docs/attributes-registry/messaging.md) | string | A value used by the messaging system as an identifier for the message, represented as a string. | `452a7c7c7c7048c2f887f61572b18fc2` | `Recommended` If span describes operation on a single message. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [11] | `80`; `8080`; `443` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [10] | `80`; `8080`; `443` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`messaging.message.body.size`](/docs/attributes-registry/messaging.md) | int | The size of the message body in bytes. Only applicable for spans describing single message operations. [11] | `1439` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1]:** The `error.type` SHOULD be predictable, and SHOULD have low cardinality. @@ -84,10 +84,10 @@ the broker doesn't have such notion, the destination name SHOULD uniquely identi **[9]:** If the key type is not string, it's string representation has to be supplied for the attribute. If the key has no unambiguous, canonical string form, don't include its value. -**[10]:** This can refer to both the compressed or uncompressed body size. If both sizes are known, the uncompressed -body size should be used. +**[10]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. -**[11]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. +**[11]:** This can refer to both the compressed or uncompressed body size. If both sizes are known, the uncompressed +body size should be used. diff --git a/docs/messaging/messaging-spans.md b/docs/messaging/messaging-spans.md index 5ffc36c97d..1c6ac90035 100644 --- a/docs/messaging/messaging-spans.md +++ b/docs/messaging/messaging-spans.md @@ -346,13 +346,13 @@ Messaging system-specific attributes MUST be defined in the corresponding `messa | [`server.address`](/docs/attributes-registry/server.md) | string | Server domain name if available without reverse DNS lookup; otherwise, IP address or Unix domain socket name. [14] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Conditionally Required` If available. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`messaging.client.id`](/docs/attributes-registry/messaging.md) | string | A unique identifier for the client that consumes or produces a message. | `client-5`; `myhost@8742@s8083jm` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`messaging.destination.partition.id`](/docs/attributes-registry/messaging.md) | string | The identifier of the partition messages are sent to or received from, unique within the `messaging.destination.name`. | `1` | `Recommended` When applicable. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`messaging.message.body.size`](/docs/attributes-registry/messaging.md) | int | The size of the message body in bytes. [15] | `1439` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`messaging.message.conversation_id`](/docs/attributes-registry/messaging.md) | string | The conversation ID identifying the conversation to which the message belongs, represented as a string. Sometimes called "Correlation ID". | `MyConversationId` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`messaging.message.envelope.size`](/docs/attributes-registry/messaging.md) | int | The size of the message body and metadata in bytes. [16] | `2738` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`messaging.message.id`](/docs/attributes-registry/messaging.md) | string | A value used by the messaging system as an identifier for the message, represented as a string. | `452a7c7c7c7048c2f887f61572b18fc2` | `Recommended` If span describes operation on a single message. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`network.peer.address`](/docs/attributes-registry/network.md) | string | Peer address of the messaging intermediary node where the operation was performed. [17] | `10.1.2.80`; `/tmp/my.sock` | `Recommended` If applicable for this messaging system. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`network.peer.address`](/docs/attributes-registry/network.md) | string | Peer address of the messaging intermediary node where the operation was performed. [15] | `10.1.2.80`; `/tmp/my.sock` | `Recommended` If applicable for this messaging system. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`network.peer.port`](/docs/attributes-registry/network.md) | int | Peer port of the messaging intermediary node where the operation was performed. | `65123` | `Recommended` if and only if `network.peer.address` is set. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [18] | `80`; `8080`; `443` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [16] | `80`; `8080`; `443` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`messaging.message.body.size`](/docs/attributes-registry/messaging.md) | int | The size of the message body in bytes. [17] | `1439` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`messaging.message.envelope.size`](/docs/attributes-registry/messaging.md) | int | The size of the message body and metadata in bytes. [18] | `2738` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1]:** The actual messaging system may differ from the one known by the client. For example, when using Kafka client libraries to communicate with Azure Event Hubs, the `messaging.system` is set to `kafka` based on the instrumentation's best knowledge. @@ -401,17 +401,17 @@ the broker doesn't have such notion, the destination name SHOULD uniquely identi **[14]:** Server domain name of the broker if available without reverse DNS lookup; otherwise, IP address or Unix domain socket name. -**[15]:** This can refer to both the compressed or uncompressed body size. If both sizes are known, the uncompressed -body size should be used. - -**[16]:** This can refer to both the compressed or uncompressed size. If both sizes are known, the uncompressed -size should be used. - -**[17]:** Semantic conventions for individual messaging systems SHOULD document whether `network.peer.*` attributes are applicable. +**[15]:** Semantic conventions for individual messaging systems SHOULD document whether `network.peer.*` attributes are applicable. Network peer address and port are important when the application interacts with individual intermediary nodes directly, If a messaging operation involved multiple network calls (for example retries), the address of the last contacted node SHOULD be used. -**[18]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. +**[16]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. + +**[17]:** This can refer to both the compressed or uncompressed body size. If both sizes are known, the uncompressed +body size should be used. + +**[18]:** This can refer to both the compressed or uncompressed size. If both sizes are known, the uncompressed +size should be used. diff --git a/docs/messaging/rabbitmq.md b/docs/messaging/rabbitmq.md index 20ad57d91c..1089f87a1c 100644 --- a/docs/messaging/rabbitmq.md +++ b/docs/messaging/rabbitmq.md @@ -31,12 +31,12 @@ In RabbitMQ, the destination is defined by an *exchange* and a *routing key*. | [`messaging.rabbitmq.destination.routing_key`](/docs/attributes-registry/messaging.md) | string | RabbitMQ message routing key. | `myKey` | `Conditionally Required` If not empty. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`messaging.rabbitmq.message.delivery_tag`](/docs/attributes-registry/messaging.md) | int | RabbitMQ message delivery tag | `123` | `Conditionally Required` When available. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`server.address`](/docs/attributes-registry/server.md) | string | Server domain name if available without reverse DNS lookup; otherwise, IP address or Unix domain socket name. [5] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Conditionally Required` If available. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`messaging.message.body.size`](/docs/attributes-registry/messaging.md) | int | The size of the message body in bytes. [6] | `1439` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`messaging.message.conversation_id`](/docs/attributes-registry/messaging.md) | string | Message [correlation Id](https://www.rabbitmq.com/tutorials/tutorial-six-java#correlation-id) property. | `MyConversationId` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`messaging.message.id`](/docs/attributes-registry/messaging.md) | string | A value used by the messaging system as an identifier for the message, represented as a string. | `452a7c7c7c7048c2f887f61572b18fc2` | `Recommended` If span describes operation on a single message. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`network.peer.address`](/docs/attributes-registry/network.md) | string | Peer address of the network connection - IP address or Unix domain socket name. [7] | `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`network.peer.address`](/docs/attributes-registry/network.md) | string | Peer address of the network connection - IP address or Unix domain socket name. [6] | `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`network.peer.port`](/docs/attributes-registry/network.md) | int | Peer port number of the network connection. | `65123` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [8] | `80`; `8080`; `443` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [7] | `80`; `8080`; `443` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`messaging.message.body.size`](/docs/attributes-registry/messaging.md) | int | The size of the message body in bytes. [8] | `1439` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1]:** The `error.type` SHOULD be predictable, and SHOULD have low cardinality. @@ -67,12 +67,12 @@ the broker doesn't have such notion, the destination name SHOULD uniquely identi **[5]:** Server domain name of the broker if available without reverse DNS lookup; otherwise, IP address or Unix domain socket name. -**[6]:** This can refer to both the compressed or uncompressed body size. If both sizes are known, the uncompressed -body size should be used. +**[6]:** If an operation involved multiple network calls (for example retries), the address of the last contacted node SHOULD be used. -**[7]:** If an operation involved multiple network calls (for example retries), the address of the last contacted node SHOULD be used. +**[7]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. -**[8]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. +**[8]:** This can refer to both the compressed or uncompressed body size. If both sizes are known, the uncompressed +body size should be used. diff --git a/docs/messaging/rocketmq.md b/docs/messaging/rocketmq.md index 48741ebd08..326d8dee63 100644 --- a/docs/messaging/rocketmq.md +++ b/docs/messaging/rocketmq.md @@ -35,13 +35,13 @@ Specific attributes for Apache RocketMQ are defined below. | [`messaging.rocketmq.message.group`](/docs/attributes-registry/messaging.md) | string | It is essential for FIFO message. Messages that belong to the same message group are always processed one by one within the same consumer group. | `myMessageGroup` | `Conditionally Required` If the message type is FIFO. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`server.address`](/docs/attributes-registry/server.md) | string | Server domain name if available without reverse DNS lookup; otherwise, IP address or Unix domain socket name. [9] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Conditionally Required` If available. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`messaging.client.id`](/docs/attributes-registry/messaging.md) | string | A unique identifier for the client that consumes or produces a message. | `client-5`; `myhost@8742@s8083jm` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`messaging.message.body.size`](/docs/attributes-registry/messaging.md) | int | The size of the message body in bytes. [10] | `1439` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`messaging.message.id`](/docs/attributes-registry/messaging.md) | string | A value used by the messaging system as an identifier for the message, represented as a string. | `452a7c7c7c7048c2f887f61572b18fc2` | `Recommended` If span describes operation on a single message. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`messaging.rocketmq.consumption_model`](/docs/attributes-registry/messaging.md) | string | Model of message consumption. This only applies to consumer spans. | `clustering`; `broadcasting` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`messaging.rocketmq.message.keys`](/docs/attributes-registry/messaging.md) | string[] | Key(s) of message, another way to mark message besides message id. | `["keyA", "keyB"]` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`messaging.rocketmq.message.tag`](/docs/attributes-registry/messaging.md) | string | The secondary classifier of message besides topic. | `tagA` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`messaging.rocketmq.message.type`](/docs/attributes-registry/messaging.md) | string | Type of message. | `normal`; `fifo`; `delay` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [11] | `80`; `8080`; `443` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [10] | `80`; `8080`; `443` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`messaging.message.body.size`](/docs/attributes-registry/messaging.md) | int | The size of the message body in bytes. [11] | `1439` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1]:** The `error.type` SHOULD be predictable, and SHOULD have low cardinality. @@ -80,10 +80,10 @@ the broker doesn't have such notion, the destination name SHOULD uniquely identi **[9]:** Server domain name of the broker if available without reverse DNS lookup; otherwise, IP address or Unix domain socket name. -**[10]:** This can refer to both the compressed or uncompressed body size. If both sizes are known, the uncompressed -body size should be used. +**[10]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. -**[11]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. +**[11]:** This can refer to both the compressed or uncompressed body size. If both sizes are known, the uncompressed +body size should be used. diff --git a/model/messaging/spans.yaml b/model/messaging/spans.yaml index cc10819f8a..c955d4feb2 100644 --- a/model/messaging/spans.yaml +++ b/model/messaging/spans.yaml @@ -64,7 +64,9 @@ groups: sampling_relevant: true - ref: messaging.message.conversation_id - ref: messaging.message.envelope.size + requirement_level: opt_in - ref: messaging.message.body.size + requirement_level: opt_in - ref: messaging.batch.message_count requirement_level: conditionally_required: If the span describes an operation on a batch of messages. @@ -111,6 +113,7 @@ groups: brief: > Message [correlation Id](https://www.rabbitmq.com/tutorials/tutorial-six-java#correlation-id) property. - ref: messaging.message.body.size + requirement_level: opt_in - id: messaging.kafka type: attribute_group @@ -141,8 +144,8 @@ groups: conditionally_required: If the span describes an operation on a batch of messages. - ref: messaging.client.id - ref: messaging.message.body.size - requirement_level: - recommended: If span describes operation on a single message. + requirement_level: opt_in + brief: The size of the message body in bytes. Only applicable for spans describing single message operations. - id: messaging.rocketmq type: attribute_group @@ -172,6 +175,7 @@ groups: - ref: messaging.rocketmq.consumption_model - ref: messaging.client.id - ref: messaging.message.body.size + requirement_level: opt_in - ref: messaging.batch.message_count requirement_level: conditionally_required: If the span describes an operation on a batch of messages. From d4f7bcd22415d517d4a7f78e42376d7a2e114e5c Mon Sep 17 00:00:00 2001 From: Joao Grassi <5938087+joaopgrassi@users.noreply.github.com> Date: Mon, 7 Oct 2024 18:09:11 +0200 Subject: [PATCH 3/4] [chore] Fix GH label generation script (#1455) --- .github/workflows/generate-registry-area-labels.yml | 11 ++++------- .../tools}/scripts/generate-registry-area-labels.sh | 0 2 files changed, 4 insertions(+), 7 deletions(-) rename {.github/workflows => internal/tools}/scripts/generate-registry-area-labels.sh (100%) mode change 100644 => 100755 diff --git a/.github/workflows/generate-registry-area-labels.yml b/.github/workflows/generate-registry-area-labels.yml index 1b8ea3a59d..e59c0c6e8b 100644 --- a/.github/workflows/generate-registry-area-labels.yml +++ b/.github/workflows/generate-registry-area-labels.yml @@ -5,7 +5,7 @@ on: paths: - model/** - ./.github/workflows/generate-registry-area-labels.yml - - ./.github/workflows/scripts/generate-registry-area-labels.sh + - ./internal/tools/scripts/generate-registry-area-labels.sh workflow_dispatch: @@ -15,14 +15,11 @@ jobs: if: ${{ github.repository_owner == 'open-telemetry' }} steps: - uses: actions/checkout@v4 + + # areas.txt is generated by the Make target generate-gh-issue-templates - name: Generate registry area labels run: | make generate-gh-issue-templates - - - name: Run update permissions - run: chmod +x ./.github/workflows/scripts/generate-registry-area-labels.sh - - - name: Generate registry area labels - run: ./.github/workflows/scripts/generate-registry-area-labels.sh + ./internal/tools/scripts/generate-registry-area-labels.sh ./internal/tools/bin/areas.txt env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} diff --git a/.github/workflows/scripts/generate-registry-area-labels.sh b/internal/tools/scripts/generate-registry-area-labels.sh old mode 100644 new mode 100755 similarity index 100% rename from .github/workflows/scripts/generate-registry-area-labels.sh rename to internal/tools/scripts/generate-registry-area-labels.sh From fbecf791d1b94abd4102b5df2db15cac03d92130 Mon Sep 17 00:00:00 2001 From: Trask Stalnaker Date: Mon, 7 Oct 2024 10:02:15 -0700 Subject: [PATCH 4/4] Add `db.operation.batch.size` to markdown (#1448) Co-authored-by: Joao Grassi <5938087+joaopgrassi@users.noreply.github.com> --- docs/database/cassandra.md | 21 ++++++++++++--------- docs/database/cosmosdb.md | 21 ++++++++++++--------- docs/database/couchdb.md | 7 +++++-- docs/database/database-spans.md | 21 ++++++++++++--------- docs/database/elasticsearch.md | 13 ++++++++----- docs/database/hbase.md | 7 +++++-- docs/database/mariadb.md | 17 ++++++++++------- docs/database/mongodb.md | 7 +++++-- docs/database/mssql.md | 17 ++++++++++------- docs/database/mysql.md | 17 ++++++++++------- docs/database/postgresql.md | 17 ++++++++++------- docs/database/redis.md | 21 ++++++++++++--------- docs/database/sql.md | 17 ++++++++++------- model/database/spans.yaml | 1 + 14 files changed, 122 insertions(+), 82 deletions(-) diff --git a/docs/database/cassandra.md b/docs/database/cassandra.md index acabd5d171..cb23fea868 100644 --- a/docs/database/cassandra.md +++ b/docs/database/cassandra.md @@ -33,11 +33,12 @@ The Semantic Conventions for [Cassandra](https://cassandra.apache.org/) extend a | [`db.cassandra.idempotence`](/docs/attributes-registry/db.md) | boolean | Whether or not the query is idempotent. | | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`db.cassandra.page_size`](/docs/attributes-registry/db.md) | int | The fetch size used for paging, i.e. how many rows will be returned at once. | `5000` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`db.cassandra.speculative_execution_count`](/docs/attributes-registry/db.md) | int | The number of times a query was speculatively executed. Not set or `0` if the query was not executed speculatively. | `0`; `2` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [11] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [12] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`network.peer.address`](/docs/attributes-registry/network.md) | string | Peer address of the database node where the operation was performed. [13] | `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [11] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [12] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [13] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`network.peer.address`](/docs/attributes-registry/network.md) | string | Peer address of the database node where the operation was performed. [14] | `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`network.peer.port`](/docs/attributes-registry/network.md) | int | Peer port number of the network connection. | `65123` | `Recommended` if and only if `network.peer.address` is set. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [14] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [15] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [15] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [16] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. If the collection name is parsed from the query text, it SHOULD be the first collection name found in the query and it SHOULD match the value provided in the query text including any schema and database name prefix. @@ -68,17 +69,19 @@ Instrumentations SHOULD document how `error.type` is populated. **[10]:** If using a port other than the default port for this DBMS and if `server.address` is set. -**[11]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[11]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. + +**[12]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). For batch operations, if the individual operations are known to have the same query text then that query text SHOULD be used, otherwise all of the individual query texts SHOULD be concatenated with separator `; ` or some other database system specific separator if more applicable. Even though parameterized query text can potentially have sensitive data, by using a parameterized query the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit to observability of capturing the static part of the query text by default outweighs the risk. -**[12]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[13]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). -**[13]:** If a database operation involved multiple network calls (for example retries), the address of the last contacted node SHOULD be used. +**[14]:** If a database operation involved multiple network calls (for example retries), the address of the last contacted node SHOULD be used. -**[14]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[15]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. -**[15]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. +**[16]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. If a parameter has no name and instead is referenced only by index, then `` SHOULD be the 0-based index. diff --git a/docs/database/cosmosdb.md b/docs/database/cosmosdb.md index c663a9ad95..d33c420ac4 100644 --- a/docs/database/cosmosdb.md +++ b/docs/database/cosmosdb.md @@ -37,10 +37,11 @@ Cosmos DB instrumentation includes call-level (public API) surface spans and net | [`az.namespace`](/docs/attributes-registry/azure.md) | string | [Azure Resource Provider Namespace](https://learn.microsoft.com/azure/azure-resource-manager/management/azure-services-resource-providers) as recognized by the client. [8] | `Microsoft.DocumentDB` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`db.cosmosdb.client_id`](/docs/attributes-registry/db.md) | string | Unique Cosmos client instance id. | `3ba4827d-4422-483f-b59f-85b74211c11d` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`db.cosmosdb.request_content_length`](/docs/attributes-registry/db.md) | int | Request payload size in bytes | | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [9] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [10] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [11] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`user_agent.original`](/docs/attributes-registry/user-agent.md) | string | Full user-agent string is generated by Cosmos DB SDK [12] | `cosmos-netstandard-sdk/3.23.0\|3.23.1\|1\|X64\|Linux 5.4.0-1098-azure 104 18\|.NET Core 3.1.32\|S\|` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [13] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [9] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [10] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [11] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [12] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`user_agent.original`](/docs/attributes-registry/user-agent.md) | string | Full user-agent string is generated by Cosmos DB SDK [13] | `cosmos-netstandard-sdk/3.23.0\|3.23.1\|1\|X64\|Linux 5.4.0-1098-azure 104 18\|.NET Core 3.1.32\|S\|` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [14] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. @@ -183,19 +184,21 @@ Instrumentations SHOULD document how `error.type` is populated. **[8]:** When `az.namespace` attribute is populated, it MUST be set to `Microsoft.DocumentDB` for all operations performed by Cosmos DB client. -**[9]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[9]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. + +**[10]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). For batch operations, if the individual operations are known to have the same query text then that query text SHOULD be used, otherwise all of the individual query texts SHOULD be concatenated with separator `; ` or some other database system specific separator if more applicable. Even though parameterized query text can potentially have sensitive data, by using a parameterized query the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit to observability of capturing the static part of the query text by default outweighs the risk. -**[10]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[11]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). -**[11]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[12]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. -**[12]:** The user-agent value is generated by SDK which is a combination of
`sdk_version` : Current version of SDK. e.g. 'cosmos-netstandard-sdk/3.23.0'
`direct_pkg_version` : Direct package version used by Cosmos DB SDK. e.g. '3.23.1'
`number_of_client_instances` : Number of cosmos client instances created by the application. e.g. '1'
`type_of_machine_architecture` : Machine architecture. e.g. 'X64'
`operating_system` : Operating System. e.g. 'Linux 5.4.0-1098-azure 104 18'
`runtime_framework` : Runtime Framework. e.g. '.NET Core 3.1.32'
`failover_information` : Generated key to determine if region failover enabled. +**[13]:** The user-agent value is generated by SDK which is a combination of
`sdk_version` : Current version of SDK. e.g. 'cosmos-netstandard-sdk/3.23.0'
`direct_pkg_version` : Direct package version used by Cosmos DB SDK. e.g. '3.23.1'
`number_of_client_instances` : Number of cosmos client instances created by the application. e.g. '1'
`type_of_machine_architecture` : Machine architecture. e.g. 'X64'
`operating_system` : Operating System. e.g. 'Linux 5.4.0-1098-azure 104 18'
`runtime_framework` : Runtime Framework. e.g. '.NET Core 3.1.32'
`failover_information` : Generated key to determine if region failover enabled. Format Reg-{D (Disabled discovery)}-S(application region)|L(List of preferred regions)|N(None, user did not configure it). Default value is "NS". -**[13]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. +**[14]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. If a parameter has no name and instead is referenced only by index, then `` SHOULD be the 0-based index. diff --git a/docs/database/couchdb.md b/docs/database/couchdb.md index 4544c2946d..447047d519 100644 --- a/docs/database/couchdb.md +++ b/docs/database/couchdb.md @@ -26,7 +26,8 @@ The Semantic Conventions for [CouchDB](https://couchdb.apache.org/) extend and o | [`db.response.status_code`](/docs/attributes-registry/db.md) | string | The HTTP response code returned by the Couch DB. [3] | `200`; `201`; `429` | `Conditionally Required` [4] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [5] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [6] | `80`; `8080`; `443` | `Conditionally Required` [7] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [8] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [8] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [9] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | **[1]:** In **CouchDB**, `db.operation.name` should be set to the HTTP method + the target REST route according to the API reference documentation. For example, when retrieving a document, `db.operation.name` would be set to (literally, i.e., without replacing the placeholders with concrete values): [`GET /{db}/{docid}`](https://docs.couchdb.org/en/stable/api/document/common.html#get--db-docid). @@ -45,7 +46,9 @@ Instrumentations SHOULD document how `error.type` is populated. **[7]:** If using a port other than the default port for this DBMS and if `server.address` is set. -**[8]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[8]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. + +**[9]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. diff --git a/docs/database/database-spans.md b/docs/database/database-spans.md index 5496681ee5..96b810f026 100644 --- a/docs/database/database-spans.md +++ b/docs/database/database-spans.md @@ -97,11 +97,12 @@ These attributes will usually be the same for all operations performed over the | [`db.response.status_code`](/docs/attributes-registry/db.md) | string | Database response status code. [7] | `102`; `ORA-17002`; `08P01`; `404` | `Conditionally Required` [8] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [9] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [10] | `80`; `8080`; `443` | `Conditionally Required` [11] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [12] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [13] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`network.peer.address`](/docs/attributes-registry/network.md) | string | Peer address of the database node where the operation was performed. [14] | `10.1.2.80`; `/tmp/my.sock` | `Recommended` If applicable for this database system. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [12] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [13] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [14] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`network.peer.address`](/docs/attributes-registry/network.md) | string | Peer address of the database node where the operation was performed. [15] | `10.1.2.80`; `/tmp/my.sock` | `Recommended` If applicable for this database system. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`network.peer.port`](/docs/attributes-registry/network.md) | int | Peer port number of the network connection. | `65123` | `Recommended` if and only if `network.peer.address` is set. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [15] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [16] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [16] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [17] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1]:** The actual DBMS may differ from the one identified by the client. For example, when using PostgreSQL client libraries to connect to a CockroachDB, the `db.system` is set to `postgresql` based on the instrumentation's best knowledge. @@ -134,18 +135,20 @@ Instrumentations SHOULD document how `error.type` is populated. **[11]:** If using a port other than the default port for this DBMS and if `server.address` is set. -**[12]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[12]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. + +**[13]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). For batch operations, if the individual operations are known to have the same query text then that query text SHOULD be used, otherwise all of the individual query texts SHOULD be concatenated with separator `; ` or some other database system specific separator if more applicable. Even though parameterized query text can potentially have sensitive data, by using a parameterized query the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit to observability of capturing the static part of the query text by default outweighs the risk. -**[13]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[14]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). -**[14]:** Semantic conventions for individual database systems SHOULD document whether `network.peer.*` attributes are applicable. Network peer address and port are useful when the application interacts with individual database nodes directly. +**[15]:** Semantic conventions for individual database systems SHOULD document whether `network.peer.*` attributes are applicable. Network peer address and port are useful when the application interacts with individual database nodes directly. If a database operation involved multiple network calls (for example retries), the address of the last contacted node SHOULD be used. -**[15]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[16]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. -**[16]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. +**[17]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. If a parameter has no name and instead is referenced only by index, then `` SHOULD be the 0-based index. diff --git a/docs/database/elasticsearch.md b/docs/database/elasticsearch.md index ab338db007..bcf734f6ba 100644 --- a/docs/database/elasticsearch.md +++ b/docs/database/elasticsearch.md @@ -35,8 +35,9 @@ The **span name** follows the [general database span name guidelines](database-s | [`db.collection.name`](/docs/attributes-registry/db.md) | string | The index or data stream against which the query is executed. [9] | `my_index`; `index1, index2` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`db.elasticsearch.node.name`](/docs/attributes-registry/db.md) | string | Represents the human-readable identifier of the node/instance to which a request was routed. [10] | `instance-0000000001` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`db.namespace`](/docs/attributes-registry/db.md) | string | The name of the Elasticsearch cluster which the client connects to. [11] | `customers`; `test.users` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`db.query.text`](/docs/attributes-registry/db.md) | string | The request body for a [search-type query](https://www.elastic.co/guide/en/elasticsearch/reference/current/search.html), as a json string. [12] | `"{\"query\":{\"term\":{\"user.id\":\"kimchy\"}}}"` | `Recommended` [13] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [14] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [12] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.query.text`](/docs/attributes-registry/db.md) | string | The request body for a [search-type query](https://www.elastic.co/guide/en/elasticsearch/reference/current/search.html), as a json string. [13] | `"{\"query\":{\"term\":{\"user.id\":\"kimchy\"}}}"` | `Recommended` [14] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [15] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | **[1]:** The `db.operation.name` SHOULD match the endpoint identifier provided in the request (see the [Elasticsearch schema](https://raw.githubusercontent.com/elastic/elasticsearch-specification/main/output/schema/schema.json)). @@ -78,13 +79,15 @@ Instrumentations SHOULD document how `error.type` is populated. **[11]:** When communicating with an Elastic Cloud deployment, this should be collected from the "X-Found-Handling-Cluster" HTTP response header. -**[12]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[12]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. + +**[13]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). For batch operations, if the individual operations are known to have the same query text then that query text SHOULD be used, otherwise all of the individual query texts SHOULD be concatenated with separator `; ` or some other database system specific separator if more applicable. Even though parameterized query text can potentially have sensitive data, by using a parameterized query the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit to observability of capturing the static part of the query text by default outweighs the risk. -**[13]:** Should be collected by default for search-type queries and only if there is sanitization that excludes sensitive information. +**[14]:** Should be collected by default for search-type queries and only if there is sanitization that excludes sensitive information. -**[14]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[15]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. diff --git a/docs/database/hbase.md b/docs/database/hbase.md index b4319c4a60..047bdad37a 100644 --- a/docs/database/hbase.md +++ b/docs/database/hbase.md @@ -27,7 +27,8 @@ The Semantic Conventions for [HBase](https://hbase.apache.org/) extend and overr | [`db.response.status_code`](/docs/attributes-registry/db.md) | string | Protocol-specific response code recorded as string. [5] | `200`; `409`; `14` | `Conditionally Required` If response was received. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [6] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [7] | `80`; `8080`; `443` | `Conditionally Required` [8] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [9] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [9] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [10] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | **[1]:** If table name includes the namespace, the `db.collection.name` SHOULD be set to the full table name. @@ -52,7 +53,9 @@ Instrumentations SHOULD document how `error.type` is populated. **[8]:** If using a port other than the default port for this DBMS and if `server.address` is set. -**[9]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[9]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. + +**[10]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. diff --git a/docs/database/mariadb.md b/docs/database/mariadb.md index ee68fe9a30..b927338527 100644 --- a/docs/database/mariadb.md +++ b/docs/database/mariadb.md @@ -27,9 +27,10 @@ The Semantic Conventions for *MariaDB* extend and override the [Database Semanti | [`db.response.status_code`](/docs/attributes-registry/db.md) | string | [Maria DB error code](https://mariadb.com/kb/en/mariadb-error-code-reference/) represented as a string. [6] | `1008`; `3058` | `Conditionally Required` If response has ended with warning or an error. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [7] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [8] | `80`; `8080`; `443` | `Conditionally Required` [9] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [10] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [11] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [12] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [13] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [10] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [11] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [12] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [13] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [14] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. If the collection name is parsed from the query text, it SHOULD be the first collection name found in the query and it SHOULD match the value provided in the query text including any schema and database name prefix. @@ -96,15 +97,17 @@ Instrumentations SHOULD document how `error.type` is populated. **[9]:** If using a port other than the default port for this DBMS and if `server.address` is set. -**[10]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[10]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. + +**[11]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). For batch operations, if the individual operations are known to have the same query text then that query text SHOULD be used, otherwise all of the individual query texts SHOULD be concatenated with separator `; ` or some other database system specific separator if more applicable. Even though parameterized query text can potentially have sensitive data, by using a parameterized query the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit to observability of capturing the static part of the query text by default outweighs the risk. -**[11]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[12]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). -**[12]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[13]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. -**[13]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. +**[14]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. If a parameter has no name and instead is referenced only by index, then `` SHOULD be the 0-based index. diff --git a/docs/database/mongodb.md b/docs/database/mongodb.md index 6bb7814742..db243d41c1 100644 --- a/docs/database/mongodb.md +++ b/docs/database/mongodb.md @@ -27,7 +27,8 @@ The Semantic Conventions for [MongoDB](https://www.mongodb.com/) extend and over | [`db.response.status_code`](/docs/attributes-registry/db.md) | string | [MongoDB error code](https://www.mongodb.com/docs/manual/reference/error-codes/) represented as a string. [4] | `36`; `11602` | `Conditionally Required` [5] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [6] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [7] | `80`; `8080`; `443` | `Conditionally Required` [8] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [9] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [9] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [10] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | **[1]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. If the collection name is parsed from the query text, it SHOULD be the first collection name found in the query and it SHOULD match the value provided in the query text including any schema and database name prefix. @@ -50,7 +51,9 @@ Instrumentations SHOULD document how `error.type` is populated. **[8]:** If using a port other than the default port for this DBMS and if `server.address` is set. -**[9]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[9]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. + +**[10]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. diff --git a/docs/database/mssql.md b/docs/database/mssql.md index 3b44bc09ab..f0275de77f 100644 --- a/docs/database/mssql.md +++ b/docs/database/mssql.md @@ -27,9 +27,10 @@ The Semantic Conventions for the *Microsoft SQL Server* extend and override the | [`db.response.status_code`](/docs/attributes-registry/db.md) | string | [Microsoft SQL Server error](https://learn.microsoft.com/sql/relational-databases/errors-events/database-engine-events-and-errors) number represented as a string. [6] | `102`; `40020` | `Conditionally Required` If response has ended with warning or an error. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [7] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [8] | `80`; `8080`; `443` | `Conditionally Required` [9] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [10] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [11] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [12] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [13] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [10] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [11] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [12] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [13] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [14] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. If the collection name is parsed from the query text, it SHOULD be the first collection name found in the query and it SHOULD match the value provided in the query text including any schema and database name prefix. @@ -66,15 +67,17 @@ Instrumentations SHOULD document how `error.type` is populated. **[9]:** If using a port other than the default port for this DBMS and if `server.address` is set. -**[10]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[10]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. + +**[11]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). For batch operations, if the individual operations are known to have the same query text then that query text SHOULD be used, otherwise all of the individual query texts SHOULD be concatenated with separator `; ` or some other database system specific separator if more applicable. Even though parameterized query text can potentially have sensitive data, by using a parameterized query the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit to observability of capturing the static part of the query text by default outweighs the risk. -**[11]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[12]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). -**[12]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[13]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. -**[13]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. +**[14]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. If a parameter has no name and instead is referenced only by index, then `` SHOULD be the 0-based index. diff --git a/docs/database/mysql.md b/docs/database/mysql.md index 2a11f73098..700ba09762 100644 --- a/docs/database/mysql.md +++ b/docs/database/mysql.md @@ -27,9 +27,10 @@ The Semantic Conventions for *MySQL* extend and override the [Database Semantic | [`db.response.status_code`](/docs/attributes-registry/db.md) | string | [MySQL error number](https://dev.mysql.com/doc/mysql-errors/9.0/en/error-reference-introduction.html). [6] | `1005`; `MY-010016` | `Conditionally Required` If response has ended with warning or an error. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [7] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [8] | `80`; `8080`; `443` | `Conditionally Required` [9] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [10] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [11] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [12] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [13] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [10] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [11] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [12] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [13] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [14] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. If the collection name is parsed from the query text, it SHOULD be the first collection name found in the query and it SHOULD match the value provided in the query text including any schema and database name prefix. @@ -96,15 +97,17 @@ Instrumentations SHOULD document how `error.type` is populated. **[9]:** If using a port other than the default port for this DBMS and if `server.address` is set. -**[10]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[10]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. + +**[11]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). For batch operations, if the individual operations are known to have the same query text then that query text SHOULD be used, otherwise all of the individual query texts SHOULD be concatenated with separator `; ` or some other database system specific separator if more applicable. Even though parameterized query text can potentially have sensitive data, by using a parameterized query the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit to observability of capturing the static part of the query text by default outweighs the risk. -**[11]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[12]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). -**[12]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[13]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. -**[13]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. +**[14]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. If a parameter has no name and instead is referenced only by index, then `` SHOULD be the 0-based index. diff --git a/docs/database/postgresql.md b/docs/database/postgresql.md index c92ed52fb4..2ff4a35b04 100644 --- a/docs/database/postgresql.md +++ b/docs/database/postgresql.md @@ -27,9 +27,10 @@ The Semantic Conventions for *PostgreSQL* extend and override the [Database Sema | [`db.response.status_code`](/docs/attributes-registry/db.md) | string | [PostgreSQL error code](https://www.postgresql.org/docs/current/errcodes-appendix.html). [6] | `08000`; `08P01` | `Conditionally Required` If response has ended with warning or an error. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [7] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [8] | `80`; `8080`; `443` | `Conditionally Required` [9] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [10] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [11] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [12] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [13] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [10] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [11] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [12] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [13] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [14] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. If the collection name is parsed from the query text, it SHOULD be the first collection name found in the query and it SHOULD match the value provided in the query text including any schema and database name prefix. @@ -103,15 +104,17 @@ Instrumentations SHOULD document how `error.type` is populated. **[9]:** If using a port other than the default port for this DBMS and if `server.address` is set. -**[10]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[10]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. + +**[11]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). For batch operations, if the individual operations are known to have the same query text then that query text SHOULD be used, otherwise all of the individual query texts SHOULD be concatenated with separator `; ` or some other database system specific separator if more applicable. Even though parameterized query text can potentially have sensitive data, by using a parameterized query the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit to observability of capturing the static part of the query text by default outweighs the risk. -**[11]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[12]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). -**[12]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[13]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. -**[13]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. +**[14]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. If a parameter has no name and instead is referenced only by index, then `` SHOULD be the 0-based index. diff --git a/docs/database/redis.md b/docs/database/redis.md index 186d12efa9..77cea2473b 100644 --- a/docs/database/redis.md +++ b/docs/database/redis.md @@ -26,11 +26,12 @@ The Semantic Conventions for [Redis](https://redis.com/) extend and override the | [`db.response.status_code`](/docs/attributes-registry/db.md) | string | The Redis [simple error](https://redis.io/docs/latest/develop/reference/protocol-spec/#simple-errors) prefix. [4] | `ERR`; `WRONGTYPE`; `CLUSTERDOWN` | `Conditionally Required` [5] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [6] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [7] | `80`; `8080`; `443` | `Conditionally Required` [8] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.query.text`](/docs/attributes-registry/db.md) | string | The full syntax of the Redis CLI command. [9] | `HMSET myhash field1 'Hello' field2 'World'` | `Recommended` [10] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`network.peer.address`](/docs/attributes-registry/network.md) | string | Peer address of the database node where the operation was performed. [11] | `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [9] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.query.text`](/docs/attributes-registry/db.md) | string | The full syntax of the Redis CLI command. [10] | `HMSET myhash field1 'Hello' field2 'World'` | `Recommended` [11] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`network.peer.address`](/docs/attributes-registry/network.md) | string | Peer address of the database node where the operation was performed. [12] | `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`network.peer.port`](/docs/attributes-registry/network.md) | int | Peer port number of the network connection. | `65123` | `Recommended` if and only if `network.peer.address` is set. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [12] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [13] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [13] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [14] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1]:** A connection's currently associated database index may change during its lifetime, e.g. from executing `SELECT `. @@ -59,16 +60,18 @@ Instrumentations SHOULD document how `error.type` is populated. **[8]:** If using a port other than the default port for this DBMS and if `server.address` is set. -**[9]:** For **Redis**, the value provided for `db.query.text` SHOULD correspond to the syntax of the Redis CLI. If, for example, the [`HMSET` command](https://redis.io/commands/hmset) is invoked, `"HMSET myhash field1 'Hello' field2 'World'"` would be a suitable value for `db.query.text`. +**[9]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. -**[10]:** Non-parameterized query text SHOULD NOT be collected by default unless there is sanitization that excludes sensitive data, e.g. by redacting all literal values present in the query text. +**[10]:** For **Redis**, the value provided for `db.query.text` SHOULD correspond to the syntax of the Redis CLI. If, for example, the [`HMSET` command](https://redis.io/commands/hmset) is invoked, `"HMSET myhash field1 'Hello' field2 'World'"` would be a suitable value for `db.query.text`. + +**[11]:** Non-parameterized query text SHOULD NOT be collected by default unless there is sanitization that excludes sensitive data, e.g. by redacting all literal values present in the query text. Parameterized query text SHOULD be collected by default (the query parameter values themselves are opt-in, see [`db.query.parameter.`](../../docs/attributes-registry/db.md)). -**[11]:** If a database operation involved multiple network calls (for example retries), the address of the last contacted node SHOULD be used. +**[12]:** If a database operation involved multiple network calls (for example retries), the address of the last contacted node SHOULD be used. -**[12]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[13]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. -**[13]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. +**[14]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. If a parameter has no name and instead is referenced only by index, then `` SHOULD be the 0-based index. diff --git a/docs/database/sql.md b/docs/database/sql.md index 8dae35e9d7..5ff03c6597 100644 --- a/docs/database/sql.md +++ b/docs/database/sql.md @@ -51,9 +51,10 @@ Instrumentations applied to generic SQL drivers SHOULD adhere to SQL semantic co | [`db.response.status_code`](/docs/attributes-registry/db.md) | string | Database response code recorded as string. [6] | `ORA-17027`; `1052`; `2201B` | `Conditionally Required` If response has ended with warning or an error. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [7] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [8] | `80`; `8080`; `443` | `Conditionally Required` [9] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [10] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [11] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [12] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [13] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [10] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [11] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [12] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [13] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [14] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. If the collection name is parsed from the query text, it SHOULD be the first collection name found in the query and it SHOULD match the value provided in the query text including any schema and database name prefix. @@ -128,15 +129,17 @@ Instrumentations SHOULD document how `error.type` is populated. **[9]:** If using a port other than the default port for this DBMS and if `server.address` is set. -**[10]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[10]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. + +**[11]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). For batch operations, if the individual operations are known to have the same query text then that query text SHOULD be used, otherwise all of the individual query texts SHOULD be concatenated with separator `; ` or some other database system specific separator if more applicable. Even though parameterized query text can potentially have sensitive data, by using a parameterized query the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit to observability of capturing the static part of the query text by default outweighs the risk. -**[11]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[12]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). -**[12]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[13]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. -**[13]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. +**[14]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. If a parameter has no name and instead is referenced only by index, then `` SHOULD be the 0-based index. diff --git a/model/database/spans.yaml b/model/database/spans.yaml index 6f0dcbd91d..cdac8e1059 100644 --- a/model/database/spans.yaml +++ b/model/database/spans.yaml @@ -10,6 +10,7 @@ groups: # sampling_relevant: true - ref: db.operation.name sampling_relevant: true + - ref: db.operation.batch.size - ref: server.address sampling_relevant: true - ref: server.port