-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Capture transaction execution statistics as a new event #108284
Labels
A-observability-inf
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Comments
kevin-v-ngo
added
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-cluster-observability
labels
Aug 7, 2023
xinhaoz
added a commit
to xinhaoz/cockroach
that referenced
this issue
Nov 3, 2023
Previously with the json logging format, it was not possible to emit boolean fields that were false. Even if the boolean field is marked as 'includeempty', it will be emitted as `field: true` in the event. While always not including false boolean fields is more space efficient, with certain fields it's helpful to have the field emitted in all instances and explicitly state its 'false' value. This patch makes it possible for `fieldName: false` to be emitted in the json logging format when the field is marked as `includeempty`. Epic: none Part of: cockroachdb#108284 Release note: None
craig bot
pushed a commit
that referenced
this issue
Nov 9, 2023
113757: util/log: allow bool fields to be emitted as false in json format r=xinhaoz a=xinhaoz Previously with the json logging format, it was not possible to emit boolean fields that were false. Even if the boolean field is marked as 'includeempty', it will be emitted as `field: true` in the event. While always not including false boolean fields is more space efficient, with certain fields it's helpful to have the field emitted in all instances and explicitly state its 'false' value. This patch makes it possible for `fieldName: false` to be emitted in the json logging format when the field is marked as `includeempty`. Epic: none Part of: #108284 Release note: None 113992: sql: add disable_changefeed_replication session variable r=miretskiy,yuzefovich a=andyyang890 This patch adds a `disable_changefeed_replication` session variable that can be used to disable changefeed replication for changes that occur within a session. Right now, the session variable has no effect but in later commits, it will be plumbed to the KV layer. Fixes #114071 Release note: None Co-authored-by: Xin Hao Zhang <[email protected]> Co-authored-by: Andy Yang <[email protected]>
xinhaoz
added a commit
to xinhaoz/cockroach
that referenced
this issue
Nov 10, 2023
This commit adds the messages listed below to `telemetry.proto` in preparation for sending transaction executions to the telemetry channel. The transaction event that is eventually sent should contain all execution information currently being tracked for transaction fingerprints. - `SampledTransaction`: contains fields equivalent to the execution information stored by `CollectedTransactionStatistics` from app_stats.proto, but represents a single txn execution instead of aggregated executions of a transaction fingerprint. - `SampledExecStats`: used as a field in `SampledTransaction`, it contains execution stats that are sampled. This event is the equivalent to `ExecStats` from app_stats.proto but for a single execution. - `MVCCIteratorStats`: used in `SampledExecStats` above, the equivalent of MVCCIteratorStats from app_stats.proto but for a single execution. In addition, in order to support the above fields a couple of additional code templates have been added for generating json log encoding: - array_of_uint64 type is now being handled for json logs - `nestedMessage` has been added as a custom type in `gen.go`. Object field types can be assigned to this type in order to generate them as nested objects. Part of: cockroachdb#108284 Release note: None
xinhaoz
added a commit
to xinhaoz/cockroach
that referenced
this issue
Nov 29, 2023
This commit adds the messages listed below to `telemetry.proto` in preparation for sending transaction executions to the telemetry channel. The transaction event that is eventually sent should contain all execution information currently being tracked for transaction fingerprints. - `SampledTransaction`: contains fields equivalent to the execution information stored by `CollectedTransactionStatistics` from app_stats.proto, but represents a single txn execution instead of aggregated executions of a transaction fingerprint. - `SampledExecStats`: used as a field in `SampledTransaction`, it contains execution stats that are sampled. This event is the equivalent to `ExecStats` from app_stats.proto but for a single execution. - `MVCCIteratorStats`: used in `SampledExecStats` above, the equivalent of MVCCIteratorStats from app_stats.proto but for a single execution. In addition, in order to support the above fields a couple of additional code templates have been added for generating json log encoding: - array_of_uint64 type is now being handled for json logs - `nestedMessage` has been added as a custom type in `gen.go`. Object field types can be assigned to this type in order to generate them as nested objects. Part of: cockroachdb#108284 Release note: None
craig bot
pushed a commit
that referenced
this issue
Dec 4, 2023
113952: log: add protobuf messages for telemetry txn events r=xinhaoz a=xinhaoz This commit adds the messages listed below to `telemetry.proto` in preparation for sending transaction executions to the telemetry channel. The transaction event that is eventually sent should contain all execution information currently being tracked for transaction fingerprints. - `SampledTransaction`: contains fields equivalent to the execution information stored by `CollectedTransactionStatistics` from app_stats.proto, but represents a single txn execution instead of aggregated executions of a transaction fingerprint. - `SampledExecStats`: used as a field in `SampledTransaction`, it contains execution stats that are sampled. This event is the equivalent to `ExecStats` from app_stats.proto but for a single execution. - `MVCCIteratorStats`: used in `SampledExecStats` above, the equivalent of MVCCIteratorStats from app_stats.proto but for a single execution. In addition, in order to support the above fields a couple of additional code templates have been added for generating json log encoding: - array_of_uint64 type is now being handled for json logs - `nestedMessage` has been added as a custom type in `gen.go`. Object field types can be assigned to this type in order to generate them as nested objects. Part of: #108284 Release note: None 114666: opt: reduce planning time for queries with many joins r=mgartner a=mgartner Prior to this commit, some queries with many joins would perform a large number of allocations calculating the selectivity of null-rejecting join filters. This was due to `statisticsBuiler.selectivityFromNullsRemoved` allocating a single-column set for each not-null column, and allocating column statistics for each set. Many of those allocations and much unnecessary computations to traverse the expression tree are now avoided. This is made possible by the realization that the selectivity of a null-rejecting filter is always 1 if the column was already not-null in the input. Epic: None Release note: None 115509: span: Re-initialize iterator when forwarding r=miretskiy a=miretskiy Re-initialize iterator when forwarding span frontier timestamp. The underlying btree may be mutated (by merge operation) invalidating previously constructed iterator. Btree implementation is also hardened against mis-use when mutating span frontier while iterating. Fixes #115411 Fixes #115528 Fixes #115512 Fixes #115490 Fixes #115488 Fixes #115487 Fixes #115483 Release notes: None Co-authored-by: Xin Hao Zhang <[email protected]> Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: Yevgeniy Miretskiy <[email protected]>
xinhaoz
added a commit
to xinhaoz/cockroach
that referenced
this issue
Dec 5, 2023
…on mode Previously, we dropped loggingBEGIN statements in telemetry transaction sampling mode due to BEGIn not having an associated transaction execution id at the time of logging. For transaction sampling we were using the transaction execution id to track the transaction through its execution in order to log all of its statements. Since BEGIN statements did not have an id, we could not start tracking with BEGIN. This commit enables us to include BEGIN statements by using a combination of the session id and session txn counter as the tracking id for telemetry, instead of the execution id. This allows us to start tracking transactions at BEGIN instead of at the statement after it. Part of: cockroachdb#108284 Release note (sql change): Telemetry logging - "transaction" sampling mode will now log BEGIN statements when they are present in a sampled transaction.
xinhaoz
added a commit
to xinhaoz/cockroach
that referenced
this issue
Jan 10, 2024
This change modifies the telemetry transaction sampling process to be simpler. Previously for telemetry transaction sampling the telemetry logging struct managed the tracking of sampled transactions via a map of execution ids. If a transaction was determined to be sampled, we would input an entry in this map and for each statement, we would check the map to see if the statement belongs to a tracked transaction. Instead of using a map, we can mark the transaction as being sampled by telemetry in the conn executor. This removes the need for concurrent data struct access when the transaction is marked as being sampled, since each statement no longer needs to read an entry from the shared map. This also removes the need to track the number of transactions currently being sampled for telemetry, as this was introduced to manage the memory used by the map. We will now determine if the transaction should be logged to telemetry at the start of transaction execution or at the start of a transaction restart. The transaction will be marked for telemetry logging if enough time has elapsed since the last transaction was sampled or if session tracing is on. Transaction statements will be logged according to the following settings: - sql.telemetry.transaction_sampling.frequency controls the frequency at which we sample a transaction. If a transaction is marked to be sampled by telemetry, this means we will log all of its statement execution events to telemetry, up to a maximum of `sql.telemetry.transaction_sampling.statement_events_per_transaction.max` statements. Part of: cockroachdb#108284 Release note (ops change): New cluster settings: - sql.telemetry.transaction_sampling.statement_events_per_transaction.max: controls the maximum number of statement events to emit per sampled transaction for TELEMETRY - sql.telemetry.transaction_sampling.frequency: controls the maximum frequency at which we sample transactions for telemetry
xinhaoz
added a commit
to xinhaoz/cockroach
that referenced
this issue
Jan 10, 2024
This commit sends SampledTransaction events to the telemetry channel. In "transaction" sampling mode, if a transaction is marked to be logged to telemetry, we will emit a SampledTranaction event on transaction end containing transaction level stats. It is expected that if a transaction event exists in telemetry, its statement events will also have been logged (with a maximum number according to the setting sql.telemetry.transaction_sampling.statement_events_per_transaction.max). Closes: cockroachdb#108284 Release note (ops change): Transactions sampled for the telemetry logging channel will now emit a SampledTransaction event. To sample transactions, set the cluster setting `sql.telemetry.query_sampling.mode = 'transaction'` and enable telemetry logging via `sql.telemetry.query_sampling.enabled = true`.
xinhaoz
added a commit
to xinhaoz/cockroach
that referenced
this issue
Jan 12, 2024
This change modifies the telemetry transaction sampling process to be simpler. Previously for telemetry transaction sampling the telemetry logging struct managed the tracking of sampled transactions via a map of execution ids. If a transaction was determined to be sampled, we would input an entry in this map and for each statement, we would check the map to see if the statement belongs to a tracked transaction. Instead of using a map, we can mark the transaction as being sampled by telemetry in the conn executor. This removes the need for concurrent data struct access when the transaction is marked as being sampled, since each statement no longer needs to read an entry from the shared map. This also removes the need to track the number of transactions currently being sampled for telemetry, as this was introduced to manage the memory used by the map. We will now determine if the transaction should be logged to telemetry at the start of transaction execution or at the start of a transaction restart. The transaction will be marked for telemetry logging if enough time has elapsed since the last transaction was sampled or if session tracing is on. Transaction statements will be logged according to the following settings: - sql.telemetry.transaction_sampling.frequency controls the frequency at which we sample a transaction. If a transaction is marked to be sampled by telemetry, this means we will log all of its statement execution events to telemetry, up to a maximum of `sql.telemetry.transaction_sampling.statement_events_per_transaction.max` statements. Part of: cockroachdb#108284 Release note (ops change): New cluster settings: - sql.telemetry.transaction_sampling.statement_events_per_transaction.max: controls the maximum number of statement events to emit per sampled transaction for TELEMETRY - sql.telemetry.transaction_sampling.frequency: controls the maximum frequency at which we sample transactions for telemetry
xinhaoz
added a commit
to xinhaoz/cockroach
that referenced
this issue
Jan 12, 2024
This change modifies the telemetry transaction sampling process to be simpler. Previously for telemetry transaction sampling the telemetry logging struct managed the tracking of sampled transactions via a map of execution ids. If a transaction was determined to be sampled, we would input an entry in this map and for each statement, we would check the map to see if the statement belongs to a tracked transaction. Instead of using a map, we can mark the transaction as being sampled by telemetry in the conn executor. This removes the need for concurrent data struct access when the transaction is marked as being sampled, since each statement no longer needs to read an entry from the shared map. This also removes the need to track the number of transactions currently being sampled for telemetry, as this was introduced to manage the memory used by the map. We will now determine if the transaction should be logged to telemetry at the start of transaction execution or at the start of a transaction restart. The transaction will be marked for telemetry logging if enough time has elapsed since the last transaction was sampled or if session tracing is on. Transaction statements will be logged according to the following settings: - sql.telemetry.transaction_sampling.frequency controls the frequency at which we sample a transaction. If a transaction is marked to be sampled by telemetry, this means we will log all of its statement execution events to telemetry, up to a maximum of `sql.telemetry.transaction_sampling.statement_events_per_transaction.max` statements. Part of: cockroachdb#108284 Release note (ops change): New cluster settings: - sql.telemetry.transaction_sampling.statement_events_per_transaction.max: controls the maximum number of statement events to emit per sampled transaction for TELEMETRY - sql.telemetry.transaction_sampling.frequency: controls the maximum frequency at which we sample transactions for telemetry
xinhaoz
added a commit
to xinhaoz/cockroach
that referenced
this issue
Jan 16, 2024
This change modifies the telemetry transaction sampling process to be simpler. Previously for telemetry transaction sampling the telemetry logging struct managed the tracking of sampled transactions via a map of execution ids. If a transaction was determined to be sampled, we would input an entry in this map and for each statement, we would check the map to see if the statement belongs to a tracked transaction. Instead of using a map, we can mark the transaction as being sampled by telemetry in the conn executor. This removes the need for concurrent data struct access when the transaction is marked as being sampled, since each statement no longer needs to read an entry from the shared map. This also removes the need to track the number of transactions currently being sampled for telemetry, as this was introduced to manage the memory used by the map. We will now determine if the transaction should be logged to telemetry at the start of transaction execution or at the start of a transaction restart. The transaction will be marked for telemetry logging if enough time has elapsed since the last transaction was sampled or if session tracing is on. Transaction statements will be logged according to the following settings: - sql.telemetry.transaction_sampling.frequency controls the frequency at which we sample a transaction. If a transaction is marked to be sampled by telemetry, this means we will log all of its statement execution events to telemetry, up to a maximum of `sql.telemetry.transaction_sampling.statement_events_per_transaction.max` statements. Part of: cockroachdb#108284 Release note (ops change): New cluster settings: - sql.telemetry.transaction_sampling.statement_events_per_transaction.max: controls the maximum number of statement events to emit per sampled transaction for TELEMETRY - sql.telemetry.transaction_sampling.frequency: controls the maximum frequency at which we sample transactions for telemetry
xinhaoz
added a commit
to xinhaoz/cockroach
that referenced
this issue
Jan 17, 2024
This change modifies the telemetry transaction sampling process to be simpler. Previously for telemetry transaction sampling the telemetry logging struct managed the tracking of sampled transactions via a map of execution ids. If a transaction was determined to be sampled, we would input an entry in this map and for each statement, we would check the map to see if the statement belongs to a tracked transaction. Instead of using a map, we can mark the transaction as being sampled by telemetry in the conn executor. This removes the need for concurrent data struct access when the transaction is marked as being sampled, since each statement no longer needs to read an entry from the shared map. This also removes the need to track the number of transactions currently being sampled for telemetry, as this was introduced to manage the memory used by the map. We will now determine if the transaction should be logged to telemetry at the start of transaction execution or at the start of a transaction restart. The transaction will be marked for telemetry logging if enough time has elapsed since the last transaction was sampled or if session tracing is on. Transaction statements will be logged according to the following settings: - sql.telemetry.transaction_sampling.frequency controls the frequency at which we sample a transaction. If a transaction is marked to be sampled by telemetry, this means we will log all of its statement execution events to telemetry, up to a maximum of `sql.telemetry.transaction_sampling.statement_events_per_transaction.max` statements. Additional items in this commit: - TelemetryLoggingMetrics to telemetryLoggingMetrics since it is not used in other packages. - Renames `lastEmittedTime` -> `lastSampledTime` in telemetryLogging struct as it is no longer representative of what this timestamp is. With th eintroduction of transaction sampling, the emitted event time is not necessary the time at which we decide to sample an event. - Creates a datadriven test handler for telemetry logging. Datadriven telemetry logging tests should be created in the dir pkg/sql/testdata/telemetry_logging. - `telemetry_logging/logging` contains tests on verifying emitted logs. - `telemetry_logging/logging_decision` contains unit tests for the functions `shouldEmitTransactionLog` and `shouldEmitStatementLog`. Epic: none Release note: None Part of: cockroachdb#108284 Release note (ops change): New cluster settings: - sql.telemetry.transaction_sampling.statement_events_per_transaction.max: controls the maximum number of statement events to emit per sampled transaction for TELEMETRY - sql.telemetry.transaction_sampling.frequency: controls the maximum frequency at which we sample transactions for telemetry
craig bot
pushed a commit
that referenced
this issue
Jan 18, 2024
115733: telemetry: track telemetry transactions through conn executor r=xinhaoz a=xinhaoz This change modifies the telemetry transaction sampling process to be simpler. Previously for telemetry transaction sampling the telemetry logging struct managed the tracking of sampled transactions via a map of execution ids. If a transaction was determined to be sampled, we would input an entry in this map and for each statement, we would check the map to see if the statement belongs to a tracked transaction. Instead of using a map, we can mark the transaction as being sampled by telemetry in the conn executor. This removes the need for concurrent data struct access when the transaction is marked as being sampled, since each statement no longer needs to read an entry from the shared map. This also removes the need to track the number of transactions currently being sampled for telemetry, as this was introduced to manage the memory used by the map. We will now determine if the transaction should be logged to telemetry at the start of transaction execution or at the start of a transaction restart. The transaction will be marked for telemetry logging if enough time has elapsed since the last transaction was sampled or if session tracing is on. Transaction statements will be logged according to the following settings: - sql.telemetry.transaction_sampling.frequency controls the frequency at which we sample a transaction. If a transaction is marked to be sampled by telemetry, this means we will log all of its statement execution events to telemetry, up to a maximum of `sql.telemetry.transaction_sampling.statement_events_per_transaction.max` statements. Additional items in this commit: - TelemetryLoggingMetrics to telemetryLoggingMetrics since it is not used in other packages. - Renames `lastEmittedTime` -> `lastSampledTime` in telemetryLogging struct as it is no longer representative of what this timestamp is. With th eintroduction of transaction sampling, the emitted event time is not necessary the time at which we decide to sample an event. - Creates a datadriven test handler for telemetry logging. Datadriven telemetry logging tests should be created in the dir pkg/sql/testdata/telemetry_logging. - `telemetry_logging/logging` contains tests on verifying emitted logs. - `telemetry_logging/logging_decision` contains unit tests for the functions `shouldEmitTransactionLog` and `shouldEmitStatementLog`. Epic: none Release note: None Part of: #108284 Release note (ops change): New cluster settings: - sql.telemetry.transaction_sampling.statement_events_per_transaction.max: controls the maximum number of statement events to emit per sampled transaction for TELEMETRY - sql.telemetry.transaction_sampling.frequency: controls the maximum frequency at which we sample transactions for telemetry Co-authored-by: Xin Hao Zhang <[email protected]>
xinhaoz
added a commit
to xinhaoz/cockroach
that referenced
this issue
Jan 19, 2024
This commit sends SampledTransaction events to the telemetry channel. In "transaction" sampling mode, if a transaction is marked to be logged to telemetry, we will emit a SampledTranaction event on transaction end containing transaction level stats. It is expected that if a transaction event exists in telemetry, its statement events will also have been logged (with a maximum number according to the setting sql.telemetry.transaction_sampling.statement_events_per_transaction.max). Closes: cockroachdb#108284 Release note (ops change): Transactions sampled for the telemetry logging channel will now emit a SampledTransaction event. To sample transactions, set the cluster setting `sql.telemetry.query_sampling.mode = 'transaction'` and enable telemetry logging via `sql.telemetry.query_sampling.enabled = true`.
xinhaoz
added a commit
to xinhaoz/cockroach
that referenced
this issue
Jan 22, 2024
This commit sends SampledTransaction events to the telemetry channel. In "transaction" sampling mode, if a transaction is marked to be logged to telemetry, we will emit a SampledTranaction event on transaction end containing transaction level stats. It is expected that if a transaction event exists in telemetry, its statement events will also have been logged (with a maximum number according to the setting sql.telemetry.transaction_sampling.statement_events_per_transaction.max). Closes: cockroachdb#108284 Release note (ops change): Transactions sampled for the telemetry logging channel will now emit a SampledTransaction event. To sample transactions, set the cluster setting `sql.telemetry.query_sampling.mode = 'transaction'` and enable telemetry logging via `sql.telemetry.query_sampling.enabled = true`.
craig bot
pushed a commit
that referenced
this issue
Jan 26, 2024
115722: telemetry: log transaction exec events to TELEMETRY r=xinhaoz a=xinhaoz ### 1. [sql/telemetry: add SkippedTransactions to SampledTransaction proto](a356e63) This commit adds the field SkippedTransactions to the SampledTransaction protobuf to count the number of transactions that were not sampled while telemetry transaction logging is enabled. The corresponding field is added to the telemetryLogging struct and will be used in the following commit to track skipped transactions. Some whitespace in the SampledTransaction proto definition is adjusted. Epic: none Release note (sql change): New field `SkippedTransactions` in the SampledTransaction event, which is emitted to the TELEMETRY logging channel when telemetry logging is enabled and set to "transaction" mode. ### 2. [eventpb: make MVCCIteratorStats in SampledExecStats non-nullable](6610467) This field should always exist in SampledExecStats. Since SampledTransaction is the only user of this message right now and is yet to be used we can safely change the proto definition. Epic: none Release note: None ### 3. [eventpb: make MVCCIteratorStats in SampledExecStats non-nullable](6610467) This field should always exist in SampledExecStats. Since SampledTransaction is the only user of this message right now and is yet to be used we can safely change the proto definition. Epic: none Release note: None ### 4. [telemetry: log transaction exec events to TELEMETRY](cc0d1bb) This commit sends SampledTransaction events to the telemetry channel. In "transaction" sampling mode, if a transaction is marked to be logged to telemetry, we will emit a SampledTranaction event on transaction end containing transaction level stats. It is expected that if a transaction event exists in telemetry, its statement events will also have been logged (with a maximum number according to the setting sql.telemetry.transaction_sampling.statement_events_per_transaction.max. Closes: #108284 Release note (ops change): Transactions sampled for the telemetry logging channel will now emit a SampledTransaction event. To sample transactions, set the cluster setting `sql.telemetry.query_sampling.mode = 'transaction'` and enable telemetry logging via `sql.telemetry.query_sampling.enabled = true`. 118325: datapathutils: update comment for `DebuggableTempDir` r=rail a=rickystewart Improve some of the wording here. Also I accidentally wrote "temp" instead of "test" which is confusing. Epic: none Release note: None Co-authored-by: Xin Hao Zhang <[email protected]> Co-authored-by: Ricky Stewart <[email protected]>
jlinder
pushed a commit
that referenced
this issue
Jan 29, 2024
Previously with the json logging format, it was not possible to emit boolean fields that were false. Even if the boolean field is marked as 'includeempty', it will be emitted as `field: true` in the event. While always not including false boolean fields is more space efficient, with certain fields it's helpful to have the field emitted in all instances and explicitly state its 'false' value. This patch makes it possible for `fieldName: false` to be emitted in the json logging format when the field is marked as `includeempty`. Epic: none Part of: #108284 Release note: None
jlinder
pushed a commit
that referenced
this issue
Jan 29, 2024
This commit adds the messages listed below to `telemetry.proto` in preparation for sending transaction executions to the telemetry channel. The transaction event that is eventually sent should contain all execution information currently being tracked for transaction fingerprints. - `SampledTransaction`: contains fields equivalent to the execution information stored by `CollectedTransactionStatistics` from app_stats.proto, but represents a single txn execution instead of aggregated executions of a transaction fingerprint. - `SampledExecStats`: used as a field in `SampledTransaction`, it contains execution stats that are sampled. This event is the equivalent to `ExecStats` from app_stats.proto but for a single execution. - `MVCCIteratorStats`: used in `SampledExecStats` above, the equivalent of MVCCIteratorStats from app_stats.proto but for a single execution. In addition, in order to support the above fields a couple of additional code templates have been added for generating json log encoding: - array_of_uint64 type is now being handled for json logs - `nestedMessage` has been added as a custom type in `gen.go`. Object field types can be assigned to this type in order to generate them as nested objects. Part of: #108284 Release note: None
jlinder
pushed a commit
that referenced
this issue
Jan 29, 2024
This change modifies the telemetry transaction sampling process to be simpler. Previously for telemetry transaction sampling the telemetry logging struct managed the tracking of sampled transactions via a map of execution ids. If a transaction was determined to be sampled, we would input an entry in this map and for each statement, we would check the map to see if the statement belongs to a tracked transaction. Instead of using a map, we can mark the transaction as being sampled by telemetry in the conn executor. This removes the need for concurrent data struct access when the transaction is marked as being sampled, since each statement no longer needs to read an entry from the shared map. This also removes the need to track the number of transactions currently being sampled for telemetry, as this was introduced to manage the memory used by the map. We will now determine if the transaction should be logged to telemetry at the start of transaction execution or at the start of a transaction restart. The transaction will be marked for telemetry logging if enough time has elapsed since the last transaction was sampled or if session tracing is on. Transaction statements will be logged according to the following settings: - sql.telemetry.transaction_sampling.frequency controls the frequency at which we sample a transaction. If a transaction is marked to be sampled by telemetry, this means we will log all of its statement execution events to telemetry, up to a maximum of `sql.telemetry.transaction_sampling.statement_events_per_transaction.max` statements. Additional items in this commit: - TelemetryLoggingMetrics to telemetryLoggingMetrics since it is not used in other packages. - Renames `lastEmittedTime` -> `lastSampledTime` in telemetryLogging struct as it is no longer representative of what this timestamp is. With th eintroduction of transaction sampling, the emitted event time is not necessary the time at which we decide to sample an event. - Creates a datadriven test handler for telemetry logging. Datadriven telemetry logging tests should be created in the dir pkg/sql/testdata/telemetry_logging. - `telemetry_logging/logging` contains tests on verifying emitted logs. - `telemetry_logging/logging_decision` contains unit tests for the functions `shouldEmitTransactionLog` and `shouldEmitStatementLog`. Epic: none Release note: None Part of: #108284 Release note (ops change): New cluster settings: - sql.telemetry.transaction_sampling.statement_events_per_transaction.max: controls the maximum number of statement events to emit per sampled transaction for TELEMETRY - sql.telemetry.transaction_sampling.frequency: controls the maximum frequency at which we sample transactions for telemetry
xinhaoz
added a commit
to xinhaoz/cockroach
that referenced
this issue
Jan 29, 2024
This commit sends SampledTransaction events to the telemetry channel. In "transaction" sampling mode, if a transaction is marked to be logged to telemetry, we will emit a SampledTranaction event on transaction end containing transaction level stats. It is expected that if a transaction event exists in telemetry, its statement events will also have been logged (with a maximum number according to the setting sql.telemetry.transaction_sampling.statement_events_per_transaction.max). Transaction recording for telemetry is decided at the start of transaction execution (including on restarts), and will not be refreshed for the remainder of transaction execution. Closes: cockroachdb#108284 Release note (ops change): Transactions sampled for the telemetry logging channel will now emit a SampledTransaction event. To sample transactions, set the cluster setting `sql.telemetry.query_sampling.mode = 'transaction'` and enable telemetry logging via `sql.telemetry.query_sampling.enabled = true`.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-observability-inf
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
It's difficult to monitor and troubleshoot CockroachDB without SQL statistics in downstream APM and observability tools.
This issue tracks emitting a new transaction-level event with statistics similar to what have for the statement-level event to allow users to identify and correlate issues with specific statements and up their application. Note the following behavior:
Jira issue: CRDB-30411
Epic CRDB-25399
The text was updated successfully, but these errors were encountered: