Skip to content

Commit

Permalink
server, sql: collect additional session details
Browse files Browse the repository at this point in the history
Partially addresses cockroachdb#74257.

Previously, the sql server did not provide additional session details
such as total number of transactions executed, transaction fingerprint
IDs, and total active time. This change adds the aforementioned session
details to the `serverpb.Session` struct.

To track transaction fingerprint IDs, a new circular buffer
`TxnFingerprintIDBuffer` is introduced with its corresponding cluster
setting `TxnFingerprintIDBufferCapacity` which controls the capacity.
The total number of transactions executed is accumulated using the `txnCounter`
from the `extraTxnState` in `connExecutor`. The total active time is calculated
by a `timeutil.StopWatch` which is started and stopped when a transaction is
started, restarted, and finished.

Release note (api change): the `serverpb.Session` struct now has three
new fields: number of transactions executed, transaction fingerprint
IDs, and total active time.
  • Loading branch information
Gerardo Torres committed May 4, 2022
1 parent 3831e27 commit d9304ce
Show file tree
Hide file tree
Showing 16 changed files with 485 additions and 11 deletions.
6 changes: 6 additions & 0 deletions docs/generated/http/full.md
Original file line number Diff line number Diff line change
Expand Up @@ -2108,9 +2108,12 @@ Session represents one SQL session.
| alloc_bytes | [int64](#cockroach.server.serverpb.ListSessionsResponse-int64) | | Number of currently allocated bytes in the session memory monitor. | [reserved](#support-status) |
| max_alloc_bytes | [int64](#cockroach.server.serverpb.ListSessionsResponse-int64) | | High water mark of allocated bytes in the session memory monitor. | [reserved](#support-status) |
| active_txn | [TxnInfo](#cockroach.server.serverpb.ListSessionsResponse-cockroach.server.serverpb.TxnInfo) | | Information about the txn in progress on this session. Nil if the session doesn't currently have a transaction. | [reserved](#support-status) |
| num_txns_executed | [int32](#cockroach.server.serverpb.ListSessionsResponse-int32) | | Number of transactions that were executed so far on this session. | [reserved](#support-status) |
| txnFingerprintIDs | [uint64](#cockroach.server.serverpb.ListSessionsResponse-uint64) | repeated | List of transaction fingerprint IDs in this session. | [reserved](#support-status) |
| last_active_query_no_constants | [string](#cockroach.server.serverpb.ListSessionsResponse-string) | | The SQL statement fingerprint of the last query executed on this session, compatible with StatementStatisticsKey. | [reserved](#support-status) |
| status | [Session.Status](#cockroach.server.serverpb.ListSessionsResponse-cockroach.server.serverpb.Session.Status) | | The session's status. | [reserved](#support-status) |
| end | [google.protobuf.Timestamp](#cockroach.server.serverpb.ListSessionsResponse-google.protobuf.Timestamp) | | Timestamp of session's end. | [reserved](#support-status) |
| totalActiveTime | [google.protobuf.Duration](#cockroach.server.serverpb.ListSessionsResponse-google.protobuf.Duration) | | The session's total active time. | [reserved](#support-status) |



Expand Down Expand Up @@ -2242,9 +2245,12 @@ Session represents one SQL session.
| alloc_bytes | [int64](#cockroach.server.serverpb.ListSessionsResponse-int64) | | Number of currently allocated bytes in the session memory monitor. | [reserved](#support-status) |
| max_alloc_bytes | [int64](#cockroach.server.serverpb.ListSessionsResponse-int64) | | High water mark of allocated bytes in the session memory monitor. | [reserved](#support-status) |
| active_txn | [TxnInfo](#cockroach.server.serverpb.ListSessionsResponse-cockroach.server.serverpb.TxnInfo) | | Information about the txn in progress on this session. Nil if the session doesn't currently have a transaction. | [reserved](#support-status) |
| num_txns_executed | [int32](#cockroach.server.serverpb.ListSessionsResponse-int32) | | Number of transactions that were executed so far on this session. | [reserved](#support-status) |
| txnFingerprintIDs | [uint64](#cockroach.server.serverpb.ListSessionsResponse-uint64) | repeated | List of transaction fingerprint IDs in this session. | [reserved](#support-status) |
| last_active_query_no_constants | [string](#cockroach.server.serverpb.ListSessionsResponse-string) | | The SQL statement fingerprint of the last query executed on this session, compatible with StatementStatisticsKey. | [reserved](#support-status) |
| status | [Session.Status](#cockroach.server.serverpb.ListSessionsResponse-cockroach.server.serverpb.Session.Status) | | The session's status. | [reserved](#support-status) |
| end | [google.protobuf.Timestamp](#cockroach.server.serverpb.ListSessionsResponse-google.protobuf.Timestamp) | | Timestamp of session's end. | [reserved](#support-status) |
| totalActiveTime | [google.protobuf.Duration](#cockroach.server.serverpb.ListSessionsResponse-google.protobuf.Duration) | | The session's total active time. | [reserved](#support-status) |



Expand Down
1 change: 1 addition & 0 deletions docs/generated/settings/settings-for-tenants.txt
Original file line number Diff line number Diff line change
Expand Up @@ -186,6 +186,7 @@ sql.ttl.default_range_concurrency integer 1 default amount of ranges to process
sql.ttl.default_select_batch_size integer 500 default amount of rows to select in a single query during a TTL job
sql.ttl.job.enabled boolean true whether the TTL job is enabled
sql.ttl.range_batch_size integer 100 amount of ranges to fetch at a time for a table during the TTL job
sql.txn_fingerprint_id_buffer.capacity integer 100 the maximum number of txn fingerprint IDs stored
timeseries.storage.enabled boolean true if set, periodic timeseries data is stored within the cluster; disabling is not recommended unless you are storing the data elsewhere
timeseries.storage.resolution_10s.ttl duration 240h0m0s the maximum age of time series data stored at the 10 second resolution. Data older than this is subject to rollup and deletion.
timeseries.storage.resolution_30m.ttl duration 2160h0m0s the maximum age of time series data stored at the 30 minute resolution. Data older than this is subject to deletion.
Expand Down
1 change: 1 addition & 0 deletions docs/generated/settings/settings.html
Original file line number Diff line number Diff line change
Expand Up @@ -202,6 +202,7 @@
<tr><td><code>sql.ttl.default_select_batch_size</code></td><td>integer</td><td><code>500</code></td><td>default amount of rows to select in a single query during a TTL job</td></tr>
<tr><td><code>sql.ttl.job.enabled</code></td><td>boolean</td><td><code>true</code></td><td>whether the TTL job is enabled</td></tr>
<tr><td><code>sql.ttl.range_batch_size</code></td><td>integer</td><td><code>100</code></td><td>amount of ranges to fetch at a time for a table during the TTL job</td></tr>
<tr><td><code>sql.txn_fingerprint_id_buffer.capacity</code></td><td>integer</td><td><code>100</code></td><td>the maximum number of txn fingerprint IDs stored</td></tr>
<tr><td><code>timeseries.storage.enabled</code></td><td>boolean</td><td><code>true</code></td><td>if set, periodic timeseries data is stored within the cluster; disabling is not recommended unless you are storing the data elsewhere</td></tr>
<tr><td><code>timeseries.storage.resolution_10s.ttl</code></td><td>duration</td><td><code>240h0m0s</code></td><td>the maximum age of time series data stored at the 10 second resolution. Data older than this is subject to rollup and deletion.</td></tr>
<tr><td><code>timeseries.storage.resolution_30m.ttl</code></td><td>duration</td><td><code>2160h0m0s</code></td><td>the maximum age of time series data stored at the 30 minute resolution. Data older than this is subject to deletion.</td></tr>
Expand Down
29 changes: 29 additions & 0 deletions docs/generated/swagger/spec.json
Original file line number Diff line number Diff line change
Expand Up @@ -768,6 +768,12 @@
},
"x-go-package": "github.com/cockroachdb/cockroach/pkg/server/serverpb"
},
"Duration": {
"description": "A Duration represents the elapsed time between two instants\nas an int64 nanosecond count. The representation limits the\nlargest representable duration to approximately 290 years.",
"type": "integer",
"format": "int64",
"x-go-package": "time"
},
"EventsResponse": {
"description": "EventsResponse contains a set of event log entries. This is always limited\nto the latest N entries (N is enforced in the associated endpoint).",
"type": "object",
Expand Down Expand Up @@ -1220,6 +1226,12 @@
"node_id": {
"$ref": "#/definitions/NodeID"
},
"num_txns_executed": {
"description": "num_txns_executed is the number of transactions that were executed so\nfar on this session.",
"type": "integer",
"format": "int32",
"x-go-name": "NumTxnsExecuted"
},
"start": {
"description": "Timestamp of session's start.",
"type": "string",
Expand All @@ -1229,6 +1241,17 @@
"status": {
"$ref": "#/definitions/Session_Status"
},
"totalActiveTime": {
"$ref": "#/definitions/Duration"
},
"txnFingerprintIDs": {
"description": "List of transaction fingerprint IDs in this session.",
"type": "array",
"items": {
"$ref": "#/definitions/TransactionFingerprintID"
},
"x-go-name": "TxnFingerprintIDs"
},
"username": {
"description": "Username of the user for this session.",
"type": "string",
Expand Down Expand Up @@ -1493,6 +1516,12 @@
},
"x-go-package": "github.com/cockroachdb/cockroach/pkg/util/hlc"
},
"TransactionFingerprintID": {
"description": "TransactionFingerprintID is the hashed string constructed using the\nindividual statement fingerprint IDs that comprise the transaction.",
"type": "integer",
"format": "uint64",
"x-go-package": "github.com/cockroachdb/cockroach/pkg/roachpb"
},
"TxnInfo": {
"type": "object",
"title": "TxnInfo represents an in flight user transaction on some Session.",
Expand Down
8 changes: 4 additions & 4 deletions pkg/ccl/logictestccl/testdata/logic_test/crdb_internal_tenant
Original file line number Diff line number Diff line change
Expand Up @@ -243,15 +243,15 @@ SELECT * FROM crdb_internal.cluster_transactions WHERE node_id < 0
----
id node_id session_id start txn_string application_name num_stmts num_retries num_auto_retries

query ITTTTTTTTTTTTT colnames
query ITTTTTTTTTTTTTT colnames
SELECT * FROM crdb_internal.node_sessions WHERE node_id < 0
----
node_id session_id user_name client_address application_name active_queries last_active_query session_start oldest_query_start kv_txn alloc_bytes max_alloc_bytes status session_end
node_id session_id user_name client_address application_name active_queries last_active_query num_txns_executed session_start oldest_query_start kv_txn alloc_bytes max_alloc_bytes status session_end

query ITTTTTTTTTTTTT colnames
query ITTTTTTTTTTTTTT colnames
SELECT * FROM crdb_internal.cluster_sessions WHERE node_id < 0
----
node_id session_id user_name client_address application_name active_queries last_active_query session_start oldest_query_start kv_txn alloc_bytes max_alloc_bytes status session_end
node_id session_id user_name client_address application_name active_queries last_active_query num_txns_executed session_start oldest_query_start kv_txn alloc_bytes max_alloc_bytes status session_end

query IIITTTI colnames
SELECT * FROM crdb_internal.node_contention_events WHERE table_id < 0
Expand Down
9 changes: 9 additions & 0 deletions pkg/server/serverpb/status.proto
Original file line number Diff line number Diff line change
Expand Up @@ -944,6 +944,12 @@ message Session {
// Information about the txn in progress on this session. Nil if the
// session doesn't currently have a transaction.
TxnInfo active_txn = 12;
// Number of transactions that were executed so far on this session.
int32 num_txns_executed = 16;
// List of transaction fingerprint IDs in this session.
repeated uint64 txnFingerprintIDs = 17 [(gogoproto.customname) = "TxnFingerprintIDs",
(gogoproto.customtype) = "github.com/cockroachdb/cockroach/pkg/roachpb.TransactionFingerprintID",
(gogoproto.nullable) = false];
// The SQL statement fingerprint of the last query executed on this session,
// compatible with StatementStatisticsKey.
string last_active_query_no_constants = 13;
Expand All @@ -958,6 +964,9 @@ message Session {
// Timestamp of session's end.
google.protobuf.Timestamp end = 15
[ (gogoproto.nullable) = true, (gogoproto.stdtime) = true ];
// The session's total active time.
google.protobuf.Duration totalActiveTime = 18 [(gogoproto.nullable) = false,
(gogoproto.stdduration) = true];
}

// An error wrapper object for ListSessionsResponse.
Expand Down
21 changes: 21 additions & 0 deletions pkg/sql/conn_executor.go
Original file line number Diff line number Diff line change
Expand Up @@ -883,6 +883,8 @@ func (s *Server) newConnExecutor(
stmtDiagnosticsRecorder: s.cfg.StmtDiagnosticsRecorder,
indexUsageStats: s.indexUsageStats,
txnIDCacheWriter: s.txnIDCache,
totalActiveTimeStopWatch: timeutil.NewStopWatch(),
TxnFingerprintIDBuffer: NewTxnFingerprintIDBuffer(s.cfg.Settings, s.cfg.RootMemoryMonitor),
}

ex.state.txnAbortCount = ex.metrics.EngineMetrics.TxnAbortCount
Expand Down Expand Up @@ -1497,6 +1499,17 @@ type connExecutor struct {
// txnIDCacheWriter is used to write txnidcache.ResolvedTxnID to the
// Transaction ID Cache.
txnIDCacheWriter txnidcache.Writer

// totalTxnsExecuted tracks the total number of transactions executed during
// the lifetime of the session.
totalTxnsExecuted int

// TxnFingerprintIDBuffer is a circular buffer keeping track of the
// txnFingerprintIDs in this session.
TxnFingerprintIDBuffer *TxnFingerprintIDBuffer

// totalActiveTimeStopWatch tracks the total active time of the session.
totalActiveTimeStopWatch *timeutil.StopWatch
}

// ctxHolder contains a connection's context and, while session tracing is
Expand Down Expand Up @@ -1661,6 +1674,9 @@ func (ex *connExecutor) resetExtraTxnState(ctx context.Context, ev txnEvent) err

ex.extraTxnState.createdSequences = make(map[descpb.ID]struct{})

// Increment the totalTxnsExecuted count.
ex.totalTxnsExecuted += ex.extraTxnState.txnCounter

switch ev.eventType {
case txnCommit, txnRollback:
for name, p := range ex.extraTxnState.prepStmtsNamespaceAtTxnRewindPos.portals {
Expand Down Expand Up @@ -3115,19 +3131,24 @@ func (ex *connExecutor) serialize() serverpb.Session {
remoteStr = sd.RemoteAddr.String()
}

txnFingerprintIDs := ex.TxnFingerprintIDBuffer.GetAllTxnFingerprintIDs()

return serverpb.Session{
Username: sd.SessionUser().Normalized(),
ClientAddress: remoteStr,
ApplicationName: ex.applicationName.Load().(string),
Start: ex.phaseTimes.GetSessionPhaseTime(sessionphase.SessionInit).UTC(),
ActiveQueries: activeQueries,
ActiveTxn: activeTxnInfo,
NumTxnsExecuted: int32(ex.totalTxnsExecuted),
TxnFingerprintIDs: txnFingerprintIDs,
LastActiveQuery: lastActiveQuery,
ID: ex.sessionID.GetBytes(),
AllocBytes: ex.mon.AllocBytes(),
MaxAllocBytes: ex.mon.MaximumBytes(),
LastActiveQueryNoConstants: lastActiveQueryNoConstants,
Status: status,
TotalActiveTime: ex.totalActiveTimeStopWatch.Elapsed(),
}
}

Expand Down
13 changes: 12 additions & 1 deletion pkg/sql/conn_executor_exec.go
Original file line number Diff line number Diff line change
Expand Up @@ -2101,6 +2101,12 @@ func (ex *connExecutor) onTxnFinish(ctx context.Context, ev txnEvent) {
ex.phaseTimes.SetSessionPhaseTime(sessionphase.SessionEndExecTransaction, timeutil.Now())
transactionFingerprintID :=
roachpb.TransactionFingerprintID(ex.extraTxnState.transactionStatementsHash.Sum())
err := ex.TxnFingerprintIDBuffer.Enqueue(transactionFingerprintID)
if err != nil {
if log.V(1) {
log.Warningf(ctx, "failed to enqueue transactionFingerprintID = %d: %s", transactionFingerprintID, err)
}
}
if !implicit {
ex.statsCollector.EndExplicitTransaction(
ctx,
Expand All @@ -2114,7 +2120,7 @@ func (ex *connExecutor) onTxnFinish(ctx context.Context, ev txnEvent) {
transactionFingerprintID,
)
}
err := ex.recordTransactionFinish(ctx, transactionFingerprintID, ev, implicit, txnStart)
err = ex.recordTransactionFinish(ctx, transactionFingerprintID, ev, implicit, txnStart)
if err != nil {
if log.V(1) {
log.Warningf(ctx, "failed to record transaction stats: %s", err)
Expand All @@ -2126,6 +2132,8 @@ func (ex *connExecutor) onTxnFinish(ctx context.Context, ev txnEvent) {

func (ex *connExecutor) onTxnRestart(ctx context.Context) {
if ex.extraTxnState.shouldExecuteOnTxnRestart {
ex.totalActiveTimeStopWatch.Stop()
defer ex.totalActiveTimeStopWatch.Start()
ex.phaseTimes.SetSessionPhaseTime(sessionphase.SessionMostRecentStartExecTransaction, timeutil.Now())
ex.extraTxnState.transactionStatementFingerprintIDs = nil
ex.extraTxnState.transactionStatementsHash = util.MakeFNV64()
Expand Down Expand Up @@ -2157,6 +2165,8 @@ func (ex *connExecutor) recordTransactionStart(txnID uuid.UUID) {
ex.state.mu.RUnlock()
implicit := ex.implicitTxn()

ex.totalActiveTimeStopWatch.Start()

// Transaction received time is the time at which the statement that prompted
// the creation of this transaction was received.
ex.phaseTimes.SetSessionPhaseTime(sessionphase.SessionTransactionReceived,
Expand Down Expand Up @@ -2210,6 +2220,7 @@ func (ex *connExecutor) recordTransactionFinish(

txnEnd := timeutil.Now()
txnTime := txnEnd.Sub(txnStart)
ex.totalActiveTimeStopWatch.Stop()
if ex.executorType != executorTypeInternal {
ex.metrics.EngineMetrics.SQLTxnsOpen.Dec(1)
}
Expand Down
5 changes: 4 additions & 1 deletion pkg/sql/crdb_internal.go
Original file line number Diff line number Diff line change
Expand Up @@ -1615,7 +1615,7 @@ CREATE TABLE crdb_internal.%s (
node_id INT, -- the ID of the node running the transaction
session_id STRING, -- the ID of the session
start TIMESTAMP, -- the start time of the transaction
txn_string STRING, -- the string representation of the transcation
txn_string STRING, -- the string representation of the transaction
application_name STRING, -- the name of the application as per SET application_name
num_stmts INT, -- the number of statements executed so far
num_retries INT, -- the number of times the transaction was restarted
Expand Down Expand Up @@ -1894,6 +1894,7 @@ CREATE TABLE crdb_internal.%s (
application_name STRING, -- the name of the application as per SET application_name
active_queries STRING, -- the currently running queries as SQL
last_active_query STRING, -- the query that finished last on this session as SQL
num_txns_executed INT, -- the number of transactions that were executed so far on this session
session_start TIMESTAMP, -- the time when the session was opened
oldest_query_start TIMESTAMP, -- the time when the oldest query in the session was started
kv_txn STRING, -- the ID of the current KV transaction
Expand Down Expand Up @@ -1995,6 +1996,7 @@ func populateSessionsTable(
tree.NewDString(session.ApplicationName),
tree.NewDString(activeQueries.String()),
tree.NewDString(session.LastActiveQuery),
tree.NewDInt(tree.DInt(session.NumTxnsExecuted)),
startTSDatum,
oldestStartDatum,
kvTxnIDDatum,
Expand All @@ -2020,6 +2022,7 @@ func populateSessionsTable(
tree.DNull, // application name
tree.NewDString("-- "+rpcErr.Message), // active queries
tree.DNull, // last active query
tree.DNull, // num txns executed
tree.DNull, // session start
tree.DNull, // oldest_query_start
tree.DNull, // kv_txn
Expand Down
2 changes: 1 addition & 1 deletion pkg/sql/delegate/show_sessions.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ import (
)

func (d *delegator) delegateShowSessions(n *tree.ShowSessions) (tree.Statement, error) {
const query = `SELECT node_id, session_id, status, user_name, client_address, application_name, active_queries, last_active_query, session_start, oldest_query_start FROM crdb_internal.`
const query = `SELECT node_id, session_id, status, user_name, client_address, application_name, active_queries, last_active_query, num_txns_executed, session_start, oldest_query_start FROM crdb_internal.`
table := `node_sessions`
if n.Cluster {
table = `cluster_sessions`
Expand Down
8 changes: 4 additions & 4 deletions pkg/sql/logictest/testdata/logic_test/crdb_internal
Original file line number Diff line number Diff line change
Expand Up @@ -364,15 +364,15 @@ SELECT * FROM crdb_internal.cluster_transactions WHERE node_id < 0
----
id node_id session_id start txn_string application_name num_stmts num_retries num_auto_retries

query ITTTTTTTTTTTTT colnames
query ITTTTTTTTTTTTTT colnames
SELECT * FROM crdb_internal.node_sessions WHERE node_id < 0
----
node_id session_id user_name client_address application_name active_queries last_active_query session_start oldest_query_start kv_txn alloc_bytes max_alloc_bytes status session_end
node_id session_id user_name client_address application_name active_queries last_active_query num_txns_executed session_start oldest_query_start kv_txn alloc_bytes max_alloc_bytes status session_end

query ITTTTTTTTTTTTT colnames
query ITTTTTTTTTTTTTT colnames
SELECT * FROM crdb_internal.cluster_sessions WHERE node_id < 0
----
node_id session_id user_name client_address application_name active_queries last_active_query session_start oldest_query_start kv_txn alloc_bytes max_alloc_bytes status session_end
node_id session_id user_name client_address application_name active_queries last_active_query num_txns_executed session_start oldest_query_start kv_txn alloc_bytes max_alloc_bytes status session_end

query IIITTTI colnames
SELECT * FROM crdb_internal.node_contention_events WHERE table_id < 0
Expand Down
Loading

0 comments on commit d9304ce

Please sign in to comment.