-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server, sql: surface session txnCount, recent txn fingerprints, active time #82352
Conversation
fadce86
to
60160c8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rafiss and @xinhaoz)
pkg/sql/logictest/testdata/logic_test/crdb_internal
line 371 at r1 (raw file):
SELECT * FROM crdb_internal.node_sessions WHERE node_id < 0 ---- node_id session_id user_name client_address application_name active_queries last_active_query num_txns_executed session_start oldest_query_start kv_txn alloc_bytes max_alloc_bytes status session_end
i feel like this might be less disruptive if the new column were added to the end, so the column ordering doesn't change as much. (would affect any users who are doing select *
queries)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mod a few questions and @rafiss's suggestion.
Reviewed 6 of 19 files at r1.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @matthewtodd, @rafiss, and @xinhaoz)
pkg/sql/conn_executor.go
line 2904 at r2 (raw file):
ex.extraTxnState.txnCounter++ // Session is considred active when executing a transaction.
nit: typo in "considred"
pkg/sql/txn_fingerprint_id_cache.go
line 77 at r2 (raw file):
} // Add adds a TxnFingerprintID to the cache, truncating the cache to the
nit: looks like this sentence got truncated?
pkg/sql/txn_fingerprint_id_cache.go
line 92 at r2 (raw file):
// GetAllTxnFingerprintIDs returns a slice of all TxnFingerprintIDs in the cache. // The cache may be truncated if the capacity was updated to a smaller size.
nit: I wonder if it would be sufficient to just let adding to the cache handle eviction, WDYT?
pkg/util/timeutil/stopwatch.go
line 90 at r2 (raw file):
w.mu.Lock() defer w.mu.Unlock() return w.mu.started, w.mu.startedAt
nit: Would it feel more idiomatic to reverse the order of these return parameters, like the v, ok
when getting from a map?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @matthewtodd and @rafiss)
pkg/sql/conn_executor.go
line 2904 at r2 (raw file):
Previously, matthewtodd (Matthew Todd) wrote…
nit: typo in "considred"
Fixed.
pkg/sql/txn_fingerprint_id_cache.go
line 77 at r2 (raw file):
Previously, matthewtodd (Matthew Todd) wrote…
nit: looks like this sentence got truncated?
Fixed.
pkg/sql/txn_fingerprint_id_cache.go
line 92 at r2 (raw file):
Previously, matthewtodd (Matthew Todd) wrote…
nit: I wonder if it would be sufficient to just let adding to the cache handle eviction, WDYT?
Yeah, the rate at which txn fingerprints are added in a real workload might be sufficient for ensuring the cache eventually gets resized appropriately. But in the off chance where none are inserted between a change in capacity and a retrieval of fingerprint ids, are we okay with the number of returned ids being greater than the cache's newly set capacity? I'm also not sure how often sessions become idle for extended periods of time in practice.
pkg/util/timeutil/stopwatch.go
line 90 at r2 (raw file):
Previously, matthewtodd (Matthew Todd) wrote…
nit: Would it feel more idiomatic to reverse the order of these return parameters, like the
v, ok
when getting from a map?
Ah good suggestions! Done. I also changed the name of this function to more accurately reflect what it returns.
pkg/sql/logictest/testdata/logic_test/crdb_internal
line 371 at r1 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
i feel like this might be less disruptive if the new column were added to the end, so the column ordering doesn't change as much. (would affect any users who are doing
select *
queries)
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not much to add on top of what others said aside from a lingering merge conflict artifact. otherwise!
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @abarganier, @matthewtodd, @rafiss, and @xinhaoz)
docs/generated/settings/settings-for-tenants.txt
line 276 at r3 (raw file):
sql.ttl.default_select_batch_size integer 500 default amount of rows to select in a single query during a TTL job sql.ttl.job.enabled boolean true whether the TTL job is enabled <<<<<<< HEAD
nit: Lingering artifacts from merge conflict?
Code quote:
<<<<<<< HEAD
=======
sql.ttl.range_batch_size integer 100 amount of ranges to fetch at a time for a table during the TTL job
sql.txn_fingerprint_id_cache.capacity integer 100 the maximum number of txn fingerprint IDs stored
>>>>>>> 4dd1f55533 (server, sql: surface session txnCount, txn fingerprints, active time)
Partially addresses cockroachdb#74257. Previously, the status server did not provide session details such as total number of transactions executed, transaction fingerprint IDs, and total active time. This change adds the aforementioned session details to the `serverpb.Session` struct. To track recently executed transaction fingerprint IDs, a FIFO cache `TxnFingerprintIDCache` is introduced with its corresponding cluster setting `TxnFingerprintIDBufferCapacity` to control the capacity. The default capacity is set at 100 fingerprints. The total number of transactions executed is filled using the existing `txnCounter` from the `extraTxnState` in `connExecutor`. The total active time is calculated by introducing a `timeutil.StopWatch` to the connection executor, which is started and stopped when a transaction is started and finished respectively. Release note (api change): the `serverpb.Session` struct now has three new fields: number of transactions executed, transaction fingerprint IDs, and total active time.
Thanks everyone for the reviews! |
Build failed (retrying...): |
Build failed (retrying...): |
Build failed (retrying...): |
Build succeeded: |
Finishing up Gerardo's PR, original review here: #80717
Partially addresses #74257.
Previously, the status server did not provide session details such as
total number of transactions executed, transaction fingerprint
IDs, and total active time. This change adds the aforementioned session
details to the
serverpb.Session
struct.To track recently executed transaction fingerprint IDs, a FIFO cache
TxnFingerprintIDCache
is introduced with its corresponding clustersetting
TxnFingerprintIDBufferCapacity
to control the capacity. Thedefault capacity is set at 100 fingerprints.
The total number of transactions executed is filled using the existing
txnCounter
from theextraTxnState
inconnExecutor
. The total activetime is calculated by introducing a
timeutil.StopWatch
to the connectionexecutor, which is started and stopped when a transaction is started and
finished respectively.
Release note (api change): the
serverpb.Session
struct now has threenew fields: number of transactions executed, transaction fingerprint
IDs, and total active time.