Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server, sql: surface session txnCount, recent txn fingerprints, active time #82352

Merged
merged 1 commit into from
Jun 28, 2022

Conversation

xinhaoz
Copy link
Member

@xinhaoz xinhaoz commented Jun 2, 2022

Finishing up Gerardo's PR, original review here: #80717


Partially addresses #74257.

Previously, the status server did not provide session details such as
total number of transactions executed, transaction fingerprint
IDs, and total active time. This change adds the aforementioned session
details to the serverpb.Session struct.

To track recently executed transaction fingerprint IDs, a FIFO cache
TxnFingerprintIDCache is introduced with its corresponding cluster
setting TxnFingerprintIDBufferCapacity to control the capacity. The
default capacity is set at 100 fingerprints.

The total number of transactions executed is filled using the existing
txnCounter from the extraTxnState in connExecutor. The total active
time is calculated by introducing a timeutil.StopWatch to the connection
executor, which is started and stopped when a transaction is started and
finished respectively.

Release note (api change): the serverpb.Session struct now has three
new fields: number of transactions executed, transaction fingerprint
IDs, and total active time.

@xinhaoz xinhaoz requested a review from a team June 2, 2022 14:35
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@xinhaoz xinhaoz marked this pull request as ready for review June 2, 2022 14:36
@xinhaoz xinhaoz requested review from a team as code owners June 2, 2022 14:36
@xinhaoz xinhaoz requested a review from a team June 2, 2022 14:36
@xinhaoz xinhaoz force-pushed the session-details branch 2 times, most recently from fadce86 to 60160c8 Compare June 2, 2022 20:47
Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rafiss and @xinhaoz)


pkg/sql/logictest/testdata/logic_test/crdb_internal line 371 at r1 (raw file):

SELECT * FROM crdb_internal.node_sessions WHERE node_id < 0
----
node_id  session_id  user_name  client_address  application_name  active_queries  last_active_query  num_txns_executed  session_start  oldest_query_start  kv_txn  alloc_bytes  max_alloc_bytes status session_end

i feel like this might be less disruptive if the new column were added to the end, so the column ordering doesn't change as much. (would affect any users who are doing select * queries)

@xinhaoz xinhaoz force-pushed the session-details branch from 60160c8 to 4dd1f55 Compare June 3, 2022 14:59
Copy link
Contributor

@matthewtodd matthewtodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: mod a few questions and @rafiss's suggestion.

Reviewed 6 of 19 files at r1.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @matthewtodd, @rafiss, and @xinhaoz)


pkg/sql/conn_executor.go line 2904 at r2 (raw file):

		ex.extraTxnState.txnCounter++

		// Session is considred active when executing a transaction.

nit: typo in "considred"


pkg/sql/txn_fingerprint_id_cache.go line 77 at r2 (raw file):

}

// Add adds a TxnFingerprintID to the cache, truncating the cache to the

nit: looks like this sentence got truncated?


pkg/sql/txn_fingerprint_id_cache.go line 92 at r2 (raw file):

// GetAllTxnFingerprintIDs returns a slice of all TxnFingerprintIDs in the cache.
// The cache may be truncated if the capacity was updated to a smaller size.

nit: I wonder if it would be sufficient to just let adding to the cache handle eviction, WDYT?


pkg/util/timeutil/stopwatch.go line 90 at r2 (raw file):

	w.mu.Lock()
	defer w.mu.Unlock()
	return w.mu.started, w.mu.startedAt

nit: Would it feel more idiomatic to reverse the order of these return parameters, like the v, ok when getting from a map?

@xinhaoz xinhaoz requested a review from a team June 14, 2022 17:59
Copy link
Member Author

@xinhaoz xinhaoz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @matthewtodd and @rafiss)


pkg/sql/conn_executor.go line 2904 at r2 (raw file):

Previously, matthewtodd (Matthew Todd) wrote…

nit: typo in "considred"

Fixed.


pkg/sql/txn_fingerprint_id_cache.go line 77 at r2 (raw file):

Previously, matthewtodd (Matthew Todd) wrote…

nit: looks like this sentence got truncated?

Fixed.


pkg/sql/txn_fingerprint_id_cache.go line 92 at r2 (raw file):

Previously, matthewtodd (Matthew Todd) wrote…

nit: I wonder if it would be sufficient to just let adding to the cache handle eviction, WDYT?

Yeah, the rate at which txn fingerprints are added in a real workload might be sufficient for ensuring the cache eventually gets resized appropriately. But in the off chance where none are inserted between a change in capacity and a retrieval of fingerprint ids, are we okay with the number of returned ids being greater than the cache's newly set capacity? I'm also not sure how often sessions become idle for extended periods of time in practice.


pkg/util/timeutil/stopwatch.go line 90 at r2 (raw file):

Previously, matthewtodd (Matthew Todd) wrote…

nit: Would it feel more idiomatic to reverse the order of these return parameters, like the v, ok when getting from a map?

Ah good suggestions! Done. I also changed the name of this function to more accurately reflect what it returns.


pkg/sql/logictest/testdata/logic_test/crdb_internal line 371 at r1 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

i feel like this might be less disruptive if the new column were added to the end, so the column ordering doesn't change as much. (would affect any users who are doing select * queries)

Done.

Copy link
Contributor

@abarganier abarganier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not much to add on top of what others said aside from a lingering merge conflict artifact. :lgtm: otherwise!

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @abarganier, @matthewtodd, @rafiss, and @xinhaoz)


docs/generated/settings/settings-for-tenants.txt line 276 at r3 (raw file):

sql.ttl.default_select_batch_size	integer	500	default amount of rows to select in a single query during a TTL job
sql.ttl.job.enabled	boolean	true	whether the TTL job is enabled
<<<<<<< HEAD

nit: Lingering artifacts from merge conflict?

Code quote:

<<<<<<< HEAD
=======
sql.ttl.range_batch_size	integer	100	amount of ranges to fetch at a time for a table during the TTL job
sql.txn_fingerprint_id_cache.capacity	integer	100	the maximum number of txn fingerprint IDs stored
>>>>>>> 4dd1f55533 (server, sql: surface session txnCount, txn fingerprints, active time)

Partially addresses cockroachdb#74257.

Previously, the status server did not provide session details such as
total number of transactions executed, transaction fingerprint
IDs, and total active time. This change adds the aforementioned session
details to the `serverpb.Session` struct.

To track recently executed transaction fingerprint IDs, a FIFO cache
`TxnFingerprintIDCache` is introduced with its corresponding cluster
setting `TxnFingerprintIDBufferCapacity` to control the capacity. The
default capacity is set at 100 fingerprints.

The total number of transactions executed is filled using the existing
`txnCounter` from the `extraTxnState` in `connExecutor`. The total active
time is calculated by introducing a `timeutil.StopWatch` to the connection
executor, which is started and stopped when a transaction is started and
finished respectively.

Release note (api change): the `serverpb.Session` struct now has three
new fields: number of transactions executed, transaction fingerprint
IDs, and total active time.
@xinhaoz
Copy link
Member Author

xinhaoz commented Jun 28, 2022

Thanks everyone for the reviews!
bors r+

@craig
Copy link
Contributor

craig bot commented Jun 28, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Jun 28, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Jun 28, 2022

Build failed (retrying...):

@craig craig bot merged commit 5541cf8 into cockroachdb:master Jun 28, 2022
@craig
Copy link
Contributor

craig bot commented Jun 28, 2022

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants