Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: introduce crdb_internal.cluster_locks virtual table #77876

Merged
merged 2 commits into from
Apr 23, 2022

Conversation

AlexTalks
Copy link
Contributor

@AlexTalks AlexTalks commented Mar 16, 2022

This change introduces the new virtual table
crdb_internal.cluster_locks, as proposed in the KV Lock Observability
RFC (#75541), to expose lock contention. This virtual table displays
the locks currently tracked in the lock tables in ranges across a
cluster, utilizing the KV QueryLocksRequest API to gather information
on the lock holders as well as the operations waiting on the locks
before converting to a user-friendly SQL view that incorporates
information about the database, table, schema, and index.

For example,

root@localhost:26261/defaultdb> select range_id, table_name, lock_key_pretty, txn_id, ts, lock_strength, durability, granted, contended, duration from crdb_internal.cluster_locks;
  range_id | table_name |     lock_key_pretty      |                txn_id                |             ts             | lock_strength | durability | granted | contended |    duration
-----------+------------+--------------------------+--------------------------------------+----------------------------+---------------+------------+---------+-----------+------------------
       235 | kv         | /Table/115/1/"alex"/0    | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.916745 | Exclusive     | Replicated |  true   |   true    | 00:01:50.607386
       235 | kv         | /Table/115/1/"alex"/0    | d71ea8eb-21b9-48a0-98b6-82eed84ed76f | 2022-03-26 00:24:02.137125 | None          | Replicated |  false  |   true    | 00:01:50.607376
       235 | kv         | /Table/115/1/"bob"/0     | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.916745 | Exclusive     | Replicated |  true   |   false   | 00:01:50.607384
       235 | kv         | /Table/115/1/"chris"/0   | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.916745 | Exclusive     | Replicated |  true   |   false   | 00:01:50.607383
       255 | kv         | /Table/115/1/"lauren"/0  | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.819704 | Exclusive     | Replicated |  true   |   true    | 00:01:50.607757
       255 | kv         | /Table/115/1/"lauren"/0  | d71ea8eb-21b9-48a0-98b6-82eed84ed76f | 2022-03-26 00:24:02.137125 | None          | Replicated |  false  |   true    | 00:01:50.60773
       255 | kv         | /Table/115/1/"marilyn"/0 | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.819704 | Exclusive     | Replicated |  true   |   false   | 00:01:50.607755
       255 | kv         | /Table/115/1/"mike"/0    | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.819704 | Exclusive     | Replicated |  true   |   false   | 00:01:50.607754
       255 | kv         | /Table/115/1/"nancy"/0   | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.819704 | Exclusive     | Replicated |  true   |   false   | 00:01:50.607745
       255 | kv         | /Table/115/1/"noah"/0    | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.819704 | Exclusive     | Replicated |  true   |   false   | 00:01:50.607744
       255 | kv         | /Table/115/1/"paul"/0    | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.819704 | Exclusive     | Replicated |  true   |   false   | 00:01:50.607744
       255 | kv         | /Table/115/1/"peter"/0   | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.819704 | Exclusive     | Replicated |  true   |   false   | 00:01:50.607743
       256 | kv         | /Table/115/1/"sam"/0     | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.740235 | Exclusive     | Replicated |  true   |   true    | 00:01:50.607582
       256 | kv         | /Table/115/1/"sam"/0     | d71ea8eb-21b9-48a0-98b6-82eed84ed76f | 2022-03-26 00:24:02.137125 | None          | Replicated |  false  |   true    | 00:01:50.607573
       256 | kv         | /Table/115/1/"thomas"/0  | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.740235 | Exclusive     | Replicated |  true   |   false   | 00:01:50.607581
       256 | kv         | /Table/115/1/"zebra"/0   | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.740235 | Exclusive     | Replicated |  true   |   false   | 00:01:50.60758
(16 rows)

This internal table is usable for system tenants as well as secondary
tenants. It is also useful in conjunction with other
crdb_internal.cluster_* tables used for observability, and also can be
joined with itself to visualize blocked operations, and the transactions
that are blocking them. This feature is gated on v22.1, to ensure that
older nodes will not receive kv.Batch requests with
QueryLocksRequests in them.

Closes #74834

Release justification: Low risk, high benefit change.

Release note (sql change): introducing the crdb_internal.cluster_locks
virtual table, which exposes the current state of locks on keys tracked
by concurrency control. The virtual table displays metadata on locks
currently held by transactions, as well as operations waiting to obtain
the locks, and as such can be used to visualize active contention.
The VIEWACTIVITY or VIEWACTIVITYREDACTED role option, or the
admin role, is required to access the virtual table; however, if the
user only has the VIEWACTIVITYREDACTED role option, the key on which
a lock is held will be redacted.

@AlexTalks AlexTalks requested review from a team March 16, 2022 02:49
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@kevin-v-ngo
Copy link

FYI - @xinhaoz

Copy link
Contributor Author

@AlexTalks AlexTalks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/crdb_internal.go, line 5751 at r1 (raw file):

			// TODO(sarkesian): set key limits correctly, currently 5 used for testing.
			b.Header.MaxSpanRequestKeys = 5

@yuzefovich FYI - this is the logic I was talking about. If it's possible to incorporate a limit hint into a request to a virtual table, definitely let me know!

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks and @yuzefovich)


pkg/sql/crdb_internal.go, line 5751 at r1 (raw file):

Previously, AlexTalks (Alex Sarkesian) wrote…

@yuzefovich FYI - this is the logic I was talking about. If it's possible to incorporate a limit hint into a request to a virtual table, definitely let me know!

I see. Unfortunately, without a lot of plumbing we probably cannot push any limits / hints from the optimizer into this scope, so I'd just use the defaults we use elsewhere (I think 100k keys limit and 10MiB bytes limit).

Copy link
Contributor Author

@AlexTalks AlexTalks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)


pkg/sql/crdb_internal.go, line 5751 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

I see. Unfortunately, without a lot of plumbing we probably cannot push any limits / hints from the optimizer into this scope, so I'd just use the defaults we use elsewhere (I think 100k keys limit and 10MiB bytes limit).

OK - got it. I've just used the static constant values here.

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as discussed during team meeting, there's one crdb_ too many in the name.

But the one thing that worries me a bit more is that the code currently does not paginate: all the results are accumulated in RAM before the generator function is returned.

Usually, a vtable generator is so written that the underlying results are pulled incrementally as the rows get generated. This ensures we don't get more data in RAM than is consumed from the vtable, and also ensures the memory accounting for the result rows is more-or-less aligned with that of the data source.
Here, the current implementation model breaks that. so at the very least the code should do its own memory accounting. And I'd encourage you to see how to make this code more "streaming-like".

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)

@ajwerner
Copy link
Contributor

This thing would really benefit from #74806. I'd go so far as to say we should try to get that issue scheduled if only to benefit features like this.

Copy link
Contributor Author

@AlexTalks AlexTalks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@knz I've updated this a bit more so that a) it is more clear how the pagination works, and only one page of locks returned from QueryLocksRequest should be kept in memory at one time, and b) we query the correct spans, rather than simply TableDataMin-TableDataMax as before (which wouldn't support tenants).

Unless I'm unclear about how the generator works, I believe this should only perform a request when necessary - let me know if this is not correct.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)

@AlexTalks AlexTalks changed the title sql: introduce crdb_internal.crdb_locks virtual table sql: introduce crdb_internal.cluster_locks virtual table Mar 24, 2022
Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you have successfully addressed my concern. I'd welcome some additional explanatory comments (also for our future selves) though.

Reviewed 1 of 2 files at r1, 1 of 1 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks)


-- commits, line 10 at r3:
nit: be sure to expand this commit message a bit before this merges.


pkg/sql/crdb_internal.go, line 3110 at r3 (raw file):

	map[uint32]uint32,
) {
	// TODO(knz): maybe this could use internalLookupCtx.

Can you talk with folk on the SQL schema team to provide suggestions.


pkg/sql/crdb_internal.go, line 5688 at r3 (raw file):

		expensive operation since it creates a cluster-wide RPC-fanout.`,
	schema: `
CREATE TABLE crdb_internal.crdb_locks (

are you still sure about this name?


pkg/sql/crdb_internal.go, line 5740 at r3 (raw file):

		var resumeSpan *roachpb.Span

		fetchLocks := func(key, endKey roachpb.Key) error {

I would welcome a few explanatory comments through this function to explain what is going on and how the code ensures the results are paginated.

Copy link
Contributor Author

@AlexTalks AlexTalks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks and @knz)


pkg/sql/crdb_internal.go, line 3110 at r3 (raw file):

Previously, knz (kena) wrote…

Can you talk with folk on the SQL schema team to provide suggestions.

Will do!


pkg/sql/crdb_internal.go, line 5688 at r3 (raw file):

Previously, knz (kena) wrote…

are you still sure about this name?

Very strange, I thought I had pushed this with my updated commit that changes the name to cluster_locks and also supports the secondary tenant use case. Will double check.


pkg/sql/crdb_internal.go, line 5740 at r3 (raw file):

Previously, knz (kena) wrote…

I would welcome a few explanatory comments through this function to explain what is going on and how the code ensures the results are paginated.

I believe I've added some in my updated commit, hopefully that will help explain a bit.

@AlexTalks AlexTalks force-pushed the crdb_locks_vtable branch 2 times, most recently from f1fb660 to 3c5cd30 Compare March 25, 2022 00:23
@yuzefovich
Copy link
Member

I just want to point out that it seems that we want to get this into 22.1 release, but this PR bumps the cockroach version. Once we cut the first 22.1.0-beta, I think bumps are no longer allowed, so we either have to slowdown the release train or this feature will have to wait for 22.2. The latter outcome seems rather unfortunate, so it'd be good to communicate with the release team, probably opening an issue with release-blocker (not GA-blocker) label would be good.

@RichardJCai
Copy link
Contributor

I just want to point out that it seems that we want to get this into 22.1 release, but this PR bumps the cockroach version. Once we cut the first 22.1.0-beta, I think bumps are no longer allowed, so we either have to slowdown the release train or this feature will have to wait for 22.2. The latter outcome seems rather unfortunate, so it'd be good to communicate with the release team, probably opening an issue with release-blocker (not GA-blocker) label would be good.

Alternatively we could probably get away with not having the new version / version gate here. It seems like we're adding the version gate out of caution since it is an internal table.

@AlexTalks
Copy link
Contributor Author

I just want to point out that it seems that we want to get this into 22.1 release, but this PR bumps the cockroach version. Once we cut the first 22.1.0-beta, I think bumps are no longer allowed, so we either have to slowdown the release train or this feature will have to wait for 22.2. The latter outcome seems rather unfortunate, so it'd be good to communicate with the release team, probably opening an issue with release-blocker (not GA-blocker) label would be good.

Alternatively we could probably get away with not having the new version / version gate here. It seems like we're adding the version gate out of caution since it is an internal table.

The version gate is primarily to avoid sending QueryLocksRequest RPCs to nodes that don't support it (<22.1). If this is not necessary, that's definitely fine and I can remove it!

@ajwerner
Copy link
Contributor

The version gate is primarily to avoid sending QueryLocksRequest RPCs to nodes that don't support it (<22.1). If this is not necessary, that's definitely fine and I can remove it!

It sounds necessary to me.

@yuzefovich
Copy link
Member

The version gate sounds necessary to me too.

@AlexTalks AlexTalks requested a review from a team as a code owner March 26, 2022 00:33
Copy link
Contributor Author

@AlexTalks AlexTalks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks, @knz, and @yuzefovich)


pkg/clusterversion/cockroach_versions.go, line 550 at r6 (raw file):

	},
	{
		Key:     ClusterLocksVirtualTable,

@rafiss @RichardJCai please let me know if this bit is correct or should be removed!

Copy link
Contributor

@Azhng Azhng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r5, 4 of 14 files at r6, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks, @Azhng, @knz, and @yuzefovich)


pkg/sql/crdb_internal.go, line 3108 at r7 (raw file):

	map[uint32]map[uint32]string,
	map[uint32]uint32,
	map[uint32]uint32,

small nit: do you mind name the return variables of this function and the one below?


pkg/sql/crdb_internal.go, line 5718 at r7 (raw file):

			return nil, nil, err
		}
		if !hasViewActivityOrViewActivityRedacted {

When user have ViewActivityRedacted permission, we should ensure that they cannot view any PII information. That means we probably should omit lock_key and lock_key_pretty column.


pkg/sql/crdb_internal.go, line 5783 at r7 (raw file):

			err := p.txn.Run(ctx, &b)
			if err == nil {

I feel we usually invert this conditions, so the error handling is closer to the call site e.g.

if err != nil {
  return

resp = b.RawResponse().Responses[0].GetQueryLocks()
locks = resp.Locks
resumeSpan = resp.ResumeSpan

pkg/sql/crdb_internal.go, line 5813 at r7 (raw file):

					lockIdx = 0
				}
			}

preference nit: perhaps refactor the for loop into something like maybeFetchLockInfo() 🤔


pkg/sql/crdb_internal.go, line 5832 at r7 (raw file):

		// each waiter, prior to moving onto the next lock (or fetching additional
		// results as necessary).
		return func() (tree.Datums, error) {

I think the patterns wrap the returning virtualTableGenerator using setupGenerator(). This would allow the generator to run in a separate goroutine and allows for graceful query cancellation.


pkg/sql/crdb_internal.go, line 5838 at r7 (raw file):

			}

			// If we couldn't get any more locks from getNextLock(), we are finished

nit: s/are finished/have finished/

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks, @Azhng, @knz, @rafiss, @RichardJCai, and @yuzefovich)


pkg/clusterversion/cockroach_versions.go, line 550 at r6 (raw file):

Previously, AlexTalks (Alex Sarkesian) wrote…

@rafiss @RichardJCai please let me know if this bit is correct or should be removed!

this part lgtm!

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's cool to see this! Have you thought about how you'd like to test it? The behavior should be stable enough to exercise in a logic test.

Reviewed 1 of 4 files at r5, 14 of 14 files at r6, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks, @Azhng, @knz, and @RichardJCai)


-- commits, line 60 at r7:
Reminder to fill in this release note.


pkg/sql/crdb_internal.go, line 3108 at r7 (raw file):

Previously, Azhng (Archer Zhang) wrote…

small nit: do you mind name the return variables of this function and the one below?

+1, and could you add a comment to this function describing what it does?


pkg/sql/crdb_internal.go, line 5704 at r7 (raw file):

);`,
	indexes: nil,
	generator: func(ctx context.Context, p *planner, db catalog.DatabaseDescriptor, stopper *stop.Stopper) (virtualTableGenerator, cleanupFunc, error) {

Do you think it would be worthwhile to enumerate the set of virtualIndexs we plan to define on this table?

Also, consider writing this to minimize the amount of code that will need to move when we do add virtualIndexs. The flow of 1) get descriptors, 2) get spans, 3) get locks for spans is good, but we may want to extract parts of this out into reusable helpers.


pkg/sql/crdb_internal.go, line 5779 at r7 (raw file):

			b.AddRawRequest(queryLocksRequest)

			b.Header.MaxSpanRequestKeys = int64(rowinfra.ProductionKVBatchSize)

Should we be using a memory monitor here like we do in crdbInternalClusterStmtStatsTable?

@AlexTalks AlexTalks force-pushed the crdb_locks_vtable branch 3 times, most recently from 23ed76c to 96be08b Compare April 13, 2022 00:53
While previously when evaluating a `QueryLocksRequest` in the lock
table, the default was to look at the unreplicated lock holder first, we
should instead prioritize selecting the replicated lock holder as it is
of stronger durability.

Release note: None
@AlexTalks AlexTalks force-pushed the crdb_locks_vtable branch 2 times, most recently from 36f3d0f to 0c8083d Compare April 21, 2022 00:02
Copy link
Contributor Author

@AlexTalks AlexTalks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dismissed @Azhng from 3 discussions.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks, @Azhng, @knz, @nvanbenschoten, @rafiss, and @yuzefovich)


pkg/sql/crdb_internal.go, line 3110 at r3 (raw file):

Previously, AlexTalks (Alex Sarkesian) wrote…

Will do!

Done.


pkg/sql/crdb_internal.go, line 5688 at r3 (raw file):

Previously, AlexTalks (Alex Sarkesian) wrote…

Very strange, I thought I had pushed this with my updated commit that changes the name to cluster_locks and also supports the secondary tenant use case. Will double check.

Done.


pkg/sql/crdb_internal.go, line 5740 at r3 (raw file):

Previously, AlexTalks (Alex Sarkesian) wrote…

I believe I've added some in my updated commit, hopefully that will help explain a bit.

Done.


pkg/sql/crdb_internal.go, line 5779 at r7 (raw file):

Previously, AlexTalks (Alex Sarkesian) wrote…

I actually don't know the answer to this - do we have any guideline here?

Looking into this a bit more, I think it wouldn't hurt to have but I think we'll need to refactor this to create a worker and use setupGenerator. If possible, I'd like to do this in a follow-up PR.


pkg/sql/crdb_internal.go, line 5832 at r7 (raw file):

Previously, Azhng (Archer Zhang) wrote…

I think the patterns wrap the returning virtualTableGenerator using setupGenerator(). This would allow the generator to run in a separate goroutine and allows for graceful query cancellation.

OK. Planning to do this in a follow-up PR.

Copy link
Contributor Author

@AlexTalks AlexTalks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks, @Azhng, @knz, @nvanbenschoten, @rafiss, and @yuzefovich)


pkg/sql/crdb_internal.go, line 5779 at r7 (raw file):

Previously, AlexTalks (Alex Sarkesian) wrote…

Looking into this a bit more, I think it wouldn't hurt to have but I think we'll need to refactor this to create a worker and use setupGenerator. If possible, I'd like to do this in a follow-up PR.

Incorporated into #80216.


pkg/sql/crdb_internal.go, line 5832 at r7 (raw file):

Previously, AlexTalks (Alex Sarkesian) wrote…

OK. Planning to do this in a follow-up PR.

Incorporated into #80216.

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: nice job on this @AlexTalks. Let's get someone from SQL to sign off as well, then we can go ahead and merge.

Reviewed 1 of 18 files at r10, 10 of 12 files at r14, 3 of 3 files at r15, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @AlexTalks, @Azhng, @knz, @nvanbenschoten, @rafiss, and @yuzefovich)


pkg/sql/crdb_internal.go, line 6019 at r15 (raw file):

			waiterIdx++

			strippedKey, _, _ := keys.DecodeTenantPrefix(curLock.Key)

nit: do we even want to run this line and the next when shouldRedactKeys?


pkg/ccl/logictestccl/testdata/logic_test/cluster_locks_tenant, line 77 at r15 (raw file):


# looking at each transaction separately, validate the expected results in the lock table
query TTTTTTBB colnames,retry

It's cool that we can write tests like these!

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 18 files at r10, 11 of 11 files at r13, 10 of 12 files at r14, 3 of 3 files at r15, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @AlexTalks, @Azhng, @knz, @nvanbenschoten, @rafiss, and @yuzefovich)


pkg/sql/crdb_internal.go, line 5840 at r15 (raw file):

	indexes: nil,
	generator: func(ctx context.Context, p *planner, db catalog.DatabaseDescriptor, stopper *stop.Stopper) (virtualTableGenerator, cleanupFunc, error) {
		if !p.ExecCfg().Settings.Version.IsActive(ctx, clusterversion.ClusterLocksVirtualTable) {

nit: add a TODO to remove this gate in 22.2 time frame.


pkg/sql/crdb_internal.go, line 5929 at r15 (raw file):

			}

			resp = b.RawResponse().Responses[0].GetQueryLocks()

nit: should we be defensive and check the length of b.RawResponse() returning an error if it's not equal to 1?


pkg/sql/crdb_internal.go, line 5983 at r15 (raw file):

			// If we couldn't get any more locks from getNextLock(), we have finished
			// generating result rows.
			if curLock == nil {

nit: it seems cleaner to explicitly add || fErr != nil check to the conditional.


pkg/sql/crdb_internal.go, line 5995 at r15 (raw file):

			if waiterIdx < 0 {
				if curLock.LockHolder != nil {
					txnIDDatum = tree.NewDUuid(tree.DUuid{UUID: curLock.LockHolder.ID})

nit: might be good to use DatumAlloc for the new datums. I guess we don't do this for other internal virtual tables, so feel free to ignore this comment.


pkg/sql/logictest/testdata/logic_test/cluster_locks, line 1 at r15 (raw file):

# LogicTest: local 5node

I'm curious about the choice of these modes. IIUC this file is separate from cluster_locks_tenant because we manually split ranges. However, we then don't move those ranges to other nodes (if we're in 5node config). My guess is that you want to verify the behavior in a distributed setting, and I don't think this file accomplishes that.

Copy link
Contributor Author

@AlexTalks AlexTalks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @AlexTalks, @Azhng, @knz, @nvanbenschoten, @rafiss, and @yuzefovich)


pkg/sql/crdb_internal.go, line 5840 at r15 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: add a TODO to remove this gate in 22.2 time frame.

OK


pkg/sql/crdb_internal.go, line 5929 at r15 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: should we be defensive and check the length of b.RawResponse() returning an error if it's not equal to 1?

OK - can add an assertion, though strangely I notice there's some other places in crdb_internal.go where we don't do this ¯_(ツ)_/¯


pkg/sql/crdb_internal.go, line 5983 at r15 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: it seems cleaner to explicitly add || fErr != nil check to the conditional.

This way fErr simply falls through, though I can add it as it doesn't hurt at all. Had always learned/thought that Go semantics sort of requires you to return (nil, non-nil err) when your return params are (*value, error).


pkg/sql/crdb_internal.go, line 6019 at r15 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: do we even want to run this line and the next when shouldRedactKeys?

OK - I suppose it's not necessary at all so if it seems cleaner that's cool.


pkg/sql/logictest/testdata/logic_test/cluster_locks, line 1 at r15 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

I'm curious about the choice of these modes. IIUC this file is separate from cluster_locks_tenant because we manually split ranges. However, we then don't move those ranges to other nodes (if we're in 5node config). My guess is that you want to verify the behavior in a distributed setting, and I don't think this file accomplishes that.

Yes - I would have tried to put it in one file, but skipif only applies to statements and queries, which means I can't skipif config 3node-tenant the lines where I have let $r1; select range_id.... Also, local and 5node seemed to make sense because the test doesn't play well with the fakedist configs.

Your thought is that we should manually move these ranges around with ALTER TABLE t EXPERIMENTAL_RELOCATE...? Likely in a skipif config local statement?

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r16, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @AlexTalks, @Azhng, @knz, @nvanbenschoten, @rafiss, and @yuzefovich)


pkg/sql/crdb_internal.go, line 5983 at r15 (raw file):

Previously, AlexTalks (Alex Sarkesian) wrote…

This way fErr simply falls through, though I can add it as it doesn't hurt at all. Had always learned/thought that Go semantics sort of requires you to return (nil, non-nil err) when your return params are (*value, error).

Hm, I guess it's a best practice to return (nil, non-nil err), but I think I've also come across cases of returning (x, non-nil err) where x is undefined.

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! :lgtm:

Note that there appear to be some more tests in need of adjustment.

Reviewed 2 of 2 files at r17, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @AlexTalks, @Azhng, @knz, @nvanbenschoten, @rafiss, and @yuzefovich)

This change introduces the new virtual table
`crdb_internal.cluster_locks`, as proposed in the KV Lock Observability
RFC (cockroachdb#75541), to expose lock contention.  This virtual table displays
the locks currently tracked in the lock tables in ranges across a
cluster, utilizing the KV `QueryLocksRequest` API to gather information
on the lock holders as well as the operations waiting on the locks
before converting to a user-friendly SQL view that incorporates
information about the database, table, schema, and index.

For example,
```
root@localhost:26261/defaultdb> select range_id, table_name, lock_key_pretty, txn_id, ts, lock_strength, durability, granted, contended, duration from crdb_internal.cluster_locks;
  range_id | table_name |     lock_key_pretty      |                txn_id                |             ts             | lock_strength | durability | granted | contended |    duration
-----------+------------+--------------------------+--------------------------------------+----------------------------+---------------+------------+---------+-----------+------------------
       235 | kv         | /Table/115/1/"alex"/0    | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.916745 | Exclusive     | Replicated |  true   |   true    | 00:01:50.607386
       235 | kv         | /Table/115/1/"alex"/0    | d71ea8eb-21b9-48a0-98b6-82eed84ed76f | 2022-03-26 00:24:02.137125 | None          | Replicated |  false  |   true    | 00:01:50.607376
       235 | kv         | /Table/115/1/"bob"/0     | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.916745 | Exclusive     | Replicated |  true   |   false   | 00:01:50.607384
       235 | kv         | /Table/115/1/"chris"/0   | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.916745 | Exclusive     | Replicated |  true   |   false   | 00:01:50.607383
       255 | kv         | /Table/115/1/"lauren"/0  | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.819704 | Exclusive     | Replicated |  true   |   true    | 00:01:50.607757
       255 | kv         | /Table/115/1/"lauren"/0  | d71ea8eb-21b9-48a0-98b6-82eed84ed76f | 2022-03-26 00:24:02.137125 | None          | Replicated |  false  |   true    | 00:01:50.60773
       255 | kv         | /Table/115/1/"marilyn"/0 | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.819704 | Exclusive     | Replicated |  true   |   false   | 00:01:50.607755
       255 | kv         | /Table/115/1/"mike"/0    | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.819704 | Exclusive     | Replicated |  true   |   false   | 00:01:50.607754
       255 | kv         | /Table/115/1/"nancy"/0   | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.819704 | Exclusive     | Replicated |  true   |   false   | 00:01:50.607745
       255 | kv         | /Table/115/1/"noah"/0    | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.819704 | Exclusive     | Replicated |  true   |   false   | 00:01:50.607744
       255 | kv         | /Table/115/1/"paul"/0    | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.819704 | Exclusive     | Replicated |  true   |   false   | 00:01:50.607744
       255 | kv         | /Table/115/1/"peter"/0   | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.819704 | Exclusive     | Replicated |  true   |   false   | 00:01:50.607743
       256 | kv         | /Table/115/1/"sam"/0     | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.740235 | Exclusive     | Replicated |  true   |   true    | 00:01:50.607582
       256 | kv         | /Table/115/1/"sam"/0     | d71ea8eb-21b9-48a0-98b6-82eed84ed76f | 2022-03-26 00:24:02.137125 | None          | Replicated |  false  |   true    | 00:01:50.607573
       256 | kv         | /Table/115/1/"thomas"/0  | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.740235 | Exclusive     | Replicated |  true   |   false   | 00:01:50.607581
       256 | kv         | /Table/115/1/"zebra"/0   | 7471e67b-bbc3-4dd9-a42f-43ec97949b14 | 2022-03-26 00:23:45.740235 | Exclusive     | Replicated |  true   |   false   | 00:01:50.60758
(16 rows)
```

This internal table is usable for system tenants as well as secondary
tenants.  It is also useful in conjunction with other
`crdb_internal.cluster_*` tables used for observability, and also can be
joined with itself to visualize blocked operations, and the transactions
that are blocking them.  This feature is gated on v22.1, to ensure that
older nodes will not receive `kv.Batch` requests with
`QueryLocksRequest`s in them.

Closes cockroachdb#74834

Release justification: Low risk, high benefit change.

Release note (sql change): introducing the `crdb_internal.cluster_locks`
virtual table, which exposes the current state of locks on keys tracked
by concurrency control. The virtual table displays metadata on locks
currently held by transactions, as well as operations waiting to obtain
the locks, and as such can be used to visualize active contention.
The `VIEWACTIVITY` or `VIEWACTIVITYREDACTED` role option, or the
`admin` role, is required to access the virtual table; however, if the
user only has the `VIEWACTIVITYREDACTED` role option, the key on which
a lock is held will be redacted.
@AlexTalks
Copy link
Contributor Author

bors r+

@craig
Copy link
Contributor

craig bot commented Apr 23, 2022

Build succeeded:

@craig craig bot merged commit fc11beb into cockroachdb:master Apr 23, 2022
@blathers-crl
Copy link

blathers-crl bot commented Apr 23, 2022

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from c80047e to blathers/backport-release-22.1-77876: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.1.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@AlexTalks AlexTalks deleted the crdb_locks_vtable branch April 26, 2022 00:22
craig bot pushed a commit that referenced this pull request May 14, 2022
79623: sql: add virtual indexes to crdb_internal.cluster_locks virtual table  r=AlexTalks a=AlexTalks

This change adds virtual indexes on the `table_id`, `database_name`, and
`table_name` columns of the `crdb_internal.cluster_locks` virtual table,
so that when queried, the `kv.Batch`es with `QueryLocksRequest`s can be
constrained to query only specific tables or databases. This allows the
requests to be much more limited, rather than needing to request all of
the ranges that comprise the key spans of all tables accessible by the
user.

Release note (sql change): Improved query performance for
`crdb_internal.cluster_locks` when issued with constraints in the WHERE
clause on `table_id`, `database_name`, or `table_name` columns.

Depends on #77876, #80422

80832: kvserver: IsCompleteTransaction might panic with certain batch sequences r=shralex a=shralex

It's unclear how this panic happened. One possibility is that EntTxn had a negative sequence number. Another hypothesis is that ba.Requests was concurrently mutated due to a data race. This happened once, so for now adding more info to the panic.

Release note: None

Jira issue: https://cockroachlabs.atlassian.net/browse/CRDB-14627

81190: roachtest: update activerecord adapter to v6.1.10 r=rafiss a=ecwall

refs #67893
refs #80777

This version correctly disables supports_expression_index to
prevent `ON CONFLICT expression` from appearing in generated
SQL statements.

Release note: None

81193: storage: upgrade to pebblev2 table format r=jbowens,nicktrav a=erikgrinaker

Resubmit of #76780, which was partially reverted for 22.1.

---

The `Pebblev2` SSTable format adds support for range keys. Add two new
cluster versions to provide the upgrade path - the first version for
bumping the store, the second for use as a feature gate.

Release note: None

81207: ttljob: fix a range edge case r=rafiss a=otan

See individual commits for details.

Refs: #81208

Co-authored-by: Alex Sarkesian <[email protected]>
Co-authored-by: shralex <[email protected]>
Co-authored-by: Evan Wall <[email protected]>
Co-authored-by: Erik Grinaker <[email protected]>
Co-authored-by: Oliver Tan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kv, sql: visualize current lock holders and waiters via sql
10 participants