sql,kv: add compact_engine_span for internal use #57709

sumeerbhola · 2020-12-08T18:55:30Z

This sql builtin can be used to compact the key span for a
running node, given the (nodeID, storeID, start_key, end_key).

Informs #26068

Release note: None

cockroach-teamcity · 2020-12-08T18:55:37Z

This change is

tbg

Reviewed 12 of 12 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola)

pkg/kv/kvserver/api.proto, line 83 at r1 (raw file):

message CompactEngineSpanRequest {
  StoreRequestHeader header = 1 [(gogoproto.nullable) = false, (gogoproto.embed) = true];
  roachpb.Span span = 2 [(gogoproto.nullable) = false, (gogoproto.embed) = true];

I don't think this should be embedded, as it makes the methods of Span available on CompactEngineSpanRequest. We typically only embed the headers in these requests.

pkg/sql/planner.go, line 725 at r1 (raw file):

	ctx context.Context, nodeID int32, storeID int32, rangeID int64,
) error {
	conn, err := p.execCfg.DistSender.NodeDialer().Dial(ctx, roachpb.NodeID(nodeID), rpc.DefaultClass)

	if !p.ExecCfg().Codec.ForSystemTenant() {
		return errorutil.UnsupportedWithMultiTenancy(errorutil.FeatureNotAvailableToNonSystemTenantsIssue)
	}

The RPC would fail, but this returns a better error.

pkg/sql/planner.go, line 736 at r1 (raw file):

		},
		Span: roachpb.Span{
			// TODO: ask about conversion from RangeID to the span. We need

I think this is really a question from someone from #sql-execution. You (or something you call) has to perform a meta2 scan to pick out the right range descriptor but this is a builtin; I don't know if it gets to do any of that stuff (we have a distSender so we could do it all manually, but that seems wrong). Ask them about what your options are, but my intuition is that the right thing to do is to take a start and end key in compact_engine_span and to call it via

select crdb_internal.compact_engine_span(1 /* nodeID */, 2 /* storeID */, start_key::string, end_key::string) from crdb_internal.ranges_no_leases where range_id = 37;

You could similary target all nodes by deriving the nodeID and storeID from crdb_internal.kv_store_status.{node_id,store_id} to, for example, target all stores in the cluster.

It would be nice to wrap the above expression into a short-hand again, but I don't know if we have the facilities for that. Again, a good question for #sql-execution, but even if we don't, the above is good enough to be pasted into a run book.

pkg/sql/sem/builtins/builtins.go, line 4545 at r1 (raw file):

				return tree.DBoolTrue, nil
			},
			Info:       "This function is used only by CockroachDB's developers for restoring engine health.",

Still give this a real description. After all, internal users like documentation, too.

sumeerbhola

TFTR!
I've addressed the comments.

Regarding tests:

I've added sql logictests.
I've registered the server in client_test.go. I couldn't find a test for CollectChecksum, so didn't have a pattern to follow. Should I be using multiTestConfig.nodeDialerlike TestStoreRangeWaitForApplication does, or is the logictest sufficient since it exercises the end-to-end path?

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal and @tbg)

pkg/kv/kvserver/api.proto, line 83 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

I don't think this should be embedded, as it makes the methods of Span available on CompactEngineSpanRequest. We typically only embed the headers in these requests.

Done

pkg/sql/planner.go, line 725 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

	if !p.ExecCfg().Codec.ForSystemTenant() {
		return errorutil.UnsupportedWithMultiTenancy(errorutil.FeatureNotAvailableToNonSystemTenantsIssue)
	}

The RPC would fail, but this returns a better error.

Done. I noticed there are two ExecutorConfig objects I can get to: planner.execCfg and planner.extendedEvalCtx.ExecCfg -- the latter is returned by planner.ExecCfg(). Do you know the difference?

pkg/sql/planner.go, line 736 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

I think this is really a question from someone from #sql-execution. You (or something you call) has to perform a meta2 scan to pick out the right range descriptor but this is a builtin; I don't know if it gets to do any of that stuff (we have a distSender so we could do it all manually, but that seems wrong). Ask them about what your options are, but my intuition is that the right thing to do is to take a start and end key in compact_engine_span and to call it via
select crdb_internal.compact_engine_span(1 /* nodeID */, 2 /* storeID */, start_key::string, end_key::string) from crdb_internal.ranges_no_leases where range_id = 37;
You could similary target all nodes by deriving the nodeID and storeID from crdb_internal.kv_store_status.{node_id,store_id} to, for example, target all stores in the cluster.

It would be nice to wrap the above expression into a short-hand again, but I don't know if we have the facilities for that. Again, a good question for #sql-execution, but even if we don't, the above is good enough to be pasted into a run book.

Using crdb_internal.ranges_no_leases is perfect.

pkg/sql/sem/builtins/builtins.go, line 4545 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Still give this a real description. After all, internal users like documentation, too.

Done.

tbg

I think the logic tests do a good enough job at verifying that the plumbing works (that's pretty much all they do).
We don't want to write new tests using multiTestContext (it's a relic that's unfortunately proven hard to get rid of). Really the only additional thing that I could see you test is whether calling into the server with a compaction request actually does trigger the compaction. This seems hard to do above the pebble level unless we're fine with a mostly tautological test - you could add a testing knob that intercepts the inputs to CompactEngineSpan before calling CompactEngine and it could at least check that the key spans don't get mangled (when called from SQL). I don't know if that's worth it, once we get rid of the hex encoding.

Reviewed 11 of 11 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal, @sumeerbhola, and @tbg)

pkg/sql/planner.go, line 725 at r1 (raw file):

Previously, sumeerbhola wrote…

Done. I noticed there are two ExecutorConfig objects I can get to: planner.execCfg and planner.extendedEvalCtx.ExecCfg -- the latter is returned by planner.ExecCfg(). Do you know the difference?

I'm not sure, you'd have to ask someone from SQL. I think using p.ExecCfg() is likely to give you the right one but it can't hurt to ask for a clarifying comment on the fields if there aren't any yet.

pkg/sql/logictest/testdata/logic_test/crdb_internal, line 647 at r2 (raw file):

query TTB colnames
SELECT start_pretty, end_pretty,
 crdb_internal.compact_engine_span(1, 1, to_hex(start_key), to_hex(end_key))

would need to change if the hex stuff went away.

pkg/sql/sem/builtins/builtins.go, line 4542 at r2 (raw file):

				nodeID := int32(tree.MustBeDInt(args[0]))
				storeID := int32(tree.MustBeDInt(args[1]))
				startKey, err := hex.DecodeString(string(tree.MustBeDString(args[2])))

Rather than baking the hex encoding in here, the caller could use the decode builtin if they wished to specify the inputs in hex, or even just use the \x encoding (https://www.postgresql.org/docs/9.1/datatype-binary.html).
I don't have any particular rule to point to to justify this, but I think the method should take the data in the representation that is most natural to it, and a "string that will be assumed to be hex encoded" doesn't seem to qualify, especially since we assume that the keys are mostly provided by crdb_internal.ranges_no_leases.

pkg/sql/sem/builtins/builtins.go, line 4560 at r2 (raw file):

				"end keys are hex encoded. To get the start and end keys for a particular rangeID, one " +
				"can do: SELECT to_hex(start_key), to_hex(end_key) FROM crdb_internal.ranges_no_leases " +
				"WHERE range_id=<value>. The compaction is run synchronously, so this function may take " +

Would need an update if hex stuff were to go away.

pkg/sql/sem/tree/eval.go, line 3070 at r2 (raw file):

	// CompactEngineSpan is used to compact an engine key span at the given
	// (nodeID, storeID). The parameter list here will be replaced by a struct
	// union when we add more overloads to the compact_span builtin.

Do you think the list will realistically be replaced? I don't think it needs to (but there may be a gap in my understanding).

sumeerbhola

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal, @sumeerbhola, and @tbg)

pkg/sql/planner.go, line 725 at r1 (raw file):

I'm not sure, you'd have to ask someone from SQL.

Yahor confirmed that they are the same.

pkg/sql/sem/builtins/builtins.go, line 4542 at r2 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Rather than baking the hex encoding in here, the caller could use the decode builtin if they wished to specify the inputs in hex, or even just use the \x encoding (https://www.postgresql.org/docs/9.1/datatype-binary.html).
I don't have any particular rule to point to to justify this, but I think the method should take the data in the representation that is most natural to it, and a "string that will be assumed to be hex encoded" doesn't seem to qualify, especially since we assume that the keys are mostly provided by crdb_internal.ranges_no_leases.

Thanks for the link -- I'd been hunting for a spec for the encoding we emit bytes in, and this clarifies things.

When I run SELECT start_key, end_key from crdb_internal.ranges_no_leases; the keys are printed in the bytea escape format which is discouraged in https://www.postgresql.org/docs/current/datatype-binary.html#id-1.5.7.12.10.
And doing encode(start_key, 'hex') is the same as to_hex(start_key) -- I can change the comment here to say the former to be more Postgres compliant, though since this is an internal facing function the conciseness of to_hex may be preferable. And lex.DecodeRawBytesToByteArray is calling hex.DecodeString just as the code here does. If you prefer I can use lex.DecodeRawBytesToByteArray so that there is no decoding code divergence over time.
So I am still leaning towards hex. If this function were to take either hex or escape format it would also need to take a format string, which seems unnecessary flexibility.
Thoughts?

pkg/sql/sem/tree/eval.go, line 3070 at r2 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Do you think the list will realistically be replaced? I don't think it needs to (but there may be a gap in my understanding).

I assumed we would eventually want something that iterated over all nodes. I've adjusted the comment.
Punching out of the builtin environment in this way is a hack, so all I wanted to state here was that we shouldn't start adding more methods here.

tbg

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal, @sumeerbhola, and @tbg)

pkg/sql/sem/builtins/builtins.go, line 4542 at r2 (raw file):

Previously, sumeerbhola wrote…

Thanks for the link -- I'd been hunting for a spec for the encoding we emit bytes in, and this clarifies things.

When I run SELECT start_key, end_key from crdb_internal.ranges_no_leases; the keys are printed in the bytea escape format which is discouraged in https://www.postgresql.org/docs/current/datatype-binary.html#id-1.5.7.12.10.
And doing encode(start_key, 'hex') is the same as to_hex(start_key) -- I can change the comment here to say the former to be more Postgres compliant, though since this is an internal facing function the conciseness of to_hex may be preferable. And lex.DecodeRawBytesToByteArray is calling hex.DecodeString just as the code here does. If you prefer I can use lex.DecodeRawBytesToByteArray so that there is no decoding code divergence over time.
So I am still leaning towards hex. If this function were to take either hex or escape format it would also need to take a format string, which seems unnecessary flexibility.
Thoughts?

I'm not sure we're on the same page yet. My point was that it strikes me as unidiomatic to have a builtin that takes hex-encoded strings instead of just raw bytes. If you dropped all of the hex encoding stuff and just used select crdb_internal.compact_engine_span(1, 1, start_key, end_key) from crdb_internal.ranges_no_leases, does that not work?

If you had a hex key and wanted to pass that to the method (say you poked at the engine manually and so all you have is a copied hex string) you could use

select crdb_internal.compact_engine_span(1, 1, decode('deadbeef', 'hex'), decode('feadbeef', 'hex'))

so I don't understand why you want to force the hex encoding into the builtin.

I also don't understand why it matters how the keys are printed when you select from crdb_internal.ranges_no_leases. They are raw byte strings, surely they won't pretty-print nicely, but why is that an issue? Feels like I'm missing something.

sumeerbhola

I don't know if that's worth it, once we get rid of the hex encoding.

I didn't bother adding more tests.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal and @tbg)

pkg/sql/logictest/testdata/logic_test/crdb_internal, line 647 at r2 (raw file):

Previously, tbg (Tobias Grieger) wrote…

would need to change if the hex stuff went away.

Done.

pkg/sql/sem/builtins/builtins.go, line 4542 at r2 (raw file):

Previously, tbg (Tobias Grieger) wrote…

I'm not sure we're on the same page yet. My point was that it strikes me as unidiomatic to have a builtin that takes hex-encoded strings instead of just raw bytes. If you dropped all of the hex encoding stuff and just used select crdb_internal.compact_engine_span(1, 1, start_key, end_key) from crdb_internal.ranges_no_leases, does that not work?

If you had a hex key and wanted to pass that to the method (say you poked at the engine manually and so all you have is a copied hex string) you could use
select crdb_internal.compact_engine_span(1, 1, decode('deadbeef', 'hex'), decode('feadbeef', 'hex'))
so I don't understand why you want to force the hex encoding into the builtin.

I also don't understand why it matters how the keys are printed when you select from crdb_internal.ranges_no_leases. They are raw byte strings, surely they won't pretty-print nicely, but why is that an issue? Feels like I'm missing something.

My thinking was fuzzy -- I mixed up the printing aspect and jumped to an assumption that a user would need to pass in a string and not bytea. Didn't realize that decode bridges that gap.
Done.

pkg/sql/sem/builtins/builtins.go, line 4560 at r2 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Would need an update if hex stuff were to go away.

Done.

tbg

Looking forward to seeing this solve problems in the wild!

Reviewed 6 of 6 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal, @sumeerbhola, and @tbg)

pkg/sql/logictest/testdata/logic_test/crdb_internal, line 649 at r3 (raw file):

SELECT start_pretty, end_pretty, to_hex(start_key), to_hex(end_key),
 crdb_internal.compact_engine_span(1, 1, start_key, end_key)
FROM crdb_internal.ranges_no_leases WHERE table_name = 'foo' LIMIT 1

This seems unnecessary prone to needing to change when the assigned tableID changes. I like how this test documents the ultimate intended usage (rather than just calling with \xff and \xff\xff or something like that), but I'd prefer thinning this down so that the ID doesn't leak into the file:

SELECT crdb_internal.compact_engine_span(1, 1, start_key, end_key)
FROM crdb_internal.ranges_no_leases WHERE table_name = 'foo' LIMIT 1

pkg/sql/logictest/testdata/logic_test/crdb_internal, line 654 at r3 (raw file):

/Table/56/1/1  /Table/56/1/2  c08989  c0898a  true

# Failed compaction due to unknown node.

How about one with unknown store as well?

sumeerbhola

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal and @tbg)

pkg/sql/logictest/testdata/logic_test/crdb_internal, line 649 at r3 (raw file):

Previously, tbg (Tobias Grieger) wrote…

This seems unnecessary prone to needing to change when the assigned tableID changes. I like how this test documents the ultimate intended usage (rather than just calling with \xff and \xff\xff or something like that), but I'd prefer thinning this down so that the ID doesn't leak into the file:
SELECT crdb_internal.compact_engine_span(1, 1, start_key, end_key)
FROM crdb_internal.ranges_no_leases WHERE table_name = 'foo' LIMIT 1

The test uses the same key again in the decode-based call below, rather than make up a fictional one, so if we want to keep that consistent with this, we would need to leak the tableID.

pkg/sql/logictest/testdata/logic_test/crdb_internal, line 654 at r3 (raw file):

Previously, tbg (Tobias Grieger) wrote…

How about one with unknown store as well?

Done

tbg

Reviewed 1 of 1 files at r4.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal and @sumeerbhola)

pkg/sql/logictest/testdata/logic_test/crdb_internal, line 649 at r3 (raw file):

Previously, sumeerbhola wrote…

The test uses the same key again in the decode-based call below, rather than make up a fictional one, so if we want to keep that consistent with this, we would need to leak the tableID.

I don't feel strongly either way (the next person shuffling the tableIDs around might) but there's no reason imo why the below example needs to line up with the above. In fact you could just hard-code a and b because all you care about is getting a nonempty span.

Speaking of which, that might be something to test, what happens if you pass end <= start?

sumeerbhola

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal and @tbg)

pkg/sql/logictest/testdata/logic_test/crdb_internal, line 649 at r3 (raw file):

Previously, tbg (Tobias Grieger) wrote…

I don't feel strongly either way (the next person shuffling the tableIDs around might) but there's no reason imo why the below example needs to line up with the above. In fact you could just hard-code a and b because all you care about is getting a nonempty span.

Speaking of which, that might be something to test, what happens if you pass end <= start?

I changed it to not output the keys, and added a test with an invalid range. There was no error being generated by Pebble -- a quick look at the code suggests that it just turning into a trivially empty range. So I added some checking in CompactEngineSpan.

tbg

Reviewed 2 of 2 files at r5.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal and @sumeerbhola)

pkg/sql/planner.go, line 736 at r5 (raw file):

	end := roachpb.Key(endKey)
	if start.Compare(end) >= 0 {
		return errors.Errorf("start %s is not less than end %s", start, end)

Wouldn't it be more idiomatic to check this when it's handed to pebble? After all, there can be multiple code paths calling into pebble compactions (in theory, maybe even practice?!) and they all benefit from this check.

tbg

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal and @sumeerbhola)

pkg/sql/planner.go, line 736 at r5 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Wouldn't it be more idiomatic to check this when it's handed to pebble? After all, there can be multiple code paths calling into pebble compactions (in theory, maybe even practice?!) and they all benefit from this check.

(i.e. inside of pebble, I mean)

We use the user keys to construct InternalKeys that are (start, InternalKeySeqNumMax, InternalKeyKindMax) and (end, 0, 0) so it is not meaningful to have start >= end since it represents an empty interval. The current behavior did actual work because of how the memtable overlap code is written. This came up in the code review for cockroachdb/cockroach#57709

sumeerbhola

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal and @tbg)

pkg/sql/planner.go, line 736 at r5 (raw file):

Previously, tbg (Tobias Grieger) wrote…

(i.e. inside of pebble, I mean)

Done

tbg

Reviewed 5 of 5 files at r6.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal)

This sql builtin can be used to compact the key span for a running node, given the (nodeID, storeID, start_key, end_key). Informs cockroachdb#26068 Release note: None

sumeerbhola · 2020-12-23T23:53:35Z

bors r+

craig · 2020-12-24T01:31:26Z

Build succeeded:

GitHub CI (Cockroach)

sumeerbhola requested a review from tbg December 8, 2020 18:55

sumeerbhola requested a review from a team as a code owner December 8, 2020 18:55

sumeerbhola force-pushed the compact_range branch 2 times, most recently from 5b46f22 to c8a2110 Compare December 8, 2020 20:08

tbg reviewed Dec 9, 2020

View reviewed changes

sumeerbhola force-pushed the compact_range branch from c8a2110 to 23c8307 Compare December 11, 2020 16:15

sumeerbhola changed the title ~~[WIP] sql,kv: add compact_engine_span for internal use~~ sql,kv: add compact_engine_span for internal use Dec 11, 2020

sumeerbhola force-pushed the compact_range branch from 23c8307 to cca8405 Compare December 11, 2020 16:17

sumeerbhola requested a review from itsbilal December 11, 2020 16:19

sumeerbhola commented Dec 11, 2020

View reviewed changes

tbg reviewed Dec 15, 2020

View reviewed changes

tbg self-requested a review December 15, 2020 14:16

sumeerbhola force-pushed the compact_range branch from cca8405 to 6cae819 Compare December 15, 2020 21:06

sumeerbhola commented Dec 15, 2020

View reviewed changes

tbg reviewed Dec 16, 2020

View reviewed changes

tbg self-requested a review December 16, 2020 08:51

sumeerbhola force-pushed the compact_range branch from 6cae819 to 3dfe108 Compare December 17, 2020 01:17

sumeerbhola commented Dec 17, 2020

View reviewed changes

tbg approved these changes Dec 17, 2020

View reviewed changes

tbg self-requested a review December 17, 2020 09:19

sumeerbhola force-pushed the compact_range branch from 3dfe108 to 6d2db62 Compare December 17, 2020 15:44

sumeerbhola commented Dec 17, 2020

View reviewed changes

tbg reviewed Dec 17, 2020

View reviewed changes

sumeerbhola force-pushed the compact_range branch from 6d2db62 to d92373b Compare December 17, 2020 16:56

sumeerbhola commented Dec 17, 2020

View reviewed changes

tbg approved these changes Dec 18, 2020

View reviewed changes

tbg reviewed Dec 18, 2020

View reviewed changes

sumeerbhola mentioned this pull request Dec 18, 2020

db: return error if manual compaction start >= end cockroachdb/pebble#1026

Merged

sumeerbhola force-pushed the compact_range branch from d92373b to 3755a95 Compare December 23, 2020 21:10

sumeerbhola commented Dec 23, 2020

View reviewed changes

tbg approved these changes Dec 23, 2020

View reviewed changes

sql,kv: add compact_engine_span for internal use

40029d1

This sql builtin can be used to compact the key span for a running node, given the (nodeID, storeID, start_key, end_key). Informs cockroachdb#26068 Release note: None

sumeerbhola force-pushed the compact_range branch from 3755a95 to 40029d1 Compare December 23, 2020 22:21

craig bot merged commit 7e604ec into cockroachdb:master Dec 24, 2020

jbowens mentioned this pull request Apr 12, 2022

sql: add crdb_internal.encode_key variant that doesn't require a row #79863

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql,kv: add compact_engine_span for internal use #57709

sql,kv: add compact_engine_span for internal use #57709

sumeerbhola commented Dec 8, 2020 •

edited

Loading

cockroach-teamcity commented Dec 8, 2020

tbg left a comment

sumeerbhola left a comment

tbg left a comment

sumeerbhola left a comment

tbg left a comment

sumeerbhola left a comment

tbg left a comment

sumeerbhola left a comment

tbg left a comment

sumeerbhola left a comment

tbg left a comment

tbg left a comment

sumeerbhola left a comment

tbg left a comment

sumeerbhola commented Dec 23, 2020

craig bot commented Dec 24, 2020

sql,kv: add compact_engine_span for internal use #57709

sql,kv: add compact_engine_span for internal use #57709

Conversation

sumeerbhola commented Dec 8, 2020 • edited Loading

cockroach-teamcity commented Dec 8, 2020

tbg left a comment

Choose a reason for hiding this comment

sumeerbhola left a comment

Choose a reason for hiding this comment

tbg left a comment

Choose a reason for hiding this comment

sumeerbhola left a comment

Choose a reason for hiding this comment

tbg left a comment

Choose a reason for hiding this comment

sumeerbhola left a comment

Choose a reason for hiding this comment

tbg left a comment

Choose a reason for hiding this comment

sumeerbhola left a comment

Choose a reason for hiding this comment

tbg left a comment

Choose a reason for hiding this comment

sumeerbhola left a comment

Choose a reason for hiding this comment

tbg left a comment

Choose a reason for hiding this comment

tbg left a comment

Choose a reason for hiding this comment

sumeerbhola left a comment

Choose a reason for hiding this comment

tbg left a comment

Choose a reason for hiding this comment

sumeerbhola commented Dec 23, 2020

craig bot commented Dec 24, 2020

sumeerbhola commented Dec 8, 2020 •

edited

Loading