-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spanconfig: modify existing RPCs to work with system span configurations #76219
spanconfig: modify existing RPCs to work with system span configurations #76219
Conversation
b8e6b1c
to
6b089a7
Compare
Rebased on top of master now that #76213 has landed. @nvanbenschoten @ajwerner, this was the PR I called out in yesterday's pod meeting -- the second commit, specifically around the encoding/decoding logic + the key span we're reserving for system span configurations was something I was hoping to get your eyes on. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving some comments on the first commit, generally LGTM. I'm going to come back to the second commit in a little bit.
pkg/roachpb/span_config.proto
Outdated
TenantID SourceTenantID = 1 [(gogoproto.nullable) = false]; | ||
|
||
// TargetTenantID is the ID of the tenant that the associated system span | ||
// configuration applies to. If the host tenant is the source then the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can we reword the sentence "If the host tenant is the source then the system span config applies over all ranges in the system (including those belonging to secondary tenants)." to mention that both SourceTenantID and TargetTenantID have to be set to the host tenant ID.
pkg/roachpb/span_config.proto
Outdated
|
||
// Targets to request configurations for. The targets listed here are not | ||
// allowed to be duplicated/overlap with one another. | ||
repeated SpanConfigTarget targets = 3 [(gogoproto.nullable) = false]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this should be field with ID 2?
pkg/roachpb/span_config.proto
Outdated
// | ||
// Any system span configurations set by the tenant are also returned if | ||
// requested. | ||
repeated SpanConfigEntry span_config_entries = 2 [(gogoproto.nullable) = false]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just for my understanding, why do we need to bump the ID on this field? The wire type is still the same right?
pkg/roachpb/span_config.proto
Outdated
// ToUpsert captures the spans we want to upsert and the configs we want to | ||
// upsert with. | ||
repeated SpanConfigEntry to_upsert = 2 [(gogoproto.nullable) = false]; | ||
// To upsert captures the targets we want to upsert and the configs we want |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/To upsert/ToUpsert/g
repeated SpanConfigEntry to_upsert = 2 [(gogoproto.nullable) = false]; | ||
// To upsert captures the targets we want to upsert and the configs we want | ||
// to upsert with. | ||
repeated SpanConfigEntry to_upsert = 4 [(gogoproto.nullable) = false]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto question as above, do we need to change this ID?
pkg/spanconfig/target.go
Outdated
span roachpb.Span | ||
} | ||
|
||
func MakeTarget(t roachpb.SpanConfigTarget) Target { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: lint will probably need a comment
pkg/spanconfig/target.go
Outdated
} | ||
|
||
// DecodeTarget takes a raw span and decodes it into a Target given its | ||
// encoding. It is the inverse of Encode. | ||
func DecodeTarget(sp roachpb.Span) Target { | ||
return Target(sp) | ||
return Target{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: i'd collapse this to one line, here and throughout but its a purely aesthetic comment so feel free to ignore.
pkg/rpc/auth_tenant.go
Outdated
case *roachpb.SpanConfigTarget_SystemSpanConfigTarget: | ||
return validateSystemTarget(*spanConfigTarget.GetSystemSpanConfigTarget()) | ||
default: | ||
panic("unknown span config target type") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we return an error instead of panicking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't make it all the way through, publishing what I had
pkg/spanconfig/systemtarget.go
Outdated
type SystemTarget struct { | ||
SourceTenantID roachpb.TenantID | ||
TargetTenantID roachpb.TenantID | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: commentary
// decodeSystemTarget converts the given span into a SystemTarget. An error is | ||
// returned if the supplied span does not conform to a system target's | ||
// encoding. | ||
func decodeSystemTarget(span roachpb.Span) (SystemTarget, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be defensive about validating EndKey
?
pkg/spanconfig/systemtarget.go
Outdated
case bytes.HasPrefix(span.Key, keys.SystemSpanConfigEntireKeyspace): | ||
return MakeSystemTarget(roachpb.SystemTenantID, roachpb.SystemTenantID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be HasPrefix
or should it be an equality check?
6b089a7
to
8666936
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed all the comments so far, RFAL!
Dismissed @adityamaru and @ajwerner from 3 discussions.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @ajwerner, @irfansharif, and @nvanbenschoten)
pkg/roachpb/span_config.proto, line 178 at r1 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
nit: Can we reword the sentence "If the host tenant is the source then the system span config applies over all ranges in the system (including those belonging to secondary tenants)." to mention that both SourceTenantID and TargetTenantID have to be set to the host tenant ID.
Good call, done.
pkg/roachpb/span_config.proto, line 216 at r1 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
nit: this should be field with ID 2?
Yup, I had some other things here before but now they're gone.
pkg/roachpb/span_config.proto, line 230 at r1 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
just for my understanding, why do we need to bump the ID on this field? The wire type is still the same right?
The type is still the same, but the underlying type (SpanConfigEntry
) changed. I think I'm being overly cautious in bumping things here (and in some other places with the bumping), given the already existing migration in conjunction with these being RPC types that aren't persisted. Do sanity check my claim here though.
pkg/roachpb/span_config.proto, line 261 at r1 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
ditto question as above, do we need to change this ID?
See above.
pkg/rpc/auth_tenant.go, line 340 at r1 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
should we return an error instead of panicking?
Done.
pkg/spanconfig/systemtarget.go, line 82 at r2 (raw file):
Previously, ajwerner wrote…
Should we be defensive about validating
EndKey
?
Good call, done. Added a test that tries to decode invalid spans as well.
pkg/spanconfig/systemtarget.go, line 85 at r2 (raw file):
Previously, ajwerner wrote…
should this be
HasPrefix
or should it be an equality check?
Fixed.
pkg/spanconfig/target.go, line 92 at r1 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
nit: i'd collapse this to one line, here and throughout but its a purely aesthetic comment so feel free to ignore.
Done.
8666936
to
79867f8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Second commit also LGTM, the panics everywhere make me a little nervous so once we get rid of the ones which won't actually be panicking once this is built out further, we should either plumb proper error handling or ReportOrPanic instead?
pkg/spanconfig/target.go
Outdated
// MakeTargetFromSystemTarget returns a Target which wraps a system target. | ||
func MakeTargetFromSystemTarget(systemTarget SystemTarget) Target { | ||
return Target{ | ||
systemTarget: systemTarget, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: collpase to oneline
pkg/keys/doc.go
Outdated
SystemMax, | ||
|
||
// 3. System tenant SQL keys: This is where we store all system-tenant | ||
// 3. System tenant SQL keys: This is where we store all syst em-tenant |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: spacing
pkg/spanconfig/systemtarget.go
Outdated
// spanStartKeyConformsToSystemTargetEncoding returns true if the given span's | ||
// start key conforms to the key encoding of system span configurations. | ||
func spanStartKeyConformsToSystemTargetEncoding(span roachpb.Span) bool { | ||
if bytes.Equal(span.Key, keys.SystemSpanConfigEntireKeyspace) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we just return condA || condB || condC
} | ||
|
||
const numOps = 20 | ||
for i := 0; i < numOps; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is numOps for? Is it just to get 20 shuffles?
I did some of error plumbing in the next PR #76414 for the case where we do need to. I'll just move them to the second commit here. |
79867f8
to
2252c53
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adityamaru added a third commit on here to clean up the panics stuff if you want to give that a quick look as well.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @ajwerner, @irfansharif, and @nvanbenschoten)
pkg/spanconfig/target_test.go, line 144 at r4 (raw file):
Previously, adityamaru (Aditya Maru) wrote…
what is numOps for? Is it just to get 20 shuffles?
Yup, just runs the thing 20 times. This is slightly overkill, but we'll rely on the sort order in the KVAccessor, so whatever :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Third commit LGTM, thanks for doing that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly nits. Nice job on this.
Reviewed 3 of 9 files at r1, 39 of 39 files at r3, 31 of 34 files at r4, 3 of 3 files at r5, 7 of 7 files at r6, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @ajwerner, @arulajmani, and @irfansharif)
pkg/roachpb/span_config.proto, line 173 at r3 (raw file):
option (gogoproto.equal) = true; // SourceTenantID is the ID of the tenant that specified the system span
nit: is the spacing off here? Reviewable doesn't seem to like it.
pkg/roachpb/span_config.proto, line 175 at r3 (raw file):
// SourceTenantID is the ID of the tenant that specified the system span // configuration. TenantID SourceTenantID = 1 [(gogoproto.nullable) = false];
Proto field names should be snake case, with an optional gogoproto.customname
.
pkg/roachpb/span_config.proto, line 180 at r3 (raw file):
// configuration applies to. // // If the host tenant is both the source and the target then the system span
Does this mean that the system tenant can't target only itself? Is that desirable? Even if it's not undesirable, is it an intuitive special case?
Could we instead say that if the field is empty, the configuration applies to all tenants?
pkg/rpc/auth_tenant.go, line 314 at r3 (raw file):
} if target.SourceTenantID != tenID || target.SourceTenantID != target.TargetTenantID {
nit: Should these conditions be separated and return different errors? SourceTenantID != tenID
means something different from TargetTenantID != tenID
.
pkg/spanconfig/target.go, line 89 at r6 (raw file):
log.Fatalf(ctx, "unknown type of system target %v", t) } return roachpb.Span{}
nit: you can change this to panic("unreachable")
to avoid the confusing return value.
pkg/spanconfig/spanconfigtestutils/recorder.go, line 101 at r3 (raw file):
return mi.batchIdx < mj.batchIdx } if !mi.update.Target.Equal(mj.update.Target) { // sort by target ordere.
This comment looks off. // sort by target order
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
, though not the deepest review I've ever given.
Reviewed 1 of 9 files at r1, 19 of 34 files at r4, 1 of 3 files at r5, 3 of 7 files at r6.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru, @ajwerner, @arulajmani, and @irfansharif)
pkg/spanconfig/target.go, line 125 at r6 (raw file):
} // We're dealing with 2 span targets; compare their start keys.
In the case that two spans have the same key, would we want to provide a total ordering over them based on the EndKey
?
Code quote:
// We're dealing with 2 span targets; compare their start keys.
return t.GetSpan().Key.Compare(o.GetSpan().Key) < 0
2252c53
to
a713d87
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTRs!
Will merge on green.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @adityamaru, @ajwerner, @irfansharif, and @nvanbenschoten)
pkg/roachpb/span_config.proto, line 180 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Does this mean that the system tenant can't target only itself? Is that desirable? Even if it's not undesirable, is it an intuitive special case?
Could we instead say that if the field is empty, the configuration applies to all tenants?
The special casing here was intended to mirror full cluster backup, which when run by the host tenant, backs-up all ranges (including tenant ranges). That being said, I do agree that this isn't an intuitive special case, especially at the level of these RPCs.
Changed to make this field nullable + added auth validation to ensure only the host tenant is allowed to leave this field unset.
pkg/rpc/auth_tenant.go, line 314 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: Should these conditions be separated and return different errors?
SourceTenantID != tenID
means something different fromTargetTenantID != tenID
.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 34 files at r4, 1 of 7 files at r6.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @adityamaru, @ajwerner, @arulajmani, @irfansharif, and @nvanbenschoten)
pkg/ccl/kvccl/kvtenantccl/connector.go, line 465 at r6 (raw file):
if err := c.withClient(ctx, func(ctx context.Context, c *client) error { resp, err := c.GetSpanConfigs(ctx, &roachpb.GetSpanConfigsRequest{ Targets: spanconfig.TargetsToTargetProtos(ctx, targets),
Do we need to plumb down a context for a type-conversion
pkg/spanconfig/target.go, line 36 at r6 (raw file):
// taught and tested the KVAccessor to work with system targets. default: return Target{}, errors.AssertionFailedf("unknown type of system target %v", t)
Getting rid of this error and opting instead for a panic would help simplify code paths (no handling of unhandle-able errors needed) that bottom out at what is affectively a type cast.
pkg/spanconfig/target.go, line 89 at r6 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: you can change this to
panic("unreachable")
to avoid the confusing return value.
+1. Would also remove the need to pass in a ctx for what should otherwise be a simple type cast.
pkg/spanconfig/target.go, line 102 at r6 (raw file):
// ID. if t.IsSystemTarget() && o.IsSystemTarget() { if t.GetSystemTarget().SourceTenantID == roachpb.SystemTenantID &&
Would this be simpler if we defined a .Less on the SystemTarget type itself? Ditto below for the equality check, where we delegate to span.Equal cause it's available. It seems generally useful to define these on the base type instead of inlining it in methods on the union type. That would for e.g. let us define sorts on the base types instead of roundtripping through the union type.
pkg/spanconfig/target.go, line 155 at r6 (raw file):
// isEmpty returns true if the receiver is an empty target. func (t Target) isEmpty() bool { if t.systemTarget.isEmpty() && t.span.Equal(roachpb.Span{}) {
[nit] This could be simplified to "return ".
pkg/spanconfig/target.go, line 163 at r6 (raw file):
// SpanConfigTargetProto returns a roachpb.SpanConfigTarget equivalent to the // receiver. func (t Target) SpanConfigTargetProto(ctx context.Context) roachpb.SpanConfigTarget {
Is it fair to say every spanconfig.Target is going to be represented as a roachpb.SpanConfigTarget over the wire? Ditto for []spanconfig.Targets represented as []roachpb.SpanConfigTarget? If so, [nit] how about the following method signatures?
func (t Target) ToProto() roachpb.SpanConfigTarget { .. } // also dropping ctx and panic-ing instead
func TargetsToProtos([]Target) []roachpb.SpanConfigTarget { ...}
func TargetsFromProtos([]roachpb.SpanConfigTarget) []Target { ...}
pkg/spanconfig/target.go, line 216 at r6 (raw file):
// RecordsToSpanConfigEntries converts a list of records to a list // roachpb.SpanConfigEntry protos suitable for sending over the wire. func RecordsToSpanConfigEntries(ctx context.Context, records []Record) []roachpb.SpanConfigEntry {
[nit] s/RecordsToSpanConfigEntries/RecordsToEntries for symmetry with EntriesToRecords below. Also nice to be a little less wordy.
pkg/spanconfig/spanconfigreconciler/reconciler.go, line 287 at r6 (raw file):
// reconciler we shouldn't be requesting these configs directly, instead, // they should be targeted through SystemSpanConfigTargets instead. targets = append(targets, spanconfig.MakeTargetFromSpan(roachpb.Span{
Does this smell to anyone? The server's internal encoding is leaking through to the client if it's having to be careful about not asking for portions of the keyspace. Ideally the server side code just filters these spans out at some level if we only want it to be targeted through explicit SystemSpanConfigTargets. Is this something we're planning to do anything about?
pkg/rpc/auth_test.go, line 153 at r6 (raw file):
return string(append(tenPrefix, []byte(key)...)) } makeSpan := func(key string, endKey ...string) *roachpb.Span {
Can we avoid "pointer to a span" here as well? Ditto below for makeSystemTarget. For no other reasons than it being sharper to use, for unclear benefits. If we're doing union types, staring at makeSpanConfigTarget, we could instead have two constructors for each type of base type in the union. IIUC, one or the other argument has to be nil, right? Could even repurpose makeSystemTarget and makeSpan for it.
pkg/rpc/auth_test.go, line 653 at r6 (raw file):
}, { // Ensure tenant 10 (the tenant we test all these with) can't pretend
Is this the same test case as the last one? In fact, I'm seeing multiple duplicated sub-tests here. Was this intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @adityamaru, @ajwerner, @arulajmani, @irfansharif, and @nvanbenschoten)
pkg/roachpb/span_config.proto, line 230 at r1 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
The type is still the same, but the underlying type (
SpanConfigEntry
) changed. I think I'm being overly cautious in bumping things here (and in some other places with the bumping), given the already existing migration in conjunction with these being RPC types that aren't persisted. Do sanity check my claim here though.
Bumping this isn't necessary, this is safe to revert (makes for a more confusing git commit history if I'm being pedantic).
pkg/roachpb/span_config.proto, line 180 at r3 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
The special casing here was intended to mirror full cluster backup, which when run by the host tenant, backs-up all ranges (including tenant ranges). That being said, I do agree that this isn't an intuitive special case, especially at the level of these RPCs.
Changed to make this field nullable + added auth validation to ensure only the host tenant is allowed to leave this field unset.
I think the following semantics are sound/easier to get wrong:
- Only the host tenant is allowed to set a target_tenant_id.
- If the target_tenant_id is unset and the source_tenant_id == host, applies to all tenants.
- If a target_tenant_id is set, it applies to the that tenant only (could be the host tenant itself)
This is in contrast to making it necessary for secondary tenants to target themselves. In the alternative above, the whole concept of "targeting" a tenant doesn't extend to secondary tenants, which is what we want.
pkg/roachpb/span_config.proto, line 175 at r9 (raw file):
// SourceTenantID is the ID of the tenant that specified the system span // configuration. TenantID source_tenant_id = 1 [(gogoproto.nullable) = false, (gogoproto.customname)= "SourceTenantID"];
[nit] Spacing around '=' for gogoproto.customname. Ditto below, where the spacing is inconsistent in another way.
pkg/roachpb/span_config.proto, line 182 at r9 (raw file):
// If the host tenant is the source and the target is unspecified then the // associated system span configuration applies over all ranges in the system // (including those belonging to secondary tennats).
Typo: "tenants"
pkg/roachpb/span_config.proto, line 205 at r9 (raw file):
reserved 1; // Target specifies the target (keyspan(s)) the config applies over.
[nit] Drop "keyspan(s)". Callers can jump to the target definition itself to figure out what the internals represent.
pkg/roachpb/span_config.proto, line 239 at r9 (raw file):
// UpdateSpanConfigsRequest is used to update the span configurations and system // span configurations over the given spans.
[nit] Talk about it in terms of targets and span configs, instead of spans. IIUC, we don't have a "system span configuration" here, but we do have this special target concept instead.
pkg/roachpb/span_config.proto, line 254 at r9 (raw file):
// are, however, allowed to overlap across lists. This is necessary to support // the delete+upsert semantics described above. // Targets of configurations being added must not overlap with any existing
I'm not sure I follow this last sentence. Is it necessary given your changes above?
This patch modifies roachpb.SpanConfigEntry to tie a roachpb.SpanConfigTarget (instead of roachpb.Span) with a roachpb.SpanConfig. A roachpb.SpanConfigTarget is a union proto over roachpb.Span and roachpb.SystemSpanConfigTarget. Given a roachpb.SpanConfigEntry, we can use the roachpb.SpanConfigTarget to disambiguate between regular span configurations and system span configurations. We use the modified roachpb.SpanConfigEntry in all our RPCs. We also replace usages of roachpb.Spans with roachpb.SpanConfigTargets in the RPCs as well. Release note: None
This patch introduces a new spanconfig.SystemTarget type. When used in conjunction with a span configuration, tenants (and the system tenant) can use this type to target multiple keyspans with the associated span configuration. We use this type to expand the a spanconfig.Target to optionally embedd a SystemTarget (in additon to a span target like it did before). We also add special encoding/decoding methods for system targets, which we will make use of when persisting them in `system.span_configurations`. We do so by reserving a special SystemSpanConfigSpan at the end of the System range. This allows us to then assign special meaning to key prefixes carved out of this range and the corresponding spans stored in `system.span_configurations`. We must take special care that the host tenant doesn't install span configurations to these keyspans directly, and as such, we make exceptions in the translator and reconciler. Release note: None
Changed to return an error to the caller when dealing with proto to target conversions for protos received over the wire. There isn't likely to be an error in this codepath, but if there is, it makes sense to return it up the RPC chain to the tenant as opposed to letting it bring a KV node down. Release note: None
a713d87
to
c7b7c26
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
Will merge on green fr this time.
Dismissed @adityamaru and @irfansharif from 11 discussions.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @adityamaru, @ajwerner, @irfansharif, and @nvanbenschoten)
pkg/ccl/kvccl/kvtenantccl/connector.go, line 465 at r6 (raw file):
Previously, irfansharif (irfan sharif) wrote…
Do we need to plumb down a context for a type-conversion
Removed.
pkg/roachpb/span_config.proto, line 230 at r1 (raw file):
Previously, irfansharif (irfan sharif) wrote…
Bumping this isn't necessary, this is safe to revert (makes for a more confusing git commit history if I'm being pedantic).
Done.
pkg/roachpb/span_config.proto, line 180 at r3 (raw file):
Previously, irfansharif (irfan sharif) wrote…
I think the following semantics are sound/easier to get wrong:
- Only the host tenant is allowed to set a target_tenant_id.
- If the target_tenant_id is unset and the source_tenant_id == host, applies to all tenants.
- If a target_tenant_id is set, it applies to the that tenant only (could be the host tenant itself)
This is in contrast to making it necessary for secondary tenants to target themselves. In the alternative above, the whole concept of "targeting" a tenant doesn't extend to secondary tenants, which is what we want.
I have a mild preference to what we have here as opposed to what you're suggesting. Mostly because in your suggestion, the target_tenant_id
being left unset means different things depending on who the source is -- if the source is the host tenant then the target represents the physical cluster, but if the source is a secondary tenant, then the target represents its logical cluster. Expressing these in "absolute" terms feels more intuitive to me.
pkg/roachpb/span_config.proto, line 175 at r9 (raw file):
Previously, irfansharif (irfan sharif) wrote…
[nit] Spacing around '=' for gogoproto.customname. Ditto below, where the spacing is inconsistent in another way.
Done.
pkg/roachpb/span_config.proto, line 205 at r9 (raw file):
Previously, irfansharif (irfan sharif) wrote…
[nit] Drop "keyspan(s)". Callers can jump to the target definition itself to figure out what the internals represent.
Done.
pkg/roachpb/span_config.proto, line 239 at r9 (raw file):
Previously, irfansharif (irfan sharif) wrote…
[nit] Talk about it in terms of targets and span configs, instead of spans. IIUC, we don't have a "system span configuration" here, but we do have this special target concept instead.
Done.
pkg/roachpb/span_config.proto, line 254 at r9 (raw file):
Previously, irfansharif (irfan sharif) wrote…
I'm not sure I follow this last sentence. Is it necessary given your changes above?
Yeah, I think the previous paragraph captures what I was trying to say here. Removed.
pkg/spanconfig/target.go, line 36 at r6 (raw file):
Previously, irfansharif (irfan sharif) wrote…
Getting rid of this error and opting instead for a panic would help simplify code paths (no handling of unhandle-able errors needed) that bottom out at what is affectively a type cast.
For this one in particular, I think it makes sense to return the error up the RPC chain, back to the tenant, instead of bringing down the KV node. For everything else, I cleaned things up.
pkg/spanconfig/target.go, line 89 at r6 (raw file):
Previously, irfansharif (irfan sharif) wrote…
+1. Would also remove the need to pass in a ctx for what should otherwise be a simple type cast.
Removed the context passing and switched to panic-ing instead.
pkg/spanconfig/target.go, line 102 at r6 (raw file):
Previously, irfansharif (irfan sharif) wrote…
Would this be simpler if we defined a .Less on the SystemTarget type itself? Ditto below for the equality check, where we delegate to span.Equal cause it's available. It seems generally useful to define these on the base type instead of inlining it in methods on the union type. That would for e.g. let us define sorts on the base types instead of roundtripping through the union type.
Good point, done for both.
pkg/spanconfig/target.go, line 155 at r6 (raw file):
Previously, irfansharif (irfan sharif) wrote…
[nit] This could be simplified to "return ".
Done.
pkg/spanconfig/target.go, line 163 at r6 (raw file):
Previously, irfansharif (irfan sharif) wrote…
Is it fair to say every spanconfig.Target is going to be represented as a roachpb.SpanConfigTarget over the wire? Ditto for []spanconfig.Targets represented as []roachpb.SpanConfigTarget? If so, [nit] how about the following method signatures?
func (t Target) ToProto() roachpb.SpanConfigTarget { .. } // also dropping ctx and panic-ing instead func TargetsToProtos([]Target) []roachpb.SpanConfigTarget { ...} func TargetsFromProtos([]roachpb.SpanConfigTarget) []Target { ...}
Done.
pkg/spanconfig/target.go, line 216 at r6 (raw file):
Previously, irfansharif (irfan sharif) wrote…
[nit] s/RecordsToSpanConfigEntries/RecordsToEntries for symmetry with EntriesToRecords below. Also nice to be a little less wordy.
Done.
pkg/spanconfig/spanconfigreconciler/reconciler.go, line 287 at r6 (raw file):
Previously, irfansharif (irfan sharif) wrote…
Does this smell to anyone? The server's internal encoding is leaking through to the client if it's having to be careful about not asking for portions of the keyspace. Ideally the server side code just filters these spans out at some level if we only want it to be targeted through explicit SystemSpanConfigTargets. Is this something we're planning to do anything about?
I see a fair bit of this changing once we introduce a method on the KVAccessor
to get all system span configurations installed by the host tenant. I agree, it isn't great that this leaks here, but I'm not sure silently swallowing this span is better given the targeted nature of our get request. We can see how it looks once we make changes to the reconciler in earnest.
pkg/rpc/auth_test.go, line 153 at r6 (raw file):
Previously, irfansharif (irfan sharif) wrote…
Can we avoid "pointer to a span" here as well? Ditto below for makeSystemTarget. For no other reasons than it being sharper to use, for unclear benefits. If we're doing union types, staring at makeSpanConfigTarget, we could instead have two constructors for each type of base type in the union. IIUC, one or the other argument has to be nil, right? Could even repurpose makeSystemTarget and makeSpan for it.
Good call, done.
pkg/rpc/auth_test.go, line 653 at r6 (raw file):
Previously, irfansharif (irfan sharif) wrote…
Is this the same test case as the last one? In fact, I'm seeing multiple duplicated sub-tests here. Was this intentional?
They're different, in that we test updates/deletes/gets separately. It's what we had before my change, I just added system span config targets to it.
bors r+ |
Build succeeded: |
…76591 75809: [CRDB-12226] server, ui: display circuit breakers in problem ranges and range status r=Santamaura a=Santamaura This PR adds changes to the reports/problemranges and reports/range pages. Ranges with replicas that have a circuit breaker will show up as problem ranges and the circuit breaker error will show up as a row on the status page. Release note (ui change): display circuit breakers in problems ranges and range status Problem Ranges page: ![Screen Shot 2022-02-08 at 4 57 51 PM](https://user-images.githubusercontent.com/17861665/153082648-6c03d195-e395-456a-be00-55ad24863752.png) Range status page: ![Screen Shot 2022-02-08 at 4 57 34 PM](https://user-images.githubusercontent.com/17861665/153082705-cbbe5507-e81d-49d7-a3f7-21b4c84226c2.png) 76278: Add cluster version as feature gate for block properties r=dt,erikgrinaker,jbowens a=nicktrav Two commits here - the first adds a new cluster version, the second makes use of the cluster version as a feature gate (and update various call sites all over the place). --- pkg/clusterversion: add new version as feature gate for block properties Prior to this change, the block properties SSTable-level feature was enabled in a single cluster version. This introduced a subtle race in that for the period in which each node is being updated to `PebbleFormatBlockPropertyCollector`, there is a brief period where not all nodes are at the same cluster version, and thus store versions. If nodes at the newer version write SSTables that make use of block properties, and these tables are consumed by nodes that are yet to be updated, the older nodes could panic when attempting to read the tables with a format they do not yet understand. While this race is academic, given that a) there are now subsequent cluster versions that act as barriers during the migration, and b) block properties were disabled in 1a8fb57, this patch addresses the race by adding a second cluster version. `EnablePebbleFormatVersionBlockProperties` acts as a barrier and a feature gate. A guarantee of the migration framework is that any node at this newer version is part of a cluster that has already run the necessary migrations for the older version, and thus ratcheted the format major version in the store, and thus enabled the block properties feature, across all nodes. Add additional documentation in `pebble.go` that details how to make use of the two-version pattern for future table-level version changes. --- pkg/storage: make MakeIngestionWriterOptions version aware With the introduction of block properties, and soon, range keys, which introduce backward-incompatible changes at the SSTable level, all nodes in a cluster must all have a sufficient store version in order to avoid runtime incompatibilities. Update `storage.MakeIngestionWriterOptions` to add a `context.Context` and `cluster.Settings` as parameters, which allows for determining whether a given cluster version is active (via `(clusterversion.Handle).IsActive()`). This allows gating the enabling / disabling of block properties (and soon, range keys), on all nodes being at a sufficient cluster version. Update various call-sites to pass in the `context.Context` and `cluster.Settings`. --- 76348: ui: downsample SQL transaction metrics using MAX r=maryliag a=dhartunian Previously, we were using the default downsampling behavior of the timeseries query engine for "Open SQL Transactions" and "Active SQL Statements" on the metrics page in DB console. This led to confusion when zooming in on transaction spikes since the spike would get larger as the zoom got tighter. This PR changes the aggregation function to use MAX to prevent this confusion. Resolves: #71827 Release note (ui change): Open SQL Transactions and Active SQL Transactions are downsampled using MAX instead of AVG and will more accurately reflect narrow spikes in transaction counts when looking and downsampled data. 76414: spanconfig: teach the KVAccessor about system span configurations r=arulajmani a=arulajmani First 3 commits are from #76219, this one's quite small -- mostly just tests. ---- This patch teaches the KVAccessor to update and get system span configurations. Release note: None 76538: ui: Use liveness info to populate decommissioned node lists r=zachlite a=zachlite Previously, the decommissioned node lists considered node status entries to determine decommissioning and decommissioned status. This changed in #56529, resulting in empty lists. Now, the node's liveness entry is considered and these lists are correctly populated. Release note (bug fix): The list of recently decommissioned nodes and the historical list of decommissioned nodes correctly display decommissioned nodes. 76544: builtins: add rehome_row to DistSQLBlocklist r=mgartner,otan a=rafiss fixes #76153 This builtin always needs to run on the gateway node. Release note: None 76546: build: display pebble git SHA in GitHub messages r=rickystewart a=nicktrav Use the short from of the Git SHA from the go.mod-style version in DEPS.bzl as the Pebble commit. This ensures that failure messages created by Team City link to a GitHub page that renders correctly. Release note: None 76550: gen/genbzl: general improvements r=ajwerner a=ajwerner This change does a few things: * It reworks the queries in terms of eachother in-memory. This is better than the previous iteration whereby it'd generate the results and then rely on the output of that query. Instead, we just build up bigger query expressions and pass them to bazel using the --query_file flag. * It avoids exploring the pkg/ui directory (and the pkg/gen directory) because those can cause problems. The pkg/ui directory ends up bringing in npm, which hurts. * It stops rewriting the files before executing the queries. It no longer needs to rewrite them up front because they aren't referenced by later queries. * It removes the excluded target which was problematic because those files weren't properly visible. Fixes #76521 Fixes #76503 Release note: None 76562: ccl/sqlproxyccl: update PeekMsg to return message size instead of body size r=JeffSwenson a=jaylim-crl Informs #76000. Follow-up to #76006. Previously, PeekMsg was returning the body size (excluding header size), which is a bit awkward from an API point of view because most callers of PeekMsg immediately adds the header size to the returned size previously. This commit cleans the API design up by making PeekMsg return the message size instead, i.e. header inclusive. At the same time, returning the message size makes it consistent with the ReadMsg API since that returns the entire message. Release note: None 76591: bazel: update shebang line in `sql-gen.sh` r=rail a=rickystewart Release note: None Co-authored-by: Santamaura <[email protected]> Co-authored-by: Nick Travers <[email protected]> Co-authored-by: David Hartunian <[email protected]> Co-authored-by: arulajmani <[email protected]> Co-authored-by: Zach Lite <[email protected]> Co-authored-by: Rafi Shamim <[email protected]> Co-authored-by: Andrew Werner <[email protected]> Co-authored-by: Jay <[email protected]> Co-authored-by: Ricky Stewart <[email protected]>
See individual commits for details.