Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multitenant: prevent tenant from overriding next tenant's span config #96014

Merged
merged 1 commit into from
Feb 7, 2023
Merged

multitenant: prevent tenant from overriding next tenant's span config #96014

merged 1 commit into from
Feb 7, 2023

Conversation

ecwall
Copy link
Contributor

@ecwall ecwall commented Jan 26, 2023

Fixes #95882

We want to ensure we have a split at the non-inclusive end of the tenant's
keyspace, which also happens to be the inclusive start of the next
tenant's (with ID=ours+1). If we didn't do anything here, and we were the
tenant with the highest ID thus far, our last range would extend to /Max.
If a tenant with a higher ID was created, when installing its initial span
config record, it would carve itself off (and our last range would only
extend to that next tenant's boundary), but this is a bit awkward for code
that wants to rely on the invariant that ranges for a tenant only extend
to the tenant's well-defined end key.

So how do we ensure this split at this tenant's non-inclusive end key?
Hard splits are controlled by the start keys on span config records1,
so we'll try insert one accordingly. But we cannot do this blindly. We
cannot assume that tenants are created in ID order, so it's possible that
the tenant with the next ID is already present + running. If so, it may
already have span config records that start at the key at which we want
to write a span config record for. Over-writing it blindly would be a
mistake -- there's no telling what config that next tenant has associated
for that span. So we'll do something simple -- we'll check transactionally
whether there's anything written already, and if so, do nothing. We
already have the split we need.

Release note: None

Footnotes

  1. See ComputeSplitKey in spanconfig.StoreReader.

@ecwall ecwall requested a review from a team as a code owner January 26, 2023 17:48
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor Author

@ecwall ecwall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @irfansharif, and @knz)


pkg/ccl/spanconfigccl/spanconfigreconcilerccl/testdata/multitenant/tenant_end_key_split_regression line 119 at r2 (raw file):

...
/Tenant/10{-\x00}                          database system (tenant)
/Tenant/11{-/Table/4}                      database system (tenant)

This is now no longer overridden (compare vs 1st commit in PR).


pkg/ccl/spanconfigccl/spanconfigreconcilerccl/testdata/multitenant/tenant_end_key_split_regression line 127 at r2 (raw file):

# And tenant=11, since it assumes its the only writer, simply does not observe
# this mutation.
mutations tenant=11

Is this expected to have different output now also?

Copy link
Contributor

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left suggestions on commentary; do use the commentary to improve the commit message too perhaps. The code changes look right to me.

@@ -390,19 +390,33 @@ func CreateTenantRecord(
return roachpb.TenantID{}, err
}

// This adds a split at the end of the tenant keyspace. This split would
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace this comment with something better that tries to explain exactly what we're doing, which is something approximately "ensuring that we have a split at the end of the tenant's keyspace, which we achieve by guaranteeing a split at the start of the next tenant's since splits are done based on start keys". And then walk through why we're checking for existing records, to not overwrite them. The comments below, to me at least, were bit difficult to parse. Summarizing the exact hazard instead of referring to the soon-to-be-closed issue may be friendlier to future readers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the comments in this section.

Copy link
Contributor

@irfansharif irfansharif Jan 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current tenant's endRecordTarget is the same as the next tenant's startRecordTarget.

We're glossing over the fact that once the next tenant starts reconciling, it deletes the "startRecordTarget" it finds and replaces it with another. In the test above, this is what we see in the following snippet. I'm taking startRecordTarget to mean one with the single-key span: [tenantPrefix, tenantPrefix.Next()).

 # Just another view of what the tenant's reconciler actually did. It got rid of
 # the original, single-key /Tenant/11{-\x00} record, and replaced it with
 # /Tenant/11{-/Table/4}, just like the comment above said. The remaining upserts
 # are for its SQL state.
 mutations tenant=11 limit=2
 ----
 delete /Tenant/11{-\x00}
 upsert /Tenant/11{-/Table/4}               database system (tenant)

I tried rewriting the comments + invariants in this section, and in writing it, I think I see another bug in this implementation. First, the comment:

// We want to ensure we have a split at the non-inclusive end of the tenant's
// keyspace, which also happens to be the inclusive start of the next tenant's
// (with ID=ours+1). If we didn't do anything here and we were the tenant with the
// highest ID thus far, our last range would extend to /Max. If a tenant with a
// higher ID was created, when installing its initial span config record, it would
// carve itself off (and our last range would only extend to that next tenant's
// boundary), but this is a bit awkward for code that want to rely on the
// invariant that ranges for a tenant only extend to the tenant's well-defined end
// key.
//
// So how do we ensure this split at this tenant's non-inclusive end key? Hard
// splits are controlled by the start keys found span config records[^1], so we'll
// try insert one accordingly. But we cannot do this blindly. We cannot assume
// that tenants are created in ID order, so it's possible that the tenant with the
// next ID is already present + running. If so, it may already have span config
// records that start at the key at which we want to write a span config record
// for. Over-writing it blindly would be a mistake -- there's no telling what
// config that next tenant has associated for that span. So we'll do something
// simple -- we'll check transactionally whether there's anything written already,
// and if so, do nothing. We already have the split we need.
//
// [^1]: See ComputeSplitKey in spanconfig.StoreReader.

Now, the new bug. Consider the following sequence:

  • tenant=11 gets deleted, and so doing, gets rid of all its span configs, including the one at its start key. This gets rid of that hard split point.
  • If tenant=11 was the tenant with the highest tenant ID, and there was a tenant=10 that was relying on tenant=11's first span config record, now tenant=10's last range could now go back to extending all the way to /Max.

We're back to where we were with #92072. With any tenant deletions, we might end up creating tenant end ranges that straddle until /Max or the next non-deleted tenant's start key.

}), tenantSpanConfig)
})

records, err := spanConfigs.GetSpanConfigRecords(ctx, []spanconfig.Target{endRecordTarget})
Copy link
Contributor

@irfansharif irfansharif Jan 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're making some assumptions here for code that currently meets the assumptions but at a distance. Specifically we're assuming that the SQLTranslator's first emitted record, once spanconfig.Reconciler kicks in, starts at keys.MakeTenantPrefix(tenantID). So it's queryable here. That record must always stay around -- the spanconfig.{Reconciler,Translator} can't get rid of the record with start key == keys.MakeTenantPrefix(tenantID). If it did, then this read txn would not find anything and insert its own. Documenting the specifics will help, and we should, but it's worth adding an assertion in the spanconfig package to make sure we don't regress on this bug accidentally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you point me to where to add the assertion? I'm not very familiar with this code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you point me to where to add the assertion? I'm not very familiar with this code.

@arulajmani is your resident expert (especially if this assertion is better suited within spanconfig.SQLTranslator), but one place to do it is here:

return storeWithLatestSpanConfigs, readTimestamp, nil

If we're running that code as a tenant, the storeWithLatestSpanConfigs can be queried/iterated over for the first span config, to sanity check that it's the one we expect -- starting at the tenant prefix. (For the system tenant the first span config will be over /{Min-System/NodeLiveness}, hence the check that we're not the system tenant).

pkg/sql/tenant_creation.go Outdated Show resolved Hide resolved
Copy link
Contributor

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @ecwall, and @knz)


pkg/ccl/spanconfigccl/spanconfigreconcilerccl/testdata/multitenant/tenant_end_key_split_regression line 127 at r2 (raw file):

Previously, ecwall (Evan Wall) wrote…

Is this expected to have different output now also?

No, this looks correct.

@blathers-crl blathers-crl bot requested a review from irfansharif January 26, 2023 20:53
/Tenant/11/Table/{4-5} database system (tenant)
...

# And tenant=11, since it assumes its the only writer, simply does not observe
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block can be removed since there's no mutation to speak of after the bug fix.

}), tenantSpanConfig)
})

records, err := spanConfigs.GetSpanConfigRecords(ctx, []spanconfig.Target{endRecordTarget})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you point me to where to add the assertion? I'm not very familiar with this code.

@arulajmani is your resident expert (especially if this assertion is better suited within spanconfig.SQLTranslator), but one place to do it is here:

return storeWithLatestSpanConfigs, readTimestamp, nil

If we're running that code as a tenant, the storeWithLatestSpanConfigs can be queried/iterated over for the first span config, to sanity check that it's the one we expect -- starting at the tenant prefix. (For the system tenant the first span config will be over /{Min-System/NodeLiveness}, hence the check that we're not the system tenant).

@@ -390,19 +390,33 @@ func CreateTenantRecord(
return roachpb.TenantID{}, err
}

// This adds a split at the end of the tenant keyspace. This split would
Copy link
Contributor

@irfansharif irfansharif Jan 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current tenant's endRecordTarget is the same as the next tenant's startRecordTarget.

We're glossing over the fact that once the next tenant starts reconciling, it deletes the "startRecordTarget" it finds and replaces it with another. In the test above, this is what we see in the following snippet. I'm taking startRecordTarget to mean one with the single-key span: [tenantPrefix, tenantPrefix.Next()).

 # Just another view of what the tenant's reconciler actually did. It got rid of
 # the original, single-key /Tenant/11{-\x00} record, and replaced it with
 # /Tenant/11{-/Table/4}, just like the comment above said. The remaining upserts
 # are for its SQL state.
 mutations tenant=11 limit=2
 ----
 delete /Tenant/11{-\x00}
 upsert /Tenant/11{-/Table/4}               database system (tenant)

I tried rewriting the comments + invariants in this section, and in writing it, I think I see another bug in this implementation. First, the comment:

// We want to ensure we have a split at the non-inclusive end of the tenant's
// keyspace, which also happens to be the inclusive start of the next tenant's
// (with ID=ours+1). If we didn't do anything here and we were the tenant with the
// highest ID thus far, our last range would extend to /Max. If a tenant with a
// higher ID was created, when installing its initial span config record, it would
// carve itself off (and our last range would only extend to that next tenant's
// boundary), but this is a bit awkward for code that want to rely on the
// invariant that ranges for a tenant only extend to the tenant's well-defined end
// key.
//
// So how do we ensure this split at this tenant's non-inclusive end key? Hard
// splits are controlled by the start keys found span config records[^1], so we'll
// try insert one accordingly. But we cannot do this blindly. We cannot assume
// that tenants are created in ID order, so it's possible that the tenant with the
// next ID is already present + running. If so, it may already have span config
// records that start at the key at which we want to write a span config record
// for. Over-writing it blindly would be a mistake -- there's no telling what
// config that next tenant has associated for that span. So we'll do something
// simple -- we'll check transactionally whether there's anything written already,
// and if so, do nothing. We already have the split we need.
//
// [^1]: See ComputeSplitKey in spanconfig.StoreReader.

Now, the new bug. Consider the following sequence:

  • tenant=11 gets deleted, and so doing, gets rid of all its span configs, including the one at its start key. This gets rid of that hard split point.
  • If tenant=11 was the tenant with the highest tenant ID, and there was a tenant=10 that was relying on tenant=11's first span config record, now tenant=10's last range could now go back to extending all the way to /Max.

We're back to where we were with #92072. With any tenant deletions, we might end up creating tenant end ranges that straddle until /Max or the next non-deleted tenant's start key.

Copy link
Collaborator

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's file a separate issue about the bug Irfan pointed out as well? That seems legit -- I think it's a bit more general than the last tenant however. What we're really looking for is no tenant's last range should straddle tenant boundaries. If tenant n is dropped, and it deletes the split at the start of its keyspace, we'll be back to pre #92072 days for tenant n - 1.

The easiest fix there would be to leave the single-key span config record at the start of tenant n's keyspace around when it gets dropped. @irfansharif, thoughts?

Reviewed 1 of 1 files at r1, 1 of 2 files at r2, 1 of 2 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ecwall, @irfansharif, and @knz)


-- commits line 4 at r1:
Let's not use imperative phrasing here. See https://cockroachlabs.atlassian.net/wiki/spaces/CRDB/pages/73072807/Git+Commit+Messages#GitCommitMessages-Committitle.

Separately, consider pulling both these commits into one -- especially given you're making changes to the test from the first commit to the second, I don't see a good reason to have 2 separate commits here.


-- commits line 14 at r3:
Same comment as above, let's not use the imperative form here.


pkg/sql/tenant_creation.go line 399 at r2 (raw file):

Previously, irfansharif (irfan sharif) wrote…

Could you point me to where to add the assertion? I'm not very familiar with this code.

@arulajmani is your resident expert (especially if this assertion is better suited within spanconfig.SQLTranslator), but one place to do it is here:

return storeWithLatestSpanConfigs, readTimestamp, nil

If we're running that code as a tenant, the storeWithLatestSpanConfigs can be queried/iterated over for the first span config, to sanity check that it's the one we expect -- starting at the tenant prefix. (For the system tenant the first span config will be over /{Min-System/NodeLiveness}, hence the check that we're not the system tenant).

I have no strong feelings about putting this in the Translator vs. the Reconciler. The line Irfan linked feels like a good place to put the assertion.


pkg/sql/tenant_creation.go line 395 at r3 (raw file):

	toUpsert := []spanconfig.Record{startRecord}

	// Ensure we have a split at the current tenant's endKey in one of two ways:

nit: maybe this commentary will be clearer if you talk about tenants as tenant n, tenant n + 1?


pkg/sql/tenant_creation.go line 399 at r3 (raw file):

	//    already exists and created this split at its own startKey when it was
	//    created.
	// 2) The next tenant does not exist so split now at the endKey.

Let's add some words about what happens when the next tenant is created in the future. Something about the new tenant overwriting this record during its creation, which will end up being a no-op, and continue to provide a split point at the end of the tenant's keyspace for this tenant.


pkg/sql/tenant_creation.go line 402 at r3 (raw file):

	//
	// If this endKey split is not created via (1) or (2), the current tenant's
	// last range will have an endKey of /Max which is outside of it's keyspace.

That's only true if this is the last tenant, by ID, that exists. Let's drop the /Max phrasing -- instead, let's say it'll be outside of its keyspace.

nit: s/it's/its


pkg/ccl/spanconfigccl/spanconfigreconcilerccl/testdata/multitenant/tenant_end_key_split_regression line 1 at r1 (raw file):

# BUG: Show how creation of a tenant can end up overwriting span config state

nit: "s/BUG/This test serves as a regression for "

@knz knz added the A-multitenancy Related to multi-tenancy label Jan 30, 2023
@blathers-crl blathers-crl bot requested a review from irfansharif January 30, 2023 23:09
@irfansharif irfansharif removed their request for review January 30, 2023 23:11
@irfansharif irfansharif dismissed their stale review January 30, 2023 23:12

Arul's reviewing this PR.

Copy link
Contributor Author

@ecwall ecwall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #96247

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @irfansharif, and @knz)


-- commits line 4 at r1:

Previously, arulajmani (Arul Ajmani) wrote…

Let's not use imperative phrasing here. See https://cockroachlabs.atlassian.net/wiki/spaces/CRDB/pages/73072807/Git+Commit+Messages#GitCommitMessages-Committitle.

Separately, consider pulling both these commits into one -- especially given you're making changes to the test from the first commit to the second, I don't see a good reason to have 2 separate commits here.

Removed this comment and message.


-- commits line 14 at r3:

Previously, arulajmani (Arul Ajmani) wrote…

Same comment as above, let's not use the imperative form here.

Fixed.


pkg/sql/tenant_creation.go line 399 at r2 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

I have no strong feelings about putting this in the Translator vs. the Reconciler. The line Irfan linked feels like a good place to put the assertion.

Added a check here.


pkg/sql/tenant_creation.go line 395 at r3 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

nit: maybe this commentary will be clearer if you talk about tenants as tenant n, tenant n + 1?

Updated.

Code quote:

// Ensure we have a split at the current tenant's endKey in o

pkg/sql/tenant_creation.go line 399 at r3 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

Let's add some words about what happens when the next tenant is created in the future. Something about the new tenant overwriting this record during its creation, which will end up being a no-op, and continue to provide a split point at the end of the tenant's keyspace for this tenant.

Updated.


pkg/sql/tenant_creation.go line 402 at r3 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

That's only true if this is the last tenant, by ID, that exists. Let's drop the /Max phrasing -- instead, let's say it'll be outside of its keyspace.

nit: s/it's/its

Updated.


pkg/ccl/spanconfigccl/spanconfigreconcilerccl/testdata/multitenant/tenant_end_key_split line 117 at r3 (raw file):

Previously, irfansharif (irfan sharif) wrote…

This block can be removed since there's no mutation to speak of after the bug fix.

Updated.

Copy link
Contributor

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The easiest fix there would be to leave the single-key span config record at the start of tenant n's keyspace around when it gets dropped. @irfansharif, thoughts?

If many tenants get deleted over time, over time we'll end up fragmenting the keyspace with empty ranges. I recommend solving this fully by ensuring that the "boundary" span configs are only left for the tenants that need it. i.e. If we're deleting tenant=11 and tenant=10 is around, leave the config at the start of tenant=11's keyspace. If tenant=12 then gets deleted, since there's no tenant=11 around to rely on that boundary, we don't need to keep it around.

Reviewed 1 of 3 files at r4, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @knz)


-- commits line 14 at r3:

Previously, ecwall (Evan Wall) wrote…

Fixed.

Thanks for updating the code comment elsewhere. Do use that here in this commit message too since its an improvement and clearer to readers.

Copy link
Collaborator

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Thanks for taking the time to clean up the comments in sql.CreateTenantRecord. Like these subtle bugs have uncovered, this stuff can get tricky and the code here makes various assumptions. I have no doubt that taking the time to document this stuff will go a long way for future readers who come across this code!

If many tenants get deleted over time, over time we'll end up fragmenting the keyspace with empty ranges. I recommend solving this fully by ensuring that the "boundary" span configs are only left for the tenants that need it. i.e. If we're deleting tenant=11 and tenant=10 is around, leave the config at the start of tenant=11's keyspace. If tenant=12 then gets deleted, since there's no tenant=11 around to rely on that boundary, we don't need to keep it around.

That makes sense. I'd personally be happy solving the problem at hand without worrying about fragmenting the keyspace with empty ranges to start off with. We can always write some sort of de-fragmenting builtin after the fact, or whatever else we may want to do here.

Reviewed 1 of 3 files at r4, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ecwall, @irfansharif, and @knz)


-- commits line 6 at r4:
"This patch ensures ... "


pkg/spanconfig/spanconfigreconciler/reconciler.go line 304 at r4 (raw file):

	if !f.codec.ForSystemTenant() {
		tenantPrefixKey := f.codec.TenantPrefix()
		// For secondary tenants, we assume that the SQLTranslator's first emitted

nit: Consider rewording this as follows:

We want to ensure tenant ranges do not straddle tenant boundaries. As such, a full reconciliation should always include a SpanConfig where the start key is keys.MakeTenantPrefix(tenantID). This ensures there is a split point right at the start of the tenant's keyspace, so that the last range of the previous tenant doesn't straddle across into this tenant. Also, sql.CreateTenantRecord relies on such a SpanConfigs existence to ensure the same thing for newly created tenants. 

pkg/sql/tenant_creation.go line 401 at r4 (raw file):

	// higher ID was created, when installing its initial span config record, it would
	// carve itself off (and our last range would only extend to that next tenant's
	// boundary), but this is a bit awkward for code that want to rely on the

"for code that wants"


pkg/sql/tenant_creation.go line 406 at r4 (raw file):

	//
	// So how do we ensure this split at this tenant's non-inclusive end key? Hard
	// splits are controlled by the start keys found span config records[^1], so we'll

"are controlled by the start keys on span config records"

@blathers-crl blathers-crl bot requested a review from irfansharif February 7, 2023 13:29
Copy link
Contributor Author

@ecwall ecwall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @arulajmani, @irfansharif, and @knz)


-- commits line 14 at r3:

Previously, irfansharif (irfan sharif) wrote…

Thanks for updating the code comment elsewhere. Do use that here in this commit message too since its an improvement and clearer to readers.

Moved here also


-- commits line 6 at r4:

Previously, arulajmani (Arul Ajmani) wrote…

"This patch ensures ... "

Fixed


pkg/spanconfig/spanconfigreconciler/reconciler.go line 304 at r4 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

nit: Consider rewording this as follows:

We want to ensure tenant ranges do not straddle tenant boundaries. As such, a full reconciliation should always include a SpanConfig where the start key is keys.MakeTenantPrefix(tenantID). This ensures there is a split point right at the start of the tenant's keyspace, so that the last range of the previous tenant doesn't straddle across into this tenant. Also, sql.CreateTenantRecord relies on such a SpanConfigs existence to ensure the same thing for newly created tenants. 

Updated.


pkg/sql/tenant_creation.go line 401 at r4 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

"for code that wants"

Fixed.


pkg/sql/tenant_creation.go line 406 at r4 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

"are controlled by the start keys on span config records"

Fixed

Fixes #95882

We want to ensure we have a split at the non-inclusive end of the tenant's
keyspace, which also happens to be the inclusive start of the next
tenant's (with ID=ours+1). If we didn't do anything here, and we were the
tenant with the highest ID thus far, our last range would extend to /Max.
If a tenant with a higher ID was created, when installing its initial span
config record, it would carve itself off (and our last range would only
extend to that next tenant's boundary), but this is a bit awkward for code
that wants to rely on the invariant that ranges for a tenant only extend
to the tenant's well-defined end key.

So how do we ensure this split at this tenant's non-inclusive end key?
Hard splits are controlled by the start keys on span config records[^1],
so we'll try insert one accordingly. But we cannot do this blindly. We
cannot assume that tenants are created in ID order, so it's possible that
the tenant with the next ID is already present + running. If so, it may
already have span config records that start at the key at which we want
to write a span config record for. Over-writing it blindly would be a
mistake -- there's no telling what config that next tenant has associated
for that span. So we'll do something simple -- we'll check transactionally
whether there's anything written already, and if so, do nothing. We
already have the split we need.

[^1]: See ComputeSplitKey in spanconfig.StoreReader.

Release note: None
@ecwall
Copy link
Contributor Author

ecwall commented Feb 7, 2023

bors r=arulajmani

@craig craig bot merged commit f640acb into cockroachdb:master Feb 7, 2023
@craig
Copy link
Contributor

craig bot commented Feb 7, 2023

Build succeeded:

arulajmani added a commit to arulajmani/cockroach that referenced this pull request Feb 8, 2023
We added an assertion in cockroachdb#96014 to ensure that full reconciliation
always preserves a split point at the start of the tenant's keyspace.
The comment that introduces the assertion explains such a span config
record should be present, however, makes the assertion that such a
record is the first span config record for the tenant.

There's no real reason to make the "first record" assertion here.
In fact, that assumption isn't correct when system span configurations
are in play, as they sort before regular span configurations. I
noticed this broke `TestBackupRestoreTenantSettings`, causing a panic in
the reconciler, while workong on an unrelated patch.

This patch relaxes the "first record" assertion. We're happy as long as
it's present somewhere.

Release note: None
craig bot pushed a commit that referenced this pull request Feb 8, 2023
96767: spanconfig: fix assertion that ensures split at tenant boundary r=ecwall a=arulajmani

We added an assertion in #96014 to ensure that full reconciliation always preserves a split point at the start of the tenant's keyspace. The comment that introduces the assertion explains such a span config record should be present, however, makes the assertion that such a record is the first span config record for the tenant.

There's no real reason to make the "first record" assertion here. In fact, that assumption isn't correct when system span configurations are in play, as they sort before regular span configurations. I noticed this broke `TestBackupRestoreTenantSettings`, causing a panic in the reconciler, while workong on an unrelated patch.

This patch relaxes the "first record" assertion. We're happy as long as it's present somewhere.

Release note: None
Epic: None

Co-authored-by: Arul Ajmani <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-multitenancy Related to multi-tenancy
Projects
None yet
Development

Successfully merging this pull request may close these issues.

multitenant: tenant creation can unintentionally overwrite another tenant's span configs
5 participants