Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: fix system config for tenant #76504

Merged

Conversation

ajwerner
Copy link
Contributor

Follow-on from #76279.

Fixes #75864.

This commit adds a mechanism to combine the system config data of the tenant
with the data provided over the GossipSubscription from the system tenant.
It then plumbs a version into the zone config methods. In the mixed version
state, the tenant uses the existing override from the system tenant. After
the span config infrastructure has been activated, the tenant uses the
overrides they've set. This affects, realistically, just the GC job, and,
to a lesser extent, the optimizer.

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@ajwerner ajwerner force-pushed the ajwerner/fix-system-config-for-tenant branch from e0736fe to c3bff52 Compare February 14, 2022 21:22
@ajwerner ajwerner force-pushed the ajwerner/fix-system-config-for-tenant branch from c3bff52 to faea93b Compare March 27, 2022 19:38
@ajwerner ajwerner force-pushed the ajwerner/fix-system-config-for-tenant branch from faea93b to 6efa71d Compare March 27, 2022 21:34
@ajwerner ajwerner marked this pull request as ready for review March 28, 2022 01:59
@ajwerner ajwerner requested review from a team as code owners March 28, 2022 01:59
@ajwerner ajwerner requested review from a team and adityamaru and removed request for a team March 28, 2022 01:59
@ajwerner
Copy link
Contributor Author

This is more or less ready for a look. There are some questions about the mixed version tenant behavior and about the spanconfigcomparedccl tests. @arulajmani let's discuss Monday.

@arulajmani arulajmani self-requested a review March 29, 2022 23:01
Copy link
Collaborator

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems pretty close. Should we unskip some of the locality optimized search logic tests that were blocked on #75864 as part of this patch as well?

I still haven't looked at spanconfigcomparedccl diffs, but publishing comments for everything else.

Reviewed 31 of 33 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru and @ajwerner)


pkg/config/system.go, line 418 at r1 (raw file):

	// configuration.
	//
	// TODO(XXX): If the reconciliation protocol is not active, and this is a

We should probably remove the "XXX" and keep this comment around before we merge. Do you think it's worth being a bit verbose and adding some words about how the host changing RANGE TENANTS fits into this ugliness?


pkg/config/system.go, line 428 at r1 (raw file):

	if !codec.ForSystemTenant() &&
		(id == 0 || !version.IsActive(
			clusterversion.EnsureSpanConfigReconciliation,

nit: In GetConfReader, we gate on EnableSpanConfigStore -- should we use that here as well?


pkg/sql/zone_config.go, line 201 at r1 (raw file):

		return nil, nil, false, err
	}
	if err = completeZoneConfig(zone, keys.SystemSQLCodec, zoneID, getKey); err != nil {

Should this change to codec as well?


pkg/sql/gcjob/refresh_statuses.go, line 344 at r1 (raw file):

	tenantTTLSeconds := execCfg.DefaultZoneConfig.GC.TTLSeconds
	v := execCfg.Settings.Version.ActiveVersionOrEmpty(ctx)
	zoneCfg, err := cfg.GetZoneConfigForObject(keys.MakeSQLCodec(roachpb.MakeTenantID(tenID)), v, 0)

Should we instead be passing in execCfg.SystemSQLCodec here instead? Then, we won't have to special case for this 0 value in GetZoneConfigForObject.

Separately, should we instead be using TenantsRangesID instead of 0 here? Unless I'm misremembering, a draft version of this used TenantsRangesID here; has our thinking changed since?


pkg/ccl/kvccl/kvtenantccl/tenant_upgrade_test.go, line 40 at r1 (raw file):

// TestTenantUpgrade exercises the case where a system tenant is in a
// non-finalized version state and creates a tenant. The test ensures
// that newly created tenant begins in that same version.

nit: "that the"

/Table/106 num_replicas=7 num_voters=5
/Table/107 num_replicas=7
/Table/50 database system (host)
/Table/106 ignore_strict_gc=true num_replicas=7 num_voters=5 rangefeed_enabled=true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking only at these tests, what this diff indicates is that according to the system config span, the new two tables have configs with ignore_strict_gc=true rangefeed_enabled=true. Is that expected? Shouldn't be, right? Since they're not system tables.

@ajwerner ajwerner force-pushed the ajwerner/fix-system-config-for-tenant branch from 6efa71d to ff63efc Compare April 11, 2022 17:28
@blathers-crl blathers-crl bot requested a review from irfansharif April 11, 2022 17:28
Copy link
Contributor Author

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RFAL

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @arulajmani, and @irfansharif)


pkg/config/system.go, line 418 at r1 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

We should probably remove the "XXX" and keep this comment around before we merge. Do you think it's worth being a bit verbose and adding some words about how the host changing RANGE TENANTS fits into this ugliness?

Done.


pkg/config/system.go, line 428 at r1 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

nit: In GetConfReader, we gate on EnableSpanConfigStore -- should we use that here as well?

Done.


pkg/sql/zone_config.go, line 201 at r1 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

Should this change to codec as well?

Done.


pkg/sql/gcjob/refresh_statuses.go, line 344 at r1 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

Should we instead be passing in execCfg.SystemSQLCodec here instead? Then, we won't have to special case for this 0 value in GetZoneConfigForObject.

Separately, should we instead be using TenantsRangesID instead of 0 here? Unless I'm misremembering, a draft version of this used TenantsRangesID here; has our thinking changed since?

Done.


pkg/ccl/kvccl/kvtenantccl/tenant_upgrade_test.go, line 40 at r1 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

nit: "that the"

Done.


pkg/ccl/spanconfigccl/spanconfigcomparedccl/testdata/basic, line 61 at r1 (raw file):

Previously, irfansharif (irfan sharif) wrote…

Looking only at these tests, what this diff indicates is that according to the system config span, the new two tables have configs with ignore_strict_gc=true rangefeed_enabled=true. Is that expected? Shouldn't be, right? Since they're not system tables.

I think I fixed it with Arul's help

@irfansharif irfansharif requested a review from arulajmani April 11, 2022 17:30
Copy link
Collaborator

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: modulo comment

Reviewed 9 of 10 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru, @ajwerner, @arulajmani, and @irfansharif)


pkg/config/system.go, line 403 at r2 (raw file):

// GetZoneConfigForObject returns the combined zone config for the given object
// identifier and SQL codec. Note that id=0 with a secondary tenant codec

Now that you've changed the GC code to use RANGE TENANTS id, can we get rid of this special casing?

Follow-on from cockroachdb#76279.

Fixes cockroachdb#75864.

This commit adds a mechanism to combine the system config data of the tenant
with the data provided over the GossipSubscription from the system tenant.
It then plumbs a version into the zone config methods. In the mixed version
state, the tenant uses the existing override from the system tenant. After
the span config infrastructure has been activated, the tenant uses the
overrides they've set. This affects, realistically, just the GC job, and,
to a lesser extent, the optimizer.

Release note: None
@ajwerner ajwerner force-pushed the ajwerner/fix-system-config-for-tenant branch from ff63efc to 69e48ba Compare April 11, 2022 20:42
Copy link
Contributor Author

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru, @arulajmani, and @irfansharif)


pkg/config/system.go, line 403 at r2 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

Now that you've changed the GC code to use RANGE TENANTS id, can we get rid of this special casing?

Done.

@ajwerner
Copy link
Contributor Author

TFTR!

bors r+

@craig craig bot merged commit 6de7a18 into cockroachdb:master Apr 11, 2022
@craig
Copy link
Contributor

craig bot commented Apr 11, 2022

Build succeeded:

@blathers-crl
Copy link

blathers-crl bot commented Apr 11, 2022

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from 69e48ba to blathers/backport-release-22.1-76504: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.1.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql,config: tenant zone config lookup is broken
4 participants