Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-21.2: server: get rid of end-user mechanism to enable span configs #70341

Merged
merged 5 commits into from
Sep 20, 2021

Conversation

irfansharif
Copy link
Contributor

@irfansharif irfansharif commented Sep 16, 2021

Backport 2/2 commits from #69658, 1/1 commits from #70232, 1/1 commits from
#70436, + another partially reverting #69658 to disallow end-users from
enabling this infrastructure (though leaving the feature gates intact for
tests).

/cc @cockroachdb/release


It was previously possible for single-tenant clusters to enable the span
configs infrastructure by setting COCKROACH_EXPERIMENTAL_SPAN_CONFIGS. For
secondary tenants, we actually enabled it by default. We've since decided that
the serverless offering intending to use this infrastructure will not be run
using the 21.2 release, instead favoring a 22.1 alpha. There's no reason then
to leave infrastructure wired up to anything in 21.2.

We'll still leave intact the testing knobs enabling this infrastructure, but
they're just that -- testing knobs. A more invasive variant of this PR could've
deleted all span configs code altogether, but that seems unnecessary (and
probably prone to bugs).

Release justification: non-production code changes
Release note: Non-production code

Cluster settings are too easy a switch to reach for to enable the new
span configs machinery. Let's gate it behind a necessary envvar as
well and use cluster settings to selectively toggle individual
components.

This commit also fixes a mighty silly bug introduced in cockroachdb#69047; for the
two methods we intended to gate use
`spanconfig.experimental_kvaccessor.enabled`, we were checking the
opposite condition or not checking it at all. Oops.

Release note: None
Release justification: non-production code changes
We'll repurpose the COCKROACH_EXPERIMENTAL_SPAN_CONFIGS envvar
introduced in the previous commit to prevent initializing the span
config manager for the host tenant (now truly switching off all span
config infrastructure unless explicitly opted into).

For both host and secondary tenants, we also want to disable the
creation of the span config reconciliation job by default. We introduce
`spanconfig.experimental_reconciliation_job.enabled` to that end.

---

We re-work some of the controls around when we start the reconciliation
job. It was previously possible for a tenant pod to start with a
conservative view of it's cluster version setting, decide not to create
the reconciliation job, find out about the up-to-date cluster version
(introducing the span config job), but only create it when the next
`check_interval` timer fired. This was undesirable.

We also want to "bounce" the job immediately when the `check_interval`
setting is changed. The earlier code didn't quite provide that property.

Release justification: non-production code changes
Release note: None
@irfansharif irfansharif requested review from a team as code owners September 16, 2021 22:31
@irfansharif irfansharif requested review from otan and removed request for a team September 16, 2021 22:31
@blathers-crl
Copy link

blathers-crl bot commented Sep 16, 2021

Thanks for opening a backport.

Please check the backport criteria before merging:

  • Patches should only be created for serious issues.
  • Patches should not break backwards-compatibility.
  • Patches should change as little code as possible.
  • Patches should not change on-disk formats or node communication protocols.
  • Patches should not add new functionality.
  • Patches must not add, edit, or otherwise modify cluster versions; or add version gates.
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
  • There is a high priority need for the functionality that cannot wait until the next release and is difficult to address in another way.
  • The new functionality is additive-only and only runs for clusters which have specifically “opted in” to it (e.g. by a cluster setting).
  • New code is protected by a conditional check that is trivial to verify and ensures that it only runs for opt-in clusters.
  • The PM and TL on the team that owns the changed code have signed off that the change obeys the above rules.

Add a brief release justification to the body of your PR to justify this backport.

Some other things to consider:

  • What did we do to ensure that a user that doesn’t know & care about this backport, has no idea that it happened?
  • Will this work in a cluster of mixed patch versions? Did we test that?
  • If a user upgrades a patch version, uses this feature, and then downgrades, what happens?

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@irfansharif irfansharif requested review from ajwerner and arulajmani and removed request for otan September 16, 2021 22:31
@irfansharif
Copy link
Contributor Author

(Bump.)

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 11 of 11 files at r1, 18 of 18 files at r2, 6 of 6 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @arulajmani, and @irfansharif)


pkg/spanconfig/spanconfigmanager/manager.go, line 157 at r2 (raw file):

		select {
		case <-jobCheckCh:
			timer.Read = true

Is this correct? Why set .Read in this case?


pkg/sql/logictest/logic.go, line 463 at r2 (raw file):

	// cluster. Connections on behalf of the logic test will go to that tenant.
	useTenant bool
	// If true, XXX:

Did you mean to add a comment here?

Accidentally left unfinished in cockroachdb#69658.

Release note: None
The contract for timeutil.Timer indicates that we should only be
setting .Read when reading from the timer channel, not unconditionally
before a call to .Reset().

Release note: None
It was previously possible for single-tenant clusters to enable the span
configs infrastructure by setting COCKROACH_EXPERIMENTAL_SPAN_CONFIGS.
For secondary tenants, we actually enabled it by default. We've since
decided that the serverless offering intending to use this
infrastructure will not be run using the 21.2 release, instead favoring
a 22.1 alpha. There's no reason then to leave infrastructure wired up to
anything in 21.2.

We'll still leave intact the testing knobs enabling this infrastructure,
but they're just that -- testing knobs. A more invasive variant of this
PR could've deleted all span configs code altogether, but that seems
unnecessary (and probably prone to bugs).

Release justification: non-production code changes
Release note: Non-production code
Copy link
Contributor Author

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @arulajmani, and @nvanbenschoten)


pkg/spanconfig/spanconfigmanager/manager.go, line 157 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Is this correct? Why set .Read in this case?

Hm, looks like I'd misunderstood the timer interface. Sent #70436 and backported here for posterity. All that aside -- this is all non-executable code as of the last commit in this PR.


pkg/sql/logictest/logic.go, line 463 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Did you mean to add a comment here?

Yup, tacked on #70232 here.

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 7 files at r4, 6 of 6 files at r6, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @arulajmani)

@irfansharif irfansharif merged commit 5529383 into cockroachdb:release-21.2 Sep 20, 2021
@irfansharif irfansharif deleted the backport21.2-69658 branch September 20, 2021 17:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants