Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jobs,server: graceful shutdown for secondary tenant servers #99958

Merged
merged 10 commits into from
Apr 6, 2023

Conversation

knz
Copy link
Contributor

@knz knz commented Mar 29, 2023

Epic: CRDB-23559
Fixes #92523.

All commits but the last are from #100436.

This change ensures that tenant servers managed by the server
controller receive a graceful drain request as part of the graceful
drain process of the surrounding KV node.

This change, in turn, ensures that SQL clients connected to these
secondary tenant servers benefit from the same guarantees (and
graceful periods) as clients to the system tenant.

@knz knz requested a review from stevendanna March 29, 2023 17:54
@knz knz requested review from a team as code owners March 29, 2023 17:54
@blathers-crl
Copy link

blathers-crl bot commented Mar 29, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@knz knz requested a review from jeffswenson March 29, 2023 17:55
Copy link
Collaborator

@stevendanna stevendanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this looks reasonable to me. startControlledServer is a bit of a beast, but that is a fish to fry for another day.

@@ -1066,6 +1071,9 @@ func (r *Registry) Start(ctx context.Context, stopper *stop.Stopper) error {
})

if err := stopper.RunAsyncTask(ctx, "jobs/cancel", func(ctx context.Context) {
r.startedControllerTasksWG.Add(1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have examples of both patterns in the database, but I almost always put the Add() before launching the goroutine/async task. You don't have a guarantee that the scheduler has started the goroutine before you end up calling Wait().


drainCtx := logtags.WithTags(context.Background(), logtags.FromContext(ctx))

for ; ; prevRemaining = remaining {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] I have a feeling this would become a little clearer to read if remaining became local to the scope of the block, and we set prevRemaining = remaining at the end of the loop, and then used for { here.

Comment on lines 223 to 229
drainServer := func() interface {
gracefulDrain(ctx context.Context, verbose bool) (uint64, redact.RedactableString, error)
} {
shutdownInterface.Lock()
defer shutdownInterface.Unlock()
return shutdownInterface.drainableServer
}()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] Perhaps I'm missing something, but this is a lot of readability overhead. I know we like defers for locks, but in this case

shutdownInterface.Lock()
drainServer := shutdownInterface.drainableServer
shutdownInterface.Unlock()

feels pretty attractive.

Overall, I think we would benefit from refactoring this a bit so that we had some set of methods on serverEntry or some new type that wrapped some of these statement management things up in some functions so they didn't have to all be inline to this function.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a quick follow up, the refactor I mentioned is strictly future looking. Not required for this PR.

@knz knz force-pushed the 20230329-tenant-server-drain branch from 75a863c to 8326c94 Compare March 31, 2023 15:23
@knz knz requested a review from a team March 31, 2023 15:23
@knz knz requested a review from a team as a code owner March 31, 2023 15:23
@knz knz requested a review from mgartner March 31, 2023 15:23
Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

startControlledServer is a bit of a beast, but that is a fish to fry for another day.

I reduced the quantity of changes to the function. PTAL.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jeffswenson and @stevendanna)


pkg/jobs/registry.go line 1074 at r2 (raw file):

Previously, stevendanna (Steven Danna) wrote…

We have examples of both patterns in the database, but I almost always put the Add() before launching the goroutine/async task. You don't have a guarantee that the scheduler has started the goroutine before you end up calling Wait().

Done.


pkg/server/server_controller_orchestration.go line 229 at r2 (raw file):

Previously, stevendanna (Steven Danna) wrote…

As a quick follow up, the refactor I mentioned is strictly future looking. Not required for this PR.

Done.


pkg/server/server_controller_orchestration.go line 561 at r2 (raw file):

Previously, stevendanna (Steven Danna) wrote…

[nit] I have a feeling this would become a little clearer to read if remaining became local to the scope of the block, and we set prevRemaining = remaining at the end of the loop, and then used for { here.

Done.

Also I made this code shared with pkg/cli/start.go as suggested earlier.

@knz knz force-pushed the 20230329-tenant-server-drain branch from 8326c94 to 83edc7f Compare March 31, 2023 15:36
Copy link
Collaborator

@stevendanna stevendanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few nits and one minor race/leak. I'll leave the latter to your judgement.

// by the tenant server any more.
drainCtx, cancel := c.stopper.WithCancelOnQuiesce(tenantCtx)
defer cancel()
shutdownInterface.maybeCallDrain(drainCtx)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we probably have other problems of this nature so perhaps it isn't worth fixing, but I think the current structure will fail to drain the server if the following sequence happens:

  1. The propogate-close task starts and is running
  2. The managed-tenant-server task starts and reaches line 349 but has not run line 350
  3. createServerEntryLocked returns the server entry, the entry is added to the server slice, the severs slice mutex is unlocked.
  4. requestAllStopped() is called.
  5. propogate-close sees the stoprequest, and runs myabeCallDrain(), but we haven't set the server yet, so it reports nothing to drain, even though our server already has drainable components started.

I think this is minor because the server won't be serving any user connections yet (since the orchestrator doesn't think it is started). So it just means any jobs or other internal components might not get drained.

@@ -364,14 +375,34 @@ func (r *Refresher) getTableDescriptor(
return desc
}

// WaitForJobShutdown(ctx context.Context) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stray comment

Comment on lines 1957 to 1958
// NB: Check the implementation of drain before adding code that would
// make this block.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit[ We can probably remove this comment now

statsProvider.Flush(ctx)
statsProvider.Stop(ctx)

// Inform the async tasks for table stats that the ode is draining
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] ode is draining

@knz knz added the backport-23.1.x Flags PRs that need to be backported to 23.1 label Mar 31, 2023
Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jeffswenson, @mgartner, and @stevendanna)


pkg/jobs/registry.go line 1958 at r6 (raw file):

Previously, stevendanna (Steven Danna) wrote…

[nit[ We can probably remove this comment now

Done.


pkg/server/drain.go line 405 at r6 (raw file):

Previously, stevendanna (Steven Danna) wrote…

[nit] ode is draining

Done.


pkg/server/server_controller_orchestration.go line 257 at r6 (raw file):

Previously, stevendanna (Steven Danna) wrote…

I think we probably have other problems of this nature so perhaps it isn't worth fixing, but I think the current structure will fail to drain the server if the following sequence happens:

  1. The propogate-close task starts and is running
  2. The managed-tenant-server task starts and reaches line 349 but has not run line 350
  3. createServerEntryLocked returns the server entry, the entry is added to the server slice, the severs slice mutex is unlocked.
  4. requestAllStopped() is called.
  5. propogate-close sees the stoprequest, and runs myabeCallDrain(), but we haven't set the server yet, so it reports nothing to drain, even though our server already has drainable components started.

I think this is minor because the server won't be serving any user connections yet (since the orchestrator doesn't think it is started). So it just means any jobs or other internal components might not get drained.

I'm impressed and curious how you went about analyzing this condition 💯

The solution here is to split the instantiation of a new server and the starting of the new server in two separate steps (i.e. split (s *Server) startTenantServerInternal). We can then have the shutdown interface ready after instantiation and before start. This incidentally mirrors the dance done in cli/start.go for exactly the same reason.

I believe this is needed anyway to solve #97661 / #98868. I'll mull it over.


pkg/sql/stats/automatic_stats.go line 378 at r6 (raw file):

Previously, stevendanna (Steven Danna) wrote…

Stray comment

Done.

@knz
Copy link
Contributor Author

knz commented Apr 1, 2023

TODO for self: add a warning in logs when a tenant server is shut down without graceful drain. done

@knz knz force-pushed the 20230329-tenant-server-drain branch from 83edc7f to 8d04053 Compare April 2, 2023 10:46
@knz knz force-pushed the 20230329-tenant-server-drain branch from 8d04053 to d0e1081 Compare April 2, 2023 19:18
@knz knz requested a review from a team April 2, 2023 19:18
@knz knz requested a review from a team as a code owner April 2, 2023 19:18
Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebased on top of #100436.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jeffswenson, @mgartner, and @stevendanna)


pkg/server/server_controller_orchestration.go line 257 at r6 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…

I'm impressed and curious how you went about analyzing this condition 💯

The solution here is to split the instantiation of a new server and the starting of the new server in two separate steps (i.e. split (s *Server) startTenantServerInternal). We can then have the shutdown interface ready after instantiation and before start. This incidentally mirrors the dance done in cli/start.go for exactly the same reason.

I believe this is needed anyway to solve #97661 / #98868. I'll mull it over.

Done.

craig bot pushed a commit that referenced this pull request Apr 3, 2023
98459: server,autoconfig: automatic configuration via config tasks r=adityamaru a=knz

Epic: CRDB-23559
Informs #98431.
All commits but the last are from #98993.

This change introduces "auto config tasks", a mechanism through which
configuration payloads ("tasks") can be injected into a running SQL
service.

This is driven via the "auto config runner" job that was introduced in
the previous commit. The job listens for the arrival of new task
definitions via a `Provider` interface. When new tasks are known, and
previous tasks have completed, the runner creates a job for the first
next task.

Release note: None

100476: server/drain: shut down SQL subsystems gracefully before releasing table leases r=JeffSwenson,rytaft a=knz

Needed for #99941 and #99958.
Epic: CRDB-23559

See individual commits for details.

100511: sqlccl: deflake TestGCTenantJobWaitsForProtectedTimestamps r=adityamaru,arulajmani a=knz

Fixes #94808

The tenant server must be shut down before the tenant record is removed; otherwise the tenant's SQL server will self-terminate by calling Stop() on its stopper, which in this case was shared with the parent cluster.

Release note: None

Co-authored-by: Raphael 'kena' Poss <[email protected]>
@mgartner mgartner removed their request for review April 3, 2023 23:28
@knz knz changed the title jobs,server: graceful shutdown for jobs + secondary tenant servers jobs,server: graceful shutdown for secondary tenant servers Apr 4, 2023
@knz knz force-pushed the 20230329-tenant-server-drain branch from d0e1081 to a82e2c7 Compare April 4, 2023 11:34
craig bot pushed a commit that referenced this pull request Apr 4, 2023
100583: server,util: avoid retry forever during startup upon premature shutdown r=aliher1911 a=knz

Needed for #99958.
Probably helps with #100578.

We need to have the `RunIdempotentWithRetry` calls abort if the surrounding server shuts down prematurely. This happens most frequently in tests that fail for another reason; and also in multitenancy tests with multiple tenant server side-by-side.

Release note: None
Epic: None


Co-authored-by: Raphael 'kena' Poss <[email protected]>
@knz knz force-pushed the 20230329-tenant-server-drain branch 2 times, most recently from 6950c76 to 19c93cd Compare April 5, 2023 22:11
@knz knz requested a review from a team as a code owner April 5, 2023 22:11
@knz
Copy link
Contributor Author

knz commented Apr 5, 2023

note to self: add a unit test using the new test knob done

@knz
Copy link
Contributor Author

knz commented Apr 6, 2023

Note to self: abandon drain on stopper quiescing done

@knz knz force-pushed the 20230329-tenant-server-drain branch from 19c93cd to 6009cae Compare April 6, 2023 14:27
knz added 10 commits April 6, 2023 16:50
ahead of splitting "new" vs "start" during construction.

Release note: None
This peels the call to "start" from the `newTenantServer`
interface and pulls it into the orchestration retry loop.

This change also incidentally reveals an earlier misdesign: we are
calling `newTenantServer` _then_ `start` in the same retry loop. If
`new` succeeds but `start` fails, the next retry will call
`newTenantServer` again *with the same stopper*, which will leak
closers from the previous call to `new`.

Release note: None
Prior to this patch, if an error occurred during the initialization or
startup of a secondary tenant server, the initialization would leak
state into the stopper defined for that tenant. Generally, reusing
a stopper across server startup failures is not safe (and API
violation).

This patch fixes it by decoupling the intermediate stopper used for
orchestration from the one used per tenant server.

Release note: None
Prior to this patch, the test was not cleaning up its server stopper
reliably at the end of each sub-test. This patch fixes it.

Release note: None
This change ensures that tenant servers managed by the server
controller receive a graceful drain request as part of the graceful
drain process of the surrounding KV node.

This change, in turn, ensures that SQL clients connected to these
secondary tenant servers benefit from the same guarantees (and
graceful periods) as clients to the system tenant.

Release note: None
@knz knz force-pushed the 20230329-tenant-server-drain branch from 6009cae to 4d4c111 Compare April 6, 2023 14:51
@knz
Copy link
Contributor Author

knz commented Apr 6, 2023

bors r=stevendanna

@craig
Copy link
Contributor

craig bot commented Apr 6, 2023

Build succeeded:

@craig craig bot merged commit 7d21940 into cockroachdb:master Apr 6, 2023
@blathers-crl
Copy link

blathers-crl bot commented Apr 6, 2023

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from 8b7093b to blathers/backport-release-23.1-99958: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 23.1.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-23.1.x Flags PRs that need to be backported to 23.1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

server: make top-level drain recursively drain all sql servers [CRDB-14537 followup]
3 participants