-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
jobs,server: graceful shutdown for secondary tenant servers #99958
Conversation
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this looks reasonable to me. startControlledServer
is a bit of a beast, but that is a fish to fry for another day.
pkg/jobs/registry.go
Outdated
@@ -1066,6 +1071,9 @@ func (r *Registry) Start(ctx context.Context, stopper *stop.Stopper) error { | |||
}) | |||
|
|||
if err := stopper.RunAsyncTask(ctx, "jobs/cancel", func(ctx context.Context) { | |||
r.startedControllerTasksWG.Add(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have examples of both patterns in the database, but I almost always put the Add() before launching the goroutine/async task. You don't have a guarantee that the scheduler has started the goroutine before you end up calling Wait().
|
||
drainCtx := logtags.WithTags(context.Background(), logtags.FromContext(ctx)) | ||
|
||
for ; ; prevRemaining = remaining { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] I have a feeling this would become a little clearer to read if remaining became local to the scope of the block, and we set prevRemaining = remaining
at the end of the loop, and then used for {
here.
drainServer := func() interface { | ||
gracefulDrain(ctx context.Context, verbose bool) (uint64, redact.RedactableString, error) | ||
} { | ||
shutdownInterface.Lock() | ||
defer shutdownInterface.Unlock() | ||
return shutdownInterface.drainableServer | ||
}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] Perhaps I'm missing something, but this is a lot of readability overhead. I know we like defers for locks, but in this case
shutdownInterface.Lock()
drainServer := shutdownInterface.drainableServer
shutdownInterface.Unlock()
feels pretty attractive.
Overall, I think we would benefit from refactoring this a bit so that we had some set of methods on serverEntry or some new type that wrapped some of these statement management things up in some functions so they didn't have to all be inline to this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a quick follow up, the refactor I mentioned is strictly future looking. Not required for this PR.
75a863c
to
8326c94
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
startControlledServer is a bit of a beast, but that is a fish to fry for another day.
I reduced the quantity of changes to the function. PTAL.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jeffswenson and @stevendanna)
pkg/jobs/registry.go
line 1074 at r2 (raw file):
Previously, stevendanna (Steven Danna) wrote…
We have examples of both patterns in the database, but I almost always put the Add() before launching the goroutine/async task. You don't have a guarantee that the scheduler has started the goroutine before you end up calling Wait().
Done.
pkg/server/server_controller_orchestration.go
line 229 at r2 (raw file):
Previously, stevendanna (Steven Danna) wrote…
As a quick follow up, the refactor I mentioned is strictly future looking. Not required for this PR.
Done.
pkg/server/server_controller_orchestration.go
line 561 at r2 (raw file):
Previously, stevendanna (Steven Danna) wrote…
[nit] I have a feeling this would become a little clearer to read if remaining became local to the scope of the block, and we set
prevRemaining = remaining
at the end of the loop, and then usedfor {
here.
Done.
Also I made this code shared with pkg/cli/start.go as suggested earlier.
8326c94
to
83edc7f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few nits and one minor race/leak. I'll leave the latter to your judgement.
// by the tenant server any more. | ||
drainCtx, cancel := c.stopper.WithCancelOnQuiesce(tenantCtx) | ||
defer cancel() | ||
shutdownInterface.maybeCallDrain(drainCtx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we probably have other problems of this nature so perhaps it isn't worth fixing, but I think the current structure will fail to drain the server if the following sequence happens:
- The propogate-close task starts and is running
- The managed-tenant-server task starts and reaches line 349 but has not run line 350
- createServerEntryLocked returns the server entry, the entry is added to the server slice, the severs slice mutex is unlocked.
- requestAllStopped() is called.
- propogate-close sees the stoprequest, and runs myabeCallDrain(), but we haven't set the server yet, so it reports nothing to drain, even though our server already has drainable components started.
I think this is minor because the server won't be serving any user connections yet (since the orchestrator doesn't think it is started). So it just means any jobs or other internal components might not get drained.
pkg/sql/stats/automatic_stats.go
Outdated
@@ -364,14 +375,34 @@ func (r *Refresher) getTableDescriptor( | |||
return desc | |||
} | |||
|
|||
// WaitForJobShutdown(ctx context.Context) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stray comment
pkg/jobs/registry.go
Outdated
// NB: Check the implementation of drain before adding code that would | ||
// make this block. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit[ We can probably remove this comment now
pkg/server/drain.go
Outdated
statsProvider.Flush(ctx) | ||
statsProvider.Stop(ctx) | ||
|
||
// Inform the async tasks for table stats that the ode is draining |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] ode is draining
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jeffswenson, @mgartner, and @stevendanna)
pkg/jobs/registry.go
line 1958 at r6 (raw file):
Previously, stevendanna (Steven Danna) wrote…
[nit[ We can probably remove this comment now
Done.
pkg/server/drain.go
line 405 at r6 (raw file):
Previously, stevendanna (Steven Danna) wrote…
[nit] ode is draining
Done.
pkg/server/server_controller_orchestration.go
line 257 at r6 (raw file):
Previously, stevendanna (Steven Danna) wrote…
I think we probably have other problems of this nature so perhaps it isn't worth fixing, but I think the current structure will fail to drain the server if the following sequence happens:
- The propogate-close task starts and is running
- The managed-tenant-server task starts and reaches line 349 but has not run line 350
- createServerEntryLocked returns the server entry, the entry is added to the server slice, the severs slice mutex is unlocked.
- requestAllStopped() is called.
- propogate-close sees the stoprequest, and runs myabeCallDrain(), but we haven't set the server yet, so it reports nothing to drain, even though our server already has drainable components started.
I think this is minor because the server won't be serving any user connections yet (since the orchestrator doesn't think it is started). So it just means any jobs or other internal components might not get drained.
I'm impressed and curious how you went about analyzing this condition 💯
The solution here is to split the instantiation of a new server and the starting of the new server in two separate steps (i.e. split (s *Server) startTenantServerInternal
). We can then have the shutdown interface ready after instantiation and before start. This incidentally mirrors the dance done in cli/start.go for exactly the same reason.
I believe this is needed anyway to solve #97661 / #98868. I'll mull it over.
pkg/sql/stats/automatic_stats.go
line 378 at r6 (raw file):
Previously, stevendanna (Steven Danna) wrote…
Stray comment
Done.
|
83edc7f
to
8d04053
Compare
8d04053
to
d0e1081
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rebased on top of #100436.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jeffswenson, @mgartner, and @stevendanna)
pkg/server/server_controller_orchestration.go
line 257 at r6 (raw file):
Previously, knz (Raphael 'kena' Poss) wrote…
I'm impressed and curious how you went about analyzing this condition 💯
The solution here is to split the instantiation of a new server and the starting of the new server in two separate steps (i.e. split
(s *Server) startTenantServerInternal
). We can then have the shutdown interface ready after instantiation and before start. This incidentally mirrors the dance done in cli/start.go for exactly the same reason.I believe this is needed anyway to solve #97661 / #98868. I'll mull it over.
Done.
98459: server,autoconfig: automatic configuration via config tasks r=adityamaru a=knz Epic: CRDB-23559 Informs #98431. All commits but the last are from #98993. This change introduces "auto config tasks", a mechanism through which configuration payloads ("tasks") can be injected into a running SQL service. This is driven via the "auto config runner" job that was introduced in the previous commit. The job listens for the arrival of new task definitions via a `Provider` interface. When new tasks are known, and previous tasks have completed, the runner creates a job for the first next task. Release note: None 100476: server/drain: shut down SQL subsystems gracefully before releasing table leases r=JeffSwenson,rytaft a=knz Needed for #99941 and #99958. Epic: CRDB-23559 See individual commits for details. 100511: sqlccl: deflake TestGCTenantJobWaitsForProtectedTimestamps r=adityamaru,arulajmani a=knz Fixes #94808 The tenant server must be shut down before the tenant record is removed; otherwise the tenant's SQL server will self-terminate by calling Stop() on its stopper, which in this case was shared with the parent cluster. Release note: None Co-authored-by: Raphael 'kena' Poss <[email protected]>
d0e1081
to
a82e2c7
Compare
100583: server,util: avoid retry forever during startup upon premature shutdown r=aliher1911 a=knz Needed for #99958. Probably helps with #100578. We need to have the `RunIdempotentWithRetry` calls abort if the surrounding server shuts down prematurely. This happens most frequently in tests that fail for another reason; and also in multitenancy tests with multiple tenant server side-by-side. Release note: None Epic: None Co-authored-by: Raphael 'kena' Poss <[email protected]>
6950c76
to
19c93cd
Compare
|
|
19c93cd
to
6009cae
Compare
…enantStartup Release note: None
Release note: None
Release note: None
Release note: None
ahead of splitting "new" vs "start" during construction. Release note: None
Release note: None
This peels the call to "start" from the `newTenantServer` interface and pulls it into the orchestration retry loop. This change also incidentally reveals an earlier misdesign: we are calling `newTenantServer` _then_ `start` in the same retry loop. If `new` succeeds but `start` fails, the next retry will call `newTenantServer` again *with the same stopper*, which will leak closers from the previous call to `new`. Release note: None
Prior to this patch, if an error occurred during the initialization or startup of a secondary tenant server, the initialization would leak state into the stopper defined for that tenant. Generally, reusing a stopper across server startup failures is not safe (and API violation). This patch fixes it by decoupling the intermediate stopper used for orchestration from the one used per tenant server. Release note: None
Prior to this patch, the test was not cleaning up its server stopper reliably at the end of each sub-test. This patch fixes it. Release note: None
This change ensures that tenant servers managed by the server controller receive a graceful drain request as part of the graceful drain process of the surrounding KV node. This change, in turn, ensures that SQL clients connected to these secondary tenant servers benefit from the same guarantees (and graceful periods) as clients to the system tenant. Release note: None
6009cae
to
4d4c111
Compare
bors r=stevendanna |
Build succeeded: |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 8b7093b to blathers/backport-release-23.1-99958: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.1.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Epic: CRDB-23559
Fixes #92523.
All commits but the last are from #100436.
This change ensures that tenant servers managed by the server
controller receive a graceful drain request as part of the graceful
drain process of the surrounding KV node.
This change, in turn, ensures that SQL clients connected to these
secondary tenant servers benefit from the same guarantees (and
graceful periods) as clients to the system tenant.