-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server: wait for active configuration profiles during startup #113666
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru and @stevendanna)
pkg/server/server_sql.go
line 1774 at r1 (raw file):
envs := s.execCfg.AutoConfigProvider.ActiveEnvironments() if len(envs) == 0 { log.Infof(ctx, "auto-configuration environments not set or already complete")
nit: preStart
is called in both system and application tenants. Do we have any auto config changes in the system tenant? If not, should we consider omitting this log message for system tenants then?
pkg/server/autoconfig/auto_config_test.go
line 109 at r1 (raw file):
} func (p *testProvider) ActiveEnvironments() []autoconfigpb.EnvironmentID {
nit: should we adjust the contract on Provider.ActiveEnvironments
to explicitly mention that nil
(or zero length slice) is returned once there are no more tasks left to run?
pkg/server/autoconfig/auto_config_test.go
line 348 at r1 (raw file):
Provider: provider, }, // Server: &server.TestingKnobs{
nit: leftover.
Before, we did not wait for these configuration profiles at startup. This produces confusing behaviour where the behaviour of the cluster changes substantially a few moments after startup. For instance the replication-source configuration profile runs SET CLUSTER SETT server.controller.default_target_cluster = 'application' If this is delayed until after we start accepting connections, then for a few moments new connections will go to the system tenant and then afterwards they will go to the application tenant. Here, we narrow the window of confusion by waiting for the configuration profiles to be complete during the preStart sequence in SQL. Note that this doesn't solve startup coordination. There are still at least two problems: 1. Async server startup still means the user may get an error for a few moments after startup until the server is started. 2. Async settings propagation still means that the default_target_cluster setting can still be delayed. Informs cockroachdb#111637 Release note: None
a4ebf87
to
0969935
Compare
bors r=yuzefovich |
Build failed: |
bors retry |
Build succeeded: |
113687: server: option to wait for default_target_cluster on connect r=yuzefovich a=stevendanna Virtual cluster startup is asynchronous. As a result, it can be hard to orchestrate bringing a virtual cluster up since the client needs to retry until the cluster is available. This makes it a little easier by optionally allowing connections to the default_target_cluster to be held until the cluster is online. As a result, a user who does: ALTER VIRTUAL CLUSTER app START SERVICE SHARED SET CLUSTER SETTING server.controller.default_target_cluster = 'app' Should be able to immediately connect to the same node without having to handle retries while the virtual cluster starts up. Note that we can extend this to wait for _any_ tenant. The restriction on only waiting on the default_target_cluster is that we need to thread a little state into the server controller in order to avoid pitfalls such as holding a lot of connections open for longer than needed for a tenant that will never become available because it doesn't exist. In a future PR, we will set this option by default in the replication-source and replication-target configuration profiles. Built on #113666 Informs #111637 Release note: None Co-authored-by: Steven Danna <[email protected]>
Before, we did not wait for these configuration profiles at startup. This produces confusing behaviour where the behaviour of the cluster changes substantially a few moments after startup.
For instance the replication-source configuration profile runs
SET CLUSTER SETTING server.controller.default_target_cluster = 'application'
If this is delayed until after we start accepting connections, then for a few moments new connections will go to the system tenant and then afterwards they will go to the application tenant.
Here, we narrow the window of confusion by waiting for the configuration profiles to be complete during the preStart sequence in SQL.
Note that this doesn't solve startup coordination. There are still at least two problems:
Async server startup still means the user may get an error for a few moments after startup until the server is started.
Async settings propagation still means that the default_target_cluster setting can still be delayed.
Informs #111637
Release note: None