-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server: only run tenants with service mode 'shared' #95658
Conversation
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
b2b2fac
to
cb7c7b9
Compare
7825bab
to
87a11e6
Compare
cb7c7b9
to
e1b76bd
Compare
87a11e6
to
ff07b82
Compare
ff07b82
to
d4daf0c
Compare
d13c9bc
to
91ccefe
Compare
1a0014b
to
52787d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed the second commit only.
Overall, I think the polling approach will work for now. Do we need to add a test to ensure that ensures that the STOP SERVICE command does tear down the server?
pkg/server/server_controller.go
Outdated
if _, err := c.createServer(ctx, tenantName, base.TestSharedProcessTenantArgs{}); err != nil { | ||
log.Warningf(ctx, "unable to start service for tenant: %v", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want any backoff for tenants that we can't start for some reason? Currently, we'll retry every second which seems a bit aggressive if we've gotten an error.
pkg/server/server_controller.go
Outdated
|
||
rows, err := ie.QueryBuffered(ctx, "fetch-running-tenants", | ||
nil, /* txn */ | ||
`SELECT name FROM system.tenants WHERE service_mode = $1 AND name IS NOT NULL`, mtinfopb.ServiceModeShared) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we prevent it on write, but do we want an extra condition here that the data_mode is also in the correct state?
pkg/server/server_controller.go
Outdated
if _, ok := reqTenants[name]; !ok { | ||
// Tenant currently running, but should not be running. Shut it down. | ||
if err := c.stopper.RunAsyncTask(ctx, "stop-tenant-async", func(ctx context.Context) { | ||
srv.server.stop(context.Background()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume we pass context.Background() here so that the stop process isn't canceled when we start shutting down. If not, the real reason could be worth a comment.
pkg/server/admin.go
Outdated
ie := s.internalExecutor | ||
rowIter, err := ie.QueryIterator(ctx, "list-tenants", nil, /* txn */ | ||
`SELECT name FROM system.tenants WHERE service_mode = $1 AND name IS NOT NULL`, mtinfopb.ServiceModeShared) | ||
if err != nil { | ||
return nil, err | ||
} | ||
defer func() { _ = rowIter.Close() }() | ||
|
||
var hasNext bool | ||
for hasNext, err = rowIter.Next(ctx); hasNext && err == nil; hasNext, err = rowIter.Next(ctx) { | ||
row := rowIter.Cur() | ||
tenantName := tree.MustBeDString(row[0]) | ||
tenantNames = append(tenantNames, roachpb.TenantName(tenantName)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could see extracting this to a small function.
pkg/server/admin.go
Outdated
row := rowIter.Cur() | ||
tenantName := tree.MustBeDString(row[0]) | ||
tenantNames = append(tenantNames, roachpb.TenantName(tenantName)) | ||
} | ||
} | ||
|
||
var tenantList []*serverpb.Tenant |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose we could pre-allocate this since in the common case the tenantList will be the same length as tenantNames.
// ListTenants returns a list of tenants that are served | ||
// by shared-process services in this server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am assuming that existing callers of this API are OK with this change?
52787d8
to
5fcce46
Compare
2e5316d
to
273dc19
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to add a test to ensure that ensures that the STOP SERVICE command does tear down the server?
Absolutely! I had forgotten. Good catch. Done.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @herkolategan, @renatolabs, @rhu713, and @stevendanna)
pkg/server/admin.go
line 4226 at r1 (raw file):
Previously, stevendanna (Steven Danna) wrote…
I am assuming that existing callers of this API are OK with this change?
Yes, we chatted about it together.
pkg/server/admin.go
line 4247 at r1 (raw file):
Previously, stevendanna (Steven Danna) wrote…
I could see extracting this to a small function.
Good idea. Done.
pkg/server/admin.go
line 4250 at r1 (raw file):
Previously, stevendanna (Steven Danna) wrote…
I suppose we could pre-allocate this since in the common case the tenantList will be the same length as tenantNames.
Done.
pkg/server/server_controller.go
line 247 at r1 (raw file):
Previously, stevendanna (Steven Danna) wrote…
I assume we pass context.Background() here so that the stop process isn't canceled when we start shutting down. If not, the real reason could be worth a comment.
Indeed - still added a comment to clarify.
pkg/server/server_controller.go
line 263 at r1 (raw file):
Previously, stevendanna (Steven Danna) wrote…
Do we want any backoff for tenants that we can't start for some reason? Currently, we'll retry every second which seems a bit aggressive if we've gotten an error.
OK. I'll do that next.
pkg/server/server_controller.go
line 288 at r1 (raw file):
Previously, stevendanna (Steven Danna) wrote…
I know we prevent it on write, but do we want an extra condition here that the data_mode is also in the correct state?
Let's do it. Done.
273dc19
to
67c573c
Compare
bcee01c
to
2f56e7a
Compare
Prior to this patch, services for secondary tenants would be started automatically upon first use by a client. This commit changes this to auto-start services upfront for all tenants with service mode SHARED. (And shut down services for tenants with another service mode configured.) Release note: None
2f56e7a
to
3ff9bc9
Compare
TFYR! bors r=stevendanna |
Build failed: |
unrelated flake #96136 bors r=stevendanna |
Build succeeded: |
96143: bench: adjust the secondary tenant initialization r=yuzefovich a=knz With the changes in #95658 this can now be simplified. Release note: None Epic: CRDB-14537 Co-authored-by: Raphael 'kena' Poss <[email protected]>
Fixes #92739.
Fixes #96146.
RFC for context: #96147
Previous PRs:
Prior to this patch, services for secondary tenants would be started
automatically upon first use by a client.
This commit changes this to auto-start services upfront for all
tenants with service mode SHARED. (And shut down services for tenants
with another service mode configured.)
Release note: None
Epic: CRDB-14537