-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: update multitenant/distsql to use new roachprod service APIs #115599
roachtest: update multitenant/distsql to use new roachprod service APIs #115599
Conversation
eedde77
to
fd14aa1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very excited about this change! 🎉
Reviewed 1 of 1 files at r1, 4 of 4 files at r2, 1 of 1 files at r3, 2 of 2 files at r4, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @herkolategan and @srosenberg)
pkg/cmd/roachtest/cluster.go
line 1929 at r2 (raw file):
func (c *clusterImpl) configureClusterSettingOptions( defaultEnv map[string]interface{},
What's the rationale for this defaultEnv
parameter? From what I can see, we only use it for COCKROACH_CRASH_ON_SPAN_USE_AFTER_FINISH
. Is there a reason why we wouldn't want that variable in SQL processes? I'm wondering if we should enforce all cockroach processes to have the same default variables (just like we're doing for the random seed var) and eliminate this argument.
pkg/cmd/roachtest/cluster.go
line 2031 at r2 (raw file):
if settings.Secure { if err := c.RefetchCertsFromNode(ctx, 1); err != nil {
Should this really be 1
and not first node in c.MakeNodes(opts...)
?
pkg/cmd/roachtest/cluster.go
line 2011 at r4 (raw file):
} func (c *clusterImpl) StartServiceForVirtualClusterE(
This function could use some documentation, especially about externalNodes
and in-memory vs external process tenants.
pkg/cmd/roachtest/tests/multitenant_distsql.go
line 71 at r3 (raw file):
c.Start(ctx, t.L(), option.DefaultStartOpts(), settings, c.Node(2)) c.Start(ctx, t.L(), option.DefaultStartOpts(), settings, c.Node(3)) storNodes := c.Range(1, 3)
Did you mean storageNodes
?
pkg/cmd/roachtest/tests/multitenant_distsql.go
line 81 at r3 (raw file):
instStartOps.RoachprodOpts.Target = install.StartServiceForVirtualCluster instStartOps.RoachprodOpts.VirtualClusterName = tenantName instStartOps.RoachprodOpts.SQLInstance = sqlInstance
Interesting, we should probably find a way to package these into something higher level too.
pkg/cmd/roachtest/tests/multitenant_distsql.go
line 85 at r3 (raw file):
t.L().Printf("Starting instance %d on node %d", i, node) instStartOps.RoachprodOpts.SQLPort = 0 instStartOps.RoachprodOpts.AdminUIPort = 0
Should we add a TODO to remove this once #111052 is resolved?
pkg/cmd/roachtest/tests/multitenant_distsql.go
line 86 at r3 (raw file):
instStartOps.RoachprodOpts.SQLPort = 0 instStartOps.RoachprodOpts.AdminUIPort = 0 err := c.StartServiceForVirtualClusterE(ctx, t.L(), c.Node(node), instStartOps, settings, storNodes)
Why not StartServiceForVirtualCluster
?
pkg/cmd/roachtest/tests/multitenant_distsql.go
line 101 at r3 (raw file):
m := c.NewMonitor(ctx, c.Nodes(1, 2, 3)) //inst1Conn, err := gosql.Open("postgres", c.PGUrl(ctx, 1)
Delete?
pkg/cmd/roachtest/tests/multitenant_distsql.go
line 146 at r3 (raw file):
m.Wait() wg.Done() }()
I don't understand this block, why do we need it? I also don't see a wg.Wait()
call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @herkolategan)
pkg/cmd/roachtest/tests/multitenant_distsql.go
line 81 at r4 (raw file):
instStartOps.RoachprodOpts.Target = install.StartServiceForVirtualCluster instStartOps.RoachprodOpts.VirtualClusterName = tenantName instStartOps.RoachprodOpts.SQLInstance = sqlInstance
Can someone remind me what SQLInstance
denotes? I can see that it's expected to be 0
in case of StartSharedProcessForVirtualCluster
, but is exact meaning is somewhat obscured by the implementation of service registry/discovery.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @herkolategan)
pkg/cmd/roachtest/tests/multitenant_distsql.go
line 65 at r4 (raw file):
// performance and resource limitations. We set the minimum range max bytes to // 1 byte to bypass the guardrails. install.MakeClusterSettings(install.SecureOption(true))
Remove?
pkg/cmd/roachtest/tests/multitenant_distsql.go
line 81 at r4 (raw file):
Ok, I found the definition in the PR commit message [1] :).
A SQL Instance can be viewed as a unique identifier for the SQL Instance that is running on a VM in the event that there are more than one SQL Instance serving the same virtual cluster on that VM.
[1] #111582
eb045d1
to
b231bfa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @renatolabs and @srosenberg)
pkg/cmd/roachtest/cluster.go
line 1929 at r2 (raw file):
Previously, renatolabs (Renato Costa) wrote…
What's the rationale for this
defaultEnv
parameter? From what I can see, we only use it forCOCKROACH_CRASH_ON_SPAN_USE_AFTER_FINISH
. Is there a reason why we wouldn't want that variable in SQL processes? I'm wondering if we should enforce all cockroach processes to have the same default variables (just like we're doing for the random seed var) and eliminate this argument.
Good point, removed defaultEnv
and made the defaults common to both.
pkg/cmd/roachtest/cluster.go
line 2031 at r2 (raw file):
Previously, renatolabs (Renato Costa) wrote…
Should this really be
1
and not first node inc.MakeNodes(opts...)
?
Copied this code over from StartE
, but I think there is an implicit contract (in roachprod
) that the absolute first node of any cluster will always carry the certs. Thus after the nodes are started, or any subset of nodes are started, node 1 will contain the latest and will need to be refetched for roachtest
to operate from locally.
pkg/cmd/roachtest/cluster.go
line 2011 at r4 (raw file):
Previously, renatolabs (Renato Costa) wrote…
This function could use some documentation, especially about
externalNodes
and in-memory vs external process tenants.
Good point, I'll add some docs.
pkg/cmd/roachtest/tests/multitenant_distsql.go
line 71 at r3 (raw file):
Previously, renatolabs (Renato Costa) wrote…
Did you mean
storageNodes
?
Naming is from before, but I like the more explicit storageNodes
, will update.
pkg/cmd/roachtest/tests/multitenant_distsql.go
line 81 at r3 (raw file):
Previously, renatolabs (Renato Costa) wrote…
Interesting, we should probably find a way to package these into something higher level too.
True, I can add a convenience function for service start opts.
pkg/cmd/roachtest/tests/multitenant_distsql.go
line 85 at r3 (raw file):
Previously, renatolabs (Renato Costa) wrote…
Should we add a TODO to remove this once #111052 is resolved?
Good idea, done.
pkg/cmd/roachtest/tests/multitenant_distsql.go
line 86 at r3 (raw file):
Previously, renatolabs (Renato Costa) wrote…
Why not
StartServiceForVirtualCluster
?
Opted to try and keep the test similar to how it was with the same logging intact as previously, but can switch if it's cleaner.
pkg/cmd/roachtest/tests/multitenant_distsql.go
line 101 at r3 (raw file):
Previously, renatolabs (Renato Costa) wrote…
Delete?
yep, leftover refactoring comment. good catch.
pkg/cmd/roachtest/tests/multitenant_distsql.go
line 146 at r3 (raw file):
Previously, renatolabs (Renato Costa) wrote…
I don't understand this block, why do we need it? I also don't see a
wg.Wait()
call.
Good catch, I did forget the wait; not that it helped unfortunately. This was introduced as a workaround because of a bug I intend to log an issue for, but just realised the workaround also does not work, because I forgot the wait group wait. So I'll remove all of this and log the bug for how this gets stuck on local clusters.
pkg/cmd/roachtest/tests/multitenant_distsql.go
line 65 at r4 (raw file):
Previously, srosenberg (Stan Rosenberg) wrote…
Remove?
Oops, yip my eyes glanced over that one.
pkg/cmd/roachtest/tests/multitenant_distsql.go
line 81 at r4 (raw file):
Previously, srosenberg (Stan Rosenberg) wrote…
Ok, I found the definition in the PR commit message [1] :).
A SQL Instance can be viewed as a unique identifier for the SQL Instance that is running on a VM in the event that there are more than one SQL Instance serving the same virtual cluster on that VM.
[1] #111582
Yes, not the clearest of terms, but denotes the ith SQL instance of an external process tenant on a specific VM.
5ce7353
to
1b14799
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Updated a few things from comments, and for quickref latest distsql only TC run just for sanity checking (passed): https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_Nightlies_RoachtestNightlyGceBazel/13101830
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @renatolabs and @srosenberg)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 6 files at r5, 4 of 4 files at r6.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @herkolategan and @srosenberg)
pkg/cmd/roachtest/option/options.go
line 61 at r14 (raw file):
// DefaultStartVirtualClusterOpts returns StartOpts for starting an external // process virtual cluster with the given tenant name and SQL instance. func DefaultStartVirtualClusterOpts(tenantName string, sqlInstance int) StartOpts {
Should we take the Target
as argument? Or at least make it explicit in the function name that this refers to external process deployment?
pkg/cmd/roachtest/option/options.go
line 71 at r14 (raw file):
// TODO(herko): remove this once dynamic port assignment is the default. startOpts.RoachprodOpts.SQLPort = 0 startOpts.RoachprodOpts.AdminUIPort = 0
The issue was closed today, we should be able to delete this now.
Epic: None Release Note: None
Previously the cluster interface only exposed a method to start storage nodes, but that is insufficient to start virtual clusters that have a separate method on the `roachprod` API (for starting). This change adds a new method `StartServiceForVirtualCluster` to the cluster interface to enable roachtests to start virtual clusters. Some refactoring was required to enable different sets of cluster settings, depending on what service type is going to be started. There are now two sets of cluster settings that can be utilised in `test_runner`. For virtual clusters `virtualClusterSettings` will be used, and for storage clusters `clusterSettings` will be utilised. Epic: None Release Note: None
Previously the multitenant distsql roachtest relied on an internal util in `roachtest` to start virtual clusters. This change updates the test to use the new official `roachtest` and `roachprod` APIs for starting virtual clusters. Fixes: cockroachdb#116019 Epic: None Release Note: None
Epic: None Release Note: None
Epic: None Release Note: None
1b14799
to
9f2551b
Compare
Add a convenience function to return `StartOpts` for starting an external process virtual cluster. Epic: None Release Note: None
9f2551b
to
555437e
Compare
TFTRs! bors r=renatolabs,srosenberg |
Build succeeded: |
This was originally removed in cockroachdb#115599 due to cockroachdb#114097 merging, but adminui was reverted in cockroachdb#117141 and mistakenly did not revert the special case for virtual clusters. Release note: None
This was originally removed in cockroachdb#115599 due to cockroachdb#114097 merging, but adminui was reverted in cockroachdb#117141 and mistakenly did not revert the special case for virtual clusters. Release note: None
117505: roachtest: assign adminui ports dynamically for virtual clusters r=srosenberg,renatolabs a=DarrylWong This was originally removed in #115599 due to #114097 merging, but adminui was reverted in #117141 and mistakenly did not revert the special case for virtual clusters. We unskip the multitenant/distsql tests as well. Release note: None Fixes: #117150 Fixes: #117149 Epic: None 117545: rpc: rm rangefeed RPC stream window special case r=erikgrinaker,miretskiy a=pav-kv The rangefeed stream window size tuning was introduced to mitigate OOM in rangefeeds caused by the excessive number of streams (one per `Range`). Since we now use mux rangefeeds (which multiplexes all the rangefeed traffic into a single stream), this setting is no longer needed, so this commit removes it. Part of #108992 Release note (ops change): `COCKROACH_RANGEFEED_RPC_INITIAL_WINDOW_SIZE` env variable has been removed, and rangefeed connection now uses the same window size as other RPC connections. 117554: sqlproxyccl: improve authentication throttle error r=JeffSwenson a=JeffSwenson The sql proxy will throttle connection attempts if a (client IP, tenant cluster) pair has too many authentication failures. The error is usually caused by a misconfigured password in a connection pool. This change replaces the "connection attempt throttled" error message with "too many failed authentication attempts". There is a hint that includes this message but not all drivers are configured to log hints. Fixes #117552 Co-authored-by: DarrylWong <[email protected]> Co-authored-by: Pavel Kalinnikov <[email protected]> Co-authored-by: Jeff <[email protected]>
This was originally removed in cockroachdb#115599 due to cockroachdb#114097 merging, but adminui was reverted in cockroachdb#117141 and mistakenly did not revert the special case for virtual clusters. Release note: None
This was originally removed in cockroachdb#115599 due to cockroachdb#114097 merging, but adminui was reverted in cockroachdb#117141 and mistakenly did not revert the special case for virtual clusters. Release note: None
Previously the multitenant distsql roachtest relied on an internal util in
roachtest
to start virtual clusters. This PR updates the test to use the new officialroachtest
androachprod
APIs for starting virtual clusters.Some additional changes were required to support upgrading the test. The cluster interface only exposed a method to start storage nodes, but that is insufficient to start virtual clusters that have a separate method on the
roachprod
API (for starting).This change adds a new method
StartServiceForVirtualCluster
to the cluster interface to enable roachtests to start virtual clusters. Some refactoring was required to enable different sets of cluster settings, depending on what servicetype is going to be started.
There are now two sets of cluster settings that can be utilised in
test_runner
. For virtual clustersvirtualClusterSettings
will be used, and for storage clustersclusterSettings
will be utilised.Fixes: #116019
Release Note: None
Epic: CRDB-31933