roachtest: remove direct go calls #134644

herkolategan · 2024-11-08T12:49:50Z

Previously, tests had to use bare goroutine calls to start a goroutine. This change removes all bare goroutine calls and replaces it with the new task APIs.

The use of errgroup.Group and Monitor still remains and will be addressed in a different PR.

Informs: #118214

Epic: None
Release note: None

cockroach-teamcity · 2024-11-08T12:49:59Z

This change is

Previously, tests had to use bare goroutine calls to start a goroutine. This change removes all bare goroutine calls and replaces it with the new task APIs. Most of the tests already handle errors from goroutines via a channel. This logic is best left in its current state. New implementations can choose whether they want to fail a test by returning an error from a managed goroutine, or would rather handle errors some other way. The use of `errgroup.Group` and `Monitor` still remains and will be addressed in a different PR. Informs: cockroachdb#118214 Epic: None Release note: None

DarrylWong · 2024-11-20T16:51:20Z

pkg/cmd/roachtest/tests/allocator.go

@@ -81,18 +82,10 @@ func registerAllocator(r registry.Registry) {
 		c.Start(ctx, t.L(), startOpts, install.MakeClusterSettings(), c.Range(start+1, nodes))
 		c.Run(ctx, option.WithNodes(c.Node(1)), "./cockroach workload init kv --drop {pgurl:1}")
 		for node := 1; node <= nodes; node++ {
-			node := node
-			// TODO(dan): Ideally, the test would fail if this queryload failed,


DarrylWong · 2024-11-20T17:59:35Z

pkg/cmd/roachtest/tests/multitenant_upgrade.go

 				defer wg.Done()
 				<-upgradeFinished
 				l.Printf("tenant upgrades finished")
-			}()
+				return nil
+			})

 			wg.Wait()


Do you think there's value in having a t.Wait that blocks until all goroutines return? Similar to how m.Wait functions? I imagine we will see more need of this when the monitor refactor happens.

Yes! I do see value in it, hence a follow-up PR, that adds group management to the task manager: #135831

DarrylWong · 2024-11-20T18:05:56Z

pkg/cmd/roachtest/tests/cancel.go

 							if _, err := runnerConn.Exec(setupQuery); err != nil {
 								errCh <- err
 								close(sem)
-								return
+								// Errors are handled in the main goroutine.
+								return nil //nolint:returnerrcheck


nit: it's not immediately obvious that returning errors in t.Go fails the test. I think it makes sense for that to be the behavior but you have to go a couple layers and peek at the implementation to confirm that is the case.

Agree with Darryl's comment; we should make this explicit in the interface doc. Also, this test could technically be refactored further to explicitly return err instead of passing it into errCh, but I also understand if you want to reduce the footprint of all changes in this PR.

I had the same thought, will need to make it clear that returning an error from a task will result in a test failure. We could alternatively provide a version of the Go/GoWithCancel methods that does not take an error return.

From the footprint perspective, I didn't want to fiddle too much with already working code. But we could definitely make some of these more ergonomic. I'll create an issue for it, since engineers tend to borrow from existing code, it would be better if the examples are improved.

We could alternatively provide a version of the Go/GoWithCancel methods that does not take an error return.

No bother, it's not that hard to add return nil :)

srosenberg · 2024-11-20T19:21:17Z

pkg/cmd/roachtest/tests/allocator.go

 				cmd := fmt.Sprintf("./cockroach workload run kv --tolerate-errors --min-block-bytes=8 --max-block-bytes=127 {pgurl%s}", c.Node(node))
-				l, err := t.L().ChildLogger(fmt.Sprintf(`kv-%d`, node))


Looks like this logger was previously unused?

Yep, had to look twice before removing, and in actual fact was dead code.

srosenberg · 2024-11-20T19:29:14Z

pkg/cmd/roachtest/tests/cluster_init.go

@@ -87,11 +88,11 @@ func runClusterInit(ctx context.Context, t test.Test, c cluster.Cluster) {
 		t.L().Printf("checking that the SQL conns are not failing immediately")
 		errCh := make(chan error, len(dbs))
 		for _, db := range dbs {
-			db := db


Good riddance :)

srosenberg

Thanks for refactoring all "naked" go calls! LGTM. We should probably do SELECT_PROBABILITY=0.6 to make sure we hit enough tests to shake out any potential regressions.

herkolategan · 2024-11-21T16:15:04Z

Thanks for refactoring all "naked" go calls! LGTM. We should probably do SELECT_PROBABILITY=0.6 to make sure we hit enough tests to shake out any potential regressions.

Definitely, I ran a targeted job on all the affected tests with a rather large regex. But I think a 0.6 probability would also make for a good final check. Will do one before merging today/tomorrow.

Added a comment to inform test implementors, using the task API, that returning an error from a Tasker goroutine will fail a test. Informs: cockroachdb#118214 Epic: None Release note: None

herkolategan · 2024-11-27T11:26:43Z

TFTRs!

bors r=srosenberg,DarrylWong

craig · 2024-11-27T11:56:01Z

Build succeeded:

herkolategan force-pushed the hbl/roachtest-remove-bare-go-routines branch 4 times, most recently from 118a08e to b9e52e5 Compare November 18, 2024 17:03

herkolategan force-pushed the hbl/roachtest-remove-bare-go-routines branch from b9e52e5 to 716be3f Compare November 19, 2024 14:23

herkolategan marked this pull request as ready for review November 20, 2024 16:41

herkolategan requested a review from a team as a code owner November 20, 2024 16:41

herkolategan requested review from nameisbhaskar and vidit-bhat and removed request for a team November 20, 2024 16:41

DarrylWong reviewed Nov 20, 2024

View reviewed changes

srosenberg reviewed Nov 20, 2024

View reviewed changes

srosenberg self-requested a review November 20, 2024 19:29

srosenberg approved these changes Nov 20, 2024

View reviewed changes

roachtest: add Tasker error handling comment

fd90592

Added a comment to inform test implementors, using the task API, that returning an error from a Tasker goroutine will fail a test. Informs: cockroachdb#118214 Epic: None Release note: None

DarrylWong approved these changes Nov 22, 2024

View reviewed changes

craig bot merged commit a86b127 into cockroachdb:master Nov 27, 2024
23 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

roachtest: remove direct go calls #134644

roachtest: remove direct go calls #134644

herkolategan commented Nov 8, 2024

cockroach-teamcity commented Nov 8, 2024

DarrylWong Nov 20, 2024

srosenberg Nov 20, 2024

DarrylWong Nov 20, 2024

herkolategan Nov 21, 2024

DarrylWong Nov 20, 2024

srosenberg Nov 20, 2024

herkolategan Nov 21, 2024

srosenberg Nov 21, 2024

srosenberg Nov 20, 2024

herkolategan Nov 21, 2024

srosenberg Nov 20, 2024

srosenberg left a comment

herkolategan commented Nov 21, 2024 •

edited

Loading

herkolategan commented Nov 27, 2024

craig bot commented Nov 27, 2024

		cmd := fmt.Sprintf("./cockroach workload run kv --tolerate-errors --min-block-bytes=8 --max-block-bytes=127 {pgurl%s}", c.Node(node))
		l, err := t.L().ChildLogger(fmt.Sprintf(`kv-%d`, node))

roachtest: remove direct go calls #134644

roachtest: remove direct go calls #134644

Conversation

herkolategan commented Nov 8, 2024

cockroach-teamcity commented Nov 8, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

srosenberg left a comment

Choose a reason for hiding this comment

herkolategan commented Nov 21, 2024 • edited Loading

herkolategan commented Nov 27, 2024

craig bot commented Nov 27, 2024

herkolategan commented Nov 21, 2024 •

edited

Loading