Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

schemachanger: Refactor tests for concurrent schema changer behaviors #108451

Merged

Conversation

Xiang-Gu
Copy link
Contributor

@Xiang-Gu Xiang-Gu commented Aug 9, 2023

  1. It cleans up some redundant tests about concurrent schema changer behavior and refactor in a new simpler, cleaner test
  2. It adds an integration style test for testing concurrent schema change behaviors where we run many schema changes for an extended period of time and assert that all of they eventually succeed and the descriptors end up in the expected state.

Fix #108140
Fix #107223

Epic: None
Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@Xiang-Gu Xiang-Gu changed the title schemachanger: Cleanup redundant tests and refactor it schemachanger: Refactor tests for concurrent schema changer behaviors Aug 9, 2023
@Xiang-Gu Xiang-Gu force-pushed the concurrent-schema-changer-tests-refactoring branch from be61133 to b9dd59f Compare August 9, 2023 15:57
@Xiang-Gu
Copy link
Contributor Author

Xiang-Gu commented Aug 9, 2023

CI tests are flaky tests unrelated to this PR. Ready for a look!

@Xiang-Gu Xiang-Gu marked this pull request as ready for review August 9, 2023 16:43
@Xiang-Gu Xiang-Gu requested a review from a team as a code owner August 9, 2023 16:43
@rafiss
Copy link
Collaborator

rafiss commented Aug 9, 2023

nice work! quick question, do you know if the new test would have been able to catch the bug described here? #106933

@Xiang-Gu
Copy link
Contributor Author

@rafiss yes but with a different error! So I looked at that PR that fixes the bug. It essentially added two lines as listed below. So I did the following:

  1. Run the test as in the PR, it passes;
  2. Remove one line and re-run this test, rather quickly, it fails with error
    schemachanger_test.go:770: pq: internal error: executing declarative schema change PreCommitPhase stage 2 of 2 with 3 MutationType ops (rollback=false) for DROP INDEX: error executing PreCommitPhase stage 2 of 2 with 3 MutationType ops: *scop.SetJobStateOnDescriptor: ... unexpected schema change job ID 889775506765152257 on table 107, expected 0
  1. Add back that line, re-run the test, and it passes.
  2. Remove another one line and re-run this test, rather quickly, it fails with the same type of error again.
  3. Add back this other line, re-run the test, and it passes.

This commit recognizes that there were previously three redundant
tests about concurrent schema changer behavior so it deletes them and
rewrite it to a simpler one.

Release note: None
…hanger behavior

This commit adds an integration style test for concurrent schema
changer behaviors where we run multiple DDLs for an extended period of
time on a few descriptors and assert that they all eventually finish
and the descriptors end up in the expected state.

Release note: None
@Xiang-Gu Xiang-Gu force-pushed the concurrent-schema-changer-tests-refactoring branch from b9dd59f to c02d1f3 Compare August 10, 2023 14:21
Copy link
Collaborator

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Xiang-Gu This is excellent stuff! A couple of suggestions for improvements, the transaction one can be ignored.

Reviewed 2 of 2 files at r1, 1 of 2 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @Xiang-Gu)


pkg/sql/schemachanger/schemachanger_test.go line 635 at r2 (raw file):

			scName = newSCName
			t.Logf("RENAME SCHEMA TO %v", newSCName)
		} else if isPQErrWithCode(err, pgcode.UndefinedDatabase) {

Would it make sense to use catalog queries here instead and transactions, that way, we get retry errors instead. But I think we can't because of declarative based on our slack conversation.


pkg/sql/schemachanger/schemachanger_test.go line 643 at r2 (raw file):

	// A goroutine that renames table `testdb.testsc.t` randomly.
	go repeatWorkWithInterval("rename-tbl-worker", renameTblInterval, func() error {

ctxgroup is a cleaner pattern here and would simplify the code

Copy link
Contributor Author

@Xiang-Gu Xiang-Gu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @fqazi)


pkg/sql/schemachanger/schemachanger_test.go line 643 at r2 (raw file):

Previously, fqazi (Faizan Qazi) wrote…

ctxgroup is a cleaner pattern here and would simplify the code

I explored ctxgroup but its functionality is not suitable for our case here.
Namely, ctxgourp allows us to spawn workers, and wait for them to complete. Here we want the main routine to let workers work indefinitely and signal them to finish once a time up fires.

@Xiang-Gu
Copy link
Contributor Author

TFTR!
bors r+

@craig craig bot merged commit f194f92 into cockroachdb:master Aug 10, 2023
@craig
Copy link
Contributor

craig bot commented Aug 10, 2023

Build succeeded:

@Xiang-Gu Xiang-Gu deleted the concurrent-schema-changer-tests-refactoring branch August 10, 2023 19:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants