Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: add invalid descriptors post test validation #96949

Merged

Conversation

herkolategan
Copy link
Collaborator

Add a validation check for the invalid descriptors virtual table against the cockroach cluster at the end of each roachtest. Assert that the table crdb_internal.invalid_descriptors is empty.

Resolves: #85330

Release note: None

@herkolategan herkolategan requested a review from a team as a code owner February 10, 2023 14:55
@herkolategan herkolategan requested review from smg260 and renatolabs and removed request for a team February 10, 2023 14:55
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@renatolabs renatolabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments, LGTM!

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @herkolategan and @smg260)


pkg/cmd/roachtest/cluster.go line 1390 at r1 (raw file):

// crdb_internal.check_consistency(true, ”, ”) indicates that any ranges'
// replicas are inconsistent with each other. It uses the first node that
// is up to run the query.

This first sentence no longer applies here now that the logic was extracted to its own function.


pkg/cmd/roachtest/test_runner.go line 1078 at r1 (raw file):

		//
		// TODO(testinfra): figure out why this can still get stuck despite the
		// above.

I've often wondered about the comment above: I don't remember ever seeing these calls time out. Perhaps we should remove this comment/TODO. cc @tbg, if you have thoughts (as writer of the comment).

Doesn't have to be on this PR, I'm mostly thinking out loud here.


pkg/cmd/roachtest/test_runner.go line 1081 at r1 (raw file):

		db, node := c.ConnectToLiveNode(ctx, t)
		if db != nil {
			defer func() { _ = db.Close() }()

Nit: this reads slightly better as defer db.Close()


pkg/cmd/roachtest/test_runner.go line 1082 at r1 (raw file):

		if db != nil {
			defer func() { _ = db.Close() }()
			t.L().Printf("running (fast) validation checks on node %d", node)

I know this comment was just moved, but "fast" is relative. Suggestion: running validation checks on node %d (<10m)

Copy link
Contributor

@renatolabs renatolabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @herkolategan and @smg260)


pkg/cmd/roachtest/cluster.go line 1390 at r1 (raw file):

Previously, renatolabs (Renato Costa) wrote…

This first sentence no longer applies here now that the logic was extracted to its own function.

Ops, I meant last sentence.

Copy link
Collaborator Author

@herkolategan herkolategan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @renatolabs, @smg260, and @tbg)


pkg/cmd/roachtest/cluster.go line 1390 at r1 (raw file):

Previously, renatolabs (Renato Costa) wrote…

Ops, I meant last sentence.

Done.


pkg/cmd/roachtest/test_runner.go line 1078 at r1 (raw file):

Previously, renatolabs (Renato Costa) wrote…

I've often wondered about the comment above: I don't remember ever seeing these calls time out. Perhaps we should remove this comment/TODO. cc @tbg, if you have thoughts (as writer of the comment).

Doesn't have to be on this PR, I'm mostly thinking out loud here.

Done.


pkg/cmd/roachtest/test_runner.go line 1082 at r1 (raw file):

Previously, renatolabs (Renato Costa) wrote…

I know this comment was just moved, but "fast" is relative. Suggestion: running validation checks on node %d (<10m)

Done.

@herkolategan herkolategan force-pushed the hbl/roachtest-invalid-descriptors-check branch 2 times, most recently from 526e2c9 to eed2df1 Compare February 17, 2023 13:56
Add a validation check for the invalid descriptors virtual table against the
cockroach cluster at the end of each roachtest. Assert that the table
`crdb_internal.invalid_descriptors` is empty.

Resolves: cockroachdb#85330

Release note: None
@herkolategan herkolategan force-pushed the hbl/roachtest-invalid-descriptors-check branch from eed2df1 to f1ebb9e Compare February 17, 2023 13:57
@herkolategan
Copy link
Collaborator Author

bors r=renatolabs

@craig
Copy link
Contributor

craig bot commented Feb 17, 2023

Build failed:

@herkolategan
Copy link
Collaborator Author

bors retry

@craig
Copy link
Contributor

craig bot commented Feb 20, 2023

Build succeeded:

@craig craig bot merged commit a66ad8b into cockroachdb:master Feb 20, 2023
Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/cmd/roachtest/test_runner.go line 1078 at r1 (raw file):

Previously, herkolategan (Herko Lategan) wrote…

Done.

I'm fine removing this. I haven't seen it in a while either (and also in the meantime rewrote the server-side impl of the consistency checks, so they are more likely to be cancelable now), though I also haven't been paying attention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

test: assert that crdb_internal.invalid_descriptors is empty at the end of each roachtest
4 participants