-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: add setting to seed up consistency checks in tests #116892
Conversation
53176d1
to
bfb9b74
Compare
Does this imply that running |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @aadityasondhi, @DarrylWong, @itsbilal, and @srosenberg)
pkg/storage/engine.go
line 638 at r1 (raw file):
// argument specifies how long to wait for before attempting a flush to // force a transition to a file-only snapshot. WaitForFileOnly(context.Context, time.Duration) error
[nit] I'd name give the parameter a useful name, maybe gracePeriodBeforeFlush
. Ideally we'd plumb that through the implementation - dur
is not useful, and in fact is misleading: a generic duration argument to a method like Wait
usually means that we'll give up waiting after that duration.
bfb9b74
to
ce26047
Compare
Previously, we'd fall back to the 3 second consistency checker EventuallyFileOnlySnapshot (EFOS) wait in roachtests, which was slowing all of them down when we ran every replica through the consistency checker in post-test assertions. This change speeds up that consistency check in roachtest post-test assertions by flipping a new cluster setting to speed up EFOS waits for consistency checks after a roachtest finishes. Epic: none Unblocks cockroachdb#116330. Release note: None
ce26047
to
f2ae78b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
@erikgrinaker that's correct but only if we're running that command with little/no background writes happening. Flushes will result in forward progress on WaitForFileOnly
without the 3 second wait.
In general when we were making this change, we were under the impression that consistency checks are usually a background operation. If this is an undesirable knock-on effect of this change we can look into ways to speed up consistency checks too.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @aadityasondhi, @DarrylWong, and @srosenberg)
pkg/storage/engine.go
line 638 at r1 (raw file):
Previously, RaduBerinde wrote…
[nit] I'd name give the parameter a useful name, maybe
gracePeriodBeforeFlush
. Ideally we'd plumb that through the implementation -dur
is not useful, and in fact is misleading: a generic duration argument to a method likeWait
usually means that we'll give up waiting after that duration.
Done.
bors r=RaduBerinde |
Build succeeded: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we were under the impression that consistency checks are usually a background operation. If this is an undesirable knock-on effect of this change we can look into ways to speed up consistency checks too.
That's generally true, except for crdb_internal.check_consistency()
. I doubt it's used much outside of tests. We could consider disabling EFOS for foreground consistency checks.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)
117236: roachtest: fix consistency check settings error r=erikgrinaker a=erikgrinaker Fixes this error, which effectively disabled post-test consistency checks as we have to ignore errors: ``` SET CLUSTER SETTING cannot be used inside a multi-statement transaction ``` Follows #116892. Epic: none Release note: None Co-authored-by: Erik Grinaker <[email protected]>
We do know at this level whether we're running on behalf of the consistency queue or on behalf of the It would probably make sense to disable EFOS for both. If not, we should at least do it for |
…checks See discussion in cockroachdb#116892. Epic: None Release note: None
I'll make the less contentious change for now. #117297. |
blathers backport 23.2 |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from f2ae78b to blathers/backport-release-23.2-116892: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.2 failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Nevermind, I guess we didn't enable this by default for 23.2. |
Previously, we'd fall back to the 3 second consistency checker EventuallyFileOnlySnapshot (EFOS) wait in roachtests, which was slowing all of
them down when we ran every replica through the consistency checker in post-test assertions. This change speeds up that consistency check in roachtest post-test assertions by flipping a new cluster setting to speed up EFOS waits for consistency checks after a roachtest finishes.
Epic: none
Unblocks #116330.
Release note: None