Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: fatal on corruption encountered in background #102252

Merged
merged 1 commit into from
Apr 25, 2023

Conversation

jbowens
Copy link
Collaborator

@jbowens jbowens commented Apr 25, 2023

Previously, on-disk corruption would only fatal the node if an interator observed it. Corruption encountered by a background job like a compaction would not fatal the node. This can result in busy churning through compactions that repeatedly fail, impacting cluster stability and user query latencies.

Now, on-disk corruption results in immediately exiting the node.

Epic: none
Fixes: #101101
Release note (ops change): When local corruption of data is encountered by a background job, a node will now exit immediately.

Previously, on-disk corruption would only fatal the node if an interator
observed it. Corruption encountered by a background job like a compaction would
not fatal the node. This can result in busy churning through compactions that
repeatedly fail, impacting cluster stability and user query latencies.

Now, on-disk corruption results in immediately exiting the node.

Epic: none
Fixes: cockroachdb#101101
Release note (ops change): When local corruption of data is encountered by a
background job, a node will now exit immediately.
@jbowens jbowens added backport-22.2.x backport-23.1.x Flags PRs that need to be backported to 23.1 labels Apr 25, 2023
@jbowens jbowens requested a review from a team as a code owner April 25, 2023 15:56
@jbowens jbowens requested a review from itsbilal April 25, 2023 15:56
@blathers-crl
Copy link

blathers-crl bot commented Apr 25, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @itsbilal)

@jbowens
Copy link
Collaborator Author

jbowens commented Apr 25, 2023

TFTR!

bors r=RaduBerinde

@craig
Copy link
Contributor

craig bot commented Apr 25, 2023

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-23.1.x Flags PRs that need to be backported to 23.1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

storage: background corruption should fatal the node
3 participants