-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage/pebble: panic pebble: flush-to key #54284
Comments
@itsbilal Please take a look. The SHA includes your |
Yep, already investigated. Details are on slack, but the core issue is a bad interaction between |
For posterity, it is useful to copy those details to this issue. |
Sure. Here's the relevant compaction input: L4 input:
L3 input:
Compaction output into L4:
And then the compaction should finish, as the range del fragmenter is empty and no more keys remain. So we “hold off” the split suggestion, but turns out that was the last key in the compaction, so we never get around to calling Possible solutions:
|
Currently, it's possible for the nonZeroSeqNumSplitter to "withhold" a compaction split suggestion in such a way that the `limit` variable in the compaction loop is exceeded, only to advise a compaction split at a later point. This is usually not a concern as we just reset `limit` when that compaction split is actually advised, but if the compaction were to run out of keys in this narrow window, we would leave the limit at a non-nil past key, which would violate an invariant in the rangedel fragmenter as it can't truncate range tombstones to a passed key. Similar logic already existed in older iteration of this code, which is in use in the crl-release-20.2 branch. A refactor here introduced this bug. This change allows for a 3-way return value from shouldSplitBefore; in the case where limit has been exceeded, we resort to resetting the limit like before. Will address cockroachdb/cockroach#54284 when this lands in cockroach master.
Currently, it's possible for the nonZeroSeqNumSplitter to "withhold" a compaction split suggestion in such a way that the `limit` variable in the compaction loop is exceeded, only to advise a compaction split at a later point. This is usually not a concern as we just reset `limit` when that compaction split is actually advised, but if the compaction were to run out of keys in this narrow window, we would leave the limit at a non-nil past key, which would violate an invariant in the rangedel fragmenter as it can't truncate range tombstones to a passed key. Similar logic already existed in older iteration of this code, which is in use in the crl-release-20.2 branch. A refactor here introduced this bug. This change allows for a 3-way return value from shouldSplitBefore; in the case where limit has been exceeded, we resort to resetting the limit like before. Will address cockroachdb/cockroach#54284 when this lands in cockroach master.
Currently, it's possible for the nonZeroSeqNumSplitter to "withhold" a compaction split suggestion in such a way that the `limit` variable in the compaction loop is exceeded, only to advise a compaction split at a later point. This is usually not a concern as we just reset `limit` when that compaction split is actually advised, but if the compaction were to run out of keys in this narrow window, we would leave the limit at a non-nil past key, which would violate an invariant in the rangedel fragmenter as it can't truncate range tombstones to a passed key. Similar logic already existed in older iteration of this code, which is in use in the crl-release-20.2 branch. A refactor here introduced this bug. This change allows for a 3-way return value from shouldSplitBefore; in the case where limit has been exceeded, we resort to resetting the limit like before. Will address cockroachdb/cockroach#54284 when this lands in cockroach master.
Currently, it's possible for the nonZeroSeqNumSplitter to "withhold" a compaction split suggestion in such a way that the `limit` variable in the compaction loop is exceeded, only to advise a compaction split at a later point. This is usually not a concern as we just reset `limit` when that compaction split is actually advised, but if the compaction were to run out of keys in this narrow window, we would leave the limit at a non-nil past key, which would violate an invariant in the rangedel fragmenter as it can't truncate range tombstones to a passed key. Similar logic already existed in older iteration of this code, which is in use in the crl-release-20.2 branch. A refactor here introduced this bug. This change allows for a 3-way return value from shouldSplitBefore; in the case where limit has been exceeded, we resort to resetting the limit like before. Will address cockroachdb/cockroach#54284 when this lands in cockroach master.
54273: changefeedccl/schemafeed: only sort the unsorted tail of descriptors r=ajwerner a=ajwerner This lead to a race detector warning firing (theorized). I'd love to validate that this is the bug but I feel pretty good about it. Fixes #48459. Release note: None 54358: sql/catalog/descs: don't hydrate dropped tables r=ajwerner a=ajwerner The invariant that types referenced by tables only exists for non-dropped tables. We were not checking the state of the table when choosing to hydrate. This lead to pretty catastropic failures when the invariant was violated. Fixes #54343. Release note (bug fix): Fixed bug from earlier alphas where dropping a database which contained tables using user-defined types could result in panics. 54417: kvserver: improve a comment around node liveness r=irfansharif a=irfansharif Release note: None 54422: vendor: Bump pebble to 08b545a1f5403e31a76b48f46a780c8d59432f57 r=petermattis a=itsbilal Changes pulled in: ``` 08b545a1f5403e31a76b48f46a780c8d59432f57 compaction: Invalidate limit when a splitter defers a split suggestion 6e5e695d8b1c33c0c4687bd7e804e9aaac66d9dd db: remove unused compaction.maxExpandedBytes ``` Fixes #54284. Release note: None. Co-authored-by: Andrew Werner <ajwerner@cockroachlabs.com> Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com> Co-authored-by: Bilal Akhtar <bilal@cockroachlabs.com>
Describe the problem
I was running a cluster trying to debug an issue in the allocator when I saw that a node had died with the following panic:
Environment:
The text was updated successfully, but these errors were encountered: