-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvnemesis: skip validation for rolled-back locking reads in weak-isolation transactions #126417
Conversation
I successfully stressed it with savepoint ops frequency at 5 and disabled serializable ops. With these settings and without the patch in this PR, the failure in #125545 reproduces. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing Mira! Should we also backport this to all branches that have the savepoint validation code?
Separately, would you mind taking a pass over all the current KVNemesis failures we have open which would be fixed by this? If the majority failures on the issue, IMO we should close them out, so that we get eyes on any failures that aren't explained by this.
Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @miraradeva)
pkg/kv/kvnemesis/validator.go
line 1120 at r1 (raw file):
case *observedRead: // If v.observationFilter == observeLocking, this must be a locking read // from a weak-isolation transaction. If this locking read is rolled back
nit: wrap to 80 chars.
I'll backport to 24.1 and 23.2. This is the PR that introduced the weak-isolation-level validation: #111992. The earliest it exists is 23.2.
Yes, will do.
It is actually wrapped to 80 chars. |
TFTR! bors=arulajmani |
…ation transactions Currently, for weak-isolation-level transactions, kvnemesis does validation only for locking reads; such transactions ensure the locks are held until the transactions commit. However, this is not the case if a locking read is rolled back by a savepoint. In this case, the lock is not released eagerly, but it's also not guaranteed to be held; the lock may be released if the transaction is pushed. Once the read is no longer considered locking, weaker isolations levels do not require that its valid times intersect with the rest of the transaction's operations. In this patch, we adjust the kvnemesis validation to skip such rolled-back reads. Fixes: cockroachdb#125545 Release note: None
8e7f521
to
cc85997
Compare
bors=arulajmani |
bors r=arulajmani |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from cc85997 to blathers/backport-release-23.2-126417: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.2.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Actually, the savepoint validation logic is not there on 23.2: #110494. |
…weak-isolation transactions Previously, cockroachdb#126417 attempted to skip validation for rolled-back locking reads that tolerate write skew. However, it depended on the validator's `observeLocking` filter in `checkAtomicCommitted` to detect if a read tolerates write skew. However, that filter is set correctly only while processing ops. This patch adds a field to `observedRead` to mark it as tolerating write skew. The flag is set in `processOp` and used later in `checkAtomicCommitted`. Fixes: cockroachdb#128373 Release note: None
…weak-isolation transactions Previously, cockroachdb#126417 attempted to skip validation for rolled-back locking reads that tolerate write skew. However, it depended on the validator's `observeLocking` filter in `checkAtomicCommitted` to detect if a read tolerates write skew. However, that filter is set correctly only while processing ops. This patch adds a field to `observedRead` to indicate that the read should be skipped from validation if rolled back by a savepoint. The flag is set in `processOp` and used later in `checkAtomicCommitted`. Fixes: cockroachdb#128373 Release note: None
…weak-isolation transactions Previously, cockroachdb#126417 attempted to skip validation for rolled-back locking reads that tolerate write skew. However, it depended on the validator's `observeLocking` filter in `checkAtomicCommitted` to detect if a read tolerates write skew. However, that filter is set correctly only while processing ops. This patch adds a field to `observedRead` to indicate that the read should be skipped from validation if rolled back by a savepoint. The flag is set in `processOp` and used later in `checkAtomicCommitted`. Fixes: cockroachdb#128373 Release note: None
128813: changefeedccl: disambiguate kv buffer metrics & logs r=rharding6373,andyyang890 a=asg0451 Disambiguates metrics and logs for the 2 buffers used by the kv feed, so we can better identify bottlenecks. The old versions are kept around for backwards compatibility, though their use is not recommended. Fixes: #127647 Release note (enterprise change): Disambiguate metrics and logs for the 2 buffers used by the kv feed. Affected metrics now have a suffix indicating which buffer they correspond to. Affected metrics: `changefeed.buffer_entries.*`, `changefeed.buffer_entries_mem.*`, `changefeed.buffer_pushback_nanos.*`. The old versions are kept around for backwards compatibility, though their use is not recommended. 128847: kvnemesis: properly skip validation for rolled-back locking reads in weak-isolation transactions r=miraradeva a=miraradeva Previously, #126417 attempted to skip validation for rolled-back locking reads that tolerate write skew. However, it depended on the validator's `observeLocking` filter in `checkAtomicCommitted` to detect if a read tolerates write skew. However, that filter is set correctly only while processing ops. This patch adds a field to `observedRead` to mark it as tolerating write skew. The flag is set in `processOp` and used later in `checkAtomicCommitted`. Fixes: #128373 Release note: None 128982: clusterversion: allow skip-version upgrade without env var r=RaduBerinde a=renatolabs Starting from 24.3, we are now able to upgrade directly from `MinSupported` without the need to set any environment variables. Epic: none Release note: None 129020: sqlsmith: limit attempts to generate operators r=mgartner a=rharding6373 Some binary operators do not support simple scalar types, so sqlsmith can loop while trying to generate a binary operation for some types (e.g., pgvector). This PR limits the number of times sqlsmith will attempt to find a valid binary operation when `simpleScalarTypes` is specified. Epic: none Release note: none Co-authored-by: Miles Frankel <[email protected]> Co-authored-by: Mira Radeva <[email protected]> Co-authored-by: Renato Costa <[email protected]> Co-authored-by: rharding6373 <[email protected]>
…weak-isolation transactions Previously, #126417 attempted to skip validation for rolled-back locking reads that tolerate write skew. However, it depended on the validator's `observeLocking` filter in `checkAtomicCommitted` to detect if a read tolerates write skew. However, that filter is set correctly only while processing ops. This patch adds a field to `observedRead` to indicate that the read should be skipped from validation if rolled back by a savepoint. The flag is set in `processOp` and used later in `checkAtomicCommitted`. Fixes: #128373 Release note: None
…weak-isolation transactions Previously, #126417 attempted to skip validation for rolled-back locking reads that tolerate write skew. However, it depended on the validator's `observeLocking` filter in `checkAtomicCommitted` to detect if a read tolerates write skew. However, that filter is set correctly only while processing ops. This patch adds a field to `observedRead` to indicate that the read should be skipped from validation if rolled back by a savepoint. The flag is set in `processOp` and used later in `checkAtomicCommitted`. Fixes: #128373 Release note: None
Currently, for weak-isolation-level transactions, kvnemesis does validation only for locking reads; such transactions ensure the locks are held until the transactions commit. However, this is not the case if a locking read is rolled back by a savepoint. In this case, the lock is not released eagerly, but it's also not guaranteed to be held; the lock may be released if the transaction is pushed. Once the read is no longer considered locking, weaker isolations levels do not require that its valid times intersect with the rest of the transaction's operations.
In this patch, we adjust the kvnemesis validation to skip such rolled-back reads.
Fixes: #125545
Release note: None