Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvnemesis: skip validation for rolled-back locking reads in weak-isolation transactions #126417

Merged
merged 1 commit into from
Jul 2, 2024

Conversation

miraradeva
Copy link
Contributor

Currently, for weak-isolation-level transactions, kvnemesis does validation only for locking reads; such transactions ensure the locks are held until the transactions commit. However, this is not the case if a locking read is rolled back by a savepoint. In this case, the lock is not released eagerly, but it's also not guaranteed to be held; the lock may be released if the transaction is pushed. Once the read is no longer considered locking, weaker isolations levels do not require that its valid times intersect with the rest of the transaction's operations.

In this patch, we adjust the kvnemesis validation to skip such rolled-back reads.

Fixes: #125545

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@miraradeva
Copy link
Contributor Author

I successfully stressed it with savepoint ops frequency at 5 and disabled serializable ops. With these settings and without the patch in this PR, the failure in #125545 reproduces.

@miraradeva miraradeva marked this pull request as ready for review June 28, 2024 19:24
@miraradeva miraradeva requested a review from a team as a code owner June 28, 2024 19:24
@miraradeva miraradeva requested a review from arulajmani June 28, 2024 19:24
Copy link
Collaborator

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Thanks for fixing Mira! Should we also backport this to all branches that have the savepoint validation code?

Separately, would you mind taking a pass over all the current KVNemesis failures we have open which would be fixed by this? If the majority failures on the issue, IMO we should close them out, so that we get eyes on any failures that aren't explained by this.

Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @miraradeva)


pkg/kv/kvnemesis/validator.go line 1120 at r1 (raw file):

		case *observedRead:
			// If v.observationFilter == observeLocking, this must be a locking read
			// from a weak-isolation transaction. If this locking read is rolled back

nit: wrap to 80 chars.

@miraradeva
Copy link
Contributor Author

miraradeva commented Jul 1, 2024

:lgtm:

Thanks for fixing Mira! Should we also backport this to all branches that have the savepoint validation code?

I'll backport to 24.1 and 23.2. This is the PR that introduced the weak-isolation-level validation: #111992. The earliest it exists is 23.2.

Separately, would you mind taking a pass over all the current KVNemesis failures we have open which would be fixed by this? If the majority failures on the issue, IMO we should close them out, so that we get eyes on any failures that aren't explained by this.

Yes, will do.

Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @miraradeva)

pkg/kv/kvnemesis/validator.go line 1120 at r1 (raw file):

		case *observedRead:
			// If v.observationFilter == observeLocking, this must be a locking read
			// from a weak-isolation transaction. If this locking read is rolled back

nit: wrap to 80 chars.

It is actually wrapped to 80 chars.

@miraradeva miraradeva added backport-23.2.x Flags PRs that need to be backported to 23.2. backport-24.1.x Flags PRs that need to be backported to 24.1. labels Jul 1, 2024
@miraradeva
Copy link
Contributor Author

TFTR!

bors=arulajmani

…ation transactions

Currently, for weak-isolation-level transactions, kvnemesis does
validation only for locking reads; such transactions ensure the locks
are held until the transactions commit. However, this is not the case
if a locking read is rolled back by a savepoint. In this case, the lock
is not released eagerly, but it's also not guaranteed to be held; the
lock may be released if the transaction is pushed. Once the read is no
longer considered locking, weaker isolations levels do not require that
its valid times intersect with the rest of the transaction's
operations.

In this patch, we adjust the kvnemesis validation to skip such
rolled-back reads.

Fixes: cockroachdb#125545

Release note: None
@miraradeva miraradeva force-pushed the mira-125545-savepoint branch from 8e7f521 to cc85997 Compare July 1, 2024 19:35
@miraradeva
Copy link
Contributor Author

bors=arulajmani

@miraradeva
Copy link
Contributor Author

bors r=arulajmani

@craig craig bot merged commit c6dc40a into cockroachdb:master Jul 2, 2024
22 checks passed
Copy link

blathers-crl bot commented Jul 2, 2024

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from cc85997 to blathers/backport-release-23.2-126417: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 23.2.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@miraradeva
Copy link
Contributor Author

Actually, the savepoint validation logic is not there on 23.2: #110494.

@miraradeva miraradeva removed the backport-23.2.x Flags PRs that need to be backported to 23.2. label Jul 2, 2024
miraradeva added a commit to miraradeva/cockroach that referenced this pull request Aug 12, 2024
…weak-isolation transactions

Previously, cockroachdb#126417 attempted to skip validation for rolled-back locking
reads that tolerate write skew. However, it depended on the validator's
`observeLocking` filter in `checkAtomicCommitted` to detect if a read
tolerates write skew. However, that filter is set correctly only while
processing ops.

This patch adds a field to `observedRead` to mark it as tolerating write
skew. The flag is set in `processOp` and used later in
`checkAtomicCommitted`.

Fixes: cockroachdb#128373

Release note: None
miraradeva added a commit to miraradeva/cockroach that referenced this pull request Aug 13, 2024
…weak-isolation transactions

Previously, cockroachdb#126417 attempted to skip validation for rolled-back locking
reads that tolerate write skew. However, it depended on the validator's
`observeLocking` filter in `checkAtomicCommitted` to detect if a read
tolerates write skew. However, that filter is set correctly only while
processing ops.

This patch adds a field to `observedRead` to indicate that the read
should be skipped from validation if rolled back by a savepoint. The
flag is set in `processOp` and used later in `checkAtomicCommitted`.

Fixes: cockroachdb#128373

Release note: None
miraradeva added a commit to miraradeva/cockroach that referenced this pull request Aug 15, 2024
…weak-isolation transactions

Previously, cockroachdb#126417 attempted to skip validation for rolled-back locking
reads that tolerate write skew. However, it depended on the validator's
`observeLocking` filter in `checkAtomicCommitted` to detect if a read
tolerates write skew. However, that filter is set correctly only while
processing ops.

This patch adds a field to `observedRead` to indicate that the read
should be skipped from validation if rolled back by a savepoint. The
flag is set in `processOp` and used later in `checkAtomicCommitted`.

Fixes: cockroachdb#128373

Release note: None
craig bot pushed a commit that referenced this pull request Aug 15, 2024
128813: changefeedccl: disambiguate kv buffer metrics & logs r=rharding6373,andyyang890 a=asg0451

Disambiguates metrics and logs for the 2 buffers
used by the kv feed, so we can better identify
bottlenecks. The old versions are kept around for
backwards compatibility, though their use is not
recommended.

Fixes: #127647

Release note (enterprise change): Disambiguate
metrics and logs for the 2 buffers used by the kv
feed. Affected metrics now have a suffix
indicating which buffer they correspond to.
Affected metrics: `changefeed.buffer_entries.*`,
`changefeed.buffer_entries_mem.*`,
`changefeed.buffer_pushback_nanos.*`. The old
versions are kept around for backwards
compatibility, though their use is not
recommended.


128847: kvnemesis: properly skip validation for rolled-back locking reads in weak-isolation transactions r=miraradeva a=miraradeva

Previously, #126417 attempted to skip validation for rolled-back locking reads that tolerate write skew. However, it depended on the validator's `observeLocking` filter in `checkAtomicCommitted` to detect if a read tolerates write skew. However, that filter is set correctly only while processing ops.

This patch adds a field to `observedRead` to mark it as tolerating write skew. The flag is set in `processOp` and used later in `checkAtomicCommitted`.

Fixes: #128373

Release note: None

128982: clusterversion: allow skip-version upgrade without env var r=RaduBerinde a=renatolabs

Starting from 24.3, we are now able to upgrade directly from
`MinSupported` without the need to set any environment variables.

Epic: none

Release note: None

129020: sqlsmith: limit attempts to generate operators r=mgartner a=rharding6373

Some binary operators do not support simple scalar types, so sqlsmith can loop while trying to generate a binary operation for some types (e.g., pgvector). This PR limits the number of times sqlsmith will attempt to find a valid binary operation when `simpleScalarTypes` is specified.

Epic: none

Release note: none

Co-authored-by: Miles Frankel <[email protected]>
Co-authored-by: Mira Radeva <[email protected]>
Co-authored-by: Renato Costa <[email protected]>
Co-authored-by: rharding6373 <[email protected]>
blathers-crl bot pushed a commit that referenced this pull request Aug 15, 2024
…weak-isolation transactions

Previously, #126417 attempted to skip validation for rolled-back locking
reads that tolerate write skew. However, it depended on the validator's
`observeLocking` filter in `checkAtomicCommitted` to detect if a read
tolerates write skew. However, that filter is set correctly only while
processing ops.

This patch adds a field to `observedRead` to indicate that the read
should be skipped from validation if rolled back by a savepoint. The
flag is set in `processOp` and used later in `checkAtomicCommitted`.

Fixes: #128373

Release note: None
blathers-crl bot pushed a commit that referenced this pull request Aug 15, 2024
…weak-isolation transactions

Previously, #126417 attempted to skip validation for rolled-back locking
reads that tolerate write skew. However, it depended on the validator's
`observeLocking` filter in `checkAtomicCommitted` to detect if a read
tolerates write skew. However, that filter is set correctly only while
processing ops.

This patch adds a field to `observedRead` to indicate that the read
should be skipped from validation if rolled back by a savepoint. The
flag is set in `processOp` and used later in `checkAtomicCommitted`.

Fixes: #128373

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-24.1.x Flags PRs that need to be backported to 24.1.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kv/kvnemesis: TestKVNemesisSingleNode failed
3 participants