Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: savepoints are broken with slow statements on master #94337

Closed
jordanlewis opened this issue Dec 27, 2022 · 6 comments · Fixed by #97471
Closed

sql: savepoints are broken with slow statements on master #94337

jordanlewis opened this issue Dec 27, 2022 · 6 comments · Fixed by #97471
Assignees
Labels
branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-kv KV Team

Comments

@jordanlewis
Copy link
Member

jordanlewis commented Dec 27, 2022

While investigating #93379, @rafiss and I have discovered some very broken behavior with savepoints on master.

The following logic test fails, but it should succeed (seeing nothing in the final output statement):

statement ok
CREATE TABLE a (id INT);
CREATE TABLE b (id INT8)

statement ok
BEGIN

statement ok
SELECT pg_sleep(1.5)

statement ok
SAVEPOINT s

statement ok
SELECT pg_sleep(1.5)

statement ok
INSERT INTO a(id) VALUES (0);
INSERT INTO b(id) VALUES (0)

statement ok
ROLLBACK TO SAVEPOINT s

query I
SELECT * FROM a
----

The pg_sleep statements are important to make this fail, suggesting timing issues.

Jira issue: CRDB-22839

@jordanlewis jordanlewis added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Dec 27, 2022
@rafiss
Copy link
Collaborator

rafiss commented Dec 27, 2022

I ran a git bisect, and found that 7ff05f1 is the first commit that causes Jordan's test to fail:

7ff05f15725617368a175cce187b15c5aba81698 is the first bad commit
commit 7ff05f15725617368a175cce187b15c5aba81698
Date:   Thu Aug 4 18:08:50 2022 -0400

    kvserver: allow certain read-only requests to drop latches before evaluation

@blathers-crl blathers-crl bot added the T-kv KV Team label Dec 27, 2022
@jordanlewis
Copy link
Member Author

1.5 seconds also seems close to the minimum - 1.25 doesn't work on my machine.

@arulajmani arulajmani self-assigned this Dec 27, 2022
@nvanbenschoten nvanbenschoten added the release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. label Jan 4, 2023
@blathers-crl
Copy link

blathers-crl bot commented Jan 4, 2023

Hi @nvanbenschoten, please add branch-* labels to identify which branch(es) this release-blocker affects.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@nvanbenschoten nvanbenschoten added the branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 label Jan 4, 2023
@arulajmani
Copy link
Collaborator

The issue here seems to be that we're incorrectly determining that the scan request for SELECT * FROM a does not need access to the intent history. The determination happens here:

if seq <= meta.Txn.Sequence || ts.LessEq(meta.Timestamp.ToTimestamp()) {
needIntentHistory = true
}

However, this doesn't take into account ignoredSequenceNumbers for the transaction -- in this case, because the intent we scanned should be ignored because of the savepoint rollback, we do need access to the intent history. We should pass this information to ScanConflictingIntentsForDroppingLatchesEarly and resort to using the intent interleaving iterator if any of the intents were written at a sequence number that should be ignored.


The question still remains -- what's the deal with pg_sleep here?

In addition to the sequence numbers, we also look at the timestamp of the read request and compare it with the timestamp at which the intent is written at when deciding whether we need the intent history or not. We only determine that intent history isn't required if the read timestamp is greater than the timestamp at which the intent was written, for which the transaction needs to get pushed. Adding the pg_sleep causes this, by virtue of the closed timestamp interval.

arulajmani added a commit to arulajmani/cockroach that referenced this issue Jan 5, 2023
… required

Read-only requests that drop their latches early, before evaluation,
scan the lock table to resolve conflicts in
`ScanConflictingIntentsForDroppingLatchesEarly`. A request needs to
determine if it needs access to the intent history during the scan
phase. Typically, this happens if the lock table scan uncovers an
intent from the same transaction but at a higher sequence number.
However, we may also find intents at lower sequence numbers but
at higher timestamps (e.g, the intent was pushed by a conflicting
transaction) -- these intents need to be read, and to correctly do
that during the scan phase, we need access to the intent history.

Note that this only applies to intents at strictly higher timestamps.
Previously, we would determine we needed access to the intent history
even if the intent history was at the same timestamp as the read
request. In practice, this is quite common, which ends up meaning
that we default back to the intent interleaving iterator more often
than we need to. This patch fixes this behaviour.

References cockroachdb#94337 -- see explanation for "what's the deal with
`pg_sleep`.

Epic: none

Release note: None
@rafiss
Copy link
Collaborator

rafiss commented Feb 10, 2023

@arulajmani do you have a timeline for when this issue can be worked on? just wondering since it's still affecting nightly tests here: #95569

@arulajmani
Copy link
Collaborator

Thanks for the ping! It's on my plate for this milestone, but realistically, I was hoping to get to it early next week. Sorry for the trouble, I didn't realize this was actively affecting nightly tests.

arulajmani added a commit to arulajmani/cockroach that referenced this issue Feb 22, 2023
Read-only requests that qualify for dropping latches early scan the
lock table under latches to perform conflict resolution. If they don't
conflict with any concurrent transactions, they drop their latches and
scan the MVCC keyspace. However, if a request is trying to read a key
that was written to in its transaction, we may need access to the key's
intent history when performing the scan.

If scanning the lock table uncovers an intent by our own transaction, we
need access to the intent history if:

1. The request is reading at a lower sequence number than the found
intent's sequence number.
2. OR the request is reading at a strictly lower timestamp than the
intent's timestamp.
3. OR the found intent should be ignored because it was written as part
of a savepoing which was subsequently rolled back.
4. OR the found intent and read request belong to different txn epochs.

Prior to this patch, we weren't accounting for 3 and 4. Worse yet, our
determination for 2 was a less than equal to -- which meant we would
end up consulting the intent history in the common case. This meant
that existing tests which would have exercised 3 and 4 weren't failing
on us.

Instead of correcting the determination, this patch makes it such that
all read-your-own-write transactions consult intent history. This
avoids the subtle complexity of getting all these cases right and
keeping them in sync with the MVCC code that reads keys. It also seems
like scanning the lock table twice (once during conflict resolution
and once during the MVCC read) for read-your-own-write transactions
in the common case did not cause noticable regressions. We can always
optimize this later, if we find there's a need to.

Epic: none
Fixes: cockroachdb#94337

Release note: None
@rafiss rafiss changed the title sql: savepoints are sometimes broken with slow statements on master sql: savepoints are broken with slow statements on master Feb 24, 2023
arulajmani added a commit to arulajmani/cockroach that referenced this issue Mar 8, 2023
Read-only requests that qualify for dropping latches early scan the
lock table under latches to perform conflict resolution. If they don't
conflict with any concurrent transactions, they drop their latches and
scan the MVCC keyspace. However, if a request is trying to read a key
that was written to in its transaction, we may need access to the key's
intent history when performing the scan.

If scanning the lock table uncovers an intent by our own transaction, we
need access to the intent history if:

1. The request is reading at a lower sequence number than the found
intent's sequence number.
2. OR the request is reading at a strictly lower timestamp than the
intent's timestamp.
3. OR the found intent should be ignored because it was written as part
of a savepoing which was subsequently rolled back.
4. OR the found intent and read request belong to different txn epochs.

Prior to this patch, we weren't accounting for 3 and 4. Worse yet, our
determination for 2 was a less than equal to -- which meant we would
end up consulting the intent history in the common case. This meant
that existing tests which would have exercised 3 and 4 weren't failing
on us.

Instead of correcting the determination, this patch makes it such that
all read-your-own-write transactions consult intent history. This
avoids the subtle complexity of getting all these cases right and
keeping them in sync with the MVCC code that reads keys. It also seems
like scanning the lock table twice (once during conflict resolution
and once during the MVCC read) for read-your-own-write transactions
in the common case did not cause noticable regressions. We can always
optimize this later, if we find there's a need to.

Epic: none
Fixes: cockroachdb#94337

Release note: None
@craig craig bot closed this as completed in 0fda198 Mar 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-kv KV Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants