-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql,kv: add preliminary SQL savepoints support #45566
sql,kv: add preliminary SQL savepoints support #45566
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything except r5,r6,r7 is elsewhere.
This is based on #43051, but I've changed a lot of stuff so I think it's better to start with a new PR. Compared to that, this version has support for rolling back after an error, different behavior on rolling back after retriable errors and removes the special treatment of the savepoint cockroach_restart
when it comes to rollbacks.
I still need to see what more tests need to be written (but you added a bunch of great ones already Rafa) and go over everything and spruce it up, but I think its ready for a first pass. Not ready for nits yet though.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz and @nvanbenschoten)
56e7ce6
to
7b4a5e3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been adding various tests. Ready for review.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz and @nvanbenschoten)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave this a first pass. It's looking good and I'm very excited for this to be merged, although the PR still needs some polish. I also think it deserves a review from @knz, given that he was the other person most connected to this change.
s/descovering/discovering/ in the commit message.
Reviewed 4 of 4 files at r1, 1 of 1 files at r2, 3 of 3 files at r3, 37 of 37 files at r4.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @knz)
pkg/kv/txn_coord_sender.go, line 179 at r4 (raw file):
epochBumpedLocked() // createSavepoint is used to populate a savepoint with all the state the
s/the needs/the transaction needs/
pkg/kv/txn_coord_sender.go, line 181 at r4 (raw file):
// createSavepoint is used to populate a savepoint with all the state the // needs to be restored on a rollback. createSavepoint(context.Context, *savepoint)
Do these need Locked
suffixes?
pkg/kv/txn_coord_sender.go, line 184 at r4 (raw file):
// rollbackToSavepoint is used to restore the state previously saved by createSavepoint(). // implementations are allowed to modify the savepoint if they want to, optimizing it for future
s/implementations/Implementations/
pkg/kv/txn_coord_sender_savepoints.go, line 33 at r4 (raw file):
seqNum enginepb.TxnSeq // txnSpanRefresher fields
nit, period after both of these comments.
pkg/kv/txn_coord_sender_savepoints.go, line 42 at r4 (raw file):
ifBytes int64 // txnID is used to verify that a rollback is not used to paper
Move these fields up above seqNum
to give the ordering here some notion of logical precedence.
pkg/kv/txn_coord_sender_savepoints.go, line 80 at r4 (raw file):
epoch: tc.mu.txn.Epoch, } for _, reqInt := range tc.interceptorStack {
👍 I like this approach.
pkg/kv/txn_coord_sender_savepoints.go, line 172 at r4 (raw file):
} if st.seqNum > 0 && st.txnID != tc.mu.txn.ID {
Why do we need these seqNum checks?
pkg/kv/txn_coord_sender_savepoints.go, line 180 at r4 (raw file):
} if st.seqNum < 0 || st.seqNum > tc.interceptorAlloc.txnSeqNumAllocator.writeSeq {
When is seqNum
less than zero?
pkg/kv/txn_interceptor_committer.go, line 465 at r4 (raw file):
func (*txnCommitter) importLeafFinalState(*roachpb.LeafTxnFinalState) {} // epochBumpedLocked implements the txnReqInterceptor interface.
nit: not your change, but if we wanted to rename epochBumpedLocked
to bumpEpochLocked
to fit the naming scheme here more appropriately, I'd be very supportive. If not, I'll make the change after this PR lands.
pkg/kv/txn_interceptor_pipeliner.go, line 607 at r4 (raw file):
func (tp *txnPipeliner) createSavepoint(ctx context.Context, s *savepoint) { if tp.ifWrites.len() > 0 { s.ifWrites = tp.ifWrites.t.Clone()
Let's add a clone method to *inFlightWriteSet
.
pkg/kv/txn_interceptor_pipeliner.go, line 615 at r4 (raw file):
// rollbackToSavepoint is part of the txnReqInterceptor interface. func (tp *txnPipeliner) rollbackToSavepoint(ctx context.Context, s *savepoint) { // Intersect the inflight writes from the savepoint to the ones from the
s/to/with/
pkg/kv/txn_interceptor_pipeliner.go, line 627 at r4 (raw file):
// higher seqnum for the same key. We don't keep track of what sequence // numbers we've verified and which we haven't, so we're going to assume // that the savepoint's write has not been verified.
I'm confused, we do keep track of what sequence numbers we've verified and which we haven't. There's a Sequence
field in inFlightWrite
.
pkg/kv/txn_interceptor_pipeliner.go, line 635 at r4 (raw file):
return true }) // TODO(andrei): Can I delete directly during the iteration above?
No, you can't. The btree library doesn't like that.
pkg/kv/txn_interceptor_pipeliner.go, line 654 at r4 (raw file):
// Restore the inflightWrites from the savepoint. if s.ifWrites == nil {
Should you be setting cloned
here?
pkg/kv/txn_interceptor_pipeliner.go, line 655 at r4 (raw file):
// Restore the inflightWrites from the savepoint. if s.ifWrites == nil { tp.ifWrites.t = nil
Should we clear(true)
here instead?
pkg/kv/txn_interceptor_pipeliner.go, line 825 at r4 (raw file):
s.t.Clear(reuse /* addNodesToFreelist */) s.bytes = 0 s.alloc.clear()
Is this safe when s.cloned == true
?
pkg/kv/txn_interceptor_pipeliner_test.go, line 1231 at r4 (raw file):
} func TestTxnPipelinerSavepoints(t *testing.T) {
Give this a comment. We've been good about doing so in these tests so far.
Good test though.
pkg/kv/txn_interceptor_pipeliner_test.go, line 1244 at r4 (raw file):
tp.createSavepoint(ctx, s) // Some more write after the savepoint. One of them is on key "c" that is part
s/write/writes/
pkg/kv/txn_interceptor_pipeliner_test.go, line 1267 at r4 (raw file):
} // Now verify one of the writes. When we'll rollback to the savepoint below,
s/we'll/we/
pkg/kv/txn_interceptor_pipeliner_test.go, line 1292 at r4 (raw file):
require.NotNil(t, br) require.Equal(t, []roachpb.Span{{Key: roachpb.Key("a")}}, tp.footprint.asSlice()) require.Equal(t, 3, tp.ifWrites.len()) // We've verified one out of 4 writes.
Want to check the footprint
?
pkg/kv/txn_interceptor_pipeliner_test.go, line 1307 at r4 (raw file):
{ var savepointWrites []inFlightWrite s.ifWrites.Ascend(func(i btree.Item) bool {
This is interesting. It begs the question of why we need to update the savepoint itself. Is it not enough to update the TxnCoordSender?
pkg/kv/txn_interceptor_seq_num_allocator.go, line 180 at r4 (raw file):
// createSavepoint is part of the txnReqInterceptor interface. func (s *txnSeqNumAllocator) createSavepoint(ctx context.Context, sp *savepoint) { sp.seqNum = s.writeSeq
It's worth adding a very small test for this. I know the implementation is trivial, but the role the txnSeqNumAllocator
plays in populating savepoints is critical.
pkg/kv/txn_interceptor_span_refresher_test.go, line 559 at r4 (raw file):
} func TestTxnSpanRefresherSavepoint(t *testing.T) {
Same point about adding a comment.
pkg/kv/testdata/savepoints, line 322 at r4 (raw file):
subtest end # !!!
!!!
Can this test be revived?
pkg/sql/conn_executor.go, line 2483 at r3 (raw file):
case *tree.ReleaseSavepoint: // TODO(knz): Sanitize this. if err := ex.validateSavepointName(t.Savepoint); err == nil {
I don't understand this. Why does validating the name indicate whether this is the cockroach_restart
savepoint?
pkg/sql/conn_executor_exec.go, line 945 at r4 (raw file):
} // TODO(andrei/cuongdo): Figure out what statements to count here.
TODO(andrei): ...
I don't think you're gonna get any help with that one any time soon.
Or just delete the TODO. Is it useful anymore?
pkg/sql/conn_executor_exec.go, line 1006 at r4 (raw file):
} // execStmtInRollbackWaitState executes a statement in a txn that's in state
Is this used? I thought we just got rid of the restartWait
state?
pkg/sql/conn_executor_savepoints.go, line 88 at r1 (raw file):
} // execSavepointInAbortedState runs a SAVEPOINT statement when a txn is aborted.
Why do we support running a SAVEPOINT while in the Aborted state? Postgres doesn't appear to:
nathan=# begin;
BEGIN
nathan=#
nathan=# savepoint a;
SAVEPOINT
nathan=#
nathan=# savepodint a;
ERROR: syntax error at or near "savepodint"
LINE 1: savepodint a;
^
nathan=# savepoint b;
ERROR: current transaction is aborted, commands ignored until end of transaction block
EDIT: this was answered later.
pkg/sql/conn_executor_savepoints.go, line 62 at r4 (raw file):
ev, payload := ex.makeErrEvent(err, s) return ev, payload, nil }
We actually create a kv.savepoint
object for cockroach_restart
now? I'm surprised. Isn't there a cost to this? Is this imposing strange behavior on the TxnCoordSender
since we still want to restart when we roll back to these savepoints?
pkg/sql/conn_executor_savepoints.go, line 200 at r4 (raw file):
curID, curEpoch := ex.state.mu.txn.ID(), ex.state.mu.txn.Epoch() if !curID.Equal(entry.txnID) { return ex.makeErrEvent(roachpb.NewTransactionRetryWithProtoRefreshError(
Should we push these errors into client.RollbackToSavepoint
?
pkg/sql/conn_executor_savepoints.go, line 264 at r4 (raw file):
// a savepoint if there was DDL performed "under it". // TODO(knz): support partial DDL cancellation in pending txns. numDDL int
Add more to this comment. This is the number of DDL statements that had been issued at the time that the savepoint was created, right?
pkg/sql/conn_executor_savepoints.go, line 276 at r4 (raw file):
// - if the epoch has changed, then we can only rollback if we performed a // refresh for the reads that are not rolled back. We currently don't do this. txnID uuid.UUID
I'm still confused about why we need this here and in the kv.savepoint
object.
pkg/sql/conn_executor_savepoints.go, line 310 at r4 (raw file):
func (stack savepointStack) clone() savepointStack { cpy := make(savepointStack, len(stack))
This copy is unfortunate. Can we do something smarter with aliasing the slices and copying on write? I suspect that even without reference counting to eliminate redundant copied, it would be cheaper to simply copy on every call to push
.
Do you mind comparing the relative frequency of these method calls so we can decide on the right solution here? I'm working under the assumption that setTxnRewindPos
is going to be called far more often than execSavepointInOpenState
.
pkg/sql/conn_executor_savepoints.go, line 337 at r4 (raw file):
} // commitOnReleaseSavepointName is the name of the savepoint with special
Could you explain again what those semantics are?
Also, move this to the top of the file. It's buried down here but pretty important to understand in order to read the rest of the file.
pkg/sql/conn_executor_savepoints_test.go, line 193 at r4 (raw file):
func isOpenTxn(status string) bool { return status == "Open" || status == "NoTxn"
Can we use sql.OpenStateStr
and sql.NoTxnStr
here?
pkg/sql/conn_fsm.go, line 58 at r4 (raw file):
} // stateAborted is entered on retriable errors. A ROLLBACK TO SAVEPOINT can
And non-retryable errors, right?
pkg/sql/conn_fsm.go, line 92 at r4 (raw file):
func (stateOpen) State() {} func (stateAborted) State() {}
nit: stray line.
pkg/sql/conn_fsm.go, line 115 at r4 (raw file):
// makeEventTxnStartPayload creates an eventTxnStartPayload. // // Pass noRolledBackSavepoint for rolledBackSavepoint when the transaction is
What is this referring to?
pkg/sql/conn_fsm.go, line 144 at r4 (raw file):
// eventSavepointRollback is generated when we want to move from Aborted to Open // through a ROLLBACK TO SAVEPOINT <not cockroach_restart>. Note that it is not // generated when such a savepoint is rolled back to from the Open state. In
"to from"
pkg/sql/conn_fsm.go, line 190 at r4 (raw file):
// eventTxnRestart is generated by a rollback to a savepoint placed at the // beginning of the transaction (commonly SAVEPOINT cockroach_restart). type eventTxnRestart struct{}
What went into the decision to have a separate event instead of a single type eventSavepointRollback struct{Initial fsm.Bool}
?
pkg/sql/conn_fsm.go, line 193 at r4 (raw file):
// eventTxnReleased is generated after a successful // RELEASE SAVEPOINT cockroach_restart. It moves the state to CommitWait.
Are these two events limited to cockroch_restart
or just commonly linked to it like eventTxnRestart
?
pkg/sql/conn_fsm.go, line 204 at r4 (raw file):
} func (eventTxnStart) Event() {}
nit: keep these in the same order as the type defs. Also ideally in the same order as the cases for a given state in the state transition map.
pkg/sql/logictest/testdata/logic_test/manual_retry, line 59 at r4 (raw file):
# BEGIN # # !!! rewrite this test somehow to assert the release behavior, not the initial status
!!!
pkg/sql/logictest/testdata/logic_test/manual_retry, line 141 at r4 (raw file):
BEGIN TRANSACTION; SAVEPOINT foo statement error pq: savepoint bar does not exist
Might as well quote the savepoint name so that we exactly mirror PG:
nathan=# BEGIN;
BEGIN
nathan=# ROLLBACK TO SAVEPOINT bar;
ERROR: savepoint "bar" does not exist
pkg/sql/logictest/testdata/logic_test/manual_retry, line 165 at r4 (raw file):
ROLLBACK TO SAVEPOINT "Foo Bar" query TB
Could you query with the column names here and below?
pkg/sql/opt/exec/execbuilder/builder.go, line 76 at r4 (raw file):
// IsDDL is set to true if the statement contains DDL. IsDDL bool
Someone else should sign off on whether this is a valid use of execbuilder.Builder
.
cc. @RaduBerinde
pkg/sql/opt/exec/execbuilder/relational.go, line 144 at r4 (raw file):
var err error isDDL := opt.IsDDLOp(e)
nit: no need for the variable.
pkg/sql/testdata/savepoints, line 1 at r4 (raw file):
# This test exercises the savepoint state in the conn executor.
Nice tests!
pkg/sql/testdata/savepoints, line 303 at r4 (raw file):
subtest rollback_after_error # check that we can rollback after an error
nit: s/check/Check/ here and below.
pkg/sql/testdata/savepoints, line 330 at r4 (raw file):
2: SELECT crdb_internal.force_retry('100ms') -- pq: restart transaction: crdb_internal.force_retry(): TransactionRetryWithProtoRefreshError: forced by crdb_internal.force_retry() -- Open -> Aborted XXXX init(r) 3: ROLLBACK TO SAVEPOINT init -- 0 rows
It's too bad we aren't testing that this causes a txn retry and that the epoch is larger.
pkg/sql/testdata/savepoints, line 335 at r4 (raw file):
-- Open -> NoTxn #... (none) # Check that, after a retriable error, rolling back to anything an initial
"other than"?
pkg/sql/testdata/savepoints, line 526 at r4 (raw file):
# Test that the rewinding we do when performing an automatic retry restores the # savepoint stack properly. subtest rewing_on_automatic_restarts
s/rewing/rewind/
Generally I'll trust you on implementation but I'll sit tomorrow over this PR to add more testing too. I believe remaining issues should be rooted out via testing, including via the sqlsmith tests that Rohan already prepared for us. Also Rafi recommends that we trigger the nightly run on the PR manually to pick up all the new feedback from ORM tests. |
I have kicked a full nightly run on the PR ahead of further changes, to pick up any news from ORMs: https://teamcity.cockroachdb.com/viewLog.html?buildId=1782977& |
0f5c697
to
a3ddc11
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/descovering/discovering/ in the commit message.
done
I haven't finished all the comments, but responding to some. Take a look if you get around here. The rest tomorrow.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @knz, and @nvanbenschoten)
pkg/kv/txn_coord_sender.go, line 181 at r4 (raw file):
the needs
done
pkg/kv/txn_coord_sender.go, line 184 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
s/implementations/Implementations/
done
pkg/kv/txn_coord_sender_savepoints.go, line 33 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit, period after both of these comments.
done
pkg/kv/txn_coord_sender_savepoints.go, line 42 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Move these fields up above
seqNum
to give the ordering here some notion of logical precedence.
done
pkg/kv/txn_coord_sender_savepoints.go, line 172 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Why do we need these seqNum checks?
because "initial savepoint" allow rollbacks after retries. See now.
pkg/kv/txn_coord_sender_savepoints.go, line 180 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
When is
seqNum
less than zero?
never, I guess that's the point of the assertion. Rafa added it, but I'd leave it.
pkg/kv/txn_interceptor_committer.go, line 465 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: not your change, but if we wanted to rename
epochBumpedLocked
tobumpEpochLocked
to fit the naming scheme here more appropriately, I'd be very supportive. If not, I'll make the change after this PR lands.
meh
pkg/kv/txn_interceptor_pipeliner.go, line 607 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Let's add a clone method to
*inFlightWriteSet
.
well but I'm only cloning the tree, not other crap...
pkg/kv/txn_interceptor_pipeliner.go, line 615 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
s/to/with/
done
pkg/kv/txn_interceptor_pipeliner.go, line 627 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I'm confused, we do keep track of what sequence numbers we've verified and which we haven't. There's a
Sequence
field ininFlightWrite
.
indeed. See now.
pkg/kv/txn_interceptor_pipeliner.go, line 635 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
No, you can't. The btree library doesn't like that.
ack
pkg/kv/txn_interceptor_pipeliner.go, line 654 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Should you be setting
cloned
here?
It's already set from the moment we've created the savepoint. Or maybe I'm missing your point.
pkg/kv/txn_interceptor_pipeliner.go, line 655 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Should we
clear(true)
here instead?
Sometimes. See now.
Perhaps we could always do it if we relied on s.ifWrites == nil
(the special nil
value) to imply that no earlier savepoint has any writes, but it seems pretty fragile.
This is all pretty nasty; curious for your thoughts.
pkg/kv/txn_interceptor_pipeliner.go, line 825 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Is this safe when
s.cloned == true
?
It is not safe when you can still rollback to any savepoints that might have writes in them. But we don't; we only call this on epoch bump (after which you can only roll back to "initial" savepoints), on commit, and, since this review, on a special case of rollbackToSavepoint
when we're rolling back to an initial savepoint.
Added a comment to the method about it. And accepting thoughts.
pkg/kv/txn_interceptor_pipeliner_test.go, line 1231 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Give this a comment. We've been good about doing so in these tests so far.
Good test though.
done
pkg/kv/txn_interceptor_pipeliner_test.go, line 1244 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
s/write/writes/
done
pkg/kv/txn_interceptor_pipeliner_test.go, line 1267 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
s/we'll/we/
the future here is meant to signal that the comment does not apply to the lines immediately below, but to something that's coming later.
pkg/kv/txn_interceptor_pipeliner_test.go, line 1292 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Want to check the
footprint
?
I did one line above, no?
pkg/kv/txn_interceptor_pipeliner_test.go, line 1307 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
This is interesting. It begs the question of why we need to update the savepoint itself. Is it not enough to update the TxnCoordSender?
Updating the savepoint too is an optimization for the case when we'll rollback to it again.
pkg/kv/testdata/savepoints, line 322 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
!!!
Can this test be revived?
done
pkg/sql/conn_executor_exec.go, line 945 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
TODO(andrei): ...
I don't think you're gonna get any help with that one any time soon.
Or just delete the TODO. Is it useful anymore?
:)
gone
pkg/sql/conn_executor_exec.go, line 1006 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Is this used? I thought we just got rid of the
restartWait
state?
not used. gone.
pkg/sql/conn_executor_savepoints.go, line 200 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Should we push these errors into
client.RollbackToSavepoint
?
done
pkg/sql/conn_executor_savepoints.go, line 276 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I'm still confused about why we need this here and in the
kv.savepoint
object.
they're gone, left just in kv
13b6c4c
to
3c811c6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @knz, and @nvanbenschoten)
pkg/kv/txn_coord_sender.go, line 179 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
s/the needs/the transaction needs/
done
pkg/kv/txn_interceptor_span_refresher_test.go, line 559 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Same point about adding a comment.
done
pkg/sql/conn_executor.go, line 2483 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I don't understand this. Why does validating the name indicate whether this is the
cockroach_restart
savepoint?
I don't know what was going on at the time of this commit, but in the latter commit this is changed to isCommitOnReleaseSavepoint
.
pkg/sql/conn_executor_savepoints.go, line 62 at r4 (raw file):
We actually create a kv.savepoint object for cockroach_restart now? I'm surprised. Isn't there a cost to this?
I've optimized the creation of kv.savepoints for initial savepoints - since the empty value suffices for those.
Is this imposing strange behavior on the TxnCoordSender since we still want to restart when we roll back to these savepoints?
We don't need to do anything special when rolling back to these savepoints - which is nice. If we're rolling back to them from the Open state, then they behave just list regular savepoints (we don't need to restart anything; we used to before this patch but not any more). When rolling back to them from the Aborted state, the transaction will have been already restarted if there was a retriable error prior.
pkg/sql/conn_executor_savepoints.go, line 310 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
This copy is unfortunate. Can we do something smarter with aliasing the slices and copying on write? I suspect that even without reference counting to eliminate redundant copied, it would be cheaper to simply copy on every call to
push
.Do you mind comparing the relative frequency of these method calls so we can decide on the right solution here? I'm working under the assumption that
setTxnRewindPos
is going to be called far more often thanexecSavepointInOpenState
.
setTxnRewindPos
should be called less than execSavepointInOpenState
. setTxnRewindPos
is called on the order of once per transaction.
Still, I've tried for a hot minute to optimize the common handling of a stack of savepoints only consisting of the commitOnRelease
one, but it's not entirely trivial because the name of that savepoint is not always cockroach_restart
, depending on session variables. I've optimized the cloning of the nil
slice, through, which should be useful for connections that never use savepoints.
I might try harder, but after this PR.
pkg/sql/conn_executor_savepoints.go, line 337 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Could you explain again what those semantics are?
Also, move this to the top of the file. It's buried down here but pretty important to understand in order to read the rest of the file.
done
pkg/sql/conn_executor_savepoints_test.go, line 193 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Can we use
sql.OpenStateStr
andsql.NoTxnStr
here?
done
pkg/sql/conn_fsm.go, line 58 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
And non-retryable errors, right?
done
pkg/sql/conn_fsm.go, line 92 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: stray line.
done
pkg/sql/conn_fsm.go, line 115 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
What is this referring to?
stale. removed.
pkg/sql/conn_fsm.go, line 144 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
"to from"
but it's correct :)
pkg/sql/conn_fsm.go, line 190 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
What went into the decision to have a separate event instead of a single
type eventSavepointRollback struct{Initial fsm.Bool}
?
pretty arbitrary, but I think it looks better in the state machine diagram
pkg/sql/logictest/testdata/logic_test/manual_retry, line 59 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
!!!
done
pkg/sql/opt/exec/execbuilder/relational.go, line 144 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: no need for the variable.
done
pkg/sql/testdata/savepoints, line 1 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Nice tests!
Mostly Raphael.
3c811c6
to
e8b3ad4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @knz, and @nvanbenschoten)
pkg/kv/txn_interceptor_seq_num_allocator.go, line 180 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
It's worth adding a very small test for this. I know the implementation is trivial, but the role the
txnSeqNumAllocator
plays in populating savepoints is critical.
done
pkg/sql/conn_executor_savepoints.go, line 264 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Add more to this comment. This is the number of DDL statements that had been issued at the time that the savepoint was created, right?
done
pkg/sql/conn_fsm.go, line 193 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Are these two events limited to
cockroch_restart
or just commonly linked to it likeeventTxnRestart
?
the former. Emphasized more in the comment. Releasing a normal savepoint doesn't generate an event.
pkg/sql/conn_fsm.go, line 204 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: keep these in the same order as the type defs. Also ideally in the same order as the cases for a given state in the state transition map.
shuffled a bit
pkg/sql/logictest/testdata/logic_test/manual_retry, line 141 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Might as well quote the savepoint name so that we exactly mirror PG:
nathan=# BEGIN; BEGIN nathan=# ROLLBACK TO SAVEPOINT bar; ERROR: savepoint "bar" does not exist
done
pkg/sql/logictest/testdata/logic_test/manual_retry, line 165 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Could you query with the column names here and below?
done
pkg/sql/testdata/savepoints, line 303 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: s/check/Check/ here and below.
meh
pkg/sql/testdata/savepoints, line 330 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
It's too bad we aren't testing that this causes a txn retry and that the epoch is larger.
I've added a check that we don't see writes from before the rollback
pkg/sql/testdata/savepoints, line 335 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
"other than"?
done
pkg/sql/testdata/savepoints, line 526 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
s/rewing/rewind/
done
e8b3ad4
to
a99e49e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz and @nvanbenschoten)
pkg/sql/opt/exec/execbuilder/builder.go, line 76 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Someone else should sign off on whether this is a valid use of
execbuilder.Builder
.cc. @RaduBerinde
Looks fine to me. The whole setsystemconfig trigger thing could be passed to the execbuilder as a function since it's really higher-level logic IMO but no need to do that now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 3 files at r12, 30 of 41 files at r13, 3 of 3 files at r14.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @andreimatei and @knz)
pkg/kv/txn_interceptor_pipeliner.go, line 683 at r14 (raw file):
} // If the tree has not been cloned before, we can attempt a fast path where we
Remove this comment.
pkg/kv/txn_interceptor_pipeliner.go, line 801 at r14 (raw file):
} // AsSlice returns the in-flight writes, ordered by key.
Since we're not exporting any of these methods, name this asSlice
.
06fc1d2
to
ebe8e55
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
@rafiss, I've updated the pgjdbc
blacklist removing the 56 tests that are now passing.
I've also run the django
test, and it passed a bunch of tests and wanted
var djangoBlacklist20_1 = blacklist{
"admin_views.tests.GroupAdminTest.test_group_permission_performance": "unknown",
}
but it also claimed that only 10 tests failed unexpectedly and making that change would remove more than 10. So I don't know what's going on and I haven't touched the blacklist.
I also ran the hibernate
test which claims that 20 tests failed unexpectedly. I don't immediately see a rhyme in the failed ones, so I'm gonna burry my head in the sand and move on.
sqlalchemy
and typeorm
passed without any fuss.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @knz and @nvanbenschoten)
pkg/kv/txn_interceptor_pipeliner.go, line 683 at r14 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Remove this comment.
done
pkg/kv/txn_interceptor_pipeliner.go, line 801 at r14 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Since we're not exporting any of these methods, name this
asSlice
.
done
bors r+ |
Merge conflict |
Before this patch, a transaction was releasing its locks immediately after a statement encountered an error, when entering the Aborted state. With the upcoming support for savepoints, which will allow error recovery, this is no longer desirable (or, at least, not always desirable). This patch removes the rollback that happened when transitioning to the Aborted state. Instead, we defer the rollback to the final ROLLBACK statement, which transitions us out of the Aborted state. Release note: None
We have some metrics around savepoints which are referenced both in some SQL category, and in a KV transactions category. I'm removing the references from the KV section, I don't think they belong. And generally I understand that this catalog is not used for anything. Release note: None
Release note (sql change): CockroachDB now collects separate sets of metrics for usage of SAVEPOINT: one set for regular SQL savepoints and one set for uses dedicated to CockroachDB's client-side transaction retry protocol.
This patch adds support for SAVEPOINT <foo>, RELEASE SAVEPOINT <foo>, ROLLBACK TO SAVEPOINT <foo>. Before this patch, we only had support for the special savepoint cockroach_restart, which had to be placed at the beginning of the transaction and was specifically intended for dealing with transaction retries. This patch implements general support for savepoints, which provide an error recovery mechanism. The connExecutor now maintains a stack of savepoints. Rolling back to a savepoint uses the recent KV api for ignoring a range of write sequence numbers. At the SQL level, savepoints differ in two characteristics: 1) savepoints placed at the beginning of a transaction (i.e. before any KV operations) are marked as "initial". Rolling back to an initial savepoint is always possible. Rolling back to a non-initial savepoint is not possible after the transaction restarts (see below). 2) the savepoint named "cockroach_restart" retains special RELEASE semantics: releasing it (attempts to) commit the underlying KV txn. This continues to allow for discovering of deferred serilizability errors (i.e. write timestamp pushes by the ts cache). As before, this RELEASE can fail with a retriable error, at which point the client can do ROLLBACK TO SAVEPOINT cockroach_restart (which is guaranteed to work because cockroach_restart needs to be an "initial" savepoint). The transaction continues to maintain all its locks after such an error. This is all in contrast to releasing any other savepoints, which cannot commit the txn and also never fails. See below for more discussion. The cockroach_restart savepoint is only special in its release behavior, not in its rollback behavior. With the implementation of savepoints, the state machine driving a SQL connection's transactions becomes a lot simpler. There's no longer a distinction between an "Aborted" transaction and one that's in "RestartWait". Rolling back to a savepoint now works the same way across the two states, so RestartWait is gone. This patch also improves the KV savepoints. They now capture and restore the state of the read spans and the in-flight writes. Some things don't work (yet): a) Rolling back to a savepoint created after a schema change will error out. This is because we don't yet snapshot the transaction's schema change state. b) After a retriable error, you can only rollback to an initial savepoint. Attempting to rollback to a non-initial savepoint generates a retriable error again. If the trasaction has been aborted, I think this is the right behavior; no recovery is possible since the transaction has lost its write intents. In the usual case where the transaction has not been aborted, I think we want something better but it will take more work to get it. I think the behavior we want is the following: - after a serializability failure, retrying just part of the transaction should be doable by attempting a ROLLBACK TO SAVEPOINT. This rollback should succeed if all the non-rolled-back reads can be refreshed to the desired timestamp. If they can be refreshed, then the client can simply retry the rolled back part of the transaction. If they can't, then the ROLLBACK should return a retriable error again, allowing the client to attempt a deeper rollback - and so on until the client rolls back to an initial savepoint (which succeeds by definition). Implementing this would allow for the following nifty pattern: func fn_n() { for { SAVEPOINT savepoint_n try { fn_n+1() } catch retriable error { err := ROLLBACK TO SAVEPOINT outer if err != nil { throw err } continue } RELEASE SAVEPOINT savepoint_n break } } The idea here is that the client is trying to re-do as little work as possible by successively rolling back to earlier and earlier savepoints. This pattern will technically work with the current patch already, except it will not actually help the client in any way since all the rollbacks will fail until we get to the very first savepoint. There's an argument to be made for making RELEASE SAVEPOINT check for deferred serializability violations (and perhaps other deferred checks - like deferred constraint validation), although Postgres doesn't do any of these. Anyway, I've left implementing this for a future patch because I want to do some KV work for supporting it nicely. Currently, the automatic restart behavior that KV transactions have is a pain in the ass since it works against what we're trying to do. For the time-being, non-initial savepoints remember their txn ID and epoch and attempting to rollback to them after these changes produces a retriable error automatically. Fixes cockroachdb#45477 Touches cockroachdb#10735 Release note (sql change): SQL savepoints are now supported. SAVEPOINT <foo>, RELEASE SAVEPOINT <foo>, ROLLBACK TO SAVEPOINT <foo> now works. `SHOW SAVEPOINT STATUS` can be used to inspect the current stack of active savepoints. Co-authored-by: Raphael 'kena' Poss <[email protected]> Co-authored-by: Andrei Matei <[email protected]>
ebe8e55
to
f1e2a00
Compare
bors r+ |
Build succeeded |
There was no reason for you to rush this merge before I had a look, was there? In fact the release note here is incomplete and there are still a few missing tests. I'll issue the complementary PR. |
actually you even botched it... I'll file the additional issues |
I merged it cause I'm away until Tuesday. I thought you told me you had looked... |
all good - we can fix things next week. Enjoy your weekend |
This change updates the syntax diagram definitions and generated BNF for several SAVEPOINT-related statements, specifically: - Add the SHOW SAVEPOINT STATUS statement to the list of syntax diagrams generated by pkg/cmd/docgen - Add the SHOW SAVEPOINT STATUS BNF file to the other generated BNF files - Update ROLLBACK TO SAVEPOINT to note that the savepoint name does not have to be 'cockroach_restart' It uses the changes in cockroachdb#45794, which enabled docgen for SHOW SAVEPOINT STATUS. It is part of the work surrounding cockroachdb#45566, which added preliminary SQL savepoints support. Release justification: low-risk update to documentation diagrams Release note: None
45962: sql: re-add GC job on schema element deletion r=pbardea a=pbardea This commit creates GC jobs upon the deletion of an index, table or database. Similarly to the previous implementation, it considers the walltime at which the schema change completed to be the drop time of the schema element. Release note (sql change): Previously, after deleting an index, table, or database the relevant schema change job would change its running status to waiting for GC TTL. The schema change and the GC process are now decoupled into 2 jobs. Release justification: This is a follow up to the migration of turning schema changes into actual jobs. This commit re-adds the ability to properly GC indexes and tables. 46048: docgen: update savepoint-related definitions, bnfs r=rmloveland a=rmloveland This change updates the syntax diagram definitions and generated BNF for several SAVEPOINT-related statements, specifically: - Add the SHOW SAVEPOINT STATUS statement to the list of syntax diagrams generated by pkg/cmd/docgen - Add the SHOW SAVEPOINT STATUS BNF file to the other generated BNF files - Update ROLLBACK TO SAVEPOINT to note that the savepoint name does not have to be 'cockroach_restart' It uses the changes in #45794, which enabled docgen for SHOW SAVEPOINT STATUS. It is part of the work surrounding #45566, which added preliminary SQL savepoints support. Release justification: low-risk update to documentation diagrams Release note: None Co-authored-by: Paul Bardea <[email protected]> Co-authored-by: Rich Loveland <[email protected]>
Informs (but does not resolve) #10735.
This patch adds support for SAVEPOINT , RELEASE SAVEPOINT ,
ROLLBACK TO SAVEPOINT .
Before this patch, we only had support for the special savepoint
cockroach_restart, which had to be placed at the beginning of the
transaction and was specifically intended for dealing with transaction
retries. This patch implements general support for savepoints, which
provide an error recovery mechanism.
The connExecutor now maintains a stack of savepoints. Rolling back to a
savepoint uses the recent KV api for ignoring a range of write sequence
numbers.
At the SQL level, savepoints differ in two characteristics:
KV operations) are marked as "initial". Rolling back to an initial
savepoint is always possible. Rolling back to a non-initial savepoint is
not possible after the transaction restarts (see below).
semantics: releasing it (attempts to) commit the underlying KV txn.
This continues to allow for descovering of deferred serilizability
errors (i.e. write timestamp pushes by the ts cache). As before, this
RELEASE can fail with a retriable error, at which point the client can
do ROLLBACK TO SAVEPOINT cockroach_restart (which is guaranteed to work
because cockroach_restart needs to be an "initial" savepoint). The
transaction continues to maintain all its locks after such an error.
This is all in contrast to releasing any other savepoints, which cannot
commit the txn and also never fails. See below for more discussion.
The cockroach_restart savepoint is only special in its release behavior,
not in its rollback behavior.
With the implementation of savepoints, the state machine driving a SQL
connection's transactions becomes a lot simpler. There's no longer a
distinction between an "Aborted" transaction and one that's in
"RestartWait". Rolling back to a savepoint now works the same way across
the two states, so RestartWait is gone.
This patch also improves the KV savepoints. They now capture and restore
the state of the read spans and the in-flight writes.
Some things don't work (yet):
a) Rolling back to a savepoint created after a schema change will error
out. This is because we don't yet snapshot the transaction's schema
change state.
b) After a retriable error, you can only rollback to an initial
savepoint. Attempting to rollback to a non-initial savepoint generates a
retriable error again. If the trasaction has been aborted, I think this
is the right behavior; no recovery is possible since the transaction has
lost its write intents. In the usual case where the transaction has not
been aborted, I think we want something better but it will take more
work to get it. I think the behavior we want is the following:
should be doable by attempting a ROLLBACK TO SAVEPOINT. This rollback
should succeed if all the non-rolled-back reads can be refreshed to the
desired timestamp. If they can be refreshed, then the client can simply
retry the rolled back part of the transaction. If they can't, then the
ROLLBACK should return a retriable error again, allowing the client to
attempt a deeper rollback - and so on until the client rolls back to an
initial savepoint (which succeeds by definition).
Implementing this would allow for the following nifty pattern:
The idea here is that the client is trying to re-do as little work as
possible by successively rolling back to earlier and earlier savepoints.
This pattern will technically work with the current patch already,
except it will not actually help the client in any way since all the
rollbacks will fail until we get to the very first savepoint.
There's an argument to be made for making RELEASE SAVEPOINT check for
deferred serializability violations (and perhaps other deferred checks -
like deferred constraint validation), although Postgres doesn't do any
of these.
Anyway, I've left implementing this for a future patch because I want to
do some KV work for supporting it nicely. Currently, the automatic
restart behavior that KV transactions have is a pain in the ass since it
works against what we're trying to do.
For the time-being, non-initial savepoints remember their txn ID and
epoch and attempting to rollback to them after these changes produces a
retriable error automatically.
Release note (sql change): SQL savepoints are now supported. SAVEPOINT
, RELEASE SAVEPOINT , ROLLBACK TO SAVEPOINT now works.
SHOW SAVEPOINT STATUS
can be used to inspect the current stack of activesavepoints.
Co-authored-by: Raphael 'kena' Poss [email protected]
Co-authored-by: Andrei Matei [email protected]