-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv: scan doesn't return row added in the same txn #40487
Comments
Could you expand on
Are you running multiple KV checks in the same batch or are you issuing multiple batches in parallel? |
Different batches, in parallel. When the batches are run serially I don't see the issue. They are checking different parent tables so it's a bit puzzling. |
diff --git i/pkg/kv/txn_coord_sender.go w/pkg/kv/txn_coord_sender.go
index 5a14058eda..44beb1bb4d 100644
--- i/pkg/kv/txn_coord_sender.go
+++ w/pkg/kv/txn_coord_sender.go
@@ -453,6 +453,7 @@ func NewTxnCoordSenderFactory(
if tcf.heartbeatInterval == 0 {
tcf.heartbeatInterval = base.DefaultTxnHeartbeatInterval
}
+ tcf.heartbeatInterval = 5 * base.DefaultTxnHeartbeatInterval
if tcf.metrics == (TxnMetrics{}) {
tcf.metrics = MakeTxnMetrics(metric.TestSampleInterval)
} Discovered this yesterday. Increasing the heartbeat interval to trigger more aborted transactions drastically reduces how long it takes for the consistency error to manifest (from taking upwards of 60s to 7-9s, consistently). This points to our current flow cancellation procedure: when the root txn is aborted/abandoned, we set it up to not return results to the client so as to not miss seeing our own writes. Suspiciously we avoid doing this if the flow is local and is using the root txn, both of which apply here. |
CC @andreimatei |
The local can't miss to see its own writes; when using the @irfansharif, are you looking at this? Would you like me to take over? And if so, are Radu's instructions still the best repro? |
Banged my head over this all morning again to no avail, happy to have you take over. The best repro is with the patch I added above. The missing key error only ever occurs after a flurry of aborted transactions now to be retried. Naively this diff seemed awfully suspect, though I'm not familiar enough with pkg/sql to understand the reasoning as stated here. |
I was also going to pull on this thread next, though our usage of |
I think that comment refers to executing different |
Yeah, that was about |
I think the root/leaf stuff might be a red herring. None of the queries in question get run on remote nodes. The main query is a mutation, which can't be distributed. The postqueries contain nodes that prevent distribution ( |
So I think this is simply #25329; one can't have concurrent uses of a root transaction. And so I understand that the distsql union processor is probably broken. I think there's also funky something funky here: cockroach/pkg/sql/distsql_running.go Line 543 in 2109039
That check was changed to only affect leaf transactions, but I think it's useless for leaves. More tomorrow. |
Weird.. I assume you can have multiple scans in parallel in a Leaf txn (or distsql would be badly broken)? We could maybe set up a Leaf txn even if it's all local (?) |
Yeah, So, we have a couple of requirements:
Concurrency is introduced by processors inputs with more than one stream, which require synchronizers (or simply a So the plan I've discussed with @jordanlewis is the following:
|
To clarify requirement 2: is it allowed to have read-only queries on leaves concurrently with a root mutation? |
yes |
I've worked on this and made progress. Have yet to deal with the columnar execution world, but @jordanlewis helped me figure out how stuff works there. @RaduBerinde , do you want to disallow mutations under a |
Yeah, very easy to do from opt. Will think about other cases. |
I've made good progress on this, but I've now hit a something I wasn't prepared for: I naively thought that all |
So here's where we are on this: But I've recently realized that there's also another problem, worse than b). Unfortunately, my fix for b) in #41102 I believe actually makes c) more likely. Because the PR introduces uses of the LeafTxn on the gateway too, now queries that execute fully on the gateway but are not completely "fused" are also susceptible. This means queries with a In other words, I've lied when I answered @RaduBerinde's question:
I said yes. I now believe the answer is no. Except the question was implicitly scoped with concurrent uses of the root and leaves on the gateway. I believe even (existing) uses of remote leaves are unsafe. |
Before this patch, a DistSQL flow running on its gateway node would use the RootTxn for all its processors for row-based flows / all of its operators for vectorized flows. Some of these processors/operator can execute concurrently with one another. RootTxns don't support concurrent requests (see cockroachdb#25329), resulting in some reads possibly missing to see the transaction's own writes. This patch rectifies the situation by being discriminate about what parts of the gateway flow use the RootTxn and what parts use LeafTxns. LeafTxns can be used concurrently (and they can be used concurrently with a RootTxn too). Some things need the RootTxn - mutations. Since cockroachdb#40975, mutations can no longer run concurrently with other mutations. So, we use the RootTxn for all processors/operators fused with the DistSQLReceiver, and we use LeafTxns for all others. Processors and operators no longer use the txn from the FlowCtx (that field goes away). The exact details on how we determine what runs in what transactions depend on the type of flow. i) For row-based flows, the "head processor" and everything fused with it will use the RootTxn. All the other processors will use the Leaf. This is done in FlowBase.startInternal(). Processors get the txn to use through Run() and Start() (if they implement RowSource). ii) For vectorized flows, I went a different route. These guys don't have Run() method (they have an Init() but that's more inconvenient to pass a txn to). They do, however, have facilities for visiting trees of OpNodes. So, the operators that need a txn declare that by implementing a new KVOp interface. At flow setup time we visit all the operators and give them the right txn - in colexec.SetTxn(). Fixes cockroachdb#40487 Release justification: bug fix Release note (bug fix): Fix the possibility of confusing error messages caused by transactions that are about to abort missing to see previous writes performed by the same transaction.
This patch makes it so that we can optionally fuse the inputs into unordered and ordered sync with the sync (and also with the sync's consumer). Before this patch, the fusing logic was bailing on fusing ordered syncs for no reason (other than code complexity). This patch embraces the complexity. Before this patch, producers of unordered syncs were always run in parallel. This patch makes it so that they're optionally serialized (i.e. each source is consumed before moving on to the next source). The option is taken in case a query ends up running entirely on the gateway (either because it was forced to run on the gateway or because all the data happens to be on the gateway). If that's the case, then the concurrency was not giving us much. Dissallowing concurrency in these cases fixes #40487: concurrent use of a Root txn is not kosher and some queries planned entirely on the gateway need to use a Root (i.e. mutations). The serialization of row-based unordered syncs is done by implementing them through an orderedSync with no ordering. For vectorized ones, I've created a new SerialUnorderedSynchronizer. Fixes #40487 Release justification: Fixes bug. Release note: None
This patch makes it so that we can optionally fuse the inputs into unordered and ordered sync with the sync (and also with the sync's consumer). Before this patch, the fusing logic was bailing on fusing ordered syncs for no reason (other than code complexity). This patch embraces the complexity. Before this patch, producers of unordered syncs were always run in parallel. This patch makes it so that they're optionally serialized (i.e. each source is consumed before moving on to the next source). The option is taken in case a query ends up running entirely on the gateway (either because it was forced to run on the gateway or because all the data happens to be on the gateway). If that's the case, then the concurrency was not giving us much. Dissallowing concurrency in these cases fixes cockroachdb#40487 for "regular queries": concurrent use of a Root txn is not kosher and some queries planned entirely on the gateway need to use a Root (i.e. mutations). Everything going through the "normal" planning process is now asserted to not result in any concurrency if it resulted in a single flow (on the gateway). The serialization of row-based unordered syncs is done by implementing them through an orderedSync with no ordering. For vectorized ones, I've created a new SerialUnorderedSynchronizer. Release justification: Fixes bug. Release note: None
Before this patch, a DistSQL flow running on its gateway node would use the RootTxn for all its processors for row-based flows / all of its operators for vectorized flows if there are no remote flows. Some of these processors/operator can execute concurrently with one another. RootTxns don't support concurrent requests (see cockroachdb#25329), resulting in some reads possibly missing to see the transaction's own writes. This patch fixes things by using a LeafTxn on the gateway in case there's concurrency on the gateway or if there's any remote flows. In other words, the Root is used only if there's no remote flows and no concurrency. This is sufficient for supporting mutations (which need the Root), because mutations force everything to be planned on the gateway and so, thanks to the previous commit, there's no concurrency if that's the case. Fixes cockroachdb#40487 Touches cockroachdb#24798 Release justification: Fixes bad bugs. Release note: Fix a bug possibly leading to transactions missing to see their own previous writes (cockroachdb#40487).
Before this patch, a DistSQL flow running on its gateway node would use the RootTxn for all its processors for row-based flows / all of its operators for vectorized flows if there are no remote flows. Some of these processors/operator can execute concurrently with one another. RootTxns don't support concurrent requests (see cockroachdb#25329), resulting in some reads possibly missing to see the transaction's own writes. This patch fixes things by using a LeafTxn on the gateway in case there's concurrency on the gateway or if there's any remote flows. In other words, the Root is used only if there's no remote flows and no concurrency. This is sufficient for supporting mutations (which need the Root), because mutations force everything to be planned on the gateway and so, thanks to the previous commit, there's no concurrency if that's the case. Fixes cockroachdb#40487 Touches cockroachdb#24798 Release justification: Fixes bad bugs. Release note: Fix a bug possibly leading to transactions missing to see their own previous writes (cockroachdb#40487).
This patch makes it so that we can optionally fuse the inputs into unordered and ordered sync with the sync (and also with the sync's consumer). Before this patch, the fusing logic was bailing on fusing ordered syncs for no reason (other than code complexity). This patch embraces the complexity. Before this patch, producers of unordered syncs were always run in parallel. This patch makes it so that they're optionally serialized (i.e. each source is consumed before moving on to the next source). The option is taken in case a query ends up running entirely on the gateway (either because it was forced to run on the gateway or because all the data happens to be on the gateway). If that's the case, then the concurrency was not giving us much. Dissallowing concurrency in these cases fixes cockroachdb#40487 for "regular queries": concurrent use of a Root txn is not kosher and some queries planned entirely on the gateway need to use a Root (i.e. mutations). Everything going through the "normal" planning process is now asserted to not result in any concurrency if it resulted in a single flow (on the gateway). The serialization of row-based unordered syncs is done by implementing them through an orderedSync with no ordering. For vectorized ones, I've created a new SerialUnorderedSynchronizer. Release justification: Fixes bug. Release note: None
Before this patch, a DistSQL flow running on its gateway node would use the RootTxn for all its processors for row-based flows / all of its operators for vectorized flows if there are no remote flows. Some of these processors/operator can execute concurrently with one another. RootTxns don't support concurrent requests (see cockroachdb#25329), resulting in some reads possibly missing to see the transaction's own writes. This patch fixes things by using a LeafTxn on the gateway in case there's concurrency on the gateway or if there's any remote flows. In other words, the Root is used only if there's no remote flows and no concurrency. This is sufficient for supporting mutations (which need the Root), because mutations force everything to be planned on the gateway and so, thanks to the previous commit, there's no concurrency if that's the case. Fixes cockroachdb#40487 Touches cockroachdb#24798 Release justification: Fixes bad bugs. Release note: Fix a bug possibly leading to transactions missing to see their own previous writes (cockroachdb#40487).
This patch makes it so that we can optionally fuse the inputs into unordered and ordered sync with the sync (and also with the sync's consumer). Before this patch, the fusing logic was bailing on fusing ordered syncs for no reason (other than code complexity). This patch embraces the complexity. Before this patch, producers of unordered syncs were always run in parallel. This patch makes it so that they're optionally serialized (i.e. each source is consumed before moving on to the next source). The option is taken in case a query ends up running entirely on the gateway (either because it was forced to run on the gateway or because all the data happens to be on the gateway). If that's the case, then the concurrency was not giving us much. Dissallowing concurrency in these cases fixes cockroachdb#40487 for "regular queries": concurrent use of a Root txn is not kosher and some queries planned entirely on the gateway need to use a Root (i.e. mutations). Everything going through the "normal" planning process is now asserted to not result in any concurrency if it resulted in a single flow (on the gateway). The serialization of row-based unordered syncs is done by implementing them through an orderedSync with no ordering. For vectorized ones, I've created a new SerialUnorderedSynchronizer. Release justification: Fixes bug. Release note: None
Before this patch, a DistSQL flow running on its gateway node would use the RootTxn for all its processors for row-based flows / all of its operators for vectorized flows if there are no remote flows. Some of these processors/operator can execute concurrently with one another. RootTxns don't support concurrent requests (see cockroachdb#25329), resulting in some reads possibly missing to see the transaction's own writes. This patch fixes things by using a LeafTxn on the gateway in case there's concurrency on the gateway or if there's any remote flows. In other words, the Root is used only if there's no remote flows and no concurrency. This is sufficient for supporting mutations (which need the Root), because mutations force everything to be planned on the gateway and so, thanks to the previous commit, there's no concurrency if that's the case. Fixes cockroachdb#40487 Touches cockroachdb#24798 Release justification: Fixes bad bugs. Release note: Fix a bug possibly leading to transactions missing to see their own previous writes (cockroachdb#40487).
I've been experimenting with TPCC and the new foreign key path. When I made an improvement that runs multiple FK checks in parallel, I started getting occasional violation errors. I was able to reproduce this locally and capture some traces.
Repro steps:
origin/fk-violation-repro
branch; it feels like the error happens only when the server is under heavy load so I used GOMAXPROCS=4.The error is always the same:
This is the parent row in the
order
table which should have been inserted in the same transaction.I set up a local zipkin and below is the span for the join reader. You can find the relevant span by looking for the
omfg
tag ("annotation query" in zipkin UI). All 9 rows reference the sameorder
row (57, 2, 3029) so we scan 1 span. The KV request returns no rows (0 loop iterations
).This is the corresponding distsender span:
Earlier spans in the transaction confirm that we did in fact insert a row with these values.
The text was updated successfully, but these errors were encountered: