-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batch inserts don't reliable work in 19.2 #43017
Comments
Zendesk ticket #4218 has been linked to this issue. |
cc @RaduBerinde @andreimatei for triage |
I don't think the planning or sql execution of upsert has changed materially between 19.1 - 19.2. I would start with KV. We will likely need to reproduce. |
Summarizing the private file in gdrive: The table has a composite PK with types (INT8, STRING). The INT8 portion of the PK is also a foreign key referencing another table, and has a secondary index (as required by the FK, even though it's redundant). The statements are Tagging @nvanbenschoten since parallel commits is the most likely suspect for a change between 19.1 and 19.2 (also RETURNING NOTHING was made a no-op in #35959, also for 19.2). |
The 7 concurrent transactions are each writing 256-row batches. Are we sure that these batches are disjoint from one another or could they be overlapping on either their primary key or on Could we get the query plan for the COUNT statement? I tried reproducing but didn't have any luck. @ricardocrdb could you try to get more information about what exactly was generating the batch inserts. A script that reproduces would get us a very long way. |
If we suspect that parallel commits is responsible, we can have them try disabling it and seeing if they can still reproduce. This would help narrow down what's going wrong.
|
FWIW, I'd be pretty surprised if that changes anything because these transactions shouldn't even be using parallel commits. If they're only running a single KV batch which is touching more than 128 keys (the default for |
@andy-kimball made a good point that we could easily test this by changing the |
@ricardocrdb and I met with the original reporter of this issue today. After some back and forth, we were able to narrow the issue down to a bug in the vectorized execution engine's handling of ordered, limited scans. The customer was maintaining a partial state of a table in their client application and it was getting confused by a buggy The customer seems to be ok with leaving the vectorized execution engine disabled until the issue is resolved. @ricardocrdb you mentioned that you were going to open a new issue to track the scan bug and attach the data set we were able to gather. Have you gotten a chance to do so? I'm going to go ahead and close this issue once we have the new one. |
User is running into a possible regression occurs in 19.2 where a transaction that involves batch
UPSERTS
of 256 rows is committing, but the results are not being shown when performing a count of the added rows. The user is performing anUPSERT
that can be found at the location here. This also contains the DDL for the relevant tables for the query.The customer has reported that the same transactions and batch
UPSERT
behave as expected in 19.1.3, as a COUNT shows the newly added rows.The text was updated successfully, but these errors were encountered: