-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: crdb_internal.jobs cannot be safely joined with other tables #62415
Comments
@fqazi to bisect |
After bisecting it came down to:
|
@ajwerner Do you have any thoughts about the right thing here? This is an issue between crdb_internal.force_retry (for the WHEN COMPLETE clause) and QueryBufferedEx of some sort. I don't know if creating a new transaction in the internal table would be correct either. |
Fascinating. Maybe we can rework the query. Let me chew on this a bit. Thanks for the bisect. |
I'm afraid we're violating some principles with some of the new streaming stuff inside of virtual tables. Namely, cockroach transactions don't support parallelism. To deal with this, we have this cool device called a leaf transaction which isn't too heavy in terms of resources but comes with some rules for how to deal with its errors and its state propagation. I don't think we're using these leaf transactions when we kick off these concurrent scans. @yuzefovich let's sync up on what to do about this. |
I'm worried that it's even worse than this. I'm worried we can't even get a leaf to work properly here. |
|
SHOW JOBS WHEN COMPLETE (SELECT job_id FROM [SHOW JOBS])
broken
I briefly looked into this, and I agree that it is pretty bad. I'm not sure what to do as I'm not very familiar with internals of how SQL interfacts with KV txns, so definitely let's chat, Andrew. I also audited the usage of the streaming internal executor API and found the following possibly concerning additional callsites:
|
I've just talked with Andrew, and we believe we see a good path forward that doesn't require very invasive changes and should be easily backportable without sacrificing the gains that streaming internal executor work provided. The idea is that we'll require synchronization between the |
FWIW/FYI There's a tech note in docs/tech-notes that explains leaf and root txns and what's allowed and what not. |
@yuzefovich I'm re-opening this until we get the backports we want merged into 21.1. |
#62923 is now merged which fixes this issue - hopefully - for good. |
Describe the problem
This seems like a major regression. This works on 20.2.6. I don't understand the source of concurrency. This query does some crazy stuff down here:
cockroach/pkg/sql/delegate/show_jobs.go
Lines 63 to 71 in 0796cbc
The root of the problem is that we have this streaming iterator thing which we kick off underneath a virtual plan node. That thing, however, concurrently uses the same transaction as we use elsewhere. This leads to problems. The below query is just one example. There are plenty of other examples.
The text was updated successfully, but these errors were encountered: