-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
distsql: expressions evaluation uses the wrong Txn sometimes #41992
Comments
Very good idea. |
I started thinking about this issue, and here are some thoughts. I think we want to change
This will allow us to figure out what txn the flow should be using before
(I'm thinking that we probably should do option 2 with the flow also having a reference to txn, but without processors/operators having a reference to the flow.) This will allow us to remove The main complication of Thoughts? |
I think you're on to something here but this needs to be thought about a little more. It would become performance problem to multiply the number of arguments (and overall number of bytes on the stack) threaded at every level of the Here's a more thorough analysis: the
So here you are. To start, IMHO merely threading Here's a strategy that probably works better:
In this story the However if we ever consider running multiple scalar evaluations for the same statement in parallel in the future, we may want to share a single |
Nice analysis. Here's a potential counterpoint: per-row Is this mutually agreeable? If so, I would argue that it's not a big deal to thread an extra argument through. But if you say it's still disagreeable that's fine. I just want us to move away from using "performance of |
I don't have a strong opinion on whether the I have a broader question than all of this, though. Do we actually ever need the transaction in builtins (or, at least, do the 18 pg_ builtins that currently use it really need to use it)? What is it exactly that needs to be done in the same transaction as the one running the statement? In other words, what writes from said txn need to be seen by the queries done by the builtins? Cause if there's no good reason to use it, I'd rather not use it at all. Is it about seeing dirty schema elements? Cause for that, perhaps we could use a more restricted schema accessor interface. |
So during my travel I thought of something which changed my position a little. See below. Replying to Andrei first:
yes - they inspect the schema descriptors via vtables.
DDL and privilege grants.
The builtins issue SQL queries against pg_catalog tables. Good luck virtualizing all this. (The complexity is not warranted IMHO)
I like this argument too. New learnings on my side:
Here's an approach. Look at So here I'd suggest something like this. an |
Simply moving the |
What you call "cosmetics" is another team's cognitive overhead or the avoidance thereof.
I certainly would support that work too! But maybe there's something in-between where we only extract part of the fields into a differently scoped struct. Also I'd like to underline that once there are multiple differently-scoped structs we'll need another struct to hold these references and pass them around, which is pretty much what I was suggesting above. |
We have marked this issue as stale because it has been inactive for |
still current |
We have marked this issue as stale because it has been inactive for |
@yuzefovich @rharding6373 @DrewKimball I think we fixed this right? |
I don't think so - Jane and Andrew did some work around the internal executors to tie them to a concrete txn, so perhaps this issue has been mostly addressed. However, this space seems quite fragile and deserves more attention at some point, and this issue has lots of useful of context, so I'll keep it open. |
Gateway flows will sometimes use the Root txn erroneously for evaluating transactions, when the should be using a Leaf.
The processes of creating the processors/operators in a flow makes a copy of the
EvalCtx
from theFlowCtx
by using NewEvalCtx. This generally happens when processors/operators are instantiated. TheEvalCtx
contains a*client.Txn
. On the gateway, that transaction is initially the Root transaction. However, if there's any remote flows or if the gateway flow has any concurrency, we'll later switch the transaction that the gateway flow uses to a Leaf. We're switchingFlowCtx.Txn
andFlowCtx.EvalCtx.Txn
. Unfortunately,Unfortunately, by the time we switch that txn, all the processors have already captured the old one for the purposes of expression evaluation. This is bad because the gateway flow can have concurrency in it, and RootTxns don't support concurrency (#25329). Worse,
RootTxn
cannot be used in conjunction with remote leaves (this is another type of concurrency, really) because the Root might refresh at an inopportune time causing write skew.The txn is used during expression evaluation by some built-in function - many
pg_
ones use it to query system tables using theinternalExecutor
.I think the solution here is to pass the transaction explicitly to the builtins that need it (one way or another). In fact I think it'd be a good idea to take the txn out of the
EvalCtx
completely and leave theEvalCtx
for immutable session attributes.This is related to #15670 which complains that the
context.Context
that captured through theEvalCtx
and used by expression evaluation is similarly the wrong one.It is also related to #41222, which complains that we might be missing to collect the metadata of transactions used for expression evaluation. This all speaks to the fact that expression evaluation was kinda bolted onto DistSQL processors without sufficient smoothing.
Jira issue: CRDB-5401
The text was updated successfully, but these errors were encountered: