-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: internal executor doesn't have session init'd properly and is just using go defaults #70888
Comments
What's the context in which this internal executor is being executed? You're correct that the global one hanging off the execCfg doesn't have any session data set. This came up recently when I was debugging something. The values from Lines 178 to 179 in d88ece5
If you use the |
I didn't catalog them all but I think it was mostly jobs related? |
Yeah, I jobs were not doing anything with session state. My guess is that we should rework the "default" session state for the global internal executor. I think it's a pretty small change but I suspect there'll be some surprising fallout. |
To be exact I was going around in circles trying to figure out why EnableZigzagJoin was false. This is purely about initializing sessions with the correct default state which isn't necessarily the same as Go default state which seems like an accidental oversight. |
Right, agreed. I ran into this the other day too doing something similar (slack thread). Very painful. I'm just saying the fix is to change the logic here. where we construct the session state if it's not provided (linked above and again below). I speculate that it'll break something subtle that's relying on these zero values. Lines 178 to 179 in d88ece5
|
I think #71246 might fix this -- is that right @RichardJCai ? |
Yeah I believe it should since we now have to pass session data to create the InternalExecutor. |
Woot, thanks, I'll verify and close this when that PR lands. |
Hm nevermind, I might have been mistaken. There is now a |
@rafiss This came up in Queries triage/discussion. I wasn't sure if there were plans to tackle it with https://cockroachlabs.atlassian.net/browse/CRDB-14492? |
cc @ZhouXing19 the (perhaps reluctantly) up and coming internal executor expert. |
I gave this another go on Friday and didn't get super far but regardless happy to collab or handoff. Probably one of those things better to get into 22.2 early to weed out all the gremlins. |
I'll pick this up, maybe along with #80262, which is refactoring the internal executor initialization. |
@ZhouXing19 When you take a look here, could you compile a list of settings where the go default differs from the CockroachDB default? |
What's the reason for needing this list? Also I'm pretty sure the list just what Tommy listed in the body of the issue. |
It would be nice to see where the go defaults and the cluster defaults diverge, it would be a much smaller list. The problem is not all internal executors are the same and there's specific overrides at play (like migration jobs opt out of distsql). The crime is things like InsertFastPath are disabled. |
I think I was more concerned here since cluster defaults can be updated so all these values can diverge but I think in this thread we're actually talking about the CRDB "default" assuming it's unchanged by users. I see what you mean now. |
Although you do bring up a good point about cluster settings that have side affects, like if sql.trace.txn.threshold is on should that apply to internal executors? |
Made a list for entries in |
…ecutor() This commit introduces two functions that allow users to run sql statements with an internal executor. We intend to limit the usage of a real internal executor only inside these functions, instead of free-floating or hanging off certain structs. In other words, we restrict the init of an internal executor. The motivation is that if an internal executor is used to run multiple sql statements in a txn manner, these executions are expected to use the same set of info (such as descriptor collections) among their conn executor. While this rule can be easily forgot by the user of internal executors. Hence we provide an interface that wraps the initialization of internal executors with the query executions, so that the users won't need to be worried. Informs: once all existing usages of the internal executors are replaced with the new interfaces proposed here, cockroachdb#70888 should be solved. Release note: None
…ecutor() This commit introduces two functions that allow users to run sql statements with an internal executor. We intend to limit the usage of a real internal executor only inside these functions, instead of free-floating or hanging off certain structs. In other words, we restrict the init of an internal executor. The motivation is that if an internal executor is used to run multiple sql statements in a txn manner, these executions are expected to use the same set of info (such as descriptor collections) among their conn executor. While this rule can be easily forgot by the user of internal executors. Hence we provide an interface that wraps the initialization of internal executors with the query executions, so that the users won't need to be worried. Informs: once all existing usages of the internal executors are replaced with the new interfaces proposed here, cockroachdb#70888 should be solved. Release note: None
…Executor() This commit introduces two functions that allow users to run sql statements with an internal executor. We intend to limit the usage of a real internal executor only inside these functions, instead of free-floating or hanging off certain structs. In other words, we restrict the init of an internal executor. The motivation is that if an internal executor is used to run multiple sql statements in a txn manner, these executions are expected to use the same set of info (such as descriptor collections) among their conn executor. While this rule can be easily forgot by the user of internal executors. Hence we provide an interface that wraps the initialization of internal executors with the query executions, so that the users won't need to be worried. Informs: once all existing usages of the internal executors are replaced with the new interfaces proposed here, cockroachdb#70888 should be solved. Release note: None
…Executor() This commit introduces two functions that allow users to run sql statements with an internal executor. We intend to limit the usage of a real internal executor only inside these functions, instead of free-floating or hanging off certain structs. In other words, we restrict the init of an internal executor. The motivation is that if an internal executor is used to run multiple sql statements in a txn manner, these executions are expected to use the same set of info (such as descriptor collections) among their conn executor. While this rule can be easily forgot by the user of internal executors. Hence we provide an interface that wraps the initialization of internal executors with the query executions, so that the users won't need to be worried. Informs: once all existing usages of the internal executors are replaced with the new interfaces proposed here, cockroachdb#70888 should be solved. Release note: None
…Executor() This commit introduces two functions that allow users to run sql statements with an internal executor. We intend to limit the usage of a real internal executor only inside these functions, instead of free-floating or hanging off certain structs. In other words, we restrict the init of an internal executor. The motivation is that if an internal executor is used to run multiple sql statements in a txn manner, these executions are expected to use the same set of info (such as descriptor collections) among their conn executor. While this rule can be easily forgot by the user of internal executors. Hence we provide an interface that wraps the initialization of internal executors with the query executions, so that the users won't need to be worried. Informs: once all existing usages of the internal executors are replaced with the new interfaces proposed here, cockroachdb#70888 should be solved. Release note: None
I keep stumbling over this. I added a new feature and I started with a cluster settings. Then I switched it to use a session variable and I almost thought I had done it right, but noticed that the CPU usage of background internal queries was still high. It's awkward because I don't know that I feel comfortable setting my session variable up as part of the "minimal session data". |
The session data used by the internal executor looks like all the session data defaults are just the go value defaults:
Is this correct? Shouldn't we init some values to their cluster setting defaults? Thinking about EnableZigzagJoin, InsertFastPath, DistSQLMode, VectorizeMode, WorkMemLimit (looks like we interpret 0 as Default, ie 64MB here). Worried about internal queries running slowly because they aren't using all our chops.
Jira issue: CRDB-10271
The text was updated successfully, but these errors were encountered: