Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

physicalplan: preevaluate subqueries on LocalExprs and always set LocalExprs #49891

Merged
merged 3 commits into from
Jun 10, 2020

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented Jun 5, 2020

physicalplan: preevaluate subqueries on LocalExprs

When the plan is local, we do not serialize expressions. Previously, in
such a case we would also not preevaluate the subqueries in the
expressions which made EXPLAIN (VEC) return unexpected plan (there
would tree.Subquery in the expression which we don't support in the
vectorized, so we would wrap the plan). Now we will preevalute the
subqueries before storing in the processor spec. AFAICT it affects only
this explain variant and nothing else.

Release note: None

colexec: improve expression parsing

This commit introduces colexec.ExprHelper that helps with expression
processing. Previously, we were allocating a new execinfra.ExprHelper
and were calling Init on it in order to get the typed expression from
possibly serialized representation of each expression. Now, this new
expression helper is reused between all expressions in the flow on
a single node.

There is one caveat, however: we need to make sure that we force
deserialization of the expressions during SupportsVectorized check if
the flow is scheduled to be run on a remote node (different from the one
that is performing the check). This is necessary to make sure that the
remote nodes will be able to deserialize the expressions without
encountering errors (if we don't force the serialization during the
check, we will use LocalExpr - if available - and might not catch
things that we don't support).

Release note: None

physicalplan: always store LocalExpr

Previously, we would set either LocalExpr (unserialized expression,
only when we have the full plan on a single node) or Expr (serialized
expression, when we have distributed plan as well as in some tests).
However, we could be setting both and making best effort to reuse
unserialized LocalExpr on the gateway even if the plan is distributed.
And this commit adds such behavior.

Fixes: #49810.

Release note: None

@yuzefovich yuzefovich added the do-not-merge bors won't merge a PR with this label. label Jun 5, 2020
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@yuzefovich yuzefovich force-pushed the eval-subquery branch 4 times, most recently from 0a4f2c0 to e19e92a Compare June 5, 2020 03:41
@yuzefovich yuzefovich changed the title WIP physicalplan: preevaluate subqueries on LocalExprs physicalplan: preevaluate subqueries on LocalExprs and always set LocalExprs Jun 5, 2020
@yuzefovich yuzefovich force-pushed the eval-subquery branch 4 times, most recently from ee33247 to 5310962 Compare June 10, 2020 03:07
When the plan is local, we do not serialize expressions. Previously, in
such a case we would also not preevaluate the subqueries in the
expressions which made `EXPLAIN (VEC)` return unexpected plan (there
would `tree.Subquery` in the expression which we don't support in the
vectorized, so we would wrap the plan). Now we will preevalute the
subqueries before storing in the processor spec. AFAICT it affects only
this explain variant and nothing else.

Release note: None
@yuzefovich yuzefovich removed the do-not-merge bors won't merge a PR with this label. label Jun 10, 2020
This commit introduces `colexec.ExprHelper` that helps with expression
processing. Previously, we were allocating a new `execinfra.ExprHelper`
and were calling `Init` on it in order to get the typed expression from
possibly serialized representation of each expression. Now, this new
expression helper is reused between all expressions in the flow on
a single node.

There is one caveat, however: we need to make sure that we force
deserialization of the expressions during `SupportsVectorized` check if
the flow is scheduled to be run on a remote node (different from the one
that is performing the check). This is necessary to make sure that the
remote nodes will be able to deserialize the expressions without
encountering errors (if we don't force the serialization during the
check, we will use `LocalExpr` - if available - and might not catch
things that we don't support).

Release note: None
Previously, we would set either `LocalExpr` (unserialized expression,
only we have the full plan on a single node) or `Expr` (serialized
expression, when we have distributed plan as well as in some tests).
However, we could be setting both and making best effort to reuse
unserialized `LocalExpr` on the gateway even if the plan is distributed.
And this commit adds such behavior.

Release note: None
Copy link
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 2 files at r1, 8 of 8 files at r2, 6 of 6 files at r3.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis)

@yuzefovich
Copy link
Member Author

yuzefovich commented Jun 10, 2020

TFTR!

bors r+

@craig
Copy link
Contributor

craig bot commented Jun 10, 2020

Build succeeded

@craig craig bot merged commit 2ae3d3c into cockroachdb:master Jun 10, 2020
@yuzefovich yuzefovich deleted the eval-subquery branch June 10, 2020 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

physicalplan: avoid serialization of expressions on the gateway in distributed flows
3 participants