Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: cleanup making copies of the eval context #126036

Merged
merged 4 commits into from
Jun 25, 2024

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented Jun 21, 2024

This PR audits all processors to avoid now-redundant (as of #125715) copies of the eval.Context as well as performs some miscellaneous cleanup along the way.

Some microbenchmarks:

name                   old time/op    new time/op    delta
LastWriteWinsInsert-8    1.37ms ±13%    1.41ms ±16%    ~     (p=0.265 n=20+20)

name                   old alloc/op   new alloc/op   delta
LastWriteWinsInsert-8     253kB ± 2%     247kB ± 2%  -2.65%  (p=0.000 n=19+20)

name                   old allocs/op  new allocs/op  delta
LastWriteWinsInsert-8     1.67k ± 0%     1.67k ± 0%  -0.18%  (p=0.033 n=18+19)
FlowSetup/vectorize=true/distribute=true-8       144µs ± 3%     142µs ± 3%   -1.25%  (p=0.005 n=25+25)
FlowSetup/vectorize=true/distribute=false-8      139µs ± 6%     138µs ± 3%     ~     (p=0.260 n=24+22)
FlowSetup/vectorize=false/distribute=true-8      137µs ± 2%     138µs ± 4%     ~     (p=0.158 n=22+25)
FlowSetup/vectorize=false/distribute=false-8     132µs ± 2%     133µs ± 3%   +1.21%  (p=0.001 n=24+25)

name                                          old alloc/op   new alloc/op   delta
FlowSetup/vectorize=true/distribute=true-8      22.9kB ± 9%    18.9kB ± 5%  -17.18%  (p=0.000 n=24+21)
FlowSetup/vectorize=true/distribute=false-8     21.3kB ±10%    17.7kB ±12%  -17.04%  (p=0.000 n=22+22)
FlowSetup/vectorize=false/distribute=true-8     27.1kB ± 0%    25.3kB ± 0%   -6.62%  (p=0.000 n=20+20)
FlowSetup/vectorize=false/distribute=false-8    25.6kB ± 0%    23.8kB ± 0%   -6.99%  (p=0.000 n=20+20)

name                                          old allocs/op  new allocs/op  delta
FlowSetup/vectorize=true/distribute=true-8         234 ± 2%       231 ± 1%   -1.01%  (p=0.000 n=20+21)
FlowSetup/vectorize=true/distribute=false-8        219 ± 8%       217 ± 7%   -0.93%  (p=0.016 n=22+22)
FlowSetup/vectorize=false/distribute=true-8        229 ± 0%       228 ± 0%   -0.44%  (p=0.000 n=21+21)
FlowSetup/vectorize=false/distribute=false-8       212 ± 0%       211 ± 0%   -0.47%  (p=0.000 n=20+20)

Epic: CRDB-39063.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@yuzefovich yuzefovich force-pushed the eval-ctx branch 4 times, most recently from 12e4ac4 to c67bd53 Compare June 21, 2024 22:09
@yuzefovich yuzefovich requested a review from mgartner June 21, 2024 22:09
@yuzefovich yuzefovich marked this pull request as ready for review June 21, 2024 22:09
@yuzefovich yuzefovich requested review from a team as code owners June 21, 2024 22:09
@yuzefovich yuzefovich requested review from dt, rharding6373 and DrewKimball and removed request for a team, dt and rharding6373 June 21, 2024 22:09
Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 10 of 10 files at r1, 20 of 20 files at r2, 3 of 3 files at r3, 29 of 29 files at r4, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball and @yuzefovich)


pkg/sql/execinfra/flow_context.go line 49 at r4 (raw file):

	// expressions with this EvalCtx should get a copy with NewEvalCtx instead
	// of using this field.
	EvalCtx *eval.Context

Would making this un-exported be helpful in limiting its usage?


pkg/sql/execinfra/processorsbase.go line 750 at r4 (raw file):

// execinfrapb.ProcessorSpec for more details).
//
// NB: it is assumed that the caller will not modify the eval context.

In the long-term it'd be nice to follow the example of mutable and immutable descriptors so that we can lean on the type system to enforce this. The API would make it impossible for an ImmutableEvalCtx to be mutated (outside of a small part of trusted code).


pkg/sql/execinfra/processorsbase.go line 794 at r4 (raw file):

		// If we haven't created a copy of the eval context, and we have some
		// renders, then we'll need to create a copy ourselves since we're going
		// to use the ExprHelper which might mutate the eval context.

ExprHelper mutates evalCtx only temporarily by pushing then popping an ivar container. Is that not safe? Or is there another mutation happening than I'm missing?

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 10 of 10 files at r1, 20 of 20 files at r2, 3 of 3 files at r3, 29 of 29 files at r4, all commit messages.
Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @yuzefovich)


pkg/sql/execinfra/processorsbase.go line 135 at r4 (raw file):

	coreOutputTypes []*types.T,
	semaCtx *tree.SemaContext,
	evalCtx *eval.Context,

[nit] Why not make it the responsibility of ProcOutputHelper to copy the evalCtx? It has the PostProcessSpec, so should be able to check for render expressions, right?

All processors that use ProcessorBaseNoHelper already have a reference
to the flow context, so storing it explicitly in the processor is
redundant.

Release note: None
This commit replaces some usages of `ProcessorBaseNoHelper.EvalCtx` field
with `FlowCtx.EvalCtx` whenever it's clear that it's safe (meaning that
those usages do not modify the eval context). It additionally replaces
usages of `EvalCtx.Settings` with `FlowCtx.Cfg.Settings`. Overall, the
goal is to reduce the usage of the eval context outside of actual
expression evaluation, and this commit should effectively be a no-op.

Release note: None
This commit removes embedded `execinfra.ProcOutputHelper`s from some
processors where they weren't actually used (because either the
post-processing spec is empty or the processor only produces metadata).
This should effectively be a no-op change.

Release note: None
This commit audits all processors to ensure that we make copies of the
eval context only when necessary. Up until recently every processor
mutated the context by modifying the `context.Context` field, but now
that field has been removed, so only a handful of processors mutate the
context (namely, whenever a processor explicitly mutates the context or
when it uses the rendering / filtering capabilities of the ExprHelper).
This commit makes it explicit whenever a copy is needed and uses the
flow's eval context when no mutation occurs. In order to clarify
different usages it removes `ProcessorBaseNoHelper.EvalCtx` field and
moves it into each processor if it needs it. Rendering capabilities are
used implicitly by many processors via `ProcOutputHelper`, and if
a processor itself doesn't need a mutable eval context, we can create
a copy for the helper without exposing it to the processor itself.

Release note: None
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @DrewKimball and @mgartner)


pkg/sql/execinfra/flow_context.go line 49 at r4 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Would making this un-exported be helpful in limiting its usage?

I don't think that we want to be limiting usage of this field per se - it exposes a lot of things that are needed by many processors.

I think the right way forward - philosophically speaking - is breaking down eval.Context into smaller pieces and moving some of them directly into FlowCtx object. Raphael did an audit of the fields in the eval.Context a while ago, but it'll be a lot of effort to actually achieve it. So for now treating FlowCtx.EvalCtx as a read-only field seems acceptable (vast majority of processors use it in this way).


pkg/sql/execinfra/processorsbase.go line 135 at r4 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

[nit] Why not make it the responsibility of ProcOutputHelper to copy the evalCtx? It has the PostProcessSpec, so should be able to check for render expressions, right?

Yep, good point, done.


pkg/sql/execinfra/processorsbase.go line 750 at r4 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

In the long-term it'd be nice to follow the example of mutable and immutable descriptors so that we can lean on the type system to enforce this. The API would make it impossible for an ImmutableEvalCtx to be mutated (outside of a small part of trusted code).

Yeah, it's one way to improve things. Another is to break down the eval.Context altogether.


pkg/sql/execinfra/processorsbase.go line 794 at r4 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

ExprHelper mutates evalCtx only temporarily by pushing then popping an ivar container. Is that not safe? Or is there another mutation happening than I'm missing?

Yeah, that's the only mutation AFAIK. The problem is that we might have concurrency between processors running on the same node, so if two processors were to use the same eval.Context via the ExprHelper, we could have corruption / data races.

@craig craig bot merged commit c6d2f7d into cockroachdb:master Jun 25, 2024
21 of 22 checks passed
@yuzefovich yuzefovich deleted the eval-ctx branch June 25, 2024 00:36
@mgartner
Copy link
Collaborator

pkg/sql/execinfra/processorsbase.go line 794 at r4 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Yeah, that's the only mutation AFAIK. The problem is that we might have concurrency between processors running on the same node, so if two processors were to use the same eval.Context via the ExprHelper, we could have corruption / data races.

Ahh makes sense. Thanks for explaining.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants