colbuilder: vectorizing rendering on top of wrapped processors #83689

yuzefovich · 2022-07-01T05:20:02Z

sql: remove no longer used fields from PostProcessSpec

This commit removes no longer used OriginalOutputColumns and
OriginalRenderExprs fields from the PostProcessSpec - they
are no longer needed because we now propagate the set of needed
columns in the IndexFetchSpec.

Release note: None

colbuilder: check native support via processor core rather than the spec

This commit is just a mechanical change.

Release note: None

colbuilder: vectorizing rendering on top of wrapped processors

Previously, whenever we needed to wrap a row-by-row processor into the
vectorized flow, we would pass in the whole PostProcessSpec to that
processor. When this was originally implemented several years ago, this
was needed so that processors could determine the set of "needed
columns" (which should be decoded from KV responses). However, recently
we refactored that, and now the set of needed columns is passed via the
IndexFetchSpec. This means that we no longer need to pass the whole
PostProcessSpec when wrapping, and this commit takes advantage of that
observation in order to vectorize the evaluation of render expressions.
In particular, vectorizing of render expressions allows us to plan more
efficient vectorized builtins. We still pass all other parts of the
PostProcessSpec (meaning that the wrapped processor is still
responsible for projections, limits, and offsets) since those operations
will limit the number of datums that need to be converted to the
columnar in-memory format as well as provide limit hints to the wrapped
processors.

Release note: None

cockroach-teamcity · 2022-07-01T05:20:10Z

This change is

This commit removes no longer used `OriginalOutputColumns` and `OriginalRenderExprs` fields from the `PostProcessSpec` - they are no longer needed because we now propagate the set of needed columns in the `IndexFetchSpec`. Release note: None

This commit is just a mechanical change. Release note: None

Previously, whenever we needed to wrap a row-by-row processor into the vectorized flow, we would pass in the whole `PostProcessSpec` to that processor. When this was originally implemented several years ago, this was needed so that processors could determine the set of "needed columns" (which should be decoded from KV responses). However, recently we refactored that, and now the set of needed columns is passed via the `IndexFetchSpec`. This means that we no longer need to pass the whole `PostProcessSpec` when wrapping, and this commit takes advantage of that observation in order to vectorize the evaluation of render expressions. In particular, vectorizing of render expressions allows us to plan more efficient vectorized builtins. We still pass all other parts of the `PostProcessSpec` (meaning that the wrapped processor is still responsible for projections, limits, and offsets) since those operations will limit the number of datums that need to be converted to the columnar in-memory format as well as provide limit hints to the wrapped processors. Release note: None

cucaroach

Reviewed 2 of 2 files at r1, 1 of 1 files at r2, 1 of 3 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @michae2 and @yuzefovich)

pkg/sql/opt/exec/execbuilder/testdata/tpch_vec line 20803 at r3 (raw file):

        │   └ *colexecprojconst.projNEBytesBytesConstOp
        └ *colexecbase.constInt64Op
          └ *colexec.bufferOp

Why did this change so much?

yuzefovich

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @cucaroach and @michae2)

pkg/sql/opt/exec/execbuilder/testdata/tpch_vec line 20803 at r3 (raw file):

Previously, cucaroach (Tommy Reilly) wrote…

Why did this change so much?

Previously CASE statement was executed via the row-by-row engine as part of the PostProcessSpec of joinReader and now we vectorize it. caseOp has many branches and they are partially shown in EXPLAIN (VEC) diagram, and here we actually have a caseOp that is an input to another caseOp.

michae2

Nice! I had a question about the PostProcessSpec change, but otherwise this looks good.

Reviewed 2 of 2 files at r1, 1 of 1 files at r2, 3 of 3 files at r3, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @cucaroach, @DrewKimball, and @yuzefovich)

pkg/sql/execinfrapb/processors_base.proto line 45 at r1 (raw file):

  // population of the DistSQL diagrams, and if set, it takes precedence over
  // OutputColumns.
  repeated uint32 original_output_columns = 7 [packed = true];

nit: It looks like these fields were added in #71441 which I think was in 22.1 (is that right?). Is it a problem to remove these fields in 22.2? During an unfinalized upgrade from 22.1 to 22.2, could a 22.2 node receive a PostProcessSpec from a 22.1 node with these fields set?

yuzefovich

TFTRs!

bors r+

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @cucaroach, @DrewKimball, and @michae2)

pkg/sql/execinfrapb/processors_base.proto line 45 at r1 (raw file):

is that right?

Yep, these fields were added in 22.1 release cycle, and we stopped using them also during 22.1 cycle (if I'm not mistaken) 😄

Is it a problem to remove these fields in 22.2?

No. Note that even though we're removing the fields, we're marking them as reserved which is backwards-compatible way of removing protobuf fields. In a mixed 22.1-22.2 setup the old nodes will populate these fields, but then the newer nodes will ignore them; if the newer node is the gateway, then it won't populate the fields, but the older nodes will still be able to deserialize the messages (but these fields will be left unset - which is ok).

craig · 2022-07-12T00:50:18Z

Build succeeded:

GitHub CI (Cockroach)

yuzefovich mentioned this pull request Jul 1, 2022

server: return non-live, live info per table #83677

Merged

yuzefovich added 2 commits July 1, 2022 11:15

colbuilder: check native support via processor core rather than the spec

023a057

This commit is just a mechanical change. Release note: None

yuzefovich force-pushed the vectorized-rendering branch from c3e1459 to c4ab930 Compare July 1, 2022 18:50

yuzefovich changed the title ~~WIP on vectorized rendering~~ colbuilder: vectorizing rendering on top of wrapped processors Jul 1, 2022

yuzefovich force-pushed the vectorized-rendering branch from c4ab930 to f5acd81 Compare July 1, 2022 18:54

yuzefovich requested review from michae2, cucaroach and a team July 2, 2022 04:37

yuzefovich marked this pull request as ready for review July 2, 2022 04:37

cucaroach reviewed Jul 6, 2022

View reviewed changes

yuzefovich commented Jul 6, 2022

View reviewed changes

yuzefovich requested review from a team and DrewKimball July 11, 2022 16:09

michae2 approved these changes Jul 11, 2022

View reviewed changes

yuzefovich commented Jul 11, 2022

View reviewed changes

craig bot merged commit a7a10ff into cockroachdb:master Jul 12, 2022

yuzefovich deleted the vectorized-rendering branch July 12, 2022 01:14

ajwerner mentioned this pull request Sep 22, 2022

sql/colexec/colbuilder: consider pooling render operators #88525

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

colbuilder: vectorizing rendering on top of wrapped processors #83689

colbuilder: vectorizing rendering on top of wrapped processors #83689

yuzefovich commented Jul 1, 2022 •

edited

Loading

cockroach-teamcity commented Jul 1, 2022

cucaroach left a comment

yuzefovich left a comment

michae2 left a comment

yuzefovich left a comment

craig bot commented Jul 12, 2022

colbuilder: vectorizing rendering on top of wrapped processors #83689

colbuilder: vectorizing rendering on top of wrapped processors #83689

Conversation

yuzefovich commented Jul 1, 2022 • edited Loading

cockroach-teamcity commented Jul 1, 2022

cucaroach left a comment

Choose a reason for hiding this comment

yuzefovich left a comment

Choose a reason for hiding this comment

michae2 left a comment

Choose a reason for hiding this comment

yuzefovich left a comment

Choose a reason for hiding this comment

craig bot commented Jul 12, 2022

yuzefovich commented Jul 1, 2022 •

edited

Loading