-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: speed up execbuilder phase #119095
opt: speed up execbuilder phase #119095
Conversation
Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
I've broken the second and third commits out into #119094. |
These changes significantly speed-up the execbuilder phase for complex queries with many relational expressions and columns. There is a regression for very simple queries, which previously benefitted from the zero-allocation, small number mode of
|
One more note: The manual |
The diff is quite large, so here's some commentary that might be helpful:
I'm happy to walk through it all or break this up in more PRs if that'd be helpful. |
e78a5e2
to
0e9d852
Compare
f918665
to
38daa14
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nice! I just have some nits and suggestions.
Reviewed 1 of 1 files at r1, 50 of 50 files at r2, 2 of 2 files at r3, 5 of 5 files at r4, 3 of 3 files at r5, 8 of 8 files at r6, 4 of 4 files at r7, 5 of 5 files at r8, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @michae2)
pkg/sql/opt/exec/execbuilder/builder.go
line 303 at r6 (raw file):
} md := b.mem.Metadata() b.colOrdsAlloc.Init(md.MaxColumn())
[nit] Shouldn't this already have been handled in New
?
pkg/sql/opt/exec/execbuilder/col_ord_map.go
line 46 at r5 (raw file):
// // TODO(mgartner): It is probably unreasonable to have more than 2^31 // ordinals in an execution node, so this could be []int32.
Isn't the number of ordinals capped by the max column ID? and column IDs are represented int32.
pkg/sql/opt/exec/execbuilder/col_ord_map.go
line 168 at r7 (raw file):
// Clear clears the map. The allocated memory is retained for future reuse. func (m colOrdMap) Clear() {
[nit] Since this is used more often with this commit, it might be worth copying in a global zeros slice instead like here:
cockroach/pkg/col/coldata/bytes.go
Line 508 in 51b5a26
var zeroElements = make([]element, MaxBatchSize) |
cockroach/pkg/col/coldata/bytes.go
Lines 523 to 529 in 51b5a26
// Zero out all elements, up to the capacity, and then restore the length of | |
// the vector. | |
l := len(b.elements) | |
b.elements = b.elements[:cap(b.elements)] | |
for n := 0; n < len(b.elements); { | |
n += copy(b.elements[n:], zeroElements) | |
} |
pkg/sql/opt/exec/execbuilder/col_ord_map.go
line 148 at r8 (raw file):
// ordinal, the method neither needs to scan through each key/value pair to find // the current maximum ordinal, nor keep complex data structures to track it. func (m colOrdMap) OrdUpperBound() int {
Under what circumstances does this end up being approximate? Are there places other than the methods in this file where the map is mutated? If it's only in Set
, maybe it would be worth doing the scan only when the max ord gets replaced with a smaller ord? That seems like it might be a rare case.
We could also do something lazy, and instead scan the map in this method if the upper bound is unset, and just unset it in Set
when the largest ord gets replaced by something smaller.
pkg/sql/opt/exec/execbuilder/relational.go
line 2611 at r7 (raw file):
outputCols.Set(join.ContinuationCol, maxOrd+1) // allExprCols is only needed for the lifetime of the function, so free
[nit] allExprCols
seems stale.
38daa14
to
fc068cd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball and @michae2)
pkg/sql/opt/exec/execbuilder/builder.go
line 303 at r6 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
[nit] Shouldn't this already have been handled in
New
?
Good catch. Done.
pkg/sql/opt/exec/execbuilder/col_ord_map.go
line 46 at r5 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
Isn't the number of ordinals capped by the max column ID? and column IDs are represented int32.
I think you're right. And #119150 (comment) shows that this yields a nice performance improvement.
I think it would be nice to detect an on overflow in the conversion from int->int32 and error, just to be safe. I originally thought we'd need Set
to return an error rather than panic to do this safely, a la #119150, but I don't think that's necessarily true. I think I went overboard when I removed all execbuilder panics in #99981. These functions can panic, we just need to be careful about all the entry point having a panic-catcher, including closures we create in execbuilder, like in 5c0a632.
pkg/sql/opt/exec/execbuilder/col_ord_map.go
line 168 at r7 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
[nit] Since this is used more often with this commit, it might be worth copying in a global zeros slice instead like here:
cockroach/pkg/col/coldata/bytes.go
Line 508 in 51b5a26
var zeroElements = make([]element, MaxBatchSize)
cockroach/pkg/col/coldata/bytes.go
Lines 523 to 529 in 51b5a26
// Zero out all elements, up to the capacity, and then restore the length of // the vector. l := len(b.elements) b.elements = b.elements[:cap(b.elements)] for n := 0; n < len(b.elements); { n += copy(b.elements[n:], zeroElements) }
I tried this and didn't see much of a difference in performance. Interestingly, it seemed slightly slower on my GCE worker. Using the new clear
builtin in Go 1.21 (https://pkg.go.dev/builtin#clear) did seem slightly faster. I'll leave as-is for now since it wasn't a major difference.
pkg/sql/opt/exec/execbuilder/col_ord_map.go
line 148 at r8 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
Under what circumstances does this end up being approximate? Are there places other than the methods in this file where the map is mutated? If it's only in
Set
, maybe it would be worth doing the scan only when the max ord gets replaced with a smaller ord? That seems like it might be a rare case.We could also do something lazy, and instead scan the map in this method if the upper bound is unset, and just unset it in
Set
when the largest ord gets replaced by something smaller.
Good idea. I've gone with your latter suggestion.
pkg/sql/opt/exec/execbuilder/relational.go
line 2611 at r7 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
[nit]
allExprCols
seems stale.
Done.
I reran the benchmarks on a GCE instance after applying the changes suggested by Drew and fixing a few bugs.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 10 of 10 files at r9, 5 of 5 files at r10, 3 of 3 files at r11, 8 of 8 files at r12, 4 of 4 files at r13, 3 of 3 files at r14, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @michae2)
pkg/sql/opt/exec/execbuilder/col_ord_map.go
line 168 at r7 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
I tried this and didn't see much of a difference in performance. Interestingly, it seemed slightly slower on my GCE worker. Using the new
clear
builtin in Go 1.21 (https://pkg.go.dev/builtin#clear) did seem slightly faster. I'll leave as-is for now since it wasn't a major difference.
Huh, good to know.
pkg/sql/opt/exec/execbuilder/col_ord_map.go
line 122 at r14 (raw file):
case m.maxIsUnknown(): // If the maximum ordinal is currently unknown, then leave it as-is. case m.ords[col] > 0 && m.ords[col] == int32(m.maxOrd) && ord < m.maxOrd:
Do you know under what circumstances we're changing a column's ordinal from anything other than "no ordinal"?
This commit adds a benchmark that measures only the execbuilder phase of optimization, and includes no other phases. Release note: None
As the execbuilder traverses a relational expression and recursively builds an `execPlan`, it creates mappings from column IDs to their ordinal position in the expression for each `execPlan` node. These mappings are used when building parent nodes to correctly map column IDs to indexed variables. In most cases the mappings are only used when building a parent, and never again. Prior to this commit, the column mappings were a field of `execPlan`, tying the lifetime of `execPlan` nodes and column mappings together. This commit decouples the lifetimes of both by removing the mapping field from `execPlan` and propagating mappings up as return values of recursive function calls. This will enable future optimizations that can reuse memory allocated for mappings that are no longer needed. Release note: None
This commit introduces a new struct, `colOrdMap`, which maps column IDs to ordinals. See the comment for `colOrdMap` for more details. This type will be used in execbuilder in future commits to store output column mappings. Release note: None
Output columns of execution nodes are now stored in `colOrdMap`s instead of `opt.ColMap`s. The `colOrdMapAllocator` struct, which is used to allocate new `colOrdMaps` has been added as a field of `Builder`. It currently is a simple implementation. Future commits will extend it to reuse allocated `colOrdMap`s when possible. Release note: None
This commit extends `colOrdOrdMapAllocator` with a `Free` method. Freed maps will be reused in future calls to `Alloc` instead of allocating a new map. The build functions of the major relational expressions have been updated to free maps when they are no longer needed. This reduces the number of maps allocated, especially for complex queries with many execution nodes. Informs cockroachdb#117546 Release note: None
fdf4841
to
a451975
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @michae2)
pkg/sql/opt/exec/execbuilder/col_ord_map.go
line 122 at r14 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
Do you know under what circumstances we're changing a column's ordinal from anything other than "no ordinal"?
I haven't confirmed that it ever happens, but there's nothing preventing it. We could try to catch places where this happens by panicking in test builds if an ordinal is overwritten. I also explored changing the signature of Set
so that it returned an error if it was overwriting a previously set ordinal. It made usage of the map a bit more messy, so I opted to not bother with it, but I can explore that again if you think it'd be worthwhile.
Also, I just pushed a minor change to MaxOrd
such that it re-memoizes the max ordinal so that subsequent calls will be constant time, until the max ordinal is overwritten again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 60 of 60 files at r15, 55 of 55 files at r16, 53 of 53 files at r17, 58 of 58 files at r18, 54 of 54 files at r19, 1 of 53 files at r20, 2 of 2 files at r21, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @michae2)
pkg/sql/opt/exec/execbuilder/col_ord_map.go
line 122 at r14 (raw file):
I also explored changing the signature of
Set
so that it returned an error if it was overwriting a previously set ordinal. It made usage of the map a bit more messy, so I opted to not bother with it, but I can explore that again if you think it'd be worthwhile.
That's fine. I thought it might simplify things to assume maxOrd is never invalidated, but if it makes things messier the current form is fine.
This commit makes `colOrdMap.MaxOrd()` a constant-time operation in most cases. See the newly added comments for more details. Release note: None
a451975
to
3f4d099
Compare
TFTRs! bors r+ |
Build succeeded: |
opt/bench: add benchmark for execbuilder
This commit adds a benchmark that measures only the execbuilder phase of
optimization, and includes no other phases.
Release note: None
opt/execbuilder: remove column map from execPlan
As the execbuilder traverses a relational expression and recursively
builds an
execPlan
, it creates mappings from column IDs to theirordinal position in the expression for each
execPlan
node. Thesemappings are used when building parent nodes to correctly map column IDs
to indexed variables. In most cases the mappings are only used when
building a parent, and never again.
Prior to this commit, the column mappings were a field of
execPlan
,tying the lifetime of
execPlan
nodes and column mappings together.This commit decouples the lifetimes of both by removing the mapping
field from
execPlan
and propagating mappings up as return values ofrecursive function calls. This will enable future optimizations that can
reuse memory allocated for mappings that are no longer needed.
Release note: None
opt/exebuilder: introduce colOrdMap
This commit introduces a new struct,
colOrdMap
, which maps column IDsto ordinals. See the comment for
colOrdMap
for more details. This typewill be used in execbuilder in future commits to store output column
mappings.
Release note: None
opt/execbuilder: use colOrdMap to store output columns
Output columns of execution nodes are now stored in
colOrdMap
s insteadof
opt.ColMap
s. ThecolOrdMapAllocator
struct, which is used toallocate new
colOrdMaps
has been added as a field ofBuilder
. Itcurrently is a simple implementation. Future commits will extend it to
reuse allocated
colOrdMap
s when possible.Release note: None
opt/execbuilder: reuse allocated colOrdMaps
This commit extends
colOrdOrdMapAllocator
with aFree
method. Freedmaps will be reused in future calls to
Alloc
instead of allocating anew map. The build functions of the major relational expressions have
been updated to free maps when they are no longer needed. This reduces
the number of maps allocated, especially for complex queries with many
execution nodes.
Informs #117546
Release note: None
opt/execbuilder: faster maximum ordinal method for colOrdMap
This commit makes
colOrdMap.MaxOrd()
a constant-time operation in mostcases. See the newly added comments for more details.
Release note: None