sql: avoid copying ColumnDescriptors in initColsForScan #50727

nvanbenschoten · 2020-06-27T05:00:26Z

This change switches scanNode from constructing and passing around a []ColumnDescriptor to constructing and passing around a []*ColumnDescriptor which references the existing ColumnDescriptors in the TableDescriptor. This is in response to seeing the allocation in initColsForScan pop up as the single largest source of total heap allocations by size (alloc_space, the heap profile sample that most closely measures GC pressure) while running TPC-E. The allocation in initColsForScan was responsible for 4.1% of the alloc_space profile after a 30 minute run of the workload.

In general, this indicates that we should move away from copying around these ColumnDescriptors by value. They are currently 120 bytes large, which isn't huge, but also isn't small. Furthermore, unlike TableDescriptors, we almost never pass around only a single ColumnDescriptor. Instead, we're usually operating on every column touched by a query, so this 120 bytes can blow up fast. For instance, if we estimate that the average TPC-E query touches somewhere between 8 and 10 columns then a single copy of all of these descriptors during the execution of a query (like we were doing in initColsForScan) requires allocating and copying over 1KB of memory.

Yahor, I'm assigning you for two reasons. One, because you seem to be working most closely to this code and likely have a good idea for how disruptive this kind of change will be. I don't want to split the world into functions that work with []ColumnDescriptor and functions that work with []*ColumnDescriptor. I also figured you'd be interested to know that I was running this using an older SHA and the second and third largest sources of allocations were in createTableReaders (3.54%) and ColumnTypesWithMutations (2.60%). Both had to do with constructing slices of types.T and both appear to have been fixed by c06277e. So nice job with that change!

This change switches `scanNode` from constructing and passing around a []ColumnDescriptor to constructing and passing around a []*ColumnDescriptor. This is in response to seeing the allocation in `initColsForScan` pop up as the single largest source of total heap allocations by size (`alloc_space`, the heap profile sample that most closely measures GC pressure) while running TPC-E. The allocation in `initColsForScan` was responsible for **4.1%** of the `alloc_space` profile after a 30 minute run of the workload. In general, this indicates that we should move away from copying around these ColumnDescriptors by value. They are currently 120 bytes large, which isn't huge, but also isn't small. Furthermore, unlike TableDescriptors, we almost never pass around only a single ColumnDescriptor. Instead, we're usually operating on every column touched by a query, so this 120 bytes can blow up fast. For instance, if we estimate that the average TPC-E query touches somewhere between 8 and 10 columns then a single copy of all of these descriptors during the execution of a query (like we were doing in initColsForScan) requires allocating and copying over 1KB of memory. Yahor, I'm assigning you for two reasons. One, because you seem to be working most closely to this code and likely have a good idea for how disruptive this kind of change will be. I don't want to split the world into functions that work with []ColumnDescriptor and functions that work with []*ColumnDescriptor. I also figured you'd be interested to know that I was running this using an older SHA and the second and third largest sources of allocations were in `createTableReaders` (3.54%) and `ColumnTypesWithMutations` (2.60%). Both had to do with constructing slices of `types.T` and both appear to have been fixed by c06277e. So nice job with that change!

cockroach-teamcity · 2020-06-27T05:00:33Z

This change is

yuzefovich

Nice find!

the second and third largest sources of allocations were in
createTableReaders (3.54%) and ColumnTypesWithMutations (2.60%). Both
had to do with constructing slices of types.T and both appear to have been
fixed by c06277e.

Wohoo!

how disruptive this kind of change will be

I think it is reasonable to introduce such "duplicity" of some methods given the performance gain we get. And I agree with your sentiment that it seems like we should be moving away from copying the column descriptors by value, and this change does a first step towards that bright future :)

Reviewed 4 of 4 files at r1.
Reviewable status: complete! 1 of 0 LGTMs obtained

nvanbenschoten · 2020-06-29T18:48:55Z

TFTR!

bors r+

craig · 2020-06-29T19:31:58Z

Build succeeded

GitHub CI (Cockroach)

nvanbenschoten requested a review from yuzefovich June 27, 2020 05:00

yuzefovich approved these changes Jun 27, 2020

View reviewed changes

craig bot merged commit dbb5ad1 into cockroachdb:master Jun 29, 2020

nvanbenschoten deleted the nvanbenschoten/colDescCpy branch June 30, 2020 20:31

yuzefovich mentioned this pull request Jul 8, 2020

sqlbase: switch to operating on slices of pointers to ColumnDescriptors #51118

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: avoid copying ColumnDescriptors in initColsForScan #50727

sql: avoid copying ColumnDescriptors in initColsForScan #50727

nvanbenschoten commented Jun 27, 2020

cockroach-teamcity commented Jun 27, 2020

yuzefovich left a comment

nvanbenschoten commented Jun 29, 2020

craig bot commented Jun 29, 2020

sql: avoid copying ColumnDescriptors in initColsForScan #50727

sql: avoid copying ColumnDescriptors in initColsForScan #50727

Conversation

nvanbenschoten commented Jun 27, 2020

cockroach-teamcity commented Jun 27, 2020

yuzefovich left a comment

Choose a reason for hiding this comment

nvanbenschoten commented Jun 29, 2020

craig bot commented Jun 29, 2020

Build succeeded