colexec: introduce batches with dynamic capacity #52453

yuzefovich · 2020-08-06T02:56:09Z

Depends on #52728.

col, colexec: introduce the concept of capacity to coldata.Batch

This commit introduces the concept of capacity to coldata.Batch which
describes the maximum number of tuples the batch can store. Note that it
is a lower bound meaning that some vectors in the batch might have
larger underlying capacity (e.g. when they were appended to).

Additionally this commit does several mechanical changes to rename the
methods.

Release note: None

colexec: remove custom input/output batch size logic from few places

Ordered aggregator, hash and merge joiners, and hash router had custom
input/output batch size logic that was put in place in order to increase
testing. This, however, is no longer required since we now randomize
coldata.BatchSize() value during the test runs, so that custom logic
is now removed.

Additionally, this commit removes several unit tests of the merge joiner
which are now exact copies of each other (previously, they had different
output batch size set).

One notable change is that this commit removes a tiny optimization from
the merge joiner when there are no output columns (meaning we have
a COUNT query).

This work has been done in order to ease follow-up work on the dynamic
batch sizes.

Release note: None

colexec: use batches with dynamic capacity in several operators

This commit introduces ResetMaybeReallocate method on
colmem.Allocator which might allocate a new batch (it uses an
exponential capacity growth until coldata.BatchSize() and also
supports a minimum capacity argument). The method can be used by the
operators that want "dynamic batch size" behavior. All usages of
NewMemBatchWithMaxCapacity and NewMemBatchWithFixedCapacity in
non-test files have been audited, and most of the operators have been
updated to exhibit the dynamic batch size behavior (most notable
exception to this are the aggregators because currently aggregate
functions hold on their output vectors, so we can't just reallocate an
output batch). The usage of NewMemBatchWithMaxCapacity is now
prohibited in non-test files by a linter in order to encourage the
engineers to think whether a dynamic batch size behavior is desired.

Resolves: #49796.

Release note: None

cockroach-teamcity · 2020-08-06T02:56:17Z

This change is

yuzefovich · 2020-08-07T03:49:37Z

Alright, I think this is RFAL.

I run a quick benchmark of KV95 workload on 3 node roachprod cluster on my laptop with vectorize_row_count_threshold=1000 (default value which will force us to use the table reader - "old") and with vectorize_row_count_threshold=0 (which will force us to use the cfetcher - "new"), and the numbers are very optimistic:

"old":

Yahors-MacBook-Pro:cockroach yuzefovich$ tail -f old.out 
Highest sequence written: 29301. Can be passed as --write-seq=R29301 to the next run.

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
  120.0s        0         556047         4633.5      0.9      0.4      0.8      1.3    369.1  read

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
  120.0s        0          29269          243.9    114.2    113.2    184.5    234.9    436.2  write

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
  120.0s        0         585316         4877.4      6.6      0.4     65.0    151.0    436.2

"new":

Highest sequence written: 32543. Can be passed as --write-seq=R32543 to the next run.

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
  120.0s        0         621370         5177.9      0.8      0.4      0.9      1.5    335.5  read

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
  120.0s        0          32514          270.9    102.3    100.7    176.2    218.1    503.3  write

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
  120.0s        0         653884         5448.9      5.9      0.5     56.6    130.0    503.3

I'd take it with a grain of salt (maybe they are due to variance of running the benchmark on the mac), but it looks like we might be able to get rid off vectorize_row_count_threshold heuristic entirely and always use the vectorized engine if it is supported.

yuzefovich · 2020-08-07T05:03:47Z

Some TPCC numbers, 3 node roachprod cluster with 100 warehouses, 1 minute of ramp and 5 minutes of load:

Yahors-MacBook-Pro:cockroach yuzefovich$ tail -f -n 2 3node-old.log 
_elapsed_______tpmC____efc__avg(ms)__p50(ms)__p90(ms)__p95(ms)__p99(ms)_pMax(ms)
  300.0s     1194.2  92.9%     25.5     24.1     35.7     41.9     56.6     83.9

Yahors-MacBook-Pro:cockroach yuzefovich$ tail -f -n 2 3node-new.log 
_elapsed_______tpmC____efc__avg(ms)__p50(ms)__p90(ms)__p95(ms)__p99(ms)_pMax(ms)
  300.0s     1197.0  93.1%     28.6     27.3     39.8     48.2     67.1    100.7

asubiotto

Great to see this and awesome results

Reviewed 68 of 68 files at r1, 22 of 22 files at r2, 13 of 13 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)

pkg/col/coldata/batch.go, line 128 at r1 (raw file):

func NewMemBatchNoCols(typs []*types.T, capacity int) Batch {
	if max := math.MaxUint16; capacity > max {
		panic(fmt.Sprintf(`batches cannot have length larger than %d; requested %d`, max, capacity))

s/length/capacity

pkg/col/coldata/vec.go, line 134 at r1 (raw file):

	// Capacity returns the capacity of the Golang's slice that is underlying
	// this Vec. Note that if there is no "slice" (like in case of flat bytes),
	// then "capacity" of such object is equal to its "length".

I think this last sentence is a bit vague (maybe just to me) how about: then the capacity is equal to the number of elements

pkg/sql/colexec/dynamic_batch_size_helper.go, line 29 at r3 (raw file):

}

// DynamicBatchSizeHelper is a utility struct that helps operators work with

I think this comment can use more fleshing out, i.e. how does it "help"?

pkg/sql/colexec/dynamic_batch_size_helper.go, line 40 at r3 (raw file):

// grow the allocated capacity of the batch exponentially, until the batch
// reaches coldata.BatchSize().
func (d *DynamicBatchSizeHelper) ResetMaybeReallocate(

Why not make this a part of the allocator? Also, all operators should now be calling this method, right? Are we enforcing this in any way? What valid uses of the normal Reset are there?

pkg/sql/colexec/mergejoiner.go, line 606 at r1 (raw file):

		bufferedGroup = &o.proberState.rBufferedGroup
	}
	// TODO(yuzefovich): reuse the same scratch batches when spillingQueue

I have an old branch that I'm hoping to revive

pkg/sql/colexec/routers.go, line 415 at r3 (raw file):

	for toAppend := len(selection); toAppend > 0; {
		if o.mu.pendingBatch == nil {
			// TODO(yuzefovich): consider whether this should be a dynamic batch.

It's a good question. I don't think so because I consider this a fixed-size scratch buffer that we flush from. What do you think?

pkg/sql/colexec/sorttopk.go, line 60 at r3 (raw file):

	// its input.
	topKSortSpooling topKSortState = iota
	// topKSortSpooling is the second state of the operator, indicating that

nit: s/topKSortSpooling/topKSortEmitting

pkg/sql/colmem/allocator.go, line 94 at r1 (raw file):

// NewMemBatchWithMaxCapacity allocates a new in-memory coldata.Batch of
// coldata.BatchSize() capacity.
func (a *Allocator) NewMemBatchWithMaxCapacity(typs []*types.T) coldata.Batch {

Why not keep this as NewMemBatch?

pkg/sql/colmem/allocator.go, line 100 at r1 (raw file):

// NewMemBatchWithFixedCapacity allocates a new in-memory coldata.Batch with
// the given capacity.
func (a *Allocator) NewMemBatchWithFixedCapacity(typs []*types.T, capacity int) coldata.Batch {

Why Fixed if it's going to be dynamic?

yuzefovich

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto)

pkg/col/coldata/batch.go, line 128 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

s/length/capacity

Done.

pkg/col/coldata/vec.go, line 134 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

I think this last sentence is a bit vague (maybe just to me) how about: then the capacity is equal to the number of elements

Done.

I agree, it's a bit vague for non-slice-backed typed, but currently this method is only used to get the memory footprint, so it's ok if we don't define the contract perfectly. I think in the future we should be able to have pools of vectors of all types of different capacities that would be taken from by colmem.Allocator objects, but we're not there yet.

pkg/sql/colexec/dynamic_batch_size_helper.go, line 29 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

I think this comment can use more fleshing out, i.e. how does it "help"?

Expanded the comment.

Update: removed the struct.

pkg/sql/colexec/dynamic_batch_size_helper.go, line 40 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Why not make this a part of the allocator? Also, all operators should now be calling this method, right? Are we enforcing this in any way? What valid uses of the normal Reset are there?

Made it a part of the allocator.

No, not all operators are expected to use this method - only those for which it makes sense to have the "dynamic size" behavior. My thinking is that all operators that instantiate batches to be returned as their output can be roughly divided into two groups:

in the first group, the work that operator needs to perform in order to produce a single tuple into the output is about the same, regardless whether that tuple is first, second, or last in the whole output stream (examples of such operators are cFetcher and columnarizer). Operators in this group want the "dynamic size" behavior.
in the second group, the work that operator needs to perform in order to produce a single tuple is not "distributed uniformly" among all tuples (examples are hash joiner and hash aggregator). Such operators don't want the "dynamic size" behavior because it wouldn't be beneficial because usually such operators perform other non-batch-related "internal" allocations of different things, so it wouldn't really matter if their output batch behaved dynamically.

In this PR, I first looked over all usages of Allocator.NewMemBatch method to separate them into NewMemBatchWithMaxCapacity and NewMemBatchWithFixedCapacity, and the usages of the latter definitely don't need the dynamic behavior. Then, I looked at all usages of NewMemBatchWithMaxCapacity in the non-test files and singled out those that I think would benefit from the dynamic behavior, and I converted all such cases to the new pattern. The only operator I wasn't sure about is routerOutputOp, but I think that should use "max capacity" batch.

ResetMaybeReallocate effectively replaces outputBatch.ResetInternalBatch in the operators that want the dynamic behavior. However, since some operators still want the fixed behavior, we need to keep coldata.Batch.ResetInternalBatch.

pkg/sql/colexec/mergejoiner.go, line 606 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

I have an old branch that I'm hoping to revive

Done.

pkg/sql/colexec/routers.go, line 415 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

It's a good question. I don't think so because I consider this a fixed-size scratch buffer that we flush from. What do you think?

Yeah, I think so too.

pkg/sql/colmem/allocator.go, line 94 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Why not keep this as NewMemBatch?

I think NewMemBatch is a little too generic, and I want to force the user of Allocator to think through whether a batch with fixed size should be used, a batch with maximum size, or a dynamic batch. I'm worried that NewMemBatch will be considered as the default option without giving it any thought.

pkg/sql/colmem/allocator.go, line 100 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Why Fixed if it's going to be dynamic?

The batch itself is not dynamic in size - we currently will allocate a new batch with a bigger size.

asubiotto

Reviewed 35 of 35 files at r4, 22 of 22 files at r5, 12 of 12 files at r6.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)

pkg/sql/colexec/dynamic_batch_size_helper.go, line 40 at r3 (raw file):

Previously, yuzefovich wrote…

Made it a part of the allocator.

No, not all operators are expected to use this method - only those for which it makes sense to have the "dynamic size" behavior. My thinking is that all operators that instantiate batches to be returned as their output can be roughly divided into two groups:

in the first group, the work that operator needs to perform in order to produce a single tuple into the output is about the same, regardless whether that tuple is first, second, or last in the whole output stream (examples of such operators are cFetcher and columnarizer). Operators in this group want the "dynamic size" behavior.

in the second group, the work that operator needs to perform in order to produce a single tuple is not "distributed uniformly" among all tuples (examples are hash joiner and hash aggregator). Such operators don't want the "dynamic size" behavior because it wouldn't be beneficial because usually such operators perform other non-batch-related "internal" allocations of different things, so it wouldn't really matter if their output batch behaved dynamically.

In this PR, I first looked over all usages of Allocator.NewMemBatch method to separate them into NewMemBatchWithMaxCapacity and NewMemBatchWithFixedCapacity, and the usages of the latter definitely don't need the dynamic behavior. Then, I looked at all usages of NewMemBatchWithMaxCapacity in the non-test files and singled out those that I think would benefit from the dynamic behavior, and I converted all such cases to the new pattern. The only operator I wasn't sure about is routerOutputOp, but I think that should use "max capacity" batch.

ResetMaybeReallocate effectively replaces outputBatch.ResetInternalBatch in the operators that want the dynamic behavior. However, since some operators still want the fixed behavior, we need to keep coldata.Batch.ResetInternalBatch.

Did you write this comment before the 1:1? I think it's still worth discussing whether we want to have these two separate groups or just have dynamic batch sizes everywhere. I prefer going down the route of having dynamic batch sizes everywhere because it makes programming simpler and the cost of dynamic batch sizes should be amortized. Also, it's not clear that the second group is that clearly defined, e.g. I think you put the hash joiner in the second group but we brought up the case of a single row join. We didn't finish that discussion because we had to leave to the next meeting, but we were talking about how it needed to allocate a hash table anyway.

Is there a way we could measure the performance impact of having dynamic batch sizes everywhere?

asubiotto · 2020-08-12T14:06:16Z

Might be good to discuss the above point at standup cc @jordanlewis @helenmhe

yuzefovich

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto)

pkg/sql/colexec/dynamic_batch_size_helper.go, line 40 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Did you write this comment before the 1:1? I think it's still worth discussing whether we want to have these two separate groups or just have dynamic batch sizes everywhere. I prefer going down the route of having dynamic batch sizes everywhere because it makes programming simpler and the cost of dynamic batch sizes should be amortized. Also, it's not clear that the second group is that clearly defined, e.g. I think you put the hash joiner in the second group but we brought up the case of a single row join. We didn't finish that discussion because we had to leave to the next meeting, but we were talking about how it needed to allocate a hash table anyway.

Is there a way we could measure the performance impact of having dynamic batch sizes everywhere?

Yes, I did write this down before our 1:1.

I have gone ahead and audited all usages of NewMemBatchWith*Capacity methods in non-test files and added the dynamic batch size behavior in several places. I also added a linter to prohibit calls to NewMemBatchWithMaxCapacity from non-test files so that the engineer was forced to think whether dynamic batch size behavior is desired. The only operators that haven't been converted (but in theory which could have been) are the aggregators (because reallocating an output batch breaks the contract of aggregate functions) and relative_rank operators (because that code is already pretty hard to reason about, and introduction of dynamic batch size would likely make things worse without giving any performance benefit).

I think it's not worth spending more time on this at this point.

yuzefovich · 2020-08-13T17:43:40Z

I figured out why the tests were failing (problem with falling over from in-memory hash joiner to the external hash joiner on *-disk configs - I changed when output batch allocation was occurring; fixed). RFAL.

dpulls · 2020-08-14T14:06:39Z

🎉 All dependencies have been resolved !

asubiotto

but are the SIGQUIT CI failures concerning?

Reviewed 79 of 79 files at r7, 68 of 68 files at r8, 26 of 26 files at r10, 34 of 34 files at r11.
Reviewable status: complete! 1 of 0 LGTMs obtained

yuzefovich · 2020-08-14T14:34:47Z

I'm thinking it's logic tests timeouts.

asubiotto · 2020-08-14T14:37:44Z

That's surprising to me, TestLogic takes 21m on the latest green build on master (https://teamcity.cockroachdb.com/viewLog.html?buildId=2182520&buildTypeId=Cockroach_EssentialCi&tab=testsInfo&branch_Cockroach=%3Cdefault%3E), here it's 30m.

(edited with updated link)

yuzefovich · 2020-08-14T14:40:45Z

Since we vary batch size, a single run might not be representative. I also saw the timeout failure on your PR for context cancellation fix. I have a hypothesis that dynamic batch size logic might be making tests run even slower (to be confirmed), so the timeouts are more likely to occur. Another thing is that I think I've seen timeouts a couple of times recently, so I'm pretty sure that currently on master we're pretty close to 30 minutes often.

asubiotto · 2020-08-14T14:47:56Z

I'm not so sure. Given that this change modifies a pretty fundamental part of the code, isn't it likely that there's something else going on? Do you have a link to those timeouts? Combing through the last ~10 runs on master https://teamcity.cockroachdb.com/viewType.html?buildTypeId=Cockroach_EssentialCi&branch_Cockroach=%3Cdefault%3E&tab=buildTypeStatusDiv the runtime is more or less steady at ~20-22 mins with one run at 25mins. Unfortunately I don't think it's easy to see the batch size for successful builds.

yuzefovich · 2020-08-14T14:49:18Z

I'm not sure either. I want to wait for another CI build on this branch before jumping to any conclusions, but I thought I'd share my current guess.

asubiotto · 2020-08-14T15:04:10Z

I see what you mean regarding timeouts in the context cancellation PR. It doesn't seem like that's a normal timeout. It looks like UpsertSetDefault was stuck for 10 minutes most likely related to that PR:

goroutine 13078718 [select, 10 minutes]:
github.com/cockroachdb/cockroach/pkg/sql/colexec.(*ParallelUnorderedSynchronizer).Next(0xc001a1f080, 0x564c8c0, 0xc057efe1c0, 0x45034c0, 0x4503401)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/colexec/parallel_unordered_synchronizer.go:302 +0x129
github.com/cockroachdb/cockroach/pkg/sql/colexec.(*noopOperator).Next(0xc0592c8080, 0x564c8c0, 0xc057efe1c0, 0xc06f6dd2e0, 0x578eaa0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/colexec/operator.go:238 +0x47
github.com/cockroachdb/cockroach/pkg/sql/colexec.invariantsChecker.Next(0x5651200, 0xc0592c8080, 0x564c8c0, 0xc057efe1c0, 0x2d9be03, 0xc0303ca780)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/colexec/invariants_checker.go:42 +0x4c
github.com/cockroachdb/cockroach/pkg/sql/colexec.(*Materializer).next(0xc0156d9600, 0x564c8c0, 0xc057efe1c0, 0x6a347a, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/colexec/materializer.go:223 +0x16c
github.com/cockroachdb/cockroach/pkg/sql/colexec.(*Materializer).nextAdapter(...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/colexec/materializer.go:214
github.com/cockroachdb/cockroach/pkg/sql/colexecbase/colexecerror.CatchVectorizedRuntimeError(0xc003997aa0, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/colexecbase/colexecerror/error.go:93 +0x5f
github.com/cockroachdb/cockroach/pkg/sql/colexec.(*Materializer).Next(0xc0156d9600, 0xc057efe1c0, 0x46a71de, 0xc, 0x564c8c0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/colexec/materializer.go:249 +0x4d
github.com/cockroachdb/cockroach/pkg/sql/execinfra.Run(0x564c8c0, 0xc057efe1c0, 0x5673680, 0xc0156d9600, 0x5628e40, 0xc04661db00)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/execinfra/base.go:170 +0x35
github.com/cockroachdb/cockroach/pkg/sql/execinfra.(*ProcessorBase).Run(0xc0156d9600, 0x564c8c0, 0xc057efe1c0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/execinfra/processorsbase.go:763 +0x92
github.com/cockroachdb/cockroach/pkg/sql/flowinfra.(*FlowBase).Run(0xc02b1d6360, 0x564c8c0, 0xc057efe1c0, 0x4c41588, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/flowinfra/flow.go:380 +0x24d
github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLPlanner).Run(0xc0111d1ef0, 0xc06324c070, 0xc01050c000, 0xc05334a100, 0xc04661db00, 0xc044c5b260, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:421 +0x5d9
github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLPlanner).PlanAndRun(0xc0111d1ef0, 0x564c980, 0xc064cf96e0, 0xc044c5b260, 0xc06324c070, 0xc01050c000, 0x0, 0x0, 0xc00e95fc40, 0xc04661db00, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:990 +0x1d1
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execWithDistSQLEngine(0xc044c5ad80, 0x564c980, 0xc064cf96e0, 0xc044c5b170, 0x3, 0x7fe0da9683a0, 0xc00ab6b380, 0xc0259da401, 0xc055cd29b8, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:996 +0x3c0
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).dispatchToExecutionEngine(0xc044c5ad80, 0x564c980, 0xc064cf96e0, 0xc044c5b170, 0x7fe0da9683a0, 0xc00ab6b380, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:886 +0x700
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execStmtInOpenState(0xc044c5ad80, 0x564c980, 0xc064cf96e0, 0x566b1c0, 0xc06ebf9040, 0xc009e17bf2, 0x4b, 0x0, 0x2, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:571 +0xb18
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execStmt(0xc044c5ad80, 0x564c980, 0xc064cf96e0, 0x566b1c0, 0xc06ebf9040, 0xc009e17bf2, 0x4b, 0x0, 0x2, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:111 +0x7e6
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execCmd(0xc044c5ad80, 0x564c8c0, 0xc03e640300, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1409 +0x1c77
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).run(0xc044c5ad80, 0x564c8c0, 0xc063ce3600, 0xc07a3572c0, 0x5400, 0x15000, 0xc07a357360, 0xc03522f0e0, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1338 +0x1f2
github.com/cockroachdb/cockroach/pkg/sql.(*Server).ServeConn(0xc0042c5b80, 0x564c8c0, 0xc063ce3600, 0xc044c5ad80, 0x5400, 0x15000, 0xc07a357360, 0xc03522f0e0, 0x0, 0x0)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:496 +0x104
github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*conn).processCommandsAsync.func1(0xc041448c5d, 0xc07955d0c0, 0x564c8c0, 0xc063ce3600, 0xc03522f0e0, 0xc0042c5b80, 0xc00ab6ad00, 0x5676dc0, 0xc049236f90, 0xc00b40f380, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:580 +0x317
created by github.com/cockroachdb/cockroach/pkg/sql/pgwire.(*conn).processCommandsAsync
	/go/src/github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:508 +0x17e

yuzefovich · 2020-08-14T16:25:51Z

I'm pretty sure the failures are timeouts - the dump of goroutines on SIGQUIT shows a bunch of them have been running around 29-30 minutes, possibly it's a coincidence, but I think it's more likely to be timeouts (locally, the files that the tests failed on pass when ran one at a time).

This commit introduces the concept of capacity to `coldata.Batch` which describes the maximum number of tuples the batch can store. Note that it is a lower bound meaning that some vectors in the batch might have larger underlying capacity (e.g. when they were appended to). Additionally this commit does several mechanical changes to rename the methods. Release note: None

Ordered aggregator, hash and merge joiners, and hash router had custom input/output batch size logic that was put in place in order to increase testing. This, however, is no longer required since we now randomize `coldata.BatchSize()` value during the test runs, so that custom logic is now removed. Additionally, this commit removes several unit tests of the merge joiner which are now exact copies of each other (previously, they had different output batch size set). One notable change is that this commit removes a tiny optimization from the merge joiner when there are no output columns (meaning we have a COUNT query). This work has been done in order to ease follow-up work on the dynamic batch sizes. Release note: None

yuzefovich · 2020-08-14T21:48:43Z

There was a simple bug in ResetMaybeReallocate - I wasn't truncating minCapacity to coldata.BatchSize() in all code paths although that's the contract of the method. As a result, we could have created batches that are larger than the maximum size which breaks our assumptions and would end up in an infinite loop, fixed.

I'll wait for a CI run and merge if green.

This commit introduces `ResetMaybeReallocate` method on `colmem.Allocator` which might allocate a new batch (it uses an exponential capacity growth until `coldata.BatchSize()` and also supports a minimum capacity argument). The method can be used by the operators that want "dynamic batch size" behavior. All usages of `NewMemBatchWithMaxCapacity` and `NewMemBatchWithFixedCapacity` in non-test files have been audited, and most of the operators have been updated to exhibit the dynamic batch size behavior (most notable exception to this are the aggregators because currently aggregate functions hold on their output vectors, so we can't just reallocate an output batch). The usage of `NewMemBatchWithMaxCapacity` is now prohibited in non-test files by a linter in order to encourage the engineers to think whether a dynamic batch size behavior is desired. Release note: None

yuzefovich · 2020-08-15T01:53:08Z

TFTR!

bors r+

craig · 2020-08-15T02:16:01Z

Build succeeded:

Compile Build (Cockroach)

yuzefovich requested review from asubiotto, a team and miretskiy and removed request for a team August 6, 2020 02:56

yuzefovich removed the request for review from miretskiy August 6, 2020 02:56

yuzefovich force-pushed the dynamic-batch branch 2 times, most recently from 9b76fed to 11c0672 Compare August 7, 2020 00:50

yuzefovich changed the title ~~col*: preliminary steps for dynamic batch sizes~~ colexec: introduce batches with dynamic capacity Aug 7, 2020

yuzefovich force-pushed the dynamic-batch branch 4 times, most recently from bf08d14 to 3a93872 Compare August 7, 2020 03:19

asubiotto suggested changes Aug 7, 2020

View reviewed changes

yuzefovich force-pushed the dynamic-batch branch from 3a93872 to 71cd00f Compare August 10, 2020 23:46

yuzefovich commented Aug 10, 2020

View reviewed changes

yuzefovich force-pushed the dynamic-batch branch from 71cd00f to 92f44b3 Compare August 10, 2020 23:51

asubiotto suggested changes Aug 12, 2020

View reviewed changes

yuzefovich force-pushed the dynamic-batch branch from 92f44b3 to c0337fa Compare August 12, 2020 22:03

yuzefovich commented Aug 12, 2020

View reviewed changes

yuzefovich force-pushed the dynamic-batch branch 2 times, most recently from 1200b04 to 53b2018 Compare August 13, 2020 17:42

yuzefovich force-pushed the dynamic-batch branch from 53b2018 to b556ae9 Compare August 14, 2020 14:17

asubiotto approved these changes Aug 14, 2020

View reviewed changes

yuzefovich added the do-not-merge bors won't merge a PR with this label. label Aug 14, 2020

yuzefovich added 2 commits August 14, 2020 14:02

yuzefovich force-pushed the dynamic-batch branch from 276a0a2 to 4b39ee2 Compare August 14, 2020 21:46

yuzefovich removed the do-not-merge bors won't merge a PR with this label. label Aug 14, 2020

yuzefovich force-pushed the dynamic-batch branch from 4b39ee2 to 895125b Compare August 14, 2020 21:55

craig bot merged commit d106b79 into cockroachdb:master Aug 15, 2020

yuzefovich deleted the dynamic-batch branch August 15, 2020 02:52

asubiotto mentioned this pull request Sep 8, 2020

sql/logictest: TestLogic/fakedist-disk/select_index_span_ranges timed out #53729

Closed

This was referenced Oct 21, 2020

sql: performance regression on TPC-H Q17 between 20.1 and 20.2 #55787

Closed

sql: decrease vectorize_row_count_threshold to 0 #55713

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

colexec: introduce batches with dynamic capacity #52453

colexec: introduce batches with dynamic capacity #52453

yuzefovich commented Aug 6, 2020 •

edited

Loading

cockroach-teamcity commented Aug 6, 2020

yuzefovich commented Aug 7, 2020 •

edited

Loading

yuzefovich commented Aug 7, 2020

asubiotto left a comment

yuzefovich left a comment

asubiotto left a comment

asubiotto commented Aug 12, 2020

yuzefovich left a comment

yuzefovich commented Aug 13, 2020

dpulls bot commented Aug 14, 2020

asubiotto left a comment

yuzefovich commented Aug 14, 2020

asubiotto commented Aug 14, 2020 •

edited

Loading

yuzefovich commented Aug 14, 2020

asubiotto commented Aug 14, 2020

yuzefovich commented Aug 14, 2020

asubiotto commented Aug 14, 2020 •

edited

Loading

yuzefovich commented Aug 14, 2020

yuzefovich commented Aug 14, 2020

yuzefovich commented Aug 15, 2020

craig bot commented Aug 15, 2020

colexec: introduce batches with dynamic capacity #52453

colexec: introduce batches with dynamic capacity #52453

Conversation

yuzefovich commented Aug 6, 2020 • edited Loading

cockroach-teamcity commented Aug 6, 2020

yuzefovich commented Aug 7, 2020 • edited Loading

yuzefovich commented Aug 7, 2020

asubiotto left a comment

Choose a reason for hiding this comment

yuzefovich left a comment

Choose a reason for hiding this comment

asubiotto left a comment

Choose a reason for hiding this comment

asubiotto commented Aug 12, 2020

yuzefovich left a comment

Choose a reason for hiding this comment

yuzefovich commented Aug 13, 2020

dpulls bot commented Aug 14, 2020

asubiotto left a comment

Choose a reason for hiding this comment

yuzefovich commented Aug 14, 2020

asubiotto commented Aug 14, 2020 • edited Loading

yuzefovich commented Aug 14, 2020

asubiotto commented Aug 14, 2020

yuzefovich commented Aug 14, 2020

asubiotto commented Aug 14, 2020 • edited Loading

yuzefovich commented Aug 14, 2020

yuzefovich commented Aug 14, 2020

yuzefovich commented Aug 15, 2020

craig bot commented Aug 15, 2020

yuzefovich commented Aug 6, 2020 •

edited

Loading

yuzefovich commented Aug 7, 2020 •

edited

Loading

asubiotto commented Aug 14, 2020 •

edited

Loading

asubiotto commented Aug 14, 2020 •

edited

Loading