colexec: implement vectorized index join #67450

DrewKimball · 2021-07-10T00:28:17Z

This patch provides a vectorized implementation of the index join
operator. Span generation is accomplished using two utility operators.
spanEncoder operates on a single index key column and fills a Bytes
column with the encoding of that column for each row. spanAssembler
takes the output of each spanEncoder and generates spans, accounting
for table/index prefixes and possibly splitting the spans over column
families. Finally, the ColIndexJoin operator uses the generated spans
to perform a lookup on the table's primary index, returns all batches
resulting from the lookup, and repeats until the input is fully consumed.

The ColIndexJoin operator queues up input rows until the memory footprint
of the rows reaches a preset limit (default 4MB for parity with the row
engine). This allows the cost of starting a scan to be amortized.

Fixes #65905

Release note (sql change): The vectorized execution engine can now
perform a scan over an index, and then join on the primary index to
retrieve the required columns.

cockroach-teamcity · 2021-07-10T00:28:24Z

This change is

DrewKimball · 2021-07-10T00:37:02Z

There are a few things/questions I would like to draw attention to:

For the spans slice in the SpanAssembler operator, I am registering each increase in the slice's capacity with the allocator. I'm not sure whether this is the correct thing to do, considering the slice is retrieved from and returned to a pool.
I am registering the memory allocated for the span keys (the underlying byte slices). However, the span generation operators do not own this memory, since the kv span fetching code keeps references to the underlying bytes. So, hypothetically a very large index join could cause the operator to hit the memory limit no matter how the input is batched. Maybe the SpanAssembler operator should release that memory once the spans slice is reset, even if it can't actually be garbage collected yet?
EncodeTableKey encodes Bytes and String types using EncodeStringAscending instead of EncodeBytesAscending. As far as I can tell, these do the same thing, except the former involves a few casts. Is it alright to just use EncodeBytesAscending and EncodeBytesDescending for all types that are stored in the vectorized engine as bytes?

DrewKimball · 2021-07-10T00:49:40Z

I ran the tpch workload for each query that has an index join vs master; here are the results:

Master:

drewkimball@Drews-MBP cockroach % bin/workload run tpch --queries '4,5,6,10,12,14,15,20' --max-ops 8 --concurrency 1
I210709 06:19:56.404415 1 workload/cli/run.go:396  [-] 1  creating load generator...
I210709 06:19:56.404533 1 workload/cli/run.go:427  [-] 2  creating load generator... done (took 123µs)
I210709 06:20:00.666842 52 workload/tpch/tpch.go:480  [-] 3  [q4] returned 5 rows after 4.26 seconds
I210709 06:20:05.971813 52 workload/tpch/tpch.go:480  [-] 4  [q5] returned 5 rows after 5.30 seconds
I210709 06:20:16.451468 52 workload/tpch/tpch.go:480  [-] 5  [q6] returned 1 rows after 10.48 seconds
I210709 06:20:21.190724 52 workload/tpch/tpch.go:480  [-] 6  [q10] returned 20 rows after 4.74 seconds
I210709 06:20:34.903941 52 workload/tpch/tpch.go:480  [-] 7  [q12] returned 2 rows after 13.71 seconds
I210709 06:20:36.416048 52 workload/tpch/tpch.go:480  [-] 8  [q14] returned 1 rows after 1.51 seconds
I210709 06:20:43.878553 52 workload/tpch/tpch.go:480  [-] 9  [q15] returned 1 rows after 7.46 seconds
I210709 06:21:05.556883 52 workload/tpch/tpch.go:480  [-] 10  [q20] returned 186 rows after 21.68 seconds

Branch:

drewkimball@Drews-MBP cockroach % bin/workload run tpch --queries '4,5,6,10,12,14,15,20' --max-ops 8 --concurrency 1
I210710 00:47:01.172921 1 workload/cli/run.go:396  [-] 1  creating load generator...
I210710 00:47:01.173037 1 workload/cli/run.go:427  [-] 2  creating load generator... done (took 122µs)
I210710 00:47:05.105648 54 workload/tpch/tpch.go:480  [-] 3  [q4] returned 5 rows after 3.93 seconds
I210710 00:47:09.042185 54 workload/tpch/tpch.go:480  [-] 4  [q5] returned 5 rows after 3.94 seconds
I210710 00:47:13.856669 54 workload/tpch/tpch.go:480  [-] 5  [q6] returned 1 rows after 4.81 seconds
I210710 00:47:17.383549 54 workload/tpch/tpch.go:480  [-] 6  [q10] returned 20 rows after 3.53 seconds
I210710 00:47:22.706510 54 workload/tpch/tpch.go:480  [-] 7  [q12] returned 2 rows after 5.32 seconds
I210710 00:47:23.572935 54 workload/tpch/tpch.go:480  [-] 8  [q14] returned 1 rows after 0.87 seconds
I210710 00:47:26.993538 54 workload/tpch/tpch.go:480  [-] 9  [q15] returned 1 rows after 3.42 seconds
I210710 00:47:37.965217 54 workload/tpch/tpch.go:480  [-] 10  [q20] returned 186 rows after 10.97 seconds

jordanlewis · 2021-07-10T21:53:20Z

Wow @DrewKimball! The results speak for themselves, the speedup far surpasses my expectation. Excellent work.

DrewKimball · 2021-07-10T22:58:09Z

@jordanlewis thanks! I should mention that most of the gains come from using larger batches of spans (default 4MB for the underlying key bytes). If someone more knowledgable about the KV fetcher code than I am points out a problem with that, we may have to decrease the limit. Though, I found pretty nice gains with just 1MB as well.

yuzefovich

Awesome work and great speedup!

nit: I think "Addresses" doesn't automatically close the linked issue.

Reviewed 11 of 16 files at r1, 3 of 8 files at r2, 4 of 6 files at r3, 1 of 1 files at r4.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @michae2)

pkg/col/coldata/bytes.go, line 413 at r4 (raw file):

}

// ResetForAppend is similar to Reset, but is also resets the offsets slice so

nit: s/is also/it also/.

pkg/sql/colexec/colbuilder/execplan.go, line 803 at r4 (raw file):

			}
			if core.JoinReader.LookupColumns != nil || !core.JoinReader.LookupExpr.Empty() {
				return r, errors.Newf("lookup join reader is unsupported in vectorized")

nit: this should be an assertion failure.

pkg/sql/colexec/colexecjoin/span_assembler_tmpl.go, line 14 at r4 (raw file):

// +build execgen_template
//
// This file is the execgen template for span_encoder.eg.go. It's formatted in a

nit: s/span_encoder/span_assembler/.

pkg/sql/colexec/colexecjoin/span_assembler_tmpl.go, line 55 at r4 (raw file):

	// Add span encoders to encode each primary key column as bytes. The
	// ColSpanAssembler will later append these together to form valid spans.
	var spanEncoders []spanEncoder

nit: we could allocate the slice of the correct capacity upfront.

pkg/sql/colexec/colexecjoin/span_assembler_tmpl.go, line 63 at r4 (raw file):

	b := spanAssemblerPool.Get().(*spanAssemblerBase)

	base := spanAssemblerBase{

nit: why not update b in-place while keeping the reference to the span slice?

pkg/sql/colexec/colexecjoin/span_assembler_tmpl.go, line 68 at r4 (raw file):

		keyPrefix:       rowenc.MakeIndexKeyPrefix(codec, table, index.GetID()),
		spanEncoders:    spanEncoders,
		spanCols:        make([]*coldata.Bytes, len(spanEncoders)),

This slice could also be kept when putting the object back to the pool (only need to nil out all elements first).

pkg/sql/colexec/colexecjoin/span_assembler_tmpl.go, line 87 at r4 (raw file):

// ColSpanAssembler is a utility operator that generates a series of spans from
// input batches which can be used to perform an index join or lookup join.

super nit: maybe not mention lookup join since only index joins are now supported and lookup joins seem hard?

pkg/sql/colexec/colexecjoin/span_assembler_tmpl.go, line 175 at r4 (raw file):

		op.spans = op.spans[:0]
	}
	op.shouldReset = false

super nit: this could be moved into the if above.

pkg/sql/colexec/colexecjoin/span_assembler_tmpl.go, line 189 at r4 (raw file):

	for i := 0; i < n; i++ {
		// Every key has a prefix encoding the table, index, etc.
		op.scratchKey = append(op.scratchKey[:0], op.keyPrefix...)

Do we modify scratchKey somewhere? If not, we could copy the key prefix only once, outside of the loop, and then slice the scratch up correctly here.

pkg/sql/colexec/execgen/cmd/execgen/span_encoder_gen.go, line 38 at r4 (raw file):

	s = assignAddRe.ReplaceAllString(s, makeTemplateFunctionCall("AssignSpanEncoding", 2))

	s = replaceManipulationFuncs(s)

nit: I think we don't need this.

pkg/sql/colfetcher/BUILD.bazel, line 25 at r4 (raw file):

        "//pkg/sql/colconv",
        "//pkg/sql/colencoding",
        "//pkg/sql/colexec/colexecjoin",

I think the dependency of colfetcher on colexecjoin is a bit unfortunate since the join package contains lots of generated code. I think it might be worth creating a separate package for the columnar span stuff.

pkg/sql/colfetcher/index_join.go, line 54 at r4 (raw file):

	mu          struct {
		syncutil.Mutex
		// rowsRead contains the number of total rows this ColBatchScan has

nit: s/ColBatchScan/ColIndexJoin/.

pkg/sql/colfetcher/index_join.go, line 74 at r4 (raw file):

var _ colexecop.KVReader = &ColIndexJoin{}
var _ execinfra.Releasable = &ColIndexJoin{}
var _ colexecop.Closer = &ColIndexJoin{}

super nit: these two lines could be squashed into one with ClosableOperator.

pkg/sql/colfetcher/index_join.go, line 121 at r4 (raw file):

			if !s.maintainOrdering {
				// Sort the spans for the following cases:
				// - For lookupJoinReaderType: this is so that we can rely upon the

nit: needs an adjustment.

pkg/sql/colfetcher/index_join.go, line 134 at r4 (raw file):

			// Handle metadata for these spans.
			if s.nodeID != 0 {

In ColBatchScan we also check that the flow is not local before retrieving the misplanned ranges, should we do the same here?

pkg/sql/colfetcher/index_join.go, line 266 at r4 (raw file):

	// Before we can safely use types from the table descriptor, we need to
	// make sure they are hydrated. In row execution engine it is done during
	// the processor initialization, but neither ColBatchScan nor cFetcher are

nit: s/ColBatchScan/ColIndexJoin/.

pkg/sql/colfetcher/index_join.go, line 279 at r4 (raw file):

	index := table.ActiveIndexes()[indexIdx]

	proc := &execinfra.ProcOutputHelper{}

Not sure if it's worth it, but in case post.OutputColumns is set (which is the case when we don't need to perform any rendering on top of the index join), we can avoid the allocation of the helper since OutputColumns will contain all needed columns.

pkg/sql/colfetcher/index_join.go, line 298 at r4 (raw file):

	fetcher.estimatedRowCount = 0
	if err := fetcher.Init(
		flowCtx.Codec(), allocator, execinfra.GetWorkMemLimit(flowCtx), false, /* false */

nit: suspicious comment.

yuzefovich

For the spans slice in the SpanAssembler operator, I am registering each increase in the slice's capacity with the allocator. I'm not sure whether this is the correct thing to do, considering the slice is retrieved from and returned to a pool.

I am registering the memory allocated for the span keys (the underlying byte slices). However, the span generation operators do not own this memory, since the kv span fetching code keeps references to the underlying bytes. So, hypothetically a very large index join could cause the operator to hit the memory limit no matter how the input is batched. Maybe the SpanAssembler operator should release that memory once the spans slice is reset, even if it can't actually be garbage collected yet?

I guess these questions might be best answered by Becca given her recent work on the memory accounting in the join reader. As I mentioned earlier, IMO it's ok for this PR to not concern itself with the memory accounting since this is what we have on master right now.

EncodeTableKey encodes Bytes and String types using EncodeStringAscending instead of EncodeBytesAscending. As far as I can tell, these do the same thing, except the former involves a few casts. Is it alright to just use EncodeBytesAscending and EncodeBytesDescending for all types that are stored in the vectorized engine as bytes?

Hm, my reading of the code is the same, so it seems reasonable to me.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @michae2)

DrewKimball

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @michae2 and @yuzefovich)

pkg/col/coldata/bytes.go, line 413 at r4 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: s/is also/it also/.

Done.

pkg/sql/colexec/colbuilder/execplan.go, line 803 at r4 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: this should be an assertion failure.

Done.

pkg/sql/colexec/colexecjoin/span_assembler_tmpl.go, line 14 at r4 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: s/span_encoder/span_assembler/.

Done.

pkg/sql/colexec/colexecjoin/span_assembler_tmpl.go, line 55 at r4 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: we could allocate the slice of the correct capacity upfront.

Done.

pkg/sql/colexec/colexecjoin/span_assembler_tmpl.go, line 63 at r4 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: why not update b in-place while keeping the reference to the span slice?

Done.

pkg/sql/colexec/colexecjoin/span_assembler_tmpl.go, line 68 at r4 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

This slice could also be kept when putting the object back to the pool (only need to nil out all elements first).

Done. May as well do the same with the spanEncoders slice, I guess.

pkg/sql/colexec/colexecjoin/span_assembler_tmpl.go, line 87 at r4 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

super nit: maybe not mention lookup join since only index joins are now supported and lookup joins seem hard?

Done.

pkg/sql/colexec/colexecjoin/span_assembler_tmpl.go, line 175 at r4 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

super nit: this could be moved into the if above.

Done.

pkg/sql/colexec/colexecjoin/span_assembler_tmpl.go, line 189 at r4 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Do we modify scratchKey somewhere? If not, we could copy the key prefix only once, outside of the loop, and then slice the scratch up correctly here.

Done.

pkg/sql/colexec/execgen/cmd/execgen/span_encoder_gen.go, line 38 at r4 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: I think we don't need this.

Forgot to remove that after getting execgen:inline to work.

pkg/sql/colfetcher/BUILD.bazel, line 25 at r4 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

I think the dependency of colfetcher on colexecjoin is a bit unfortunate since the join package contains lots of generated code. I think it might be worth creating a separate package for the columnar span stuff.

Done.

pkg/sql/colfetcher/index_join.go, line 54 at r4 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: s/ColBatchScan/ColIndexJoin/.

Done.

pkg/sql/colfetcher/index_join.go, line 74 at r4 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

super nit: these two lines could be squashed into one with ClosableOperator.

Done.

pkg/sql/colfetcher/index_join.go, line 121 at r4 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: needs an adjustment.

Done.

pkg/sql/colfetcher/index_join.go, line 134 at r4 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

In ColBatchScan we also check that the flow is not local before retrieving the misplanned ranges, should we do the same here?

Oh, yes we should. Done.

pkg/sql/colfetcher/index_join.go, line 266 at r4 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: s/ColBatchScan/ColIndexJoin/.

Done.

pkg/sql/colfetcher/index_join.go, line 279 at r4 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Not sure if it's worth it, but in case post.OutputColumns is set (which is the case when we don't need to perform any rendering on top of the index join), we can avoid the allocation of the helper since OutputColumns will contain all needed columns.

Its easy enough, so we may as well.

pkg/sql/colfetcher/index_join.go, line 298 at r4 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: suspicious comment.

Oops... fixed it.

yuzefovich

Reviewed 19 of 19 files at r5.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @michae2)

pkg/sql/colexec/colbuilder/execplan.go, line 182 at r5 (raw file):

	case spec.Core.JoinReader != nil:
		if spec.Core.JoinReader.LookupColumns != nil || !spec.Core.JoinReader.LookupExpr.Empty() {
			return errors.Newf("lookup join reader is unsupported in vectorized")

nit: I think it is worth creating global variables for these two errors (I have seen profiles where errors.Newf calls were non-trivial in this file).

pkg/sql/colexec/colexecspan/dep_test.go, line 21 at r5 (raw file):

func TestNoLinkForbidden(t *testing.T) {
	buildutil.VerifyNoImports(t,
		"github.com/cockroachdb/cockroach/pkg/sql/colexec/colexecwindow", true,

s/colexecwindow/colexecspan/.

pkg/sql/colexec/colexecspan/main_test.go, line 33 at r5 (raw file):

)

var (

nit: looks like you instantiate most of these objects here and in TestSpanAssembler which seems redundant. Probably this file could be trimmed down (maybe only batch size randomization should be kept here).

pkg/sql/colfetcher/index_join.go, line 87 at r5 (raw file):

	// cFetcher. Note that ProcessorSpan method itself will check whether
	// tracing is enabled.
	s.Ctx, s.tracingSpan = execinfra.ProcessorSpan(s.Ctx, "colindexjoin")

nit: I think the trace should be started before initializing the input.

pkg/sql/colfetcher/index_join.go, line 131 at r5 (raw file):

			if !s.flowCtx.Local && s.nodeID != 0 {
				s.misplannedRanges = append(s.misplannedRanges,
					execinfra.MisplannedRanges(s.Ctx, spans, s.nodeID, s.flowCtx.Cfg.RangeCache)...)

Hm, I wonder whether MisplannedRanges is a noop before we actually perform and finish the scan. I have a feeling that it is, and if that's true, this call is not useful. I think I would delete this altogether and leave a TODO (given that in the join reader we don't collect the misplanned info either).

Now that I've typed this out, I think that it actually makes sense that we should not try to collect misplanned ranges metadata in case of the index/lookup joins because their placement is determined by the physical placement of the table readers which are the inputs, so probably even a TODO is not needed.

pkg/sql/colfetcher/index_join.go, line 288 at r5 (raw file):

	}

	tableArgs := row.FetcherTableArgs{

nit: do you think it's worth refactoring initCRowFetcher a bit and reusing it here?

pkg/sql/colfetcher/index_join.go, line 299 at r5 (raw file):

	fetcher := cFetcherPool.Get().(*cFetcher)
	fetcher.estimatedRowCount = 0

It might be worth plumbing through the estimated row count here since the index join is 1-to-1 from its input scan, and we might have a good estimate for that scan.

pkg/sql/colexec/colexecspan/span_assembler_tmpl.go, line 56 at r5 (raw file):

	base.colFamStartKeys, base.colFamEndKeys = getColFamilyEncodings(neededCols, table, index)
	base.spanEncoders = base.spanEncoders[:0]
	base.spanCols = base.spanCols

nit: this line seems redundant.

pkg/sql/colexec/colexecspan/span_assembler_tmpl.go, line 136 at r5 (raw file):

	// maxSpansLength tracks the largest length the spans field reaches, so that
	// all the Span objects can be deeply reset upon a call to close.
	maxSpansLength int

I wonder whether it'll be faster and maybe cleaner to not track this field, but in Release method slice up spans up to its capacity, nil out all elements, and then slice it up to 0 to put it back into the pool. Thoughts?

pkg/sql/colexec/colexecspan/span_assembler_tmpl.go, line 204 at r5 (raw file):

	// in the latter, the kv fetcher logic maintains references for an unknown
	// amount of time. The row engine doesn't account for these either so it isn't
	// be a regression, but we really should find a good way to account for the

nit: s/be a/a/.

pkg/sql/colexec/colexecspan/span_assembler_tmpl.go, line 219 at r5 (raw file):

		return nil
	}
	b.shouldReset = true

Similarly, I wonder whether we should get rid off shouldReset with something like

func (b *spanAssemblerBase) GetSpans() roachpb.Spans {
  spans := b.spans
  b.spans = b.spans[:0]
  return spans
}

It slightly modifies the contact of GetSpans in that it now would return zero length slice and not nil if ConsumeBatch hasn't been called, but it seems ok to me.

pkg/sql/colexec/colexecspan/span_encoder_tmpl.go, line 88 at r5 (raw file):

type spanEncoder interface {
	// next generates the encoding for the current key column for each row int the

nit: s/int/in/.

pkg/sql/colexec/colexecspan/span_encoder_tmpl.go, line 128 at r5 (raw file):

	}
	if op.outputBytes == nil {
		op.outputBytes = coldata.NewBytes(n)

This allocation seems to be unaccounted for which is undesirable I think.

DrewKimball

After some discussion with @yuzefovich and @rytaft, I've decided on the following approach to memory accounting for the span slice and key bytes: while the SpanAssembler maintains references to them, we register the memory usage in the allocator. Once the SpanAssembler operator loses the references, we release the memory even though it may still be referenced elsewhere (the spans slice goes to a pool, while the key bytes are referenced by the kv fetcher code).

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @michae2 and @yuzefovich)

pkg/sql/colexec/colbuilder/execplan.go, line 182 at r5 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: I think it is worth creating global variables for these two errors (I have seen profiles where errors.Newf calls were non-trivial in this file).

Done. Should this be done for the other errors, or do those seem uncommon enough to not matter much?

pkg/sql/colexec/colexecspan/dep_test.go, line 21 at r5 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

s/colexecwindow/colexecspan/.

Done.

pkg/sql/colexec/colexecspan/main_test.go, line 33 at r5 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: looks like you instantiate most of these objects here and in TestSpanAssembler which seems redundant. Probably this file could be trimmed down (maybe only batch size randomization should be kept here).

Done.

pkg/sql/colfetcher/index_join.go, line 87 at r5 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: I think the trace should be started before initializing the input.

Done.

pkg/sql/colfetcher/index_join.go, line 131 at r5 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Hm, I wonder whether MisplannedRanges is a noop before we actually perform and finish the scan. I have a feeling that it is, and if that's true, this call is not useful. I think I would delete this altogether and leave a TODO (given that in the join reader we don't collect the misplanned info either).

Now that I've typed this out, I think that it actually makes sense that we should not try to collect misplanned ranges metadata in case of the index/lookup joins because their placement is determined by the physical placement of the table readers which are the inputs, so probably even a TODO is not needed.

Done.

pkg/sql/colfetcher/index_join.go, line 288 at r5 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: do you think it's worth refactoring initCRowFetcher a bit and reusing it here?

Sure, why not.

pkg/sql/colfetcher/index_join.go, line 299 at r5 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

It might be worth plumbing through the estimated row count here since the index join is 1-to-1 from its input scan, and we might have a good estimate for that scan.

Good idea. We can do even better - we can use the number of input rows for each span batch, and set it every time StartScan is called.

pkg/sql/colexec/colexecspan/span_assembler_tmpl.go, line 56 at r5 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: this line seems redundant.

Done.

pkg/sql/colexec/colexecspan/span_assembler_tmpl.go, line 136 at r5 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

I wonder whether it'll be faster and maybe cleaner to not track this field, but in Release method slice up spans up to its capacity, nil out all elements, and then slice it up to 0 to put it back into the pool. Thoughts?

That sounds like a good idea.

pkg/sql/colexec/colexecspan/span_assembler_tmpl.go, line 204 at r5 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: s/be a/a/.

Done.

pkg/sql/colexec/colexecspan/span_assembler_tmpl.go, line 219 at r5 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Similarly, I wonder whether we should get rid off shouldReset with something like
func (b *spanAssemblerBase) GetSpans() roachpb.Spans {
  spans := b.spans
  b.spans = b.spans[:0]
  return spans
}
It slightly modifies the contact of GetSpans in that it now would return zero length slice and not nil if ConsumeBatch hasn't been called, but it seems ok to me.

I like it. Done.

pkg/sql/colexec/colexecspan/span_encoder_tmpl.go, line 88 at r5 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: s/int/in/.

Done.

pkg/sql/colexec/colexecspan/span_encoder_tmpl.go, line 128 at r5 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

This allocation seems to be unaccounted for which is undesirable I think.

Definitely an oversight. Fixed it.

yuzefovich

Great work! but might be worth for someone (maybe @jordanlewis) also taking a look at span_assembler_tmpl, especially towards the bottom of the file, and at the unit test for it too.

Reviewed 10 of 15 files at r6, 7 of 8 files at r7, 1 of 1 files at r10.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball and @michae2)

pkg/sql/colexec/colbuilder/execplan.go, line 182 at r5 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Done. Should this be done for the other errors, or do those seem uncommon enough to not matter much?

Yeah, I think those aren't common enough and occur for the queries that are more heavy-weight (in which case an extra allocation isn't a big deal), but feel free to extract all errors from supportedNatively.

pkg/sql/colexec/colbuilder/execplan.go, line 276 at r10 (raw file):

	errExperimentalWrappingProhibited = errors.New("wrapping for non-JoinReader and non-LocalPlanNode cores is prohibited in vectorize=experimental_always")
	errWrappedCast                    = errors.New("mismatched types in NewColOperator and unsupported casts")
	errLookupJoinUnsupported          = errors.Newf("lookup join reader is unsupported in vectorized")

super nit: could use just New not Newf.

pkg/sql/colfetcher/index_join.go, line 299 at r5 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Good idea. We can do even better - we can use the number of input rows for each span batch, and set it every time StartScan is called.

Nice!

pkg/sql/colfetcher/index_join.go, line 82 at r10 (raw file):

	// tracing is enabled.
	s.Ctx, s.tracingSpan = execinfra.ProcessorSpan(s.Ctx, "colindexjoin")
	s.Input.Init(ctx)

Here we should use s.Ctx because in case the tracing was started, a new context was derived (and ctx points to the old one).

pkg/sql/colexec/colexecspan/span_assembler_tmpl.go, line 111 at r10 (raw file):

type spanAssemblerBase struct {
	allocator *colmem.Allocator
	keyBytes  int

nit: quick comment for keyBytes would be helpful.

yuzefovich · 2021-07-15T14:36:36Z

I guess one idea on how to increase the test coverage is to introduce another "test against processor" test for the index join in distsql/columnar_operator_test.go. However, I imagine that it might be annoying to set up, and I think we might already have sufficient coverage via the logic tests given how common index joins are, so it might not be worth pursuing this idea.

DrewKimball

I'll take a stab at it in a bit and see if it seems worth the trouble to add some "against processor" testing.

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @michae2, and @yuzefovich)

pkg/sql/colexec/colbuilder/execplan.go, line 276 at r10 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

super nit: could use just New not Newf.

Done.

pkg/sql/colfetcher/index_join.go, line 82 at r10 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Here we should use s.Ctx because in case the tracing was started, a new context was derived (and ctx points to the old one).

Done.

pkg/sql/colexec/colexecspan/span_assembler_tmpl.go, line 111 at r10 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: quick comment for keyBytes would be helpful.

Done.

DrewKimball · 2021-08-09T20:43:27Z

All right, I've made the batch-size limiting more like the row engine. The number of rows from which to generate spans is now determined by an estimate of the memory footprint of those rows. The memory size of the lookup batch is calculated by adding up the memory used by the underlying data for the row (e.g. 8 bytes for int64, etc.) as well as some extra to account for overhead due to EncDatum structs. While the vectorized engine doesn't actually use EncDatums, we add the overhead anyway to achieve parity with the row engine.

One difference is that we cannot include the EncDatum encoded field in the memory usage calculation. In cases when encoded is not nil, this leads to larger batch sizes. However, they will only be about as large as they are when encoded is nil, so batch sizes should at least be of the same order. @jordanlewis @yuzefovich do you think this seems reasonable? Should I be multiplying the memory estimate by some constant in order to simulate the encoded field?

yuzefovich

@jordanlewis is out for a week, and I think both him and myself would like for the vectorized index join input batches be as close as possible to the join reader, so introducing a multiplier sounds good to me. I agree that the encoded field of an EncDatum shouldn't be larger that the Datum field in size, so maybe use 1.25 multiplier as a somewhat arbitrary average?

Reviewed 16 of 16 files at r13.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @erikgrinaker, @jordanlewis, and @michae2)

pkg/sql/colexec/colbuilder/execplan.go, line 826 at r11 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

This setup code looks identical to that in the TableReader initialization section. Maybe we could introduce a helper function? I expect we'll also do this again if/when we support vectorized for general lookups joins, zigzag joins, etc.

+1

pkg/sql/colfetcher/cfetcher.go, line 1596 at r13 (raw file):

func (rf *cFetcher) Release() {
	rf.accountingHelper.Close()
	rf.table.Release()

nit: maybe rename AccountingHelper.Close to Release?

pkg/sql/colfetcher/index_join.go, line 220 at r13 (raw file):

		return s.startIdx
	}
	for endIdx = s.startIdx; endIdx < n; endIdx++ {

I'm thinking that maybe we should have a quick check whether including the whole batch will put us over the limit or not, using GetBatchMemSize (the fact that it includes the selection vector is probably negligible). Thoughts?

What's different here versus the cFetcher is that here we already have the full batch in hands, so we can know exactly the footprint of the batch whereas in the cFetcher we're still building out the batch, one row at a time.

pkg/sql/colfetcher/index_join.go, line 250 at r13 (raw file):

		switch vec.CanonicalTypeFamily() {
		case types.DecimalFamily:
			rowSize += int64(tree.SizeOfDecimal(&vec.Decimal()[idx]))

I'd be curious to see whether these interface conversions (i.e. calling Decimal() or Bytes()) show up non-negligibly in the CPU profiles. I'm thinking that we might have to perform that conversion one time, once we receive the new input batch and only if we see that using the whole batch will put us over the limit.

pkg/sql/colexec/colexecspan/span_assembler_tmpl.go, line 90 at r13 (raw file):

	// ConsumeBatch generates lookup spans from input batches and stores them to
	// later be returned by GetSpans. Spans are generated only for rows in the
	// range [startIdx, endIdx). If given indices such that startIdx >= endIdx,

super nit: sounds a bit unusual, maybe "If given indices are such that ..."?

DrewKimball

I ended up using 2.0 as the per-value multiplier

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @erikgrinaker, @jordanlewis, @michae2, and @yuzefovich)

pkg/sql/colexec/colbuilder/execplan.go, line 826 at r11 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

+1

Done.

pkg/sql/colfetcher/cfetcher.go, line 1596 at r13 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: maybe rename AccountingHelper.Close to Release?

Done.

pkg/sql/colfetcher/index_join.go, line 220 at r13 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

I'm thinking that maybe we should have a quick check whether including the whole batch will put us over the limit or not, using GetBatchMemSize (the fact that it includes the selection vector is probably negligible). Thoughts?

What's different here versus the cFetcher is that here we already have the full batch in hands, so we can know exactly the footprint of the batch whereas in the cFetcher we're still building out the batch, one row at a time.

That's a nice idea, though GetBatchMemSize won't work because it will produce a much smaller number than the row engine would on the same data. I'll write a helper method to handle it.

pkg/sql/colfetcher/index_join.go, line 250 at r13 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

I'd be curious to see whether these interface conversions (i.e. calling Decimal() or Bytes()) show up non-negligibly in the CPU profiles. I'm thinking that we might have to perform that conversion one time, once we receive the new input batch and only if we see that using the whole batch will put us over the limit.

Added some fields to the mem struct so the conversions can be performed up-front. I've been hard-pressed to find a query to bench with that doesn't spend most of its time in KV, but I think this is worth optimizing anyway because it's called per-row.

pkg/sql/colexec/colexecspan/span_assembler_tmpl.go, line 182 at r11 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Things might get trickier for lookup joins, since we won't necessarily know up front whether rows can be split into column families (null values disallow it)

I'm going to stick with the current algorithm for now, because (a) I think it's a bit simpler and (b) it's more obvious how to extend it to more complicated span generation for lookup joins.

pkg/sql/colexec/colexecspan/span_assembler_tmpl.go, line 90 at r13 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

super nit: sounds a bit unusual, maybe "If given indices are such that ..."?

Done.

yuzefovich

Reviewed 5 of 5 files at r14, 12 of 13 files at r15.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @erikgrinaker, @jordanlewis, @michae2, and @yuzefovich)

pkg/sql/colfetcher/index_join.go, line 220 at r13 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

That's a nice idea, though GetBatchMemSize won't work because it will produce a much smaller number than the row engine would on the same data. I'll write a helper method to handle it.

I think we could add some stuff on top of GetBatchMemSize to be essentially what we have in getBatchSize right now.

If I'm reading the code correctly, getBatchSize = GetBatchMemSize() * memEstimateMultiplier + l * w * memEstimateAdditive where l = batch.Length() and w = batch.Width(). Maybe also we need to add EncDatumRowOverhead * l.

DrewKimball

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @erikgrinaker, @jordanlewis, @michae2, and @yuzefovich)

pkg/sql/colfetcher/index_join.go, line 220 at r13 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

I think we could add some stuff on top of GetBatchMemSize to be essentially what we have in getBatchSize right now.

If I'm reading the code correctly, getBatchSize = GetBatchMemSize() * memEstimateMultiplier + l * w * memEstimateAdditive where l = batch.Length() and w = batch.Width(). Maybe also we need to add EncDatumRowOverhead * l.

Yeah, I guess it doesn't need to be exact, just good enough. Done.

yuzefovich

I think it'd be good to get the performance numbers on the relevant TPCH queries as well as a sanity check on the number of "input batches" from the perspective of the index join both in the row-by-row and vectorized engine (to make sure that our estimates are reasonable).

Otherwise,

Reviewed 4 of 4 files at r16.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball, @erikgrinaker, @jordanlewis, and @michae2)

DrewKimball

Batch sizes look good: https://gist.github.com/DrewKimball/e8d523d32671e9fa4e16777b471c110a

TPCH perf:
Master:

drewkimball@drews-mbp cockroach % bin/workload run tpch --queries '4,5,6,10,12,14,15,20' --max-ops 8 --concurrency 1
I210811 04:11:22.719317 1 workload/cli/run.go:404  [-] 1  creating load generator...
I210811 04:11:22.719452 1 workload/cli/run.go:435  [-] 2  creating load generator... done (took 136µs)
I210811 04:11:25.526325 82 workload/tpch/tpch.go:480  [-] 3  [q4] returned 5 rows after 2.81 seconds
I210811 04:11:30.293969 82 workload/tpch/tpch.go:480  [-] 4  [q5] returned 5 rows after 4.77 seconds
I210811 04:11:40.634066 82 workload/tpch/tpch.go:480  [-] 5  [q6] returned 1 rows after 10.34 seconds
I210811 04:11:44.439470 82 workload/tpch/tpch.go:480  [-] 6  [q10] returned 20 rows after 3.81 seconds
I210811 04:11:56.159936 82 workload/tpch/tpch.go:480  [-] 7  [q12] returned 2 rows after 11.72 seconds
I210811 04:11:57.293476 82 workload/tpch/tpch.go:480  [-] 8  [q14] returned 1 rows after 1.13 seconds
I210811 04:12:03.696872 82 workload/tpch/tpch.go:480  [-] 9  [q15] returned 1 rows after 6.40 seconds
I210811 04:12:21.027038 82 workload/tpch/tpch.go:480  [-] 10  [q20] returned 186 rows after 17.33 seconds

Branch:

drewkimball@drews-mbp cockroach % bin/workload run tpch --queries '4,5,6,10,12,14,15,20' --max-ops 8 --concurrency 1
I210811 04:14:30.642055 1 workload/cli/run.go:404  [-] 1  creating load generator...
I210811 04:14:30.642258 1 workload/cli/run.go:435  [-] 2  creating load generator... done (took 206µs)
I210811 04:14:33.409946 36 workload/tpch/tpch.go:480  [-] 3  [q4] returned 5 rows after 2.77 seconds
I210811 04:14:37.602520 36 workload/tpch/tpch.go:480  [-] 4  [q5] returned 5 rows after 4.19 seconds
I210811 04:14:46.229121 36 workload/tpch/tpch.go:480  [-] 5  [q6] returned 1 rows after 8.63 seconds
I210811 04:14:50.058010 36 workload/tpch/tpch.go:480  [-] 6  [q10] returned 20 rows after 3.83 seconds
I210811 04:15:00.080270 36 workload/tpch/tpch.go:480  [-] 7  [q12] returned 2 rows after 10.02 seconds
I210811 04:15:01.063998 36 workload/tpch/tpch.go:480  [-] 8  [q14] returned 1 rows after 0.98 seconds
I210811 04:15:06.490356 36 workload/tpch/tpch.go:480  [-] 9  [q15] returned 1 rows after 5.43 seconds
I210811 04:15:22.838457 36 workload/tpch/tpch.go:480  [-] 10  [q20] returned 186 rows after 16.35 seconds

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball, @erikgrinaker, @jordanlewis, and @michae2)

yuzefovich

nit: the commit message as well as the PR description need an update.

Reviewed 1 of 1 files at r17.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @erikgrinaker, @jordanlewis, and @michae2)

pkg/sql/colexec/colexecspan/span_assembler_tmpl.go, line 40 at r16 (raw file):

// NewColSpanAssembler returns a ColSpanAssembler operator that is able to
// generate lookup spans from input batches. The given size limit determines

nit: we've removed the size limit.

pkg/sql/colexec/colexecspan/span_assembler_tmpl.go, line 161 at r16 (raw file):

	// Every key has a prefix encoding the table, index, etc.
	op.scratchKey = append(op.scratchKey[:0], op.keyPrefix...)

nit: we could do this once, maybe in Init or in the constructor, to save up some work (but since this is roughly once per batch, feel free to disregard). We could also use len(op.keyPrefix) for prefixLen.

yuzefovich · 2021-08-11T06:07:43Z

I also kicked off tpchvec/perf to compare the results, average over 10 runs each:

Q  | old   | new  | improvement, %
-- |  --   |  --  | --
4  | 1.88  | 1.83 | 2.732240437
5  | 2.12  | 2.04 | 3.921568627
6  | 6.57  | 5.65 | 16.28318584
10 | 2.36  | 2.21 | 6.787330317
12 | 7.18  | 5.83 | 23.15608919
14 | 0.69  | 0.61 | 13.1147541
15 | 3.48  | 3.09 | 12.62135922
20 | 10.15 | 9.04 | 12.27876106

This patch provides a vectorized implementation of the index join operator. Span generation is accomplished using two utility operators. `spanEncoder` operates on a single index key column and fills a `Bytes` column with the encoding of that column for each row. `spanAssembler` takes the output of each `spanEncoder` and generates spans, accounting for table/index prefixes and possibly splitting the spans over column families. Finally, the `ColIndexJoin` operator uses the generated spans to perform a lookup on the table's primary index, returns all batches resulting from the lookup, and repeats until the input is fully consumed. The `ColIndexJoin` operator queues up input rows until the memory footprint of the rows reaches a preset limit (default 4MB for parity with the row engine). This allows the cost of starting a scan to be amortized. Addresses cockroachdb#65905 Release note (sql change): The vectorized execution engine can now perform a scan over an index, and then join on the primary index to retrieve the required columns.

DrewKimball

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @erikgrinaker, @jordanlewis, @michae2, and @yuzefovich)

pkg/sql/colexec/colexecspan/span_assembler_tmpl.go, line 40 at r16 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: we've removed the size limit.

Done.

pkg/sql/colexec/colexecspan/span_assembler_tmpl.go, line 161 at r16 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: we could do this once, maybe in Init or in the constructor, to save up some work (but since this is roughly once per batch, feel free to disregard). We could also use len(op.keyPrefix) for prefixLen.

Nice catch. I guess we may as well just store the prefix directly in the scratch key. Done.

DrewKimball · 2021-08-11T19:08:20Z

TFTRs!

DrewKimball · 2021-08-11T19:08:27Z

bors r+

craig · 2021-08-11T19:08:30Z

👎 Rejected by label

DrewKimball · 2021-08-11T19:08:53Z

bors r+

craig · 2021-08-11T20:16:04Z

Build succeeded:

GitHub CI (Cockroach)

DrewKimball requested review from yuzefovich, michae2 and a team July 10, 2021 00:28

DrewKimball requested a review from a team as a code owner July 10, 2021 00:28

DrewKimball force-pushed the index-join branch from 3528ab4 to bf4a78e Compare July 10, 2021 00:45

DrewKimball force-pushed the index-join branch 2 times, most recently from c2a6e2a to 54c86e5 Compare July 10, 2021 20:47

DrewKimball force-pushed the index-join branch 3 times, most recently from 6d58e0d to dae3f86 Compare July 12, 2021 17:31

yuzefovich reviewed Jul 13, 2021

View reviewed changes

DrewKimball force-pushed the index-join branch from dae3f86 to 10c5995 Compare July 14, 2021 05:59

DrewKimball commented Jul 14, 2021

View reviewed changes

yuzefovich reviewed Jul 14, 2021

View reviewed changes

DrewKimball force-pushed the index-join branch from 10c5995 to d284eda Compare July 15, 2021 03:18

DrewKimball commented Jul 15, 2021

View reviewed changes

DrewKimball force-pushed the index-join branch 4 times, most recently from 0cec42f to ef3b8e5 Compare July 15, 2021 06:27

yuzefovich approved these changes Jul 15, 2021

View reviewed changes

DrewKimball force-pushed the index-join branch from ef3b8e5 to 8bf7da3 Compare July 16, 2021 19:30

DrewKimball commented Jul 16, 2021

View reviewed changes

DrewKimball requested review from erikgrinaker and removed request for a team August 9, 2021 20:30

yuzefovich reviewed Aug 10, 2021

View reviewed changes

DrewKimball force-pushed the index-join branch 2 times, most recently from 7de64d4 to f12c46b Compare August 10, 2021 10:20

DrewKimball commented Aug 10, 2021

View reviewed changes

yuzefovich reviewed Aug 10, 2021

View reviewed changes

DrewKimball force-pushed the index-join branch from f12c46b to 2158334 Compare August 10, 2021 22:41

DrewKimball commented Aug 10, 2021

View reviewed changes

yuzefovich approved these changes Aug 10, 2021

View reviewed changes

DrewKimball commented Aug 11, 2021

View reviewed changes

DrewKimball force-pushed the index-join branch from 2158334 to 710b947 Compare August 11, 2021 04:50

yuzefovich reviewed Aug 11, 2021

View reviewed changes

DrewKimball force-pushed the index-join branch from 710b947 to 5a6f53b Compare August 11, 2021 07:49

DrewKimball commented Aug 11, 2021

View reviewed changes

erikgrinaker removed their request for review August 11, 2021 11:16

DrewKimball removed the do-not-merge bors won't merge a PR with this label. label Aug 11, 2021

craig bot merged commit 6b2e2c4 into cockroachdb:master Aug 11, 2021

DrewKimball deleted the index-join branch August 11, 2021 20:31

jordanlewis mentioned this pull request Sep 3, 2021

colexec: add support for lookup join in vectorized engine #69822

Open

8 tasks

jseldess mentioned this pull request Sep 8, 2021

colexec: implement vectorized index join cockroachdb/docs#11418

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

colexec: implement vectorized index join #67450

colexec: implement vectorized index join #67450

DrewKimball commented Jul 10, 2021 •

edited

Loading

cockroach-teamcity commented Jul 10, 2021

DrewKimball commented Jul 10, 2021

DrewKimball commented Jul 10, 2021

jordanlewis commented Jul 10, 2021

DrewKimball commented Jul 10, 2021

yuzefovich left a comment

yuzefovich left a comment

DrewKimball left a comment

yuzefovich left a comment

DrewKimball left a comment

yuzefovich left a comment

yuzefovich commented Jul 15, 2021

DrewKimball left a comment

DrewKimball commented Aug 9, 2021

yuzefovich left a comment

DrewKimball left a comment

yuzefovich left a comment

DrewKimball left a comment

yuzefovich left a comment •

edited

Loading

DrewKimball left a comment •

edited

Loading

yuzefovich left a comment

yuzefovich commented Aug 11, 2021

DrewKimball left a comment

DrewKimball commented Aug 11, 2021

DrewKimball commented Aug 11, 2021

craig bot commented Aug 11, 2021

DrewKimball commented Aug 11, 2021

craig bot commented Aug 11, 2021

colexec: implement vectorized index join #67450

colexec: implement vectorized index join #67450

Conversation

DrewKimball commented Jul 10, 2021 • edited Loading

cockroach-teamcity commented Jul 10, 2021

DrewKimball commented Jul 10, 2021

DrewKimball commented Jul 10, 2021

jordanlewis commented Jul 10, 2021

DrewKimball commented Jul 10, 2021

yuzefovich left a comment

Choose a reason for hiding this comment

yuzefovich left a comment

Choose a reason for hiding this comment

DrewKimball left a comment

Choose a reason for hiding this comment

yuzefovich left a comment

Choose a reason for hiding this comment

DrewKimball left a comment

Choose a reason for hiding this comment

yuzefovich left a comment

Choose a reason for hiding this comment

yuzefovich commented Jul 15, 2021

DrewKimball left a comment

Choose a reason for hiding this comment

DrewKimball commented Aug 9, 2021

yuzefovich left a comment

Choose a reason for hiding this comment

DrewKimball left a comment

Choose a reason for hiding this comment

yuzefovich left a comment

Choose a reason for hiding this comment

DrewKimball left a comment

Choose a reason for hiding this comment

yuzefovich left a comment • edited Loading

Choose a reason for hiding this comment

DrewKimball left a comment • edited Loading

Choose a reason for hiding this comment

yuzefovich left a comment

Choose a reason for hiding this comment

yuzefovich commented Aug 11, 2021

DrewKimball left a comment

Choose a reason for hiding this comment

DrewKimball commented Aug 11, 2021

DrewKimball commented Aug 11, 2021

craig bot commented Aug 11, 2021

DrewKimball commented Aug 11, 2021

craig bot commented Aug 11, 2021

DrewKimball commented Jul 10, 2021 •

edited

Loading

yuzefovich left a comment •

edited

Loading

DrewKimball left a comment •

edited

Loading