opt: support locality optimized search for scans with more than 1 row #69395

rytaft · 2021-08-26T02:34:13Z

This commit updates the logic for planning locality optimized search to
allow the optimization the be planned if there are fewer than 100,000
keys selected.

The optimization is not yet supported for scans with a hard limit.

Informs #64862

Release justification: Low risk, high benefit change to existing functionality.

Release note (performance improvement): locality optimized search
is now supported for scans that are guaranteed to return 100000 keys
or less. This optimization allows the execution engine to avoid
visiting remote regions if all requested keys are found in the local
region, thus reducing the latency of the query.

cockroach-teamcity · 2021-08-26T02:34:20Z

This change is

cucaroach

Why is there a 10k limit? Is it an oom avoidance thing?

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde and @yuzefovich)

rytaft · 2021-08-26T12:51:42Z

It's just the kv batch size, added a comment.

yuzefovich

Nice!

Reviewed 4 of 5 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde and @rytaft)

-- commits, line 5 at r2:
super nit: s/are fewer/are no more/.

pkg/sql/opt_exec_factory.go, line 561 at r2 (raw file):

	left, right exec.Node, reqOrdering exec.OutputOrdering, hardLimit uint64,
) (exec.Node, error) {
	if hardLimit > 1 {

Should we assert that hardLimit <= 10000?

pkg/sql/opt/xform/scan_funcs.go, line 155 at r2 (raw file):

	// TODO(rytaft): Revisit this when we have a more accurate cost model for data
	// distribution.
	const localityOptScanMaxRows = 10000

I think this number has actually been increased to 100k recently by Andrei, so I wonder whether we should just use row.productionKVBatchSize directly here (possibly moving into some common base package)?

rytaft

TFTR!

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde and @yuzefovich)

-- commits, line 5 at r2:

Previously, yuzefovich (Yahor Yuzefovich) wrote…

super nit: s/are fewer/are no more/.

Done.

pkg/sql/opt_exec_factory.go, line 561 at r2 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Should we assert that hardLimit <= 10000?

I'm inclined not to, since 10000 was just an arbitrary number. We had this assertion for 1 row before since we thought the optimization wouldn't work with more than row, but as I've discovered, it works just fine. I'd rather not have to keep this assertion in sync with whatever limit we place in the optimizer.

pkg/sql/opt/xform/scan_funcs.go, line 155 at r2 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

I think this number has actually been increased to 100k recently by Andrei, so I wonder whether we should just use row.productionKVBatchSize directly here (possibly moving into some common base package)?

Done. I made a new package since none of the existing packages seemed appropriate (added in a separate commit). Let me know if you have an idea of a better place for it.

yuzefovich

Reviewed 26 of 26 files at r3, 5 of 5 files at r4, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @RaduBerinde and @rytaft)

-- commits, line 24 at r4:
nit: s/10000/100000/.

-- commits, line 34 at r4:
ditto

pkg/sql/opt_exec_factory.go, line 561 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I'm inclined not to, since 10000 was just an arbitrary number. We had this assertion for 1 row before since we thought the optimization wouldn't work with more than row, but as I've discovered, it works just fine. I'd rather not have to keep this assertion in sync with whatever limit we place in the optimizer.

Makes sense.

pkg/sql/opt/xform/scan_funcs.go, line 155 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Done. I made a new package since none of the existing packages seemed appropriate (added in a separate commit). Let me know if you have an idea of a better place for it.

Looks good to me.

rytaft

Thanks!

bors r+

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @RaduBerinde, @rytaft, and @yuzefovich)

-- commits, line 24 at r4:

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: s/10000/100000/.

Done.

-- commits, line 34 at r4:

Previously, yuzefovich (Yahor Yuzefovich) wrote…

ditto

Done.

craig · 2021-08-27T20:43:55Z

Build failed (retrying...):

GitHub CI (Cockroach)

This commit creates a new package called rowinfra and moves some constants and type definitions from the row package to the new package so that they can be used externally without adding a dependency on the row package. Note: I don't really like the name rowinfra, but I think it works given the current name of the row package. I think it would be better to rename row to something that hints at it's actual purpose, which seems to be as the interface between the kv and sql layers. If we rename row, then we can rename rowinfra to match. Release note: None Release justification: Low risk, high benefit change to existing functionality that is needed for a follow-on commit.

rytaft · 2021-08-27T21:04:25Z

bors r-

I needed to rebase.

craig · 2021-08-27T21:04:27Z

Canceled.

rytaft · 2021-08-27T21:06:16Z

bors r+

craig · 2021-08-28T00:38:34Z

Build failed (retrying...):

GitHub CI (Cockroach)

yuzefovich · 2021-08-28T01:26:30Z

bors r-

This needs a rewrite of regional_by_row logic test (which I'll do in a minute).

craig · 2021-08-28T01:26:32Z

Canceled.

This commit updates the logic for planning locality optimized search to allow the optimization the be planned if there are no more than 100,000 keys selected. The optimization is not yet supported for scans with a hard limit. Informs cockroachdb#64862 Release justification: Low risk, high benefit change to existing functionality. Release note (performance improvement): locality optimized search is now supported for scans that are guaranteed to return 100,000 keys or less. This optimization allows the execution engine to avoid visiting remote regions if all requested keys are found in the local region, thus reducing the latency of the query.

yuzefovich · 2021-08-28T01:34:24Z

I removed the DISTSQL option from the logic test (because the physical plan differs between vectorize=on and vectorize=off configs).

bors r+

craig · 2021-08-28T06:58:01Z

Build succeeded:

GitHub CI (Cockroach)

rytaft · 2021-08-28T13:21:42Z

I removed the DISTSQL option from the logic test (because the physical plan differs between vectorize=on and vectorize=off configs).

Thanks!

rytaft requested review from yuzefovich, RaduBerinde and a team August 26, 2021 02:34

rytaft requested a review from a team as a code owner August 26, 2021 02:34

cucaroach reviewed Aug 26, 2021

View reviewed changes

rytaft force-pushed the local-opt-scan branch from 2691ea2 to c2735ee Compare August 26, 2021 12:51

yuzefovich reviewed Aug 26, 2021

View reviewed changes

rytaft force-pushed the local-opt-scan branch from c2735ee to 1eb0587 Compare August 27, 2021 12:33

rytaft requested a review from a team August 27, 2021 12:33

rytaft commented Aug 27, 2021

View reviewed changes

yuzefovich approved these changes Aug 27, 2021

View reviewed changes

rytaft force-pushed the local-opt-scan branch from 1eb0587 to 881b828 Compare August 27, 2021 19:24

rytaft commented Aug 27, 2021

View reviewed changes

rytaft force-pushed the local-opt-scan branch from 881b828 to 65fe2c7 Compare August 27, 2021 21:05

yuzefovich force-pushed the local-opt-scan branch from 65fe2c7 to 3af8181 Compare August 28, 2021 01:33

craig bot merged commit b8223fb into cockroachdb:master Aug 28, 2021

cockroach-teamcity mentioned this pull request Aug 28, 2021

opt: support locality optimized search for scans with more than 1 row cockroachdb/docs#11158

Closed

jseldess mentioned this pull request Sep 8, 2021

opt: support locality optimized search for scans with more than 1 row cockroachdb/docs#11615

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

opt: support locality optimized search for scans with more than 1 row #69395

opt: support locality optimized search for scans with more than 1 row #69395

rytaft commented Aug 26, 2021 •

edited by yuzefovich

Loading

cockroach-teamcity commented Aug 26, 2021

cucaroach left a comment

rytaft commented Aug 26, 2021

yuzefovich left a comment

rytaft left a comment

yuzefovich left a comment

rytaft left a comment

craig bot commented Aug 27, 2021

rytaft commented Aug 27, 2021

craig bot commented Aug 27, 2021

rytaft commented Aug 27, 2021

craig bot commented Aug 28, 2021

yuzefovich commented Aug 28, 2021 •

edited

Loading

craig bot commented Aug 28, 2021

yuzefovich commented Aug 28, 2021

craig bot commented Aug 28, 2021

rytaft commented Aug 28, 2021

opt: support locality optimized search for scans with more than 1 row #69395

opt: support locality optimized search for scans with more than 1 row #69395

Conversation

rytaft commented Aug 26, 2021 • edited by yuzefovich Loading

cockroach-teamcity commented Aug 26, 2021

cucaroach left a comment

Choose a reason for hiding this comment

rytaft commented Aug 26, 2021

yuzefovich left a comment

Choose a reason for hiding this comment

rytaft left a comment

Choose a reason for hiding this comment

yuzefovich left a comment

Choose a reason for hiding this comment

rytaft left a comment

Choose a reason for hiding this comment

craig bot commented Aug 27, 2021

rytaft commented Aug 27, 2021

craig bot commented Aug 27, 2021

rytaft commented Aug 27, 2021

craig bot commented Aug 28, 2021

yuzefovich commented Aug 28, 2021 • edited Loading

craig bot commented Aug 28, 2021

yuzefovich commented Aug 28, 2021

craig bot commented Aug 28, 2021

rytaft commented Aug 28, 2021

rytaft commented Aug 26, 2021 •

edited by yuzefovich

Loading

yuzefovich commented Aug 28, 2021 •

edited

Loading