-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: support locality optimized search for scans with more than 1 row #69395
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is there a 10k limit? Is it an oom avoidance thing?
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde and @yuzefovich)
It's just the kv batch size, added a comment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Reviewed 4 of 5 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde and @rytaft)
-- commits, line 5 at r2:
super nit: s/are fewer/are no more/
.
pkg/sql/opt_exec_factory.go, line 561 at r2 (raw file):
left, right exec.Node, reqOrdering exec.OutputOrdering, hardLimit uint64, ) (exec.Node, error) { if hardLimit > 1 {
Should we assert that hardLimit <= 10000
?
pkg/sql/opt/xform/scan_funcs.go, line 155 at r2 (raw file):
// TODO(rytaft): Revisit this when we have a more accurate cost model for data // distribution. const localityOptScanMaxRows = 10000
I think this number has actually been increased to 100k recently by Andrei, so I wonder whether we should just use row.productionKVBatchSize
directly here (possibly moving into some common base package)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde and @yuzefovich)
Previously, yuzefovich (Yahor Yuzefovich) wrote…
super nit:
s/are fewer/are no more/
.
Done.
pkg/sql/opt_exec_factory.go, line 561 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Should we assert that
hardLimit <= 10000
?
I'm inclined not to, since 10000 was just an arbitrary number. We had this assertion for 1 row before since we thought the optimization wouldn't work with more than row, but as I've discovered, it works just fine. I'd rather not have to keep this assertion in sync with whatever limit we place in the optimizer.
pkg/sql/opt/xform/scan_funcs.go, line 155 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
I think this number has actually been increased to 100k recently by Andrei, so I wonder whether we should just use
row.productionKVBatchSize
directly here (possibly moving into some common base package)?
Done. I made a new package since none of the existing packages seemed appropriate (added in a separate commit). Let me know if you have an idea of a better place for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 26 of 26 files at r3, 5 of 5 files at r4, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @RaduBerinde and @rytaft)
-- commits, line 24 at r4:
nit: s/10000/100000/
.
-- commits, line 34 at r4:
ditto
pkg/sql/opt_exec_factory.go, line 561 at r2 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
I'm inclined not to, since 10000 was just an arbitrary number. We had this assertion for 1 row before since we thought the optimization wouldn't work with more than row, but as I've discovered, it works just fine. I'd rather not have to keep this assertion in sync with whatever limit we place in the optimizer.
Makes sense.
pkg/sql/opt/xform/scan_funcs.go, line 155 at r2 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
Done. I made a new package since none of the existing packages seemed appropriate (added in a separate commit). Let me know if you have an idea of a better place for it.
Looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @RaduBerinde, @rytaft, and @yuzefovich)
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit:
s/10000/100000/
.
Done.
Previously, yuzefovich (Yahor Yuzefovich) wrote…
ditto
Done.
Build failed (retrying...): |
This commit creates a new package called rowinfra and moves some constants and type definitions from the row package to the new package so that they can be used externally without adding a dependency on the row package. Note: I don't really like the name rowinfra, but I think it works given the current name of the row package. I think it would be better to rename row to something that hints at it's actual purpose, which seems to be as the interface between the kv and sql layers. If we rename row, then we can rename rowinfra to match. Release note: None Release justification: Low risk, high benefit change to existing functionality that is needed for a follow-on commit.
bors r- I needed to rebase. |
Canceled. |
bors r+ |
Build failed (retrying...): |
bors r- This needs a rewrite of |
Canceled. |
This commit updates the logic for planning locality optimized search to allow the optimization the be planned if there are no more than 100,000 keys selected. The optimization is not yet supported for scans with a hard limit. Informs cockroachdb#64862 Release justification: Low risk, high benefit change to existing functionality. Release note (performance improvement): locality optimized search is now supported for scans that are guaranteed to return 100,000 keys or less. This optimization allows the execution engine to avoid visiting remote regions if all requested keys are found in the local region, thus reducing the latency of the query.
65fe2c7
to
3af8181
Compare
I removed the bors r+ |
Build succeeded: |
Thanks! |
This commit updates the logic for planning locality optimized search to
allow the optimization the be planned if there are fewer than 100,000
keys selected.
The optimization is not yet supported for scans with a hard limit.
Informs #64862
Release justification: Low risk, high benefit change to existing functionality.
Release note (performance improvement): locality optimized search
is now supported for scans that are guaranteed to return 100000 keys
or less. This optimization allows the execution engine to avoid
visiting remote regions if all requested keys are found in the local
region, thus reducing the latency of the query.