-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: make some tweaks to using the vectorized engine #46080
Conversation
a566b45
to
90e1811
Compare
90e1811
to
4f6028e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @ericharmeling on the rename
Reviewed 12 of 12 files at r1, 12 of 12 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)
pkg/cmd/roachtest/tpchvec.go, line 515 at r2 (raw file):
vectorizeSetting := "off" if vectorize { for versionPrefix, vectorizeOption := range vectorizeOptionByVersionPrefix {
nit: might be cleaner to get the prefix from the version and access the map using that.
pkg/sql/explain_vec.go, line 79 at r2 (raw file):
if flowCtx.EvalCtx.SessionData.VectorizeMode == sessiondata.VectorizeOff { return errors.New("vectorize set to 'off'")
Why is this changing? Please also add a comment detailing the reasoning for this behavior.
pkg/sql/colexec/execplan.go, line 229 at r1 (raw file):
return false, errors.Newf("distinct with error on duplicates not supported") } if mode != sessiondata.VectorizeExperimentalOn && mode != sessiondata.VectorizeExperimentalAlways {
Extract this check into a helper function and reuse for window fns.
4f6028e
to
09011ff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto)
pkg/cmd/roachtest/tpchvec.go, line 515 at r2 (raw file):
Previously, asubiotto (Alfonso Subiotto Marqués) wrote…
nit: might be cleaner to get the prefix from the version and access the map using that.
There is already a precedent in this file of doing this kind of map iteration and prefix matching. I'd keep it this way because it's hard to mess this up whereas if we switch to extracting a prefix from the version first, I think that would be more error-prone.
pkg/sql/explain_vec.go, line 79 at r2 (raw file):
Previously, asubiotto (Alfonso Subiotto Marqués) wrote…
Why is this changing? Please also add a comment detailing the reasoning for this behavior.
Added a comment.
pkg/sql/colexec/execplan.go, line 229 at r1 (raw file):
Previously, asubiotto (Alfonso Subiotto Marqués) wrote…
Extract this check into a helper function and reuse for window fns.
Good point, done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 13 of 13 files at r3, 13 of 13 files at r4.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @yuzefovich)
pkg/sql/explain_vec.go, line 79 at r2 (raw file):
Previously, yuzefovich wrote…
Added a comment.
So currently when running EXPLAIN
vectorize
will show false
if vectorize is off. But now running an EXPLAIN
will return an error? Or is this only for EXPLAIN (VEC)
Release justification: bug fixes and low-risk updates to new functionality. This commit enhances `isSupported` check during the vectorized execution planning so that operators that cannot run in `auto` mode (unordered distinct, percent_rank, and cume_dist) would be reported as "unsupported" which will enable wrapping the processor cores into the vectorized flow. Previously, we would refuse to vectorize the whole flow which is a mistake. Release note: None
Release justification: bug fixes and low-risk updates to new functionality. This commit renames `experimental_on` option of `vectorize` variable to `on` since we're now confident in the correctness. This commit also changes the behavior of `EXPLAIN (VEC)` slightly - previously, we were setting `vectorize` to (what was) `experimental_on` and then running `SupportsVectorized` check. Now we will return an error if `vectorize` is set to `off` and in other cases we will run the check with the current `vectorize` mode. This is done so that `EXPLAIN (VEC)` better reflects reality. Release note (sql change): `experimental_on` option for `vectorize` session variable has been renamed to `on`. The only things that will not run with `auto` but will run with `on` are unordered distinct and two window functions (`percent_rank` and `cume_dist`), otherwise, the two options are identical.
09011ff
to
c8cc60b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @asubiotto)
pkg/sql/explain_vec.go, line 79 at r2 (raw file):
Previously, asubiotto (Alfonso Subiotto Marqués) wrote…
So currently when running
EXPLAIN
vectorize
will showfalse
if vectorize is off. But now running anEXPLAIN
will return an error? Or is this only forEXPLAIN (VEC)
This change affects only EXPLAIN (VEC)
variant:
root@:26257/tpch> set vectorize=off;
SET
Time: 292µs
root@:26257/tpch> explain (vec) select count(*) from lineitem;
ERROR: vectorize is set to 'off'
root@:26257/tpch> explain select count(*) from lineitem;
tree | field | description
------------+-------------+----------------
| distributed | true
| vectorized | false
group | |
│ | aggregate 0 | count_rows()
│ | scalar |
└── scan | |
| table | lineitem@l_ok
| spans | ALL
(8 rows)
Time: 669µs
root@:26257/tpch> set vectorize=auto;
SET
Time: 296µs
root@:26257/tpch> explain select count(*) from lineitem;
tree | field | description
------------+-------------+----------------
| distributed | true
| vectorized | true
group | |
│ | aggregate 0 | count_rows()
│ | scalar |
└── scan | |
| table | lineitem@l_ok
| spans | ALL
(8 rows)
Time: 879µs
root@:26257/tpch> explain (vec) select count(*) from lineitem;
text
-------------------------------
│
└ Node 1
└ *colexec.countOp
└ *colexec.colBatchScan
(4 rows)
Regular EXPLAIN
didn't override the vectorize setting, so no changes are needed in there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @asubiotto)
pkg/sql/logictest/logic.go, line 564 at r6 (raw file):
"local", "local-vec-off", "local-vec",
@asubiotto what do you think about removing local-vec
, fakedist-vec
, fakedist-vec-disk
configs from the ist of the default ones because they are basically the same as without "-vec" suffix? We would explicitly put this configs into the "distinct" and "window" logic tests and probably would get almost the same coverage.
That's a good idea but I would probably wait until the 20.1 branch gets cut. My reasoning is that we might still want to switch off some new functionality but keep the testing coverage without adding explicit configs to the test files. Thanks for sharing the EXPLAIN output! Looks good. |
TFTR! bors r+ |
Noticed a typo in the comment. bors r- |
Canceled |
Never mind, I think the comment is good. bors r+ |
Build failed (retrying...) |
Build succeeded |
colexec: enable wrapping of unordered distinct in vectorized flow
Release justification: bug fixes and low-risk updates to new
functionality.
This commit enhances
isSupported
check during the vectorized executionplanning so that operators that cannot run in
auto
mode (unordereddistinct, percent_rank, and cume_dist) would be reported as
"unsupported" which will enable wrapping the processor cores into the
vectorized flow. Previously, we would refuse to vectorize the whole flow
which is a mistake.
Release note: None
sql: rename vectorize
experimental_on
toon
Release justification: bug fixes and low-risk updates to new
functionality.
This commit renames
experimental_on
option ofvectorize
variable toon
since we're now confident in the correctness. This commit alsochanges the behavior of
EXPLAIN (VEC)
slightly - previously, we weresetting
vectorize
to (what was)experimental_on
and then runningSupportsVectorized
check. Now we will return an error ifvectorize
is set to
off
and in other cases we will run the check with thecurrent
vectorize
mode. This is done so thatEXPLAIN (VEC)
betterreflects reality.
Release note (sql change):
experimental_on
option forvectorize
session variable has been renamed to
on
. The only things that willnot run with
auto
but will run withon
are unordered distinct andtwo window functions (
percent_rank
andcume_dist
), otherwise, thetwo options are identical.