Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

colexec: plan disk-spilling enabled operators when vectorize=auto #45582

Merged
merged 3 commits into from
Mar 2, 2020
Merged

colexec: plan disk-spilling enabled operators when vectorize=auto #45582

merged 3 commits into from
Mar 2, 2020

Conversation

asubiotto
Copy link
Contributor

@asubiotto asubiotto commented Mar 2, 2020

Each commit turns on one of the HashRouter, Sorter, and HashJoiner and includes the relevant test changes (all plan changes).

The only thing of note is that explain analyze plans include both row and column stats for wrapped operators. It seems like this is expected though (according to vectorize_local) and changing this naively results in other failures (unexpectedly not collecting stats), so will leave this up to discussion of whether we want to change this. Regardless, I think it is out of scope for this PR.

Closes #45172

@asubiotto asubiotto requested a review from yuzefovich March 2, 2020 15:40
@asubiotto asubiotto requested a review from a team as a code owner March 2, 2020 15:40
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@asubiotto
Copy link
Contributor Author

asubiotto commented Mar 2, 2020

Note that no tests are yet run with fakedist-vec-disk, since it is best to wait until #45505 is fixed. A future commit/PR will do this.

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r1, 16 of 16 files at r2, 12 of 12 files at r3.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @asubiotto)


pkg/sql/colflow/vectorized_flow.go, line 948 at r1 (raw file):

			// buffer an unlimited number of tuples, even though it falls back to
			// disk. vectorize=auto does support this.
			return nil, errors.Errorf("hash router encountered when vectorize=auto")

nit: s/auto/192auto/g.

The HashRouter can spill to disk, so it may now be planned when vectorize=auto.

Release note: None (this behavior change will be called out in a following
commit)
The sort operator can spill to disk, so it may now be planned when
vectorize=auto.

Release note (sql change): sorts are run using the vectorized engine when
vectorize=auto (default configuration)
The hash joiner operator can spill to disk, so it may now be planned when
vectorize=auto.

Release note (sql change): hash joins are run using the vectorized engine when
vectorize = auto (default configuration)
@asubiotto
Copy link
Contributor Author

bors r=yuzefovich

@craig
Copy link
Contributor

craig bot commented Mar 2, 2020

Build succeeded

@craig craig bot merged commit e314075 into cockroachdb:master Mar 2, 2020
@andreimatei
Copy link
Contributor

Can this be responsible for this failure?

testdata/logic_test/postgresjoin:1577: SELECT 'x' AS "xxx", * FROM J1_TBL t1 (a, b, c) JOIN J2_TBL t2 (a, d) USING (a) ORDER BY a, d
                expected success, but found
                (XX000) internal error: unexpected error from the vectorized runtime: file does not exist
                error.go:70: in func1()

https://teamcity.cockroachdb.com/viewLog.html?buildId=1779463&buildTypeId=Cockroach_UnitTests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

colexec: spill to disk when vectorize="auto"
4 participants