Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt/xform: make merge join have smaller table on the right side #71849

Merged
merged 2 commits into from
Oct 26, 2021

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented Oct 21, 2021

sql: ignore last noop processor on gateway when moving single flow

When figuring out whether a physical plan consists of flows on multiple
nodes, we want to ignore the noop processor that might be planned on the
gateway for a sole purpose of propagating the results back to the client
(in another words, when we have a flow on the gateway consisting only of
a noop processor). Previously, we forgot to do that.

Release note: None

opt/xform: make merge join have smaller table on the right side

A PR that is currently in-flight will change the implementation of the
vectorized merge joiner so that it processes the left input always in
a streaming fashion whereas the right input might be buffered
(partially) in some cases. To account for this difference this commit
changes the cost model by introducing a small factor to the right row
count which ensures that the smaller right side is preferred to the
symmetric join.

This commit also handles the case of partial joins in a similar manner
to how we handle the hash join.

Fixes: #71790.

Release note: None

@yuzefovich yuzefovich requested review from RaduBerinde and a team October 21, 2021 22:06
@yuzefovich yuzefovich requested a review from a team as a code owner October 21, 2021 22:06
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@yuzefovich yuzefovich force-pushed the merge-join-cost-model branch from bb40195 to 90c4cf2 Compare October 21, 2021 23:29
@yuzefovich
Copy link
Member Author

There are still some stats tests that need an update, but I'll do that at the very end.

There is also a suspicious failure with TestPGTest - it's possible that the way I'm modifying *memo.MergeJoinExpr is wrong (or maybe it's not allowed at all). Will wait for a review on that one too.

@RaduBerinde
Copy link
Member

What modification of memo.MergeJoinExpr are you referring to?

@yuzefovich
Copy link
Member Author

This one:

join.Left, join.Right = join.Right, join.Left
join.LeftEq, join.RightEq = join.RightEq, join.LeftEq
join.LeftOrdering, join.RightOrdering = join.RightOrdering, join.LeftOrdering

@yuzefovich
Copy link
Member Author

This is needed since we didn't introduce separate (Left|Right)Semi and (Left|Right)Anti joins and is similar to what we already do for the hash joins (although there things were simpler, without having to modify the memo directly).

@RaduBerinde
Copy link
Member

Oh, I see, let's not do that, just use some local variables.. If the full plan is cached and reused, multiple threads might be doing this at the same time.

@yuzefovich yuzefovich force-pushed the merge-join-cost-model branch from 90c4cf2 to ddda439 Compare October 22, 2021 20:34
When figuring out whether a physical plan consists of flows on multiple
nodes, we want to ignore the noop processor that might be planned on the
gateway for a sole purpose of propagating the results back to the client
(in another words, when we have a flow on the gateway consisting only of
a noop processor). Previously, we forgot to do that.

Release note: None
@yuzefovich yuzefovich force-pushed the merge-join-cost-model branch from ddda439 to a9f717b Compare October 22, 2021 20:36
@yuzefovich
Copy link
Member Author

Makes sense, fixed.

However, TestPGTest is still failing (even if I comment out the part about commuting the LeftSemi/LeftAnti join) with

    --- FAIL: TestPGTest/row_description (0.34s)
        datadriven.go:75: 
            testdata/pgtest/row_description:126: RowDescription
            expected:
            {"Type":"RowDescription","Fields":[{"Name":"v1","TableOID":52,"TableAttributeNumber":1,"DataTypeOID":20,"DataTypeSize":8,"TypeModifier":-1,"Format":0},{"Name":"v2","TableOID":53,"TableAttributeNumber":2,"DataTypeOID":20,"DataTypeSize":8,"TypeModifier":-1,"Format":0}]}
            
            found:
            {"Type":"RowDescription","Fields":[{"Name":"v1","TableOID":0,"TableAttributeNumber":0,"DataTypeOID":20,"DataTypeSize":8,"TypeModifier":-1,"Format":0},{"Name":"v2","TableOID":0,"TableAttributeNumber":0,"DataTypeOID":20,"DataTypeSize":8,"TypeModifier":-1,"Format":0}]}

Any idea why that could be the case?

@yuzefovich
Copy link
Member Author

I confirmed that it is the second commit that fails the test. Maybe the plan for the query used in TestPGTest changed and we have some bug in there?

@yuzefovich yuzefovich force-pushed the merge-join-cost-model branch from a9f717b to ec89dce Compare October 22, 2021 21:14
@yuzefovich
Copy link
Member Author

Alright, we have an already present bug (filed #71891), so for now I think it's ok to ignore the table OIDs.

@yuzefovich yuzefovich force-pushed the merge-join-cost-model branch 3 times, most recently from 7e76ac9 to 9a53a4b Compare October 23, 2021 02:02
@yuzefovich
Copy link
Member Author

I think I fixed up all the test files. For now I didn't include the changes to the stats quality tests because there are changes in queries for which the plans haven't changed. Anyway, RFAL.

Copy link
Contributor

@cucaroach cucaroach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r1, 11 of 11 files at r3, 11 of 11 files at r4, 1 of 1 files at r5, 13 of 13 files at r6, 15 of 15 files at r7, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde)

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: xfrom -> xform in the commit/PR message title

The stats quality test changes LGTM

Reviewed 2 of 2 files at r1, 11 of 11 files at r3, 11 of 11 files at r4, 1 of 1 files at r5, 13 of 13 files at r6, 15 of 15 files at r7, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde and @yuzefovich)


-- commits, line 23 at r6:
nit: a similar -> a similar fashion?


pkg/sql/opt/xform/coster.go, line 860 at r6 (raw file):

	// ensures that a join with the smaller right side is preferred to the
	// symmetric join.
	cost := memo.Cost(leftRowCount+1.25*rightRowCount) * cpuCostFactor

Have you compared the impact on plans if you also multiply leftRowCount by 0.75? (We might want to consider doing this if there are any cases where a different join type is now preferred over merge join, but I didn't see many of those...)

@yuzefovich yuzefovich force-pushed the merge-join-cost-model branch from 9a53a4b to d240e5c Compare October 25, 2021 16:57
@yuzefovich yuzefovich changed the title opt/xfrom: make merge join have smaller table on the right side opt/xform: make merge join have smaller table on the right side Oct 25, 2021
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde and @rytaft)


pkg/sql/opt/xform/coster.go, line 860 at r6 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Have you compared the impact on plans if you also multiply leftRowCount by 0.75? (We might want to consider doing this if there are any cases where a different join type is now preferred over merge join, but I didn't see many of those...)

I don't think I saw any merge joins change into non-merge joins. Is you reasoning such that we'd like to keep the cost of merge joins roughly unchanged? Would we be happy with 0.75 x 1.25 multiples? Maybe 0.9 x 1.1 would be better?

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde and @yuzefovich)


-- commits, line 24 at r8:
nit: to how we handle


pkg/sql/opt/xform/coster.go, line 860 at r6 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

I don't think I saw any merge joins change into non-merge joins. Is you reasoning such that we'd like to keep the cost of merge joins roughly unchanged? Would we be happy with 0.75 x 1.25 multiples? Maybe 0.9 x 1.1 would be better?

I think I saw at least one non-trivial plan change. But I agree that there weren't many.

Either way, I do think it is worth trying to be sure that we're not increasing the total cost of merge joins with this change. If anything, we should decrease it since your execution change makes them more efficient, not less. But anyway, either 0.75 x 1.25 or 0.9 x 1.1 seems like it would probably do the right thing -- you might just need to give it a try and see which plans change.

@yuzefovich yuzefovich force-pushed the merge-join-cost-model branch from d240e5c to e29aa70 Compare October 25, 2021 17:53
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @cucaroach, @RaduBerinde, and @rytaft)


pkg/sql/opt/xform/coster.go, line 860 at r6 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I think I saw at least one non-trivial plan change. But I agree that there weren't many.

Either way, I do think it is worth trying to be sure that we're not increasing the total cost of merge joins with this change. If anything, we should decrease it since your execution change makes them more efficient, not less. But anyway, either 0.75 x 1.25 or 0.9 x 1.1 seems like it would probably do the right thing -- you might just need to give it a try and see which plans change.

I think I'll do 0.9 x 1.1 and will regenerate the test files.

I looked over the PR and still don't see any plans that actually changed - do you mind pointing me to it?

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r9, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde and @yuzefovich)


pkg/sql/opt/xform/coster.go, line 860 at r6 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

I think I'll do 0.9 x 1.1 and will regenerate the test files.

I looked over the PR and still don't see any plans that actually changed - do you mind pointing me to it?

I added a comment -- only found one.


pkg/sql/opt/xform/coster.go, line 857 at r9 (raw file):

	// The vectorized merge join in some cases buffers rows from the right side
	// whereas the left side is processed in a streaming fashion. To account for
	// this difference, we add small factors to both row counts which ensures

super nit: you're not really "adding" a factor to both row counts now... maybe use "multiply" instead?


pkg/sql/opt/xform/testdata/external/pgjdbc, line 203 at r9 (raw file):

      │    │    │    │    │    │    │    ├── columns: n.oid:2!null n.nspname:3!null c.oid:7!null c.relname:8!null c.relnamespace:9!null c.relkind:24!null attrelid:44!null attname:45 atttypid:46 attlen:48 attnum:49!null atttypmod:52 a.attnotnull:56 attisdropped:60!null
      │    │    │    │    │    │    │    ├── fd: ()-->(3,60), (2)==(9), (9)==(2), (7)==(44), (44)==(7)
      │    │    │    │    │    │    │    ├── inner-join (merge)

This is the plan that seems to have changed to eliminate merge join in favor of hash

@yuzefovich yuzefovich force-pushed the merge-join-cost-model branch from e29aa70 to d44169b Compare October 25, 2021 18:58
A PR that is currently in-flight will change the implementation of the
vectorized merge joiner so that it processes the left input always in
a streaming fashion whereas the right input might be buffered
(partially) in some cases. To account for this difference this commit
changes the cost model by introducing a small factor to the right row
count which ensures that the smaller right side is preferred to the
symmetric join.

This commit also handles the case of partial joins in a similar manner
to how we handle the hash join.

Release note: None
@yuzefovich yuzefovich force-pushed the merge-join-cost-model branch from d44169b to fc18f45 Compare October 25, 2021 18:59
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @cucaroach, @RaduBerinde, and @rytaft)


pkg/sql/opt/xform/testdata/external/pgjdbc, line 203 at r9 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

This is the plan that seems to have changed to eliminate merge join in favor of hash

Oh indeed, thanks. This is a merge join again now.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: further down in the PR message you still have "opt/xfrom"

Otherwise, :lgtm:

Reviewed 9 of 9 files at r10, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @RaduBerinde)

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: further down in the PR message you still have "opt/xfrom"

Fixed.

TFTRs!

bors r+

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @RaduBerinde)

@craig
Copy link
Contributor

craig bot commented Oct 25, 2021

Build failed (retrying...):

@craig craig bot merged commit 0595c1f into cockroachdb:master Oct 26, 2021
@craig
Copy link
Contributor

craig bot commented Oct 26, 2021

Build succeeded:

@yuzefovich yuzefovich deleted the merge-join-cost-model branch October 26, 2021 00:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

opt: suboptimal ordering of tables for merge join
5 participants