Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-20366] [SQL] Fix recursive join reordering: inside joins are not reordered #17668

Closed
wants to merge 1 commit into from

Conversation

wzhfy
Copy link
Contributor

@wzhfy wzhfy commented Apr 18, 2017

What changes were proposed in this pull request?

If a plan has multi-level successive joins, e.g.:

         Join
         /   \
     Union   t5
      /   \
    Join  t4
    /   \
  Join  t3
  /  \
 t1   t2

Currently we fail to reorder the inside joins, i.e. t1, t2, t3.

In join reorder, we use OrderedJoin to indicate a join has been ordered, such that when transforming down the plan, these joins don't need to be rerodered again.

But there's a problem in the definition of OrderedJoin:
The real join node is a parameter, but not a child. This breaks the transform procedure because mapChildren applies transform function on parameters which should be children.

In this patch, we change OrderedJoin to a class having the same structure as a join node.

How was this patch tested?

Add a corresponding test case.

@wzhfy
Copy link
Contributor Author

wzhfy commented Apr 18, 2017

cc @cloud-fan @hvanhovell

@SparkQA
Copy link

SparkQA commented Apr 18, 2017

Test build #75891 has finished for PR 17668 at commit 522d2fa.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class OrderedJoin(

@cloud-fan
Copy link
Contributor

LGTM, merging to master!

@asfgit asfgit closed this in 321b4f0 Apr 18, 2017
peter-toth pushed a commit to peter-toth/spark that referenced this pull request Oct 6, 2018
…t reordered

## What changes were proposed in this pull request?

If a plan has multi-level successive joins, e.g.:
```
         Join
         /   \
     Union   t5
      /   \
    Join  t4
    /   \
  Join  t3
  /  \
 t1   t2
```
Currently we fail to reorder the inside joins, i.e. t1, t2, t3.

In join reorder, we use `OrderedJoin` to indicate a join has been ordered, such that when transforming down the plan, these joins don't need to be rerodered again.

But there's a problem in the definition of `OrderedJoin`:
The real join node is a parameter, but not a child. This breaks the transform procedure because `mapChildren` applies transform function on parameters which should be children.

In this patch, we change `OrderedJoin` to a class having the same structure as a join node.

## How was this patch tested?

Add a corresponding test case.

Author: wangzhenhua <[email protected]>

Closes apache#17668 from wzhfy/recursiveReorder.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants