Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-23079][SQL] Fix query constraints propagation with aliases #20270

Closed
wants to merge 2 commits into from

Conversation

gengliangwang
Copy link
Member

@gengliangwang gengliangwang commented Jan 15, 2018

What changes were proposed in this pull request?

Previously, PR #19201 fix the problem of non-converging constraints.

However, the case below will fail.


spark.range(5).write.saveAsTable("t")
val t = spark.read.table("t")
val left = t.withColumn("xid", $"id" + lit(1)).as("x")
val right = t.withColumnRenamed("id", "xid").as("y")
val df = left.join(right, "xid").filter("id = 3").toDF()
checkAnswer(df, Row(4, 3))

Because aliasMap replace all the aliased child. See the test case in PR for details.

This PR is to fix this bug by removing useless aliasMap.

Duplicated to #20278, this PR will be closed after test finished.
This PR makes sense but it takes time to understand the whole code.

How was this patch tested?

Unit test

@SparkQA
Copy link

SparkQA commented Jan 15, 2018

Test build #86135 has finished for PR 20270 at commit e9dd769.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

/**
* Infers an additional set of constraints from a given set of equality constraints.
* For e.g., if an operator has constraints of the form (`a = 5`, `a = b`), this returns an
* additional constraint of the form `b = 5`.
*/
private def inferAdditionalConstraints(constraints: Set[Expression]): Set[Expression] = {
val aliasedConstraints = eliminateAliasedExpressionInConstraints(constraints)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us revert this back?

@SparkQA
Copy link

SparkQA commented Jan 16, 2018

Test build #86173 has started for PR 20270 at commit 8a22e1d.

@SparkQA
Copy link

SparkQA commented Jan 16, 2018

Test build #86175 has started for PR 20270 at commit 05c32bf.

@SparkQA
Copy link

SparkQA commented Jan 16, 2018

Test build #86176 has started for PR 20270 at commit aaad66a.

@gengliangwang gengliangwang changed the title [SPARK-23079] Fix query constraints propagation with aliases [SPARK-23079][SQL] Fix query constraints propagation with aliases Jan 16, 2018
@gengliangwang
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Jan 17, 2018

Test build #86217 has finished for PR 20270 at commit aaad66a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

ghost pushed a commit to dbtsai/spark that referenced this pull request Jan 17, 2018
## What changes were proposed in this pull request?

Previously, PR apache#19201 fix the problem of non-converging constraints.
After that PR apache#19149 improve the loop and constraints is inferred only once.
So the problem of non-converging constraints is gone.

However, the case below will fail.

```

spark.range(5).write.saveAsTable("t")
val t = spark.read.table("t")
val left = t.withColumn("xid", $"id" + lit(1)).as("x")
val right = t.withColumnRenamed("id", "xid").as("y")
val df = left.join(right, "xid").filter("id = 3").toDF()
checkAnswer(df, Row(4, 3))

```

Because `aliasMap` replace all the aliased child. See the test case in PR for details.

This PR is to fix this bug by removing useless code for preventing non-converging constraints.
It can be also fixed with apache#20270, but this is much simpler and clean up the code.

## How was this patch tested?

Unit test

Author: Wang Gengliang <[email protected]>

Closes apache#20278 from gengliangwang/FixConstraintSimple.
asfgit pushed a commit that referenced this pull request Jan 17, 2018
## What changes were proposed in this pull request?

Previously, PR #19201 fix the problem of non-converging constraints.
After that PR #19149 improve the loop and constraints is inferred only once.
So the problem of non-converging constraints is gone.

However, the case below will fail.

```

spark.range(5).write.saveAsTable("t")
val t = spark.read.table("t")
val left = t.withColumn("xid", $"id" + lit(1)).as("x")
val right = t.withColumnRenamed("id", "xid").as("y")
val df = left.join(right, "xid").filter("id = 3").toDF()
checkAnswer(df, Row(4, 3))

```

Because `aliasMap` replace all the aliased child. See the test case in PR for details.

This PR is to fix this bug by removing useless code for preventing non-converging constraints.
It can be also fixed with #20270, but this is much simpler and clean up the code.

## How was this patch tested?

Unit test

Author: Wang Gengliang <[email protected]>

Closes #20278 from gengliangwang/FixConstraintSimple.

(cherry picked from commit 8598a98)
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants