Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-20273] [SQL] Disallow Non-deterministic Filter push-down into Join Conditions #17585

Closed
wants to merge 1 commit into from

Conversation

gatorsmile
Copy link
Member

@gatorsmile gatorsmile commented Apr 10, 2017

What changes were proposed in this pull request?

sql("SELECT t1.b, rand(0) as r FROM cachedData, cachedData t1 GROUP BY t1.b having r > 0.5").show()

We will get the following error:

Job aborted due to stage failure: Task 1 in stage 4.0 failed 1 times, most recent failure: Lost task 1.0 in stage 4.0 (TID 8, localhost, executor driver): java.lang.NullPointerException
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificPredicate.eval(Unknown Source)
	at org.apache.spark.sql.execution.joins.BroadcastNestedLoopJoinExec$$anonfun$org$apache$spark$sql$execution$joins$BroadcastNestedLoopJoinExec$$boundCondition$1.apply(BroadcastNestedLoopJoinExec.scala:87)
	at org.apache.spark.sql.execution.joins.BroadcastNestedLoopJoinExec$$anonfun$org$apache$spark$sql$execution$joins$BroadcastNestedLoopJoinExec$$boundCondition$1.apply(BroadcastNestedLoopJoinExec.scala:87)
	at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:463)

Filters could be pushed down to the join conditions by the optimizer rule PushPredicateThroughJoin. However, Analyzer blocks users to add non-deterministics conditions (For details, see the PR #7535).

We should not push down non-deterministic conditions; otherwise, we need to explicitly initialize the non-deterministic expressions. This PR is to simply block it.

How was this patch tested?

Added a test case

@gatorsmile
Copy link
Member Author

cc @cloud-fan

@SparkQA
Copy link

SparkQA commented Apr 10, 2017

Test build #75644 has started for PR 17585 at commit be3fb64.

@cloud-fan
Copy link
Contributor

retest this please

@cloud-fan
Copy link
Contributor

LGTM

@SparkQA
Copy link

SparkQA commented Apr 10, 2017

Test build #75650 has finished for PR 17585 at commit be3fb64.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@asfgit asfgit closed this in fd711ea Apr 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants