Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-22895] [SQL] Push down the deterministic predicates that are after the first non-deterministic #20069

Closed
wants to merge 4 commits into from

Conversation

gatorsmile
Copy link
Member

What changes were proposed in this pull request?

Currently, we do not guarantee an order evaluation of conjuncts in either Filter or Join operator. This is also true to the mainstream RDBMS vendors like DB2 and MS SQL Server. Thus, we should also push down the deterministic predicates that are after the first non-deterministic, if possible.

How was this patch tested?

Updated the existing test cases.

@SparkQA
Copy link

SparkQA commented Dec 24, 2017

Test build #85355 has finished for PR 20069 at commit ad6607c.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Dec 24, 2017

Test build #85356 has finished for PR 20069 at commit ad6607c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

val (pushDown, stayUp) = {
val pushDownCondition: Expression => Boolean =
p => p.deterministic && !p.references.contains(watermark.eventTime)
if (SQLConf.get.outOfOrderPredicateEvaluationEnabled) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Create a function which accepts parameter Expression => Boolean and predicates: Seq[Expression], and partitionByDeterminism is using the function parameter as e => e. deterministic.

@SparkQA
Copy link

SparkQA commented Dec 29, 2017

Test build #85508 has finished for PR 20069 at commit ac8a801.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member Author

Copy link
Contributor

@jiangxb1987 jiangxb1987 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -768,6 +768,7 @@ object SQLConf {
.booleanConf
.createWithDefault(true)


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: duplicated empty line?

@SparkQA
Copy link

SparkQA commented Dec 31, 2017

Test build #85550 has finished for PR 20069 at commit fdef44a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member Author

Thanks! Merged to master.

@asfgit asfgit closed this in cfbe11e Dec 31, 2017
@@ -851,7 +851,7 @@ object PushDownPredicate extends Rule[LogicalPlan] with PredicateHelper {

case filter @ Filter(condition, union: Union) =>
// Union could change the rows, so non-deterministic predicate can't be pushed down
val (pushDown, stayUp) = splitConjunctivePredicates(condition).span(_.deterministic)
val (pushDown, stayUp) = splitConjunctivePredicates(condition).partition(_.deterministic)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean "after the first non-deterministic"? Doesn't this simply partition predicates to deterministic and non-deterministic? Have it considered "first" non-deterministic?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@viirya
IIUC, with span, the deterministic predicates after the first non-deterministic will not be pushed down, but with partition, all deterministic predicates will be pushed down.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants