Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-10437][SQL] Support aggregation expressions in Order By #8599

Closed
wants to merge 6 commits into from

Conversation

viirya
Copy link
Member

@viirya viirya commented Sep 4, 2015

JIRA: https://issues.apache.org/jira/browse/SPARK-10437

If an expression in SortOrder is a resolved one, such as count(1), the corresponding rule in Analyzer to make it work in order by will not be applied.

@SparkQA
Copy link

SparkQA commented Sep 4, 2015

Test build #41996 has finished for PR 8599 at commit 452cfb5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -561,7 +561,7 @@ class Analyzer(
}

case sort @ Sort(sortOrder, global, aggregate: Aggregate)
if aggregate.resolved && !sort.resolved =>
if aggregate.resolved =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to set up a stop condition for this rule, or something like SELECT a, SUM(b) FROM t GROUP BY a ORDER BY a will go through this rule again and again until reach the fixed point. How about changing the end of this rule to:

if (evaluatedOrderings == sortOrder) {
  sort
} else {
  Project(aggregate.output,
    Sort(evaluatedOrderings, global,
      aggregate.copy(aggregateExpressions = originalAggExprs ++ needsPushDown)))
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I've updated it.

@SparkQA
Copy link

SparkQA commented Sep 4, 2015

Test build #42003 has finished for PR 8599 at commit e65f4db.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

LGTM except a minor comment for test.
cc @marmbrus

@SparkQA
Copy link

SparkQA commented Sep 5, 2015

Test build #42036 has finished for PR 8599 at commit 8f73c40.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member Author

viirya commented Sep 5, 2015

retest this please.

@SparkQA
Copy link

SparkQA commented Sep 5, 2015

Test build #42038 has finished for PR 8599 at commit 8f73c40.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 5, 2015

Test build #42040 has finished for PR 8599 at commit 6023c8b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member Author

viirya commented Sep 5, 2015

retest this please.

1 similar comment
@viirya
Copy link
Member Author

viirya commented Sep 5, 2015

retest this please.

@SparkQA
Copy link

SparkQA commented Sep 5, 2015

Test build #42057 has finished for PR 8599 at commit 6023c8b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class BlockFetchException(messages: String, throwable: Throwable)

@viirya
Copy link
Member Author

viirya commented Sep 6, 2015

ping @liancheng

@viirya
Copy link
Member Author

viirya commented Sep 8, 2015

ping @liancheng @marmbrus

|GROUP BY a
|ORDER BY count(*)
""".stripMargin),
Row(2) :: Row(2) :: Row(2) :: Row(2) :: Nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: And also add more typical unit tests like:
"""
|SELECT a
|FROM orderByData
|GROUP BY a
|ORDER BY a, count(*), sum(b)
""".stripMargin),

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I've added it.

@SparkQA
Copy link

SparkQA commented Sep 9, 2015

Test build #42186 has finished for PR 8599 at commit 046a8cb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class Instance(w: Double, a: Vector, b: Double)
    • class WeibullGenerator(

@liancheng
Copy link
Contributor

[SQ] in the PR description should be [SQL].

@viirya viirya changed the title [SPARK-10437][SQ] Support aggregation expressions in Order By [SPARK-10437][SQL] Support aggregation expressions in Order By Sep 9, 2015
@viirya
Copy link
Member Author

viirya commented Sep 9, 2015

@liancheng Updated. Thanks.

@viirya
Copy link
Member Author

viirya commented Sep 10, 2015

@liancheng Any more comment?

@marmbrus
Copy link
Contributor

Thanks, merged to master.

@asfgit asfgit closed this in 841972e Sep 15, 2015
@viirya viirya deleted the orderby-agg branch December 27, 2023 18:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants