-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-22361][SQL][TEST] Add unit test for Window Frames #20019
Conversation
ok to test |
Test build #85153 has finished for PR 20019 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a test checking that rangeBetween throws an AnalysisException if it gets a non-numeric column as input, as demonstrated by
Seq("a").toDF("k").select(min('k).over(Window.orderBy('k).rangeBetween(-1, 1))).show
Add a test checking that rangeBetween throws an AnalysisException if it gets a column ordered by multiple order by expressions, as demonstrated by
Seq((1, 2)).toDF("k1", "k2").select(min('k1).over(Window.orderBy('k1, 'k2).rangeBetween(-1, 1))).show
Add tests checking that the above mentioned rules are ignored if range is not bounded, as demonstrated by
Seq((1, 2)).toDF("k1", "k2").select(min('k1).over(Window.orderBy('k1, 'k2).rangeBetween(Window.unboundedPreceding, Window.unboundedFollowing))).show
and
Seq("a").toDF("k").select(min('k).over(Window.orderBy('k).rangeBetween(Window.unboundedPreceding, Window.unboundedFollowing))).show
@smurakozi nice catch, added them. Additionally found a nit. |
Test build #85190 has finished for PR 20019 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the effort! Only some minor issues.
It would also be great to improve the test for window frames parser, which currently is in ExpressionParserSuite.test("window function expressions")
.
assert(e.message.contains("Boundary end is not a valid integer: 2147483648")) | ||
} | ||
|
||
test("range between should accept at most one ORDER BY expression when unbounded") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"unbounded" -> "bounded"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
"window specification with multiple order by expressions")) | ||
} | ||
|
||
test("range between should accept numeric values only when bounded") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually a TypeCoercion issue, I'm not sure whether we should cover it here or TypeCoercionSuite
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I bit enhanced this area but not yet moved. What do you think, would it be good idea to move this test there? This is as window related as type coercion.
also cc @gatorsmile whom I believe should be interested to look at this. |
ExpressionParserSuite window related part is a bit reorganized and enhanced. |
Test build #85267 has finished for PR 20019 at commit
|
@@ -263,21 +263,60 @@ class ExpressionParserSuite extends PlanTest { | |||
"sum(product + 1) over (partition by ((product / 2) + 1) order by 2)", | |||
WindowExpression('sum.function('product + 1), | |||
WindowSpecDefinition(Seq('product / 2 + 1), Seq(Literal(2).asc), UnspecifiedFrame))) | |||
} | |||
|
|||
test("range/rows window function expressions") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about window function expressions on various window frames
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just back from holiday. What do you mean exactly here? Range/rows between test with most of the combinations are covered. If you could give an example it would be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jiangxb1987 gentle ping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for the late response. Just update the test case name.
val frameTypes = Seq(("rows", RowFrame), ("range", RangeFrame)) | ||
val boundaries = Seq( | ||
("10 preceding", -Literal(10), CurrentRow), | ||
("unbounded preceding", UnboundedPreceding, CurrentRow), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tiny nit: could you add comments on each group of boundaries to explain what they are trying to cover?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change is on the way...
Test build #85592 has finished for PR 20019 at commit
|
Test build #85803 has finished for PR 20019 at commit
|
retest this please |
Test build #86191 has finished for PR 20019 at commit
|
LGTM |
Thanks! Merged to master/2.3 |
## What changes were proposed in this pull request? There are already quite a few integration tests using window frames, but the unit tests coverage is not ideal. In this PR the already existing tests are reorganized, extended and where gaps found additional cases added. ## How was this patch tested? Automated: Pass the Jenkins. Author: Gabor Somogyi <[email protected]> Closes #20019 from gaborgsomogyi/SPARK-22361. (cherry picked from commit a9b845e) Signed-off-by: gatorsmile <[email protected]>
@gatorsmile @jiangxb1987 @smurakozi Thanks for the help! |
What changes were proposed in this pull request?
There are already quite a few integration tests using window frames, but the unit tests coverage is not ideal.
In this PR the already existing tests are reorganized, extended and where gaps found additional cases added.
How was this patch tested?
Automated: Pass the Jenkins.