Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor: Begin porting some window tests to sqllogictests #5199

Merged
merged 1 commit into from
Feb 13, 2023

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Feb 6, 2023

Which issue does this PR close?

Part of #4495

Rationale for this change

While reviewing #5171 I realized that many of the tests are still written directly in rust, which is not bad but figured I would start porting them to sqllogictests so we can avoid adding to the porting work

I think sqllogictests are easier to write (less boilerplate setup) and maintain (no rust recompilation to update output files). This is also the first sqllogictest I know of that has expain plans, which is exciting

What changes are included in this PR?

Port the first few tests from window.rs to sqllogictests

Are these changes tested?

Yes, they are all tests

Are there any user-facing changes?

No

@github-actions github-actions bot added core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) labels Feb 6, 2023
@alamb
Copy link
Contributor Author

alamb commented Feb 6, 2023

cc @mustafasrepo

@alamb alamb marked this pull request as draft February 6, 2023 17:36
@alamb
Copy link
Contributor Author

alamb commented Feb 6, 2023

There is something wrong with the plan diff comparison -- will review shortly

@alamb alamb marked this pull request as ready for review February 12, 2023 21:24
Comment on lines +143 to +147
0 0 220 44 10 0 5
0 1 220 44 10 0 5
0 2 220 44 10 0 5
0 3 220 44 10 0 5
0 4 220 44 10 0 5
Copy link
Contributor

@mustafasrepo mustafasrepo Feb 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the tests that use partitioned data e.g table test. Results are different than the previous version. I guess the reason for that is that previously first row was not used in the calculations (Maybe there were parsed as header.) I think, the previous results were wrong. Is this so? What do you think about it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe there were parsed as header.)

Yes I think this is exactly what was happening -- that is a good observation

I think it is ok that the results changed as the point of the test seems to be to cover a broad range of window function results rather than any specific window function or case (as in skipping the header row seems like a oversight in the original test rather than an important behavior to replicate)

@mustafasrepo
Copy link
Contributor

Thanks @alamb for this PR. LGTM!.

ORDER BY c1, c2
LIMIT 5
----
0 0 1 0 0 0 0 1 0 0 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the careful reader will note that the original test does not have any data for c2 = 0 (because I think it uses a slightly different test setup)

Comment on lines +143 to +147
0 0 220 44 10 0 5
0 1 220 44 10 0 5
0 2 220 44 10 0 5
0 3 220 44 10 0 5
0 4 220 44 10 0 5
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe there were parsed as header.)

Yes I think this is exactly what was happening -- that is a good observation

I think it is ok that the results changed as the point of the test seems to be to cover a broad range of window function results rather than any specific window function or case (as in skipping the header row seems like a oversight in the original test rather than an important behavior to replicate)

@alamb alamb merged commit 98dba2b into apache:master Feb 13, 2023
@ursabot
Copy link

ursabot commented Feb 13, 2023

Benchmark runs are scheduled for baseline = 9565887 and contender = 98dba2b. 98dba2b is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@alamb alamb deleted the alamb/port_window_tests branch February 18, 2023 17:48
jiangzhx pushed a commit to jiangzhx/arrow-datafusion that referenced this pull request Feb 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants