-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Executing LocalLimitExec with no column should not return an Err #5709
Conversation
@@ -462,7 +462,8 @@ impl LimitStream { | |||
.iter() | |||
.map(|col| col.slice(0, col.len().min(batch_rows))) | |||
.collect(); | |||
Some(RecordBatch::try_new(batch.schema(), limited_columns).unwrap()) | |||
let options = RecordBatchOptions::new().with_row_count(Option::from(batch_rows)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kazuyukitanimura
extending call RecordBatch
to be called with options gives more flexibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it is nice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contribution @kazuyukitanimura (and the assist @comphead )
Would it be possible to make an end to end reproducer (either with SQL or with the DataFrame API) to show the problem? It would be nice to ensure there aren't other issues with creating zero column batches.
I tried to reproduce the error and I was not able to:
DataFusion CLI v20.0.0
❯ create table foo(x int);
0 rows in set. Query took 0.004 seconds.
❯ insert into foo values(1);
0 rows in set. Query took 0.003 seconds.
❯ insert into foo values(2);
0 rows in set. Query took 0.002 seconds.
❯ explain select 1 from foo limit 1;
+---------------+---------------------------------------------------+
| plan_type | plan |
+---------------+---------------------------------------------------+
| logical_plan | Projection: Int64(1) |
| | Limit: skip=0, fetch=1 |
| | TableScan: foo projection=[x], fetch=1 |
| physical_plan | ProjectionExec: expr=[1 as Int64(1)] |
| | GlobalLimitExec: skip=0, fetch=1 |
| | MemoryExec: partitions=1, partition_sizes=[2] |
| | |
+---------------+---------------------------------------------------+
2 rows in set. Query took 0.004 seconds.
❯ select 1 from foo limit 1;
+----------+
| Int64(1) |
+----------+
| 1 |
+----------+
@@ -462,7 +462,8 @@ impl LimitStream { | |||
.iter() | |||
.map(|col| col.slice(0, col.len().min(batch_rows))) | |||
.collect(); | |||
Some(RecordBatch::try_new(batch.schema(), limited_columns).unwrap()) | |||
let options = RecordBatchOptions::new().with_row_count(Option::from(batch_rows)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it is nice
Thanks @alamb that will have to be a separate work from this PR. I have some examples, but I need to think through how to remove some proprietary code from that. Additionally I suspect there will be other execs that have the same issue, so I would like make the reproducer generic. Would you mind approving this PR first given this is pretty much the same as #4912 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the real value in an end to end reproducer is so that we don't accidentally break the usecase in some other later refactor
To have end to end reproducer, you need to be able to project empty-column relation (with non-empty row), not sure if DataFusion allows such thing in SQL/DataFrame API. |
Which issue does this PR close?
Closes #5701
Rationale for this change
Bug fix
What changes are included in this PR?
Use
options
to explicitly setrow_count
, closely related / similar solution to #4912Are these changes tested?
Yes
Are there any user-facing changes?
No