Execute plans during plan tests to make sure plans are valid and executable. #8230

mustafasrepo · 2023-11-16T08:38:42Z

Is your feature request related to a problem or challenge?

In the link, @alamb suggested that adding a run pass to the plans would be better to make sure that plan is indeed runnable. I think this feature is really nice. We can add this support to the tests under following files:

replace_with_order_preserving_variants.rs
enforce_distirbution.rs
enforce_sorting.rs

Describe the solution you'd like

I think, we can execute optimized ExecutionPlans, inside assert_optimized! macro after checking that plan is as expected. After collecting all the result (preferably with single row) with collect(plan, task_ctx).await? we would expect success at the output. If there is any inconsistency in the schema, execution will show errors.

I think for this task we can use EmptyExec with single row at the source.

Describe alternatives you've considered

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

Lordworms · 2024-01-22T01:15:57Z

May I take a try at it? Since I am new here, maybe one week should be enough.

mustafasrepo · 2024-01-22T10:07:51Z

May I take a try at it? Since I am new here, maybe one week should be enough.

Sure.

findepi · 2024-07-03T20:25:03Z

I tried to do this by appending the following at the end of assert_optimized definition in replace_with_order_preserving_variants.rs:

@@ -399,10 +405,17 @@ mod tests {
             let actual = get_plan_string(&optimized_physical_plan);
             assert_eq!(
                 expected_optimized_lines, actual,
                 "\n**Optimized Plan Mismatch\n\nexpected:\n\n{expected_optimized_lines:#?}\nactual:\n\n{actual:#?}\n\n"
             );
+
+            let ctx = SessionContext::new();
+            let object_store = InMemory::new();
+            object_store.put(&object_store::path::Path::from("file_path"), bytes::Bytes::from("").into()).await?;
+            ctx.register_object_store(&Url::parse("test://").unwrap(), Arc::new(object_store));
+            let task_ctx = Arc::new(TaskContext::from(&ctx));
+            collect(optimized_physical_plan, task_ctx).await?;
         };
     }

but this leads to internal error: entered unreachable code errors from

datafusion/datafusion/core/src/test/mod.rs

Line 300 in 5501e8e

unreachable!()

and partition not used yet errors from

datafusion/datafusion/physical-plan/src/repartition/mod.rs

Line 599 in b76c1b7

.expect("partition not used yet");

Clearly the context setup is not sufficient to execute the plans. @mustafasrepo would you be willing to give me a hand and suggest what I am doing wrong?

mustafasrepo · 2024-07-04T06:09:02Z

I think, the reason for these bugs is that. Source doesn't emit any data (Or doesn't expect to be called.). I think, we might need to update sources to support generating data. For doing so we need to update
csv_exec_sorted and stream_exec_ordered implementations to generate sources that can emit data. One possible approach would be updating csv_exec_sorted such that it generates a MemoryExec with single row (Method could be renamed to memory_exec_sorted. The reason I propose MemoryExec is that it is easier to give data during initialization for this operator).

However, corresponding change might be a bit tricky for StreamingTableExec since its stream is mock and doesn't support any data generation. Hence I propose to execute tests with actual data for just bounded cases (e.g. cases where source is constructed with csv_exec_sorted call.)

mustafasrepo added enhancement New feature or request good first issue Good for newcomers labels Nov 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Execute plans during plan tests to make sure plans are valid and executable. #8230

Execute plans during plan tests to make sure plans are valid and executable. #8230

mustafasrepo commented Nov 16, 2023 •

edited

Loading

Lordworms commented Jan 22, 2024

mustafasrepo commented Jan 22, 2024

findepi commented Jul 3, 2024

mustafasrepo commented Jul 4, 2024

Execute plans during plan tests to make sure plans are valid and executable. #8230

Execute plans during plan tests to make sure plans are valid and executable. #8230

Comments

mustafasrepo commented Nov 16, 2023 • edited Loading

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Lordworms commented Jan 22, 2024

mustafasrepo commented Jan 22, 2024

findepi commented Jul 3, 2024

mustafasrepo commented Jul 4, 2024

mustafasrepo commented Nov 16, 2023 •

edited

Loading