Fix DataFusion test and try to make ballista compile #4

yjshen · 2021-09-17T04:24:28Z

No description provided.

into arrow2-merge

yjshen · 2021-09-17T09:37:11Z

datafusion/src/physical_plan/expressions/try_cast.rs

@@ -236,7 +236,7 @@ mod tests {
    #[test]
    fn invalid_cast() {
        // Ensure a useful error happens at plan time if invalid casts are used
-        let schema = Schema::new(vec![Field::new("a", DataType::Int32, false)]);
+        let schema = Schema::new(vec![Field::new("a", DataType::Null, false)]);


Int32 -> LargeBinary cast is valid in arrow2, therefore I change to another unacceptable case Null here

yjshen · 2021-09-17T09:38:34Z

datafusion/src/physical_plan/sort_preserving_merge.rs

@@ -965,7 +965,7 @@ mod tests {
                options: Default::default(),
            },
            PhysicalSortExpr {
-                expr: col("c7", &schema).unwrap(),
+                expr: col("c12", &schema).unwrap(),


The original c7 is not distinguishable enough, i.e. same value exists many times.

👍 The same thing happens in arrow 6.0 (as pointed out by @houqp ) -- more details here: https://github.com/apache/arrow-datafusion/pull/984/files#r705557467

yjshen · 2021-09-17T09:39:46Z

datafusion/src/execution/dataframe_impl.rs

-                "| e  | 0.01479305307777301         | 0.9965400387585364          | 0.48600669271341534         | 10.206140546981722          | 21                            | 21                                     |",
+                "| c  | 0.0494924465469434          | 0.991517828651004           | 0.6600456536439785          | 13.860958726523547          | 21                            | 21                                     |",
+                "| d  | 0.061029375346466685        | 0.9748360509016578          | 0.48855379387549835         | 8.79396828975897            | 18                            | 18                                     |",
+                "| e  | 0.01479305307777301         | 0.9965400387585364          | 0.48600669271341557         | 10.206140546981727          | 21                            | 21                                     |",


I think this is due to float nature of inaccuracy, therefore acceptable.

houqp · 2021-09-18T05:05:10Z

datafusion/src/physical_optimizer/pruning.rs

@@ -1394,7 +1394,7 @@ mod tests {
        let expr = col("b1").not().eq(lit(true));
        let p = PruningPredicate::try_new(&expr, schema).unwrap();
        let result = p.prune(&statistics).unwrap();
-        assert_eq!(result, vec![true, false, false, true, true]);
+        assert_eq!(result, vec![true, true, false, true, true]);


interesting, looks like arrow2 fixed a bug that exists in arrow-rs?

I didn't get too much background on this, since this is a test against pruning of a != true, I think the current behavior is expected

* wip * more * Make scalar.rs compile * Fix various compilation error due to API difference * Make datafusion core compile * fmt * wip * wip: compile ballista * Pass all datafusion tests * Compile ballista

yjshen added 11 commits September 13, 2021 22:43

wip

a02bdab

more

b0ea1a8

Merge remote-tracking branch 'qp/arrow2-merge' into arrow2-merge

cd62984

Make scalar.rs compile

1231465

Merge remote-tracking branch 'qp/arrow2-merge' into arrow2-merge

94fd251

Fix various compilation error due to API difference

71b36aa

Make datafusion core compile

0b70766

fmt

932f7df

wip

1d00e0b

wip: compile ballista

d2ed5ce

Merge branch 'arrow2-merge' of https://github.com/houqp/arrow-datafusion

a5326c5

into arrow2-merge

github-actions bot added the ballista label Sep 17, 2021

Pass all datafusion tests

00b7711

github-actions bot added the datafusion label Sep 17, 2021

yjshen commented Sep 17, 2021

View reviewed changes

Compile ballista

1eb510a

yjshen changed the title ~~WIP: to make ballista compile~~ Fix test and try to make ballista compile Sep 17, 2021

yjshen changed the title ~~Fix test and try to make ballista compile~~ Fix DataFusion test and try to make ballista compile Sep 17, 2021

houqp reviewed Sep 18, 2021

View reviewed changes

houqp mentioned this pull request Sep 18, 2021

Update DataFusion to arrow 6.0 apache/datafusion#984

Merged

houqp merged commit caf5b22 into houqp:arrow2-merge Sep 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix DataFusion test and try to make ballista compile #4

Fix DataFusion test and try to make ballista compile #4

yjshen commented Sep 17, 2021

yjshen Sep 17, 2021

yjshen Sep 17, 2021

alamb Sep 19, 2021

yjshen Sep 17, 2021

houqp Sep 18, 2021

yjshen Sep 18, 2021

Fix DataFusion test and try to make ballista compile #4

Fix DataFusion test and try to make ballista compile #4

Conversation

yjshen commented Sep 17, 2021

yjshen Sep 17, 2021

Choose a reason for hiding this comment

yjshen Sep 17, 2021

Choose a reason for hiding this comment

alamb Sep 19, 2021

Choose a reason for hiding this comment

yjshen Sep 17, 2021

Choose a reason for hiding this comment

houqp Sep 18, 2021

Choose a reason for hiding this comment

yjshen Sep 18, 2021

Choose a reason for hiding this comment