update blaze version 2.0.7-SNAPSHOT (#312)

Co-authored-by: zhangli20 <[email protected]>
kwai · Nov 9, 2023 · 224697d · 224697d
1 parent 7f2bccf
commit 224697d
Show file tree

Hide file tree

Showing 10 changed files with 177 additions and 265 deletions.
diff --git a/.github/workflows/tpcds.yml b/.github/workflows/tpcds.yml
@@ -51,15 +51,15 @@ jobs:
         uses: actions/upload-artifact@v3
         with:
           name: blaze-engine-spark303
-          path: target/blaze-engine-spark303-pre-2.0.6-SNAPSHOT.jar
+          path: target/blaze-engine-spark303-pre-*-SNAPSHOT.jar
 
       - name: Build Spark333
         run: mvn package -Ppre -Pspark333
       - name: Upload Spark333
         uses: actions/upload-artifact@v3
         with:
           name: blaze-engine-spark333
-          path: target/blaze-engine-spark333-pre-2.0.6-SNAPSHOT.jar
+          path: target/blaze-engine-spark333-pre-*-SNAPSHOT.jar
 
   setup-spark:
     name: Setup Spark

diff --git a/README.md b/README.md
@@ -114,10 +114,10 @@ comparison with vanilla Spark on TPC-DS 1TB dataset. The benchmark result shows
 Stay tuned and join us for more upcoming thrilling numbers.
 
 Query time:
-![20230925-query-time](./benchmark-results/blaze-query-time-comparison-20230925.png)
+![20231108-query-time](./benchmark-results/blaze-query-time-comparison-20231108.png)
 
 Cluster resources:
-![20230925-resources](./benchmark-results/blaze-executor-time-comparison-20230925.png)
+![20231108-resources](./benchmark-results/blaze-cluster-resources-cost-comparison-20231108.png)
 
 We also encourage you to benchmark Blaze and share the results with us. 🤗
 

diff --git a/RELEASES.md b/RELEASES.md
@@ -0,0 +1,20 @@
+# blaze-v2.0.7
+
+## Features
+* Supports native BroadcastNestedLoopJoinExec.
+* Supports multithread UDF evaluation.
+* Supports spark.files.ignoreCorruptFiles.
+* Supports input batch statistics.
+
+## Performance
+* Improves get_json_object() performance by reducing duplicated json parsing.
+* Improves parquet reading performance by skipping utf-8 validation.
+* Supports cached expression evaluator in native AggExec.
+* Supports column pruning during native evaluation.
+* Prefer native sort even if child is non-native.
+
+## Bugfix
+* Fix missing outputPartitioning in NativeParquetExec.
+* Fix missing native converting checks in parquet scan.
+* Fix inconsistency: implement spark-compatible float to int casting.
+* Avoid closing hadoop fs for reusing in cache.
diff --git a/benchmark-results/20230925.md b/benchmark-results/20230925.md