Skip to content

Commit

Permalink
update blaze version 2.0.7-SNAPSHOT (#312)
Browse files Browse the repository at this point in the history
Co-authored-by: zhangli20 <[email protected]>
  • Loading branch information
richox and zhangli20 authored Nov 9, 2023
1 parent 7f2bccf commit 224697d
Show file tree
Hide file tree
Showing 10 changed files with 177 additions and 265 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/tpcds.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,15 +51,15 @@ jobs:
uses: actions/upload-artifact@v3
with:
name: blaze-engine-spark303
path: target/blaze-engine-spark303-pre-2.0.6-SNAPSHOT.jar
path: target/blaze-engine-spark303-pre-*-SNAPSHOT.jar

- name: Build Spark333
run: mvn package -Ppre -Pspark333
- name: Upload Spark333
uses: actions/upload-artifact@v3
with:
name: blaze-engine-spark333
path: target/blaze-engine-spark333-pre-2.0.6-SNAPSHOT.jar
path: target/blaze-engine-spark333-pre-*-SNAPSHOT.jar

setup-spark:
name: Setup Spark
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,10 +114,10 @@ comparison with vanilla Spark on TPC-DS 1TB dataset. The benchmark result shows
Stay tuned and join us for more upcoming thrilling numbers.

Query time:
![20230925-query-time](./benchmark-results/blaze-query-time-comparison-20230925.png)
![20231108-query-time](./benchmark-results/blaze-query-time-comparison-20231108.png)

Cluster resources:
![20230925-resources](./benchmark-results/blaze-executor-time-comparison-20230925.png)
![20231108-resources](./benchmark-results/blaze-cluster-resources-cost-comparison-20231108.png)

We also encourage you to benchmark Blaze and share the results with us. 🤗

Expand Down
20 changes: 20 additions & 0 deletions RELEASES.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# blaze-v2.0.7

## Features
* Supports native BroadcastNestedLoopJoinExec.
* Supports multithread UDF evaluation.
* Supports spark.files.ignoreCorruptFiles.
* Supports input batch statistics.

## Performance
* Improves get_json_object() performance by reducing duplicated json parsing.
* Improves parquet reading performance by skipping utf-8 validation.
* Supports cached expression evaluator in native AggExec.
* Supports column pruning during native evaluation.
* Prefer native sort even if child is non-native.

## Bugfix
* Fix missing outputPartitioning in NativeParquetExec.
* Fix missing native converting checks in parquet scan.
* Fix inconsistency: implement spark-compatible float to int casting.
* Avoid closing hadoop fs for reusing in cache.
260 changes: 0 additions & 260 deletions benchmark-results/20230925.md

This file was deleted.

Loading

0 comments on commit 224697d

Please sign in to comment.