From 8d91fb2d7165db9232fe751c084a825c9183df20 Mon Sep 17 00:00:00 2001 From: Yuan Zhou Date: Mon, 29 Mar 2021 13:05:35 +0800 Subject: [PATCH] mention limits Signed-off-by: Yuan Zhou --- README.md | 2 +- docs/limitation.md | 4 +++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 0a90a3137..20b0a6f55 100644 --- a/README.md +++ b/README.md @@ -101,7 +101,7 @@ orders.createOrReplaceTempView("orders") spark.sql("select * from orders where o_orderdate > date '1998-07-26'").show(20000, false) ``` -The result should showup on Spark console and you can check the DAG diagram with some Columnar Processing stage. +The result should showup on Spark console and you can check the DAG diagram with some Columnar Processing stage. Native SQL engine still lacks some features, please check out the [limitations](./docs/limitations.md). ## Performance data diff --git a/docs/limitation.md b/docs/limitation.md index 92887d4ff..a4b66f5e1 100644 --- a/docs/limitation.md +++ b/docs/limitation.md @@ -4,11 +4,13 @@ Native SQL engine currenlty works with Spark 3.0.0 only. There are still some trouble with latest Shuffle/AQE API from Spark 3.0.1, 3.0.2 or 3.1.x. ## Operator limitations +All performance critical operators in TPC-H/TPC-DS should be supported. For those unsupported operators, Native SQL engine will automatically fallback to row operators in vanilla Spark. + ### Columnar Projection with Filter We used 16 bit selection vector for filter so the max batch size need to be < 65536 ### Columnar Sort -To reduce the peak memory usage, we used smaller data structure(uin16_t). This limits +Columnar Sort does not support spill to disk yet. To reduce the peak memory usage, we used smaller data structure(uin16_t), so this limits - the max batch size to be < 65536 - the number of batches in one partiton to be < 65536