From 79855b92f5868ab9dee9b437bec374b416cf05e1 Mon Sep 17 00:00:00 2001 From: guo-shaoge Date: Thu, 1 Aug 2024 15:44:32 +0800 Subject: [PATCH 1/7] system-variables: add tiflash_hashagg_preaggregation_mode Signed-off-by: guo-shaoge --- system-variables.md | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/system-variables.md b/system-variables.md index f27cb58b8f76e..936eef193f3bc 100644 --- a/system-variables.md +++ b/system-variables.md @@ -5866,6 +5866,24 @@ For details, see [Identify Slow Queries](/identify-slow-queries.md). +### tiflash_hashagg_preaggregation_mode New in v8.3.0 + +- Scope: SESSION | GLOBAL +- Persists to cluster: Yes +- Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): Yes +- Type: String +- Default value: `auto` +- Value options: `auto`, `force_streaming`, `force_preagg` +- This variable is used to control the pre-aggregation policy for the first stage of two-staged or three-staged HashAgg pushed down to TiFlash: + - `force_preagg`: TiFlash will enforce pre-aggregation in the first stage of HashAgg. + - `force_streaming`: TiFlash will directly pass the data to the next stage of HashAgg without pre-aggregation. + - `auto`: TiFlash will automatically decide whether to perform pre-aggregation based on the observed workload's reduction rate. + +> **Note:** +> +> - This variable only takes effect when [`tiflash_mem_quota_query_per_node`](/system-variables.md#tiflash_mem_quota_query_per_node-new-in-v740) is greater than `0`. In other words, if [tiflash_mem_quota_query_per_node](/system-variables.md#tiflash_mem_quota_query_per_node-new-in-v740) is `0` or `-1`, query-level spilling will not be enabled even if `tiflash_query_spill_ratio` is greater than `0`. +> - When TiFlash query-level spilling is enabled, the spilling thresholds for individual TiFlash operators automatically become invalidated. In other words, if both [`tiflash_mem_quota_query_per_node`](/system-variables.md#tiflash_mem_quota_query_per_node-new-in-v740) and `tiflash_query_spill_ratio` are greater than 0, the three variables [tidb_max_bytes_before_tiflash_external_sort](/system-variables.md#tidb_max_bytes_before_tiflash_external_sort-new-in-v700), [tidb_max_bytes_before_tiflash_external_group_by](/system-variables.md#tidb_max_bytes_before_tiflash_external_group_by-new-in-v700), and [tidb_max_bytes_before_tiflash_external_join](/system-variables.md#tidb_max_bytes_before_tiflash_external_join-new-in-v700) become invalidated automatically, equivalent to setting them to `0`. + ### tikv_client_read_timeout New in v7.4.0 - Scope: SESSION | GLOBAL From 8c44c263804370f81b562bc20c5f7ead75b92a3c Mon Sep 17 00:00:00 2001 From: guo-shaoge Date: Thu, 1 Aug 2024 15:50:57 +0800 Subject: [PATCH 2/7] fix Signed-off-by: guo-shaoge --- system-variables.md | 5 ----- 1 file changed, 5 deletions(-) diff --git a/system-variables.md b/system-variables.md index 936eef193f3bc..07873274c1a6e 100644 --- a/system-variables.md +++ b/system-variables.md @@ -5879,11 +5879,6 @@ For details, see [Identify Slow Queries](/identify-slow-queries.md). - `force_streaming`: TiFlash will directly pass the data to the next stage of HashAgg without pre-aggregation. - `auto`: TiFlash will automatically decide whether to perform pre-aggregation based on the observed workload's reduction rate. -> **Note:** -> -> - This variable only takes effect when [`tiflash_mem_quota_query_per_node`](/system-variables.md#tiflash_mem_quota_query_per_node-new-in-v740) is greater than `0`. In other words, if [tiflash_mem_quota_query_per_node](/system-variables.md#tiflash_mem_quota_query_per_node-new-in-v740) is `0` or `-1`, query-level spilling will not be enabled even if `tiflash_query_spill_ratio` is greater than `0`. -> - When TiFlash query-level spilling is enabled, the spilling thresholds for individual TiFlash operators automatically become invalidated. In other words, if both [`tiflash_mem_quota_query_per_node`](/system-variables.md#tiflash_mem_quota_query_per_node-new-in-v740) and `tiflash_query_spill_ratio` are greater than 0, the three variables [tidb_max_bytes_before_tiflash_external_sort](/system-variables.md#tidb_max_bytes_before_tiflash_external_sort-new-in-v700), [tidb_max_bytes_before_tiflash_external_group_by](/system-variables.md#tidb_max_bytes_before_tiflash_external_group_by-new-in-v700), and [tidb_max_bytes_before_tiflash_external_join](/system-variables.md#tidb_max_bytes_before_tiflash_external_join-new-in-v700) become invalidated automatically, equivalent to setting them to `0`. - ### tikv_client_read_timeout New in v7.4.0 - Scope: SESSION | GLOBAL From a64e97520a63f233c2ae102805eb9c01fdac6c6a Mon Sep 17 00:00:00 2001 From: guo-shaoge Date: Tue, 6 Aug 2024 16:11:38 +0800 Subject: [PATCH 3/7] fix def Signed-off-by: guo-shaoge --- system-variables.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/system-variables.md b/system-variables.md index 07873274c1a6e..f8caa9c5c9dd5 100644 --- a/system-variables.md +++ b/system-variables.md @@ -5872,7 +5872,7 @@ For details, see [Identify Slow Queries](/identify-slow-queries.md). - Persists to cluster: Yes - Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): Yes - Type: String -- Default value: `auto` +- Default value: `force_preagg` - Value options: `auto`, `force_streaming`, `force_preagg` - This variable is used to control the pre-aggregation policy for the first stage of two-staged or three-staged HashAgg pushed down to TiFlash: - `force_preagg`: TiFlash will enforce pre-aggregation in the first stage of HashAgg. From d1cf817890ac34850ab180fdefa20b29487549bf Mon Sep 17 00:00:00 2001 From: guo-shaoge Date: Tue, 6 Aug 2024 16:53:45 +0800 Subject: [PATCH 4/7] refine Signed-off-by: guo-shaoge --- system-variables.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/system-variables.md b/system-variables.md index f8caa9c5c9dd5..77776a4e36089 100644 --- a/system-variables.md +++ b/system-variables.md @@ -5871,11 +5871,11 @@ For details, see [Identify Slow Queries](/identify-slow-queries.md). - Scope: SESSION | GLOBAL - Persists to cluster: Yes - Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): Yes -- Type: String +- Type: Enumeration - Default value: `force_preagg` -- Value options: `auto`, `force_streaming`, `force_preagg` +- Value options: `force_preagg`, `force_streaming`, `auto` - This variable is used to control the pre-aggregation policy for the first stage of two-staged or three-staged HashAgg pushed down to TiFlash: - - `force_preagg`: TiFlash will enforce pre-aggregation in the first stage of HashAgg. + - `force_preagg`: TiFlash will enforce pre-aggregation in the first stage of HashAgg. This was the equivalent behavior of TiFlash before this variable was introduced. - `force_streaming`: TiFlash will directly pass the data to the next stage of HashAgg without pre-aggregation. - `auto`: TiFlash will automatically decide whether to perform pre-aggregation based on the observed workload's reduction rate. From 7d2e8d7b0c4d085ff9730cc2e58386faf9f1ef2b Mon Sep 17 00:00:00 2001 From: guo-shaoge Date: Wed, 7 Aug 2024 17:09:29 +0800 Subject: [PATCH 5/7] refine Signed-off-by: guo-shaoge --- system-variables.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/system-variables.md b/system-variables.md index 77776a4e36089..062261d32a894 100644 --- a/system-variables.md +++ b/system-variables.md @@ -5874,10 +5874,10 @@ For details, see [Identify Slow Queries](/identify-slow-queries.md). - Type: Enumeration - Default value: `force_preagg` - Value options: `force_preagg`, `force_streaming`, `auto` -- This variable is used to control the pre-aggregation policy for the first stage of two-staged or three-staged HashAgg pushed down to TiFlash: - - `force_preagg`: TiFlash will enforce pre-aggregation in the first stage of HashAgg. This was the equivalent behavior of TiFlash before this variable was introduced. - - `force_streaming`: TiFlash will directly pass the data to the next stage of HashAgg without pre-aggregation. - - `auto`: TiFlash will automatically decide whether to perform pre-aggregation based on the observed workload's reduction rate. +- This variable controls which pre-aggregation strategy is used in the first stage of a two-stage or three-stage HashAgg pushed down to TiFlash: + - `force_preagg`: TiFlash forces pre-aggregation in the first stage of HashAgg, same with behavior prior to version v8.3.0. + - `force_streaming`: TiFlash directly sends data to the next stage of HashAgg without pre-aggregation. + - `auto`: TiFlash automatically selects whether to perform pre-aggregation based on the current workload's aggregation degree. ### tikv_client_read_timeout New in v7.4.0 From 83921101b0272a6da1205ebe95a1b5fe78cfcf06 Mon Sep 17 00:00:00 2001 From: Aolin Date: Wed, 7 Aug 2024 18:07:59 +0800 Subject: [PATCH 6/7] refine wording --- system-variables.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/system-variables.md b/system-variables.md index 062261d32a894..2e0ae5a47b3fd 100644 --- a/system-variables.md +++ b/system-variables.md @@ -5874,8 +5874,8 @@ For details, see [Identify Slow Queries](/identify-slow-queries.md). - Type: Enumeration - Default value: `force_preagg` - Value options: `force_preagg`, `force_streaming`, `auto` -- This variable controls which pre-aggregation strategy is used in the first stage of a two-stage or three-stage HashAgg pushed down to TiFlash: - - `force_preagg`: TiFlash forces pre-aggregation in the first stage of HashAgg, same with behavior prior to version v8.3.0. +- This variable controls the pre-aggregation strategy used in the first stage of two-stage or three-stage HashAgg operations pushed down to TiFlash: + - `force_preagg`: TiFlash forces pre-aggregation in the first stage of HashAgg. This behavior is consistent with the behavior before v8.3.0. - `force_streaming`: TiFlash directly sends data to the next stage of HashAgg without pre-aggregation. - `auto`: TiFlash automatically selects whether to perform pre-aggregation based on the current workload's aggregation degree. From 1e36de3464d18dbd578d91e057b53c0b54437fca Mon Sep 17 00:00:00 2001 From: Aolin Date: Wed, 14 Aug 2024 12:30:26 +0800 Subject: [PATCH 7/7] Apply suggestions from code review Co-authored-by: Grace Cai --- system-variables.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/system-variables.md b/system-variables.md index 2e0ae5a47b3fd..4b2e2dc999f46 100644 --- a/system-variables.md +++ b/system-variables.md @@ -5874,10 +5874,10 @@ For details, see [Identify Slow Queries](/identify-slow-queries.md). - Type: Enumeration - Default value: `force_preagg` - Value options: `force_preagg`, `force_streaming`, `auto` -- This variable controls the pre-aggregation strategy used in the first stage of two-stage or three-stage HashAgg operations pushed down to TiFlash: - - `force_preagg`: TiFlash forces pre-aggregation in the first stage of HashAgg. This behavior is consistent with the behavior before v8.3.0. +- This variable controls the pre-aggregation strategy used during the first stage of two-stage or three-stage HashAgg operations pushed down to TiFlash: + - `force_preagg`: TiFlash forces pre-aggregation during the first stage of HashAgg. This behavior is consistent with the behavior before v8.3.0. - `force_streaming`: TiFlash directly sends data to the next stage of HashAgg without pre-aggregation. - - `auto`: TiFlash automatically selects whether to perform pre-aggregation based on the current workload's aggregation degree. + - `auto`: TiFlash automatically chooses whether to perform pre-aggregation based on the current workload's aggregation degree. ### tikv_client_read_timeout New in v7.4.0