Make performance of TPCH q15 stable (#4570) #4709
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is an automated cherry-pick of #4570
Signed-off-by: xufei [email protected]
What problem does this PR solve?
Issue Number: close #4451
Problem Summary:
What is changed and how it works?
Actually, there are 2 possible solutions
executeOnBlock
Aggregator::mergeAndConvertToBlocks
, check the result_size_bytes, and if it exceeds the threshold, convert all the hash table into two-level hash tableFor the first solution, converting hash table to two-level hash table can be done by each threads in the first stage of ParallelAggregating, and for the second solution, this converting things are executed in 1 threads.
I've done some test for both solutions, and found first solution has ~20% perfomance gain compared to the second solution for TPCH q15. So I choose the first solution.
Why not enable two-level hash table by default for all the cases:
Testing query:
select count(*),id from test group by id;
test
has 65536 rows, all rows have the sameid
, that is to say the above query return only 1 rowTesting cluster has only 1 TiFlash, and the cpu has 36 core, each query is running 1000 times.
Check List
Tests
Side effects
Documentation
Release note