Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize exchange operator & Support mpp version #6596

Merged
merged 104 commits into from
Jan 30, 2023

Conversation

solotzg
Copy link
Contributor

@solotzg solotzg commented Jan 6, 2023

What problem does this PR solve?

Issue Number: close #6620

Problem Summary:

  • Optimize exchange operator
  • Support mpp-version in mpp tasks

What is changed and how it works?

Base Modules & Utils

  • add -march=haswell flag for a few modules
  • add avx2_byte_count to optimize size_t countBytesInFilter(const UInt8 * filt, size_t sz).
  • optimize a few implementations for ColumnsCommon.cpp, DataTypeString.cpp,
  • use std::chrono::steady_clock instead of system_clock in a few modules

Mpp Version Check

TiFlash will check mpp-version in FlashService::EstablishMPPConnection, FlashService::CancelMPPTask, FlashService::EstablishMPPConnection.

if failed to check mpp-version is valid, tiflash will return grpc error grpc::StatusCode::CANCELLED with message like:

ERROR 1105 (HY000): rpc error: code = Canceled desc = Failed to handling mpp dispatch request, reason=`Invalid mpp version -1, TiFlash expects version: min 0, max 1, should upgrade TiDB/planner`

Hash Partition Writer

            // compression method flag; NONE = 0x02, LZ4 = 0x82, ZSTD = 0x90
            // ... 
            // header meta:
            //     columns count;
            //     total row count (multi parts);
            //     for each column:
            //         column name;
            //         column type;
            // for each part:
            //     row count;
            //     columns data;
  • HashPartitionWriterV1 supports compression mode:
    • NONE: no compression
      • if target tunnel model is TunnelSenderMode::LOCAL, must use method NONE.
    • FAST: fast compression/decompression speed, use method LZ4.
    • HIGH_COMPRESSION: high compression (HC) ratio mode, use method ZSTD.
  • NewMPPExchangeWriter will generate HashPartitionWriterV1 when exchange type is tipb::ExchangeType::Hash and mpp-version of dag context is not 0.
  • ExchangeReceiverBase<RPCContext>::decodeChunks will check version of mpp::MPPDataPacket
    • MPPDataPacketV0: use previous way.
    • MPPDataPacketV1: use new format. check first byte(compression method flag) to determine if data is compressed.

Grafana

Add panel Exchange Bytes/Seconds in Grafana

  • hash_none_compression_remote
  • hash_none_compression_local
  • hash_lz4_compression
  • hash_zstd_compression
  • hash_original: original data size sent by hash exchange
    • hash_original_all ≈ hash_none_remote + hash_none_local + hash_lz4 * lz4-compress-rate + hash_zstd * zstd-compress-rate
  • broadcast_passthrough_original
  • broadcast_passthrough_none_compression_local
  • broadcast_passthrough_none_compression_remote

image

Config

[profiles]
[profiles.default]
batch_send_min_limit_compression = -1 # default minimal chunk size of exchanging data among TiFlash when using data compression

If batch_send_min_limit_compression LT 0, HashPartitionWriterV1 will use 8192 * partition_num.

Benchmark

Notice: in the master branch, #6713 will cause performance regression, try to resolve it and get benchmark below

Time(s) NONE FAST(LZ4) HC(ZSTD) Improvement: (Original) / (FAST) - 1.0 Improvement: (Original) / (HC) - 1.0
Q1 3.25 2.92 2.99 11.30% 8.70%
Q2 1.44 1.24 1.51 16.13% -4.64%
Q3 3.39 3.12 3.05 8.65% 11.15%
Q4 2.65 2.58 2.52 2.71% 5.16%
Q5 8.09 6.88 7.55 17.59% 7.15%
Q6 0.7 0.7 0.7 0.00% 0.00%
Q7 3.39 2.99 3.39 13.38% 0.00%
Q8 6.01 5.27 4.93 14.04% 21.91%
Q9 22.31 19.03 21.71 17.24% 2.76%
Q10 4.6 4.4 4.46 4.55% 3.14%
Q11 1.24 0.91 0.91 36.26% 36.26%
Q12 1.71 1.78 1.71 -3.93% 0.00%
Q13 3.79 3.52 3.72 7.67% 1.88%
Q14 0.84 0.84 0.84 0.00% 0.00%
Q15 1.85 1.85 1.78 0.00% 3.93%
Q16 1.11 1.11 1.17 0.00% -5.13%
Q17 7.21 6.95 7.28 3.74% -0.96%
Q18 8.62 7.42 8.42 16.17% 2.38%
Q19 1.85 1.85 1.98 0.00% -6.57%
Q20 1.44 1.38 1.51 4.35% -4.64%
Q21 9.76 9.23 9.7 5.74% 0.62%
Q22 0.7 0.7 0.77 0.00% -9.09%
SUM 95.95 86.67 92.6 10.71% 3.62%

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

TiFlash supports data compression in MPP Exchange operators

Signed-off-by: Zhigao Tong <[email protected]>
Signed-off-by: Zhigao Tong <[email protected]>
1
Signed-off-by: Zhigao Tong <[email protected]>
3
Signed-off-by: Zhigao Tong <[email protected]>
4
Signed-off-by: Zhigao Tong <[email protected]>
5
Signed-off-by: Zhigao Tong <[email protected]>
6
Signed-off-by: Zhigao Tong <[email protected]>
7
Signed-off-by: Zhigao Tong <[email protected]>
8
Signed-off-by: Zhigao Tong <[email protected]>
Signed-off-by: Zhigao Tong <[email protected]>
Signed-off-by: Zhigao Tong <[email protected]>
Signed-off-by: Zhigao Tong <[email protected]>
Signed-off-by: Zhigao Tong <[email protected]>
Signed-off-by: Zhigao Tong <[email protected]>
Signed-off-by: Zhigao Tong <[email protected]>
@ti-chi-bot ti-chi-bot added the status/LGT2 Indicates that a PR has LGTM 2. label Jan 30, 2023
@ti-chi-bot ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 30, 2023
@solotzg
Copy link
Contributor Author

solotzg commented Jan 30, 2023

/merge

@ti-chi-bot
Copy link
Member

@solotzg: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: e5376a5

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Jan 30, 2023
@solotzg
Copy link
Contributor Author

solotzg commented Jan 30, 2023

/hold

@ti-chi-bot ti-chi-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 30, 2023
@solotzg
Copy link
Contributor Author

solotzg commented Jan 30, 2023

/rebuild

@solotzg
Copy link
Contributor Author

solotzg commented Jan 30, 2023

/run-integration-test

@solotzg
Copy link
Contributor Author

solotzg commented Jan 30, 2023

/merge

@ti-chi-bot
Copy link
Member

@solotzg: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@solotzg
Copy link
Contributor Author

solotzg commented Jan 30, 2023

/run-unit-test

@solotzg
Copy link
Contributor Author

solotzg commented Jan 30, 2023

/run-all-tests

@solotzg
Copy link
Contributor Author

solotzg commented Jan 30, 2023

/unhold

@ti-chi-bot ti-chi-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 30, 2023
@ti-chi-bot ti-chi-bot merged commit c2d9582 into pingcap:master Jan 30, 2023
@solotzg solotzg deleted the test-exchange-compress branch January 30, 2023 08:10
solotzg added a commit to solotzg/tiflash that referenced this pull request Feb 1, 2023
solotzg added a commit to solotzg/tiflash that referenced this pull request Feb 1, 2023
solotzg added a commit to solotzg/tiflash that referenced this pull request Feb 1, 2023
@ti-chi-bot ti-chi-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Feb 2, 2023
ywqzzy pushed a commit to ywqzzy/tiflash_1 that referenced this pull request Feb 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/enhancement The issue or PR belongs to an enhancement. type/performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Exchange Operator Enhancement
5 participants