more flexible follower read dispatch strategy #35926

glorv · 2022-07-04T13:10:51Z

Enhancement

In the cloud environment, the tidb clusters are commonly deployed across multiple "Access Zone"(AZ for short)s. Because the traffic cost across different AZ is much higher than within the same AZ. We can reduce the cost by dispatching read request to
the TiKV/TiFlash with a higher priority.

#28033 introduced a new tidb_replica_read option "closest-replica" that always dispatch all read request to the closest TiKV. But since the "follower read" feature need to propose an extra "readIndex" message through the raft state machine. This overhead may cause performance regression. For cop requests with small response, read from local AZ can not save the traffic because the extra "readIndex" messages are always cross AZ traffic. And the extra cpu usage to process these raft message can hurt the performance.

I propose to add a new option "closest-adaptive" for tidb_replica_read variable. This new option tries to dispatch request to the nearest store only when the response size is bigger than a certain threshold which is define by a new variable "tidb_adaptive_closest_read_threshold".

BTW, when the cop traffic from tidbs in different AZ are not evenly distributed, it may cause high load in some store while other are low loaded. This can cause high tail-lantency and low overall throughput. So when follow-read is enabled, the tikv client may need to track the load of each storage and force fallback to leader read when some stores' load are high.

The text was updated successfully, but these errors were encountered:

xhebox · 2022-07-05T02:23:13Z

/cc @Yisaer

…35927) ref #35926

nolouch · 2022-07-28T13:22:26Z

So when follow-read is enabled, the tikv client may need to track the load of each storage and force fallback to leader read when some stores' load are high.

How to make PD's decision and client's decision not to affect each other?

glorv · 2022-07-29T04:02:25Z

So when follow-read is enabled, the tikv client may need to track the load of each storage and force fallback to leader read when some stores' load are high.

How to make PD's decision and client's decision not to affect each other?

In this design, we can't. by @BusyJay 's suggestion, we can move the decision making to PD, and the client only use the pd generated result. In the way, we can further make decision both for region schedule and traffic dispatch together.

BTW, because the client can adjust it decision very quickly. So even we implement the algorithm at the client side, it not a big problem. When PD do some schedule for some hot read, the client side can adjust the portion of each peer on the fly. In my benchmark, the traffic can be balance quickly within a few minutes.

ref #35926

…6824) ref #35926

…e even (#38960) ref #35926

glorv added the type/enhancement The issue or PR belongs to an enhancement. label Jul 4, 2022

glorv self-assigned this Jul 4, 2022

glorv mentioned this issue Jul 4, 2022

executor: support dispatch cop request to closest replica adaptively #35927

Merged

12 tasks

ti-chi-bot pushed a commit that referenced this issue Jul 19, 2022

executor: support dispatch cop request to closest replica adaptively (#…

afd71bd

…35927) ref #35926

This was referenced Aug 2, 2022

executor: disable closest replica read if cluster is not balanced #36824

Merged

server: add a http interface to change labels #36845

Merged

ti-chi-bot pushed a commit that referenced this issue Aug 4, 2022

server: add a http interface to change labels (#36845)

95e4df8

ref #35926

ti-chi-bot pushed a commit that referenced this issue Sep 7, 2022

executor: disable closest replica read if cluster is not balanced (#3…

36b6710

…6824) ref #35926

glorv mentioned this issue Nov 8, 2022

domain: disable closest-adaptive dynamically to make read traffic more even #38960

Merged

6 tasks

ti-chi-bot pushed a commit that referenced this issue Nov 22, 2022

domain: disable closest-adaptive dynamically to make read traffic mor…

c9bb2f2

…e even (#38960) ref #35926

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

more flexible follower read dispatch strategy #35926

more flexible follower read dispatch strategy #35926

glorv commented Jul 4, 2022

xhebox commented Jul 5, 2022

nolouch commented Jul 28, 2022

glorv commented Jul 29, 2022 •

edited

Loading

more flexible follower read dispatch strategy #35926

more flexible follower read dispatch strategy #35926

Comments

glorv commented Jul 4, 2022

Enhancement

xhebox commented Jul 5, 2022

nolouch commented Jul 28, 2022

glorv commented Jul 29, 2022 • edited Loading

glorv commented Jul 29, 2022 •

edited

Loading