kvclient: follower reads can be sent to slow node resulting in high latency #120519
Labels
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Describe the problem
There are some mitigations in place to prevent follower reads from being sent to decommissioned or draining nodes as part of #112351, but this is insufficient. Specifically there are two additional scenarios in which we should prevent sending follower reads.
An complementary solution would be to implement #109320 which will allow mitigate some of the impact, but both these features will work better together. There is also some handling with regards to how we sort replicas to reduce sending requests to nodes with high RTT, however this only handles very extreme problems and doesn't handle the typical issues we see.
To reproduce
There are various faults that can be induced in step 2 above. Some of them are stopping a node for an extended outage, slowing disk IO throughput, creating an index which creates uneven load on the system or general network flakiness to a node.
Jira issue: CRDB-36726
The text was updated successfully, but these errors were encountered: