[CHIA-1638] Pace block requests #18729

arvidn · 2024-10-17T23:18:20Z

Purpose:

This patch addresses a problem during long-sync, where the syncing node may send request_blocks at a rate exceeding the peer's outbound rate limit for respond_blocks. The result of which is the peer not responding and the syncing node closing the connection. Over time, all peers may get disconnected and syncing stalls. It takes at least 30 seconds + weight proof validation to restart syncing.

The root of this problem is that the rate limits for request_blocks is not aligned with the rate limit for respond_blocks. They are 500 msgs/minute and 100 msgs/minute respectively (but then scaled to 30%). It never makes sense to send more requests that the peer is willing to respond to, so these limits should really be the same.

This patch hard-codes the expected outbound rate limit of the peer and paces requests to never exceed that limit. Thus, maintaining a steady sync (albeit, slow). The rate is at most one request every 2 seconds.

We already send requests to multiple peers, if we have more than one. This patch keeps track of the timestamp, per peer, when it's OK to send the next request. Sometimes, this timestamp can be in the past. This happens if one peer stalls for a long time and we "miss" the time to send a request to another peer (or the same peer for that matter).

The rate limit is enforced at 60 seconds at a time, so we allow "catching up" by only incrementing the timestamp by the rate limit minimum (2 seconds). However, if a peer takes too long to respond, we penalize it by bumping the time stamp to the current time. This creates a weak affinity to request more from faster peers.

There are still issues with our concept of rate limits. For instance, there is a configuration option to scale the rate limits. But the effective limits are never communicated over the protocol, so there's no way of knowing whether a peer has tweaked its limits.

Current Behavior:

During long sync, we request blocks as fast as we can (with a single request outstanding at a time). Risking pushing peers over the limit, stalling and having to restart the sync.

New Behavior:

During long sync, we pace the block requests to peers to never exceed the (presumed) rate limit for block requests.

Testing Notes:

Manually tested on my node. I sync about 3x faster.

github-actions · 2024-10-18T16:15:54Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

github-actions · 2024-10-18T16:35:44Z

Conflicts have been resolved. A maintainer will review the pull request shortly.

github-actions · 2024-10-22T19:14:59Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

github-actions · 2024-10-25T15:41:40Z

Conflicts have been resolved. A maintainer will review the pull request shortly.

chia/full_node/full_node.py

chia/server/ws_connection.py

chia/full_node/full_node.py

concerns have been addressed

github-actions · 2024-11-01T16:55:35Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

…. Prioritize fast peers over slow ones

github-actions · 2024-11-01T17:18:57Z

Conflicts have been resolved. A maintainer will review the pull request shortly.

github-actions · 2024-11-01T18:21:55Z

File	Coverage	Missing Lines
`chia/full_node/full_node.py`	84.0%	lines 1179, 1190, 1194, 1202

Total	Missing	Coverage
26 lines	4 lines	84%

arvidn force-pushed the pace-block-requests branch from d5768f2 to 432e953 Compare October 18, 2024 14:35

arvidn added the Changed Required label for PR that categorizes merge commit message as "Changed" for changelog label Oct 18, 2024

github-actions bot added the merge_conflict Branch has conflicts that prevent merge to main label Oct 18, 2024

arvidn changed the title ~~Pace block requests~~ [CHIA-1638] Pace block requests Oct 18, 2024

arvidn force-pushed the pace-block-requests branch from 432e953 to ff3ffd6 Compare October 18, 2024 16:35

github-actions bot removed the merge_conflict Branch has conflicts that prevent merge to main label Oct 18, 2024

arvidn force-pushed the pace-block-requests branch from ff3ffd6 to ac7893a Compare October 18, 2024 16:52

github-actions bot added the merge_conflict Branch has conflicts that prevent merge to main label Oct 22, 2024

arvidn force-pushed the pace-block-requests branch from ac7893a to ced7e0f Compare October 25, 2024 15:41

github-actions bot removed the merge_conflict Branch has conflicts that prevent merge to main label Oct 25, 2024

arvidn marked this pull request as ready for review October 25, 2024 15:42

arvidn requested a review from a team as a code owner October 25, 2024 15:42

arvidn requested a review from almogdepaz October 25, 2024 15:51

github-actions bot added the coverage-diff label Oct 25, 2024

almogdepaz reviewed Oct 30, 2024

View reviewed changes

chia/full_node/full_node.py Show resolved Hide resolved

chia/full_node/full_node.py Outdated Show resolved Hide resolved

emlowe reviewed Oct 30, 2024

View reviewed changes

chia/full_node/full_node.py Show resolved Hide resolved

emlowe reviewed Oct 30, 2024

View reviewed changes

chia/server/ws_connection.py Outdated Show resolved Hide resolved

arvidn requested a review from almogdepaz October 31, 2024 11:14

altendky previously requested changes Oct 31, 2024

View reviewed changes

chia/full_node/full_node.py Outdated Show resolved Hide resolved

arvidn requested a review from emlowe October 31, 2024 13:34

arvidn closed this Nov 1, 2024

arvidn reopened this Nov 1, 2024

emlowe previously approved these changes Nov 1, 2024

View reviewed changes

arvidn added ready_to_merge Submitter and reviewers think this is ready and removed coverage-diff labels Nov 1, 2024

github-actions bot added the merge_conflict Branch has conflicts that prevent merge to main label Nov 1, 2024

arvidn added 4 commits November 1, 2024 18:18

pace block requests to avoid the peer hitting its response rate limit…

6ed8d37

…. Prioritize fast peers over slow ones

fix mypy warning

cca6cfa

simplify making WSChiaConnection usable as a dictionary key

f961009

address review comments

6dc9af2

arvidn dismissed emlowe’s stale review via 6dc9af2 November 1, 2024 17:18

arvidn force-pushed the pace-block-requests branch from d451773 to 6dc9af2 Compare November 1, 2024 17:18

github-actions bot removed the merge_conflict Branch has conflicts that prevent merge to main label Nov 1, 2024

emlowe approved these changes Nov 1, 2024

View reviewed changes

github-actions bot added the coverage-diff label Nov 1, 2024

pmaslana removed the coverage-diff label Nov 1, 2024

pmaslana merged commit 12a089f into main Nov 1, 2024
362 of 363 checks passed

pmaslana deleted the pace-block-requests branch November 1, 2024 19:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CHIA-1638] Pace block requests #18729

[CHIA-1638] Pace block requests #18729

arvidn commented Oct 17, 2024 •

edited

Loading

github-actions bot commented Oct 18, 2024

github-actions bot commented Oct 18, 2024

github-actions bot commented Oct 22, 2024

github-actions bot commented Oct 25, 2024

github-actions bot commented Nov 1, 2024

github-actions bot commented Nov 1, 2024

github-actions bot commented Nov 1, 2024

[CHIA-1638] Pace block requests #18729

[CHIA-1638] Pace block requests #18729

Conversation

arvidn commented Oct 17, 2024 • edited Loading

Purpose:

Current Behavior:

New Behavior:

Testing Notes:

github-actions bot commented Oct 18, 2024

github-actions bot commented Oct 18, 2024

github-actions bot commented Oct 22, 2024

github-actions bot commented Oct 25, 2024

github-actions bot commented Nov 1, 2024

github-actions bot commented Nov 1, 2024

github-actions bot commented Nov 1, 2024

arvidn commented Oct 17, 2024 •

edited

Loading