Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: reduce reproposals of AddSSTable and other large commands #98563

Open
erikgrinaker opened this issue Mar 14, 2023 · 3 comments
Open
Labels
C-performance Perf of queries or internals. Solution not expected to change functional behavior. T-kv KV Team

Comments

@erikgrinaker
Copy link
Contributor

erikgrinaker commented Mar 14, 2023

We've seen that Raft can run nodes out of memory during overload, in particular with large commands like AddSSTable (see e.g. #73376). One possible contributing factor here is reproposals, where an overloaded Raft group that takes more than 3 seconds to ack a proposal may continually stuff the same command into Raft over and over, worsening the overload. We should consider throttling reproposals of large commands.

Jira issue: CRDB-25334

Epic CRDB-39898

@erikgrinaker erikgrinaker added C-performance Perf of queries or internals. Solution not expected to change functional behavior. T-kv-replication labels Mar 14, 2023
@blathers-crl
Copy link

blathers-crl bot commented Mar 14, 2023

cc @cockroachdb/replication

@tbg
Copy link
Member

tbg commented Mar 16, 2023

Btw I have logging for this in #98576 and it does happen:

Details
$ grep -Eo 'reproposed .*, at applied' reprop.txt  | grep -Eo '(.*, at applied)' | sort | uniq -c | sort -rn
     58 reproposed 1 (6.7 MiB, at applied
     50 reproposed 1 (10 MiB, at applied
     46 reproposed 1 (8.8 MiB, at applied
     28 reproposed 1 (12 MiB, at applied
     24 reproposed 1 (7.9 MiB, at applied
     18 reproposed 1 (11 MiB, at applied
     15 reproposed 1 (7.3 MiB, at applied
     14 reproposed 1 (6.6 MiB, at applied
     12 reproposed 1 (7.5 MiB, at applied
     12 reproposed 1 (7.4 MiB, at applied
     10 reproposed 1 (6.8 MiB, at applied
      5 reproposed 1 (7.6 MiB, at applied
      4 reproposed 1 (7.1 MiB, at applied
      4 reproposed 1 (6.9 MiB, at applied
      3 reproposed 1 (7.2 MiB, at applied
      3 reproposed 1 (2.6 MiB, at applied
      2 reproposed 3 (1.0 KiB, at applied
      2 reproposed 1 (7.0 MiB, at applied
      2 reproposed 1 (5.3 MiB, at applied
      2 reproposed 1 (3.2 MiB, at applied
      2 reproposed 1 (1.4 KiB, at applied
      2 reproposed 1 (1.2 KiB, at applied
      1 reproposed 2 (8.8 MiB, at applied
      1 reproposed 2 (6.7 MiB, at applied
      1 reproposed 2 (11 MiB, at applied
      1 reproposed 1 (9.7 MiB, at applied
      1 reproposed 1 (9.6 MiB, at applied
      1 reproposed 1 (9.4 MiB, at applied
       ....
      (44 lines in total)

tbg added a commit to tbg/cockroach that referenced this issue May 9, 2023
We saw this likely contribute to OOMs during log application in
cockroachlabs/support#2287.

Touches cockroachdb#98563.

Epic: CRDB-25503
Release note: None
@tbg
Copy link
Member

tbg commented Jul 10, 2023

I have a (completely untested) prototype here: #106505

Not planning to work on this due to competing priorities. I am also not too convinced this is still a large problem, since CheckQuorum now defaults to on (at least on master) and the cases in which refreshes are problematic have good likelihood to also lose leadership, in which case refreshing adds very little additional work to the system since these proposals just get dropped.

I also think there is a more appealing solution (assuming this would still be an issue) which involves preventing proposals when we see evidence that they can't be made to apply. This would be done by proposing an essentially empty command that bumps the lease applied index. The upshot of this strategy is that if we can get that command to apply, we can actually let go completely of the "large" stuck command, i.e. resolve its latches, signal the caller with an unambiguous error, etc.

We can also leverage at least partial knowledge of what made it in the raft log, which is possible due to recent refactors (see onProposalsDropped1). If we revisited the lease protocol and as a result could disable raft proposal forwarding, we could make this a lot more airtight2. More details on this in #103908 (comment).

Footnotes

  1. https://github.com/cockroachdb/cockroach/blob/24da48fbef335f2085955fa3524c02ebb104e66e/pkg/kv/kvserver/replica_proposal_buf_test.go#L68-L70

  2. https://github.com/cockroachdb/cockroach/issues/105172

@tbg tbg removed their assignment Jul 10, 2023
@exalate-issue-sync exalate-issue-sync bot added T-kv KV Team and removed T-kv-replication labels Jun 28, 2024
@github-project-automation github-project-automation bot moved this to Incoming in KV Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-performance Perf of queries or internals. Solution not expected to change functional behavior. T-kv KV Team
Projects
No open projects
Status: Incoming
Development

No branches or pull requests

2 participants