Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: single node unquiesces on every tick #63295

Closed
tbg opened this issue Apr 8, 2021 · 2 comments · Fixed by #103266
Closed

kvserver: single node unquiesces on every tick #63295

tbg opened this issue Apr 8, 2021 · 2 comments · Fixed by #103266
Assignees
Labels
A-kv-replication Relating to Raft, consensus, and coordination. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.

Comments

@tbg
Copy link
Member

tbg commented Apr 8, 2021

Describe the problem

On a single-node cluster, user ranges with moderate traffic (~10qps) emit ~5 empty raft entries per second because this code

if log.V(3) {
log.Infof(ctx, "unquiescing %d: waking leader", r.RangeID)
}

fires erroneously on every raft tick (200ms). This could affect single-node peak performance.

To Reproduce

$ ./cockroach start-single-node --logtostderr --insecure --vmodule=replica_application_state_machine=1 2>&1 | grep -F 'no-op on empty Raft entry'
[...]
I210408 08:33:53.850172 278 kv/kvserver/replica_application_state_machine.go:446 ⋮ [n1,s1,r37/1:‹/{Table/53-Max}›] 173  applying command with forced error: ‹kv/kvserver/replica_application_state_machine.go›:157: no-op on empty Raft entry
I210408 08:33:54.050195 303 kv/kvserver/replica_application_state_machine.go:446 ⋮ [n1,s1,r37/1:‹/{Table/53-Max}›] 174  applying command with forced error: ‹kv/kvserver/replica_application_state_machine.go›:157: no-op on empty Raft entry
I210408 08:33:54.250296 307 kv/kvserver/replica_application_state_machine.go:446 ⋮ [n1,s1,r37/1:‹/{Table/53-Max}›] 175  applying command with forced error: ‹kv/kvserver/replica_application_state_machine.go›:157: no-op on empty Raft entry
I210408 08:33:54.450539 232 kv/kvserver/replica_application_state_machine.go:446 ⋮ [n1,s1,r37/1:‹/{Table/53-Max}›] 176  applying command with forced error: ‹kv/kvserver/replica_application_state_machine.go›:157: no-op on empty Raft entry
I210408 08:33:54.650708 228 kv/kvserver/replica_application_state_machine.go:446 ⋮ [n1,s1,r37/1:‹/{Table/53-Max}›] 177  applying command with forced error: ‹kv/kvserver/replica_application_state_machine.go›:157: no-op on empty Raft entry
I210408 08:33:54.850304 322 kv/kvserver/replica_application_state_machine.go:446 ⋮ [n1,s1,r37/1:‹/{Table/53-Max}›] 178  applying command with forced error: ‹kv/kvserver/replica_application_state_machine.go›:157: no-op on empty Raft entry
I210408 08:33:54.906795 249 kv/kvserver/replica_application_state_machine.go:446 ⋮ [n1,s1,r35/1:‹/Table/{39-40}›] 179  applying command with forced error: ‹kv/kvserver/replica_application_state_machine.go›:157: no-op on empty Raft entry
[...]

Over a 60s interval, I'm counting (only for r37, the one getting the load) 291 instances, i.e. ~5 per second.

./cockroach debug raft-log cockroach-data/ 37 shows these empty entries and also tells me that the term remains stable, i.e. these entries are not emitted as a result of raft spuriously re-electing itself as a leader.

I then went and looked for places where we propose empty commands, and voila:

if log.V(3) {
log.Infof(ctx, "unquiescing %d: waking leader", r.RangeID)
}

I verified that this fires just before each empty command, i.e. it's conclusively the origin.

Environment:

Additional context
Stumbled upon this in #62791

Jira issue: CRDB-6512

Epic CRDB-25209

@tbg tbg added A-kv-replication Relating to Raft, consensus, and coordination. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. labels Apr 8, 2021
@tbg tbg changed the title kvserver: unexpected frequent empty raft entries kvserver: single node unquiesces on every tick Apr 8, 2021
@jlinder jlinder added the T-kv KV Team label Jun 16, 2021
@erikgrinaker erikgrinaker added T-kv-replication and removed T-kv KV Team labels May 31, 2022
@erikgrinaker
Copy link
Contributor

In some sense, this is expected. We aggressively quiesce ranges as soon as we can, and the only real pacing is replication and application lag, which is ~0 with a single replica. We also see this on regular RF=3 clusters, where ranges are constantly quiescing and unquiescing under moderate load.

We could add another pacing mechanism here, e.g. only quiesce if we've gone 5 ticks without any proposals. Is something like that what you had in mind here?

Alternatively, we could skip the proposal when we believe ourselves to be the leader and don't know about any other replicas, since there's noone else to wake up.

@erikgrinaker
Copy link
Contributor

Submitted #103266 to only quiesce after 4 ticks (2 seconds) with no proposals. I've seen this have non-negligible cost with steady write load, both in single- and multi-node clusters.

@craig craig bot closed this as completed in e0c35b4 May 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-replication Relating to Raft, consensus, and coordination. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants