Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raft: ensure leadMaxSupported cannot regress when a leader bumps its term #133764

Closed
arulajmani opened this issue Oct 30, 2024 · 0 comments · Fixed by #133873
Closed

raft: ensure leadMaxSupported cannot regress when a leader bumps its term #133764

arulajmani opened this issue Oct 30, 2024 · 0 comments · Fixed by #133873
Assignees
Labels
A-leader-leases Related to the introduction of leader leases branch-master Failures and bugs on the master branch. branch-release-24.3 Used to mark GA and release blockers, technical advisories, and bugs for 24.3 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. GA-blocker T-kv KV Team

Comments

@arulajmani
Copy link
Collaborator

arulajmani commented Oct 30, 2024

Describe the problem

While staring at some raft code with @nvanbenschoten earlier today, as one does, we re-discovered this TODO:

cockroach/pkg/raft/raft.go

Lines 1353 to 1356 in 77b210b

// TODO(arul): add special handling for the r.lead == r.id case with an
// assertion to ensure the LeaderSupportExpiration is in the past before
// campaigning.
if r.supportingFortifiedLeader() && r.lead != r.id {

As written, it's possible for a raft leader to regress leadMaxSupported after it steps down, if it calls an election, doesn't win it, but de-fortifies followers and allowing another follower to win an election. This is possible because the TODO isn't addressed and this logic in inFortifyLease:

cockroach/pkg/raft/raft.go

Lines 1480 to 1486 in 77b210b

inFortifyLease := r.supportingFortifiedLeader() &&
// NB: A fortified leader is allowed to bump its term. It'll need to
// re-fortify once if it gets elected at the higher term though, so the
// leader must take care to not regress its supported expiration.
// However, at the follower, we grant the fortified leader our vote at
// the higher term.
r.lead != m.From &&


To fix this, we could either:

  1. Address the TODO and ensure leadMaxSupported is in the past by calling CanDefortify in hup.
  2. OR remove the special case handling in inFortifyLease and check supportingFortifedLeader without exception in hup.

We settled on approach 2, as we no longer need this implicit de-fortification now that we have the explicit MsgDeFortifyLeader flow.

Jira issue: CRDB-43740

Epic CRDB-37522

@arulajmani arulajmani added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-kv KV Team A-leader-leases Related to the introduction of leader leases labels Oct 30, 2024
@arulajmani arulajmani self-assigned this Oct 30, 2024
@cockroachdb cockroachdb deleted a comment from blathers-crl bot Oct 30, 2024
@arulajmani arulajmani added branch-master Failures and bugs on the master branch. branch-release-24.3 Used to mark GA and release blockers, technical advisories, and bugs for 24.3 GA-blocker labels Oct 30, 2024
arulajmani added a commit to arulajmani/cockroach that referenced this issue Oct 30, 2024
Previously, it was possible for a leader to regress leaderMaxSupported
by calling an election at a higher term. This was because we weren't
careful to recognize this case on the leader, and followers had special
case handling that allowed a leader to override `inFortifyLease`.
Together, this could cause lease regressions for LeaderLeases in some
rare cases.

This patch fixes this issue by removing special case handling in
`inFortifyLease`. We also remove a special case in the handling of
`MsgHup` which allowed a leader to step down and campaign without
de-fortifying itself.

Fixes cockroachdb#133764

Release note: None
arulajmani added a commit to arulajmani/cockroach that referenced this issue Oct 30, 2024
Previously, it was possible for a leader to regress leaderMaxSupported
by calling an election at a higher term. This was because we weren't
careful to recognize this case on the leader, and followers had special
case handling that allowed a leader to override `inFortifyLease`.
Together, this could cause lease regressions for LeaderLeases in some
rare cases.

This patch fixes this issue by removing special case handling in
`inFortifyLease`. We also remove a special case in the handling of
`MsgHup` which allowed a leader to step down and campaign without
de-fortifying itself.

Fixes cockroachdb#133764

Release note: None
arulajmani added a commit to arulajmani/cockroach that referenced this issue Oct 31, 2024
Previously, it was possible for a leader to regress leaderMaxSupported
by calling an election at a higher term. This was because we weren't
careful to recognize this case on the leader, and followers had special
case handling that allowed a leader to override `inFortifyLease`.
Together, this could cause lease regressions for LeaderLeases in some
rare cases.

This patch fixes this issue by removing special case handling in
`inFortifyLease`. We also remove a special case in the handling of
`MsgHup` which allowed a leader to step down and campaign without
de-fortifying itself.

Fixes cockroachdb#133764

Release note: None
craig bot pushed a commit that referenced this issue Nov 4, 2024
132877: kv: deflake TestLeaseQueueLeasePreferencePurgatoryError r=nvanbenschoten a=nvanbenschoten

Fixes #132652.

This commit fixes a flake in TestLeaseQueueLeasePreferencePurgatoryError where a delayed span config split could trip up the test and cause it to flake. We now wait for the span config split to be processed before waiting on the lease queue's purgatory, which should eliminate the flake.

Release note: None

133568: changefeedccl: add PTS to system.users r=andyyang890 a=asg0451

Add `system.users` to the list of system tables
that changefeeds protect with PTS. This table is
required for CDC Queries.

Part of: #128806

Release note (enterprise change): Add
`system.users` to the list of system tables that
changefeeds protect with PTS. This table is
required for CDC Queries.


133873: raft: ensure leaderMaxSupported does not regress when bumping terms r=nvanbenschoten a=arulajmani

Previously, it was possible for a leader to regress leaderMaxSupported by calling an election at a higher term. This was because we weren't careful to recognize this case on the leader, and followers had special case handling that allowed a leader to override `inFortifyLease`. Together, this could cause lease regressions for LeaderLeases in some rare cases.

This patch fixes this issue by removing special case handling in `inFortifyLease`. We also remove a special case in the handling of `MsgHup` which allowed a leader to step down and campaign without de-fortifying itself.

Fixes #133764

Release note: None

133895: ui: reroute v2 db page to legacy page when cluster is unfinalized r=xinhaoz a=xinhaoz

The migrations that support the v2 db page runs on finalization. Rather than show a non-functional page in an unfinalized cluster state let's reroute to the legacy page.

Epic: CRDB-37558
Fixes: #133843

Release note (ui change): The v2 db pages will only be available post ugprade finalization to 24.3. Prior to that we'll continue to show the legacy page when the cluster is in its unfinalized state.

134082: cli: improve tsdump upload concurrency behavior r=arjunmahishi a=dhartunian

- wrap concurrent error collection and status printing in necessary mutexes.
- reduce goroutine spawning to 10 for primary loop which avoids rate limit errors from datadog.

Release note: None

134216: sql/catalog: avoid multiple interface boxing allocs from `descpb.NameInfo` r=nvanbenschoten a=nvanbenschoten

When we cast a `descpb.NameInfo` to a `catalog.NameKey`, we incur at least one heap allocation to box the struct in an interface. This commit ensures that it's only one, and not multiple.

The change was motivated by observing two heap allocations in `lookupStoreCacheID`. Ideally, we would have no heap allocations here, but this is a step in the right direction.

While here, we also clean up a few similar cases.

```
name                            old time/op    new time/op    delta
Sysbench/SQL/oltp_read_only       36.3ms ± 4%    36.0ms ± 3%    ~     (p=0.730 n=9+9)
Sysbench/SQL/oltp_write_only      19.3ms ± 6%    18.8ms ± 4%  -2.36%  (p=0.043 n=9+10)
Sysbench/SQL/oltp_read_write      58.7ms ± 3%    57.2ms ± 2%  -2.49%  (p=0.000 n=10+10)
Sysbench/SQL/oltp_point_select    1.61ms ± 1%    1.56ms ± 2%  -2.59%  (p=0.000 n=9+10)
Sysbench/SQL/oltp_begin_commit     594µs ± 4%     608µs ± 7%    ~     (p=0.089 n=10+10)

name                            old alloc/op   new alloc/op   delta
Sysbench/SQL/oltp_read_only       1.05MB ± 1%    1.04MB ± 0%  -0.64%  (p=0.004 n=9+10)
Sysbench/SQL/oltp_write_only       405kB ± 1%     404kB ± 1%    ~     (p=0.720 n=9+10)
Sysbench/SQL/oltp_read_write      1.50MB ± 1%    1.51MB ± 1%    ~     (p=0.447 n=9+10)
Sysbench/SQL/oltp_point_select    35.3kB ± 2%    35.1kB ± 2%    ~     (p=0.278 n=9+10)
Sysbench/SQL/oltp_begin_commit    18.9kB ± 0%    18.9kB ± 1%    ~     (p=0.055 n=10+10)

name                            old allocs/op  new allocs/op  delta
Sysbench/SQL/oltp_read_only        6.96k ± 1%     6.86k ± 1%  -1.37%  (p=0.000 n=9+10)
Sysbench/SQL/oltp_write_only       3.62k ± 1%     3.58k ± 0%  -0.96%  (p=0.000 n=9+10)
Sysbench/SQL/oltp_read_write       11.0k ± 1%     10.9k ± 1%  -0.56%  (p=0.030 n=9+10)
Sysbench/SQL/oltp_point_select       317 ± 1%       312 ± 0%  -1.74%  (p=0.000 n=9+10)
Sysbench/SQL/oltp_begin_commit       140 ± 0%       140 ± 0%    ~     (p=1.000 n=10+10)
```

Epic: None
Release note: None

134217: sql: fix recently introduced data race r=yuzefovich a=yuzefovich

Recently, in a5b8d06 we introduced a possible data race. Namely, that change modifies a proto inside of the physical plan to cap the slice of types during the vectorized operator planning in order to prevent a type schema corruption due to slice aliasing. However, it's been assumed in a couple of places that the physical plan is immutable. This commit clarifies that the physical plan is immutable after its finalization and moves the capping into the finalization.

Fixes: #133934.

Release note: None

Co-authored-by: Nathan VanBenschoten <[email protected]>
Co-authored-by: Miles Frankel <[email protected]>
Co-authored-by: Arul Ajmani <[email protected]>
Co-authored-by: Xin Hao Zhang <[email protected]>
Co-authored-by: David Hartunian <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
craig bot pushed a commit that referenced this issue Nov 5, 2024
133132: sql/tests: create sysbench microbenchmark suite r=nvanbenschoten a=nvanbenschoten

This commit adds a new microbenchmark suite which emulates `sysbench`, with subtests for `oltp_read_only`, `oltp_write_only`, and `oltp_read_write`. The structure of the benchmark suite makes it easy to add more subtests in the future (e.g. for `oltp_multi_insert`).

The suite is designed to run the same workload against different levels of the CockroachDB stack, with the initial implementation targeting SQL (and below) through the `sysbenchSQL` driver and KV (and below) through the `sysbenchKV` driver. An example of additional an driver that could be added in the future is `sysbenchPebble`. It would also make sense to test drivers under different configurations. For example, we should add a variant of the `sysbenchKV` driver which disables the local RPC fast-path.

The goal of this suite is to provide developers with a way to quickly measure the performance of CockroachDB against sysbench in a Go microbenchmark environment. This will help with identifying opportunities for performance improvements and with providing a rapid feedback loop for evaluating the effectiveness of performance changes.

The initial benchmark performance looks like:
```
name                        time/op
Sysbench/SQL/OltpReadOnly   2.88ms ±16%
Sysbench/SQL/OltpWriteOnly  2.32ms ±16%
Sysbench/SQL/OltpReadWrite  5.47ms ± 6%
Sysbench/KV/OltpReadOnly     445µs ± 5%
Sysbench/KV/OltpWriteOnly    594µs ± 5%
Sysbench/KV/OltpReadWrite   1.07ms ± 4%

name                        alloc/op
Sysbench/SQL/OltpReadOnly    965kB ± 1%
Sysbench/SQL/OltpWriteOnly   487kB ± 4%
Sysbench/SQL/OltpReadWrite  1.34MB ± 0%
Sysbench/KV/OltpReadOnly     264kB ± 1%
Sysbench/KV/OltpWriteOnly    184kB ± 4%
Sysbench/KV/OltpReadWrite    440kB ± 0%

name                        allocs/op
Sysbench/SQL/OltpReadOnly    6.26k ± 1%
Sysbench/SQL/OltpWriteOnly   3.40k ± 0%
Sysbench/SQL/OltpReadWrite   9.65k ± 0%
Sysbench/KV/OltpReadOnly       650 ± 0%
Sysbench/KV/OltpWriteOnly    1.08k ± 1%
Sysbench/KV/OltpReadWrite    1.73k ± 0%
```

Epic: None
Release Note: None

133643: storage: bump Pebble to the updated Category API r=RaduBerinde a=RaduBerinde

Changes:

 * [`71bb6ba2`](cockroachdb/pebble@71bb6ba2) sstable: make Category an enum

Release note: none.
Fixes #133507


133873: raft: ensure leaderMaxSupported does not regress when bumping terms r=nvanbenschoten a=arulajmani

Previously, it was possible for a leader to regress leaderMaxSupported by calling an election at a higher term. This was because we weren't careful to recognize this case on the leader, and followers had special case handling that allowed a leader to override `inFortifyLease`. Together, this could cause lease regressions for LeaderLeases in some rare cases.

This patch fixes this issue by removing special case handling in `inFortifyLease`. We also remove a special case in the handling of `MsgHup` which allowed a leader to step down and campaign without de-fortifying itself.

Fixes #133764

Release note: None

134127: kvcoord: add metric for async requests throttling duration r=yuzefovich a=yuzefovich

This commit adds another metric `distsender.batches.async.throttled_cumulative_duration_nanos` which exposes how long partial batches were throttled for. This can be useful when adjusting the DistSender concurrency limit.

Informs: #131125.
Epic: None

Release note: None

Co-authored-by: Nathan VanBenschoten <[email protected]>
Co-authored-by: Radu Berinde <[email protected]>
Co-authored-by: Arul Ajmani <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
@craig craig bot closed this as completed in 231aba1 Nov 5, 2024
blathers-crl bot pushed a commit that referenced this issue Nov 5, 2024
Previously, it was possible for a leader to regress leaderMaxSupported
by calling an election at a higher term. This was because we weren't
careful to recognize this case on the leader, and followers had special
case handling that allowed a leader to override `inFortifyLease`.
Together, this could cause lease regressions for LeaderLeases in some
rare cases.

This patch fixes this issue by removing special case handling in
`inFortifyLease`. We also remove a special case in the handling of
`MsgHup` which allowed a leader to step down and campaign without
de-fortifying itself.

Fixes #133764

Release note: None
nvanbenschoten pushed a commit to nvanbenschoten/cockroach that referenced this issue Nov 5, 2024
Previously, it was possible for a leader to regress leaderMaxSupported
by calling an election at a higher term. This was because we weren't
careful to recognize this case on the leader, and followers had special
case handling that allowed a leader to override `inFortifyLease`.
Together, this could cause lease regressions for LeaderLeases in some
rare cases.

This patch fixes this issue by removing special case handling in
`inFortifyLease`. We also remove a special case in the handling of
`MsgHup` which allowed a leader to step down and campaign without
de-fortifying itself.

Fixes cockroachdb#133764

Release note: None
arulajmani added a commit that referenced this issue Nov 6, 2024
Previously, it was possible for a leader to regress leaderMaxSupported
by calling an election at a higher term. This was because we weren't
careful to recognize this case on the leader, and followers had special
case handling that allowed a leader to override `inFortifyLease`.
Together, this could cause lease regressions for LeaderLeases in some
rare cases.

This patch fixes this issue by removing special case handling in
`inFortifyLease`. We also remove a special case in the handling of
`MsgHup` which allowed a leader to step down and campaign without
de-fortifying itself.

Fixes #133764

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-leader-leases Related to the introduction of leader leases branch-master Failures and bugs on the master branch. branch-release-24.3 Used to mark GA and release blockers, technical advisories, and bugs for 24.3 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. GA-blocker T-kv KV Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant