Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loqrecovery: use captured meta range content for LOQ plans #94239

Merged
merged 1 commit into from
Jan 17, 2023

Conversation

aliher1911
Copy link
Contributor

@aliher1911 aliher1911 commented Dec 23, 2022

Note: only last commit belongs to this PR. Will update description once #93157 lands.

Previously loss of quorum recovery planner was using local replica info collected from all nodes to find source of truth for replicas that lost quorum.
With online approach local info snapshots don't have atomicity. This could cause planner to fail if available replicas are caught in different states on different nodes.
This commit adds alternative planning approach when range metadata is available. Instead of fixing individual replicas that can't make progress it finds ranges that can't make progress from metadata using descriptors and updates their replicas to recover from loss of quorum.
This commit also adds replica collection stage as a part of make-plan command itself. To invoke collection from a cluster instead of using files one needs to provide --host and other standard cluster connection related flags (--cert-dir, --insecure etc.) as appropriate.

Example command output for a local cluster with 3 out of 5 nodes surrvivng looks like:

~/tmp ❯❯❯ cockroach debug recover make-plan --insecure --host 127.0.0.1:26257 >recover-plan.json
Nodes scanned:           3
Total replicas analyzed: 228
Ranges without quorum:   15
Discarded live replicas: 0

Proposed changes:
  range r4:/System/tsd updating replica (n2,s2):3 to (n2,s2):15. Discarding available replicas: [], discarding dead replicas: [(n5,s5):4,(n4,s4):2].
  range r80:/Table/106/1 updating replica (n1,s1):1 to (n1,s1):14. Discarding available replicas: [], discarding dead replicas: [(n5,s5):3,(n4,s4):2].
  range r87:/Table/106/1/"paris"/"\xcc\xcc\xcc\xcc\xcc\xcc@\x00\x80\x00\x00\x00\x00\x00\x00(" updating replica (n1,s1):1 to (n1,s1):14. Discarding available replicas: [], discarding dead replicas: [(n5,s5):3,(n4,s4):2].
  range r88:/Table/106/1/"seattle"/"ffffffH\x00\x80\x00\x00\x00\x00\x00\x00\x14" updating replica (n3,s3):3 to (n3,s3):15. Discarding available replicas: [], discarding dead replicas: [(n5,s5):4,(n4,s4):2].
  range r105:/Table/106/1/"washington dc"/"L\xcc\xcc\xcc\xcc\xccL\x00\x80\x00\x00\x00\x00\x00\x00\x0f" updating replica (n3,s3):3 to (n3,s3):14. Discarding available replicas: [], discarding dead replicas: [(n5,s5):1,(n4,s4):2].
  range r98:/Table/107/1/"boston"/"333333D\x00\x80\x00\x00\x00\x00\x00\x00\x03" updating replica (n2,s2):3 to (n2,s2):15. Discarding available replicas: [], discarding dead replicas: [(n5,s5):4,(n4,s4):2].
  range r95:/Table/107/1/"seattle"/"ffffffH\x00\x80\x00\x00\x00\x00\x00\x00\x06" updating replica (n3,s3):2 to (n3,s3):15. Discarding available replicas: [], discarding dead replicas: [(n4,s4):4,(n5,s5):3].
  range r125:/Table/107/1/"washington dc"/"DDDDDDD\x00\x80\x00\x00\x00\x00\x00\x00\x04" updating replica (n3,s3):2 to (n3,s3):14. Discarding available replicas: [], discarding dead replicas: [(n4,s4):1,(n5,s5):3].
  range r115:/Table/108/1/"boston"/"8Q\xeb\x85\x1e\xb8B\x00\x80\x00\x00\x00\x00\x00\x00n" updating replica (n2,s2):3 to (n2,s2):15. Discarding available replicas: [], discarding dead replicas: [(n5,s5):4,(n4,s4):2].
  range r104:/Table/108/1/"new york"/"\x1c(\xf5\u008f\\I\x00\x80\x00\x00\x00\x00\x00\x007" updating replica (n2,s2):2 to (n2,s2):15. Discarding available replicas: [], discarding dead replicas: [(n5,s5):4,(n4,s4):3].
  range r102:/Table/108/1/"seattle"/"p\xa3\xd7\n=pD\x00\x80\x00\x00\x00\x00\x00\x00\xdc" updating replica (n3,s3):2 to (n3,s3):15. Discarding available replicas: [], discarding dead replicas: [(n4,s4):4,(n5,s5):3].
  range r126:/Table/108/1/"washington dc"/"Tz\xe1G\xae\x14L\x00\x80\x00\x00\x00\x00\x00\x00\xa5" updating replica (n3,s3):2 to (n3,s3):14. Discarding available replicas: [], discarding dead replicas: [(n4,s4):1,(n5,s5):3].
  range r86:/Table/108/3 updating replica (n1,s1):1 to (n1,s1):14. Discarding available replicas: [], discarding dead replicas: [(n4,s4):3,(n5,s5):2].
  range r59:/Table/109/1 updating replica (n2,s2):3 to (n2,s2):15. Discarding available replicas: [], discarding dead replicas: [(n5,s5):4,(n4,s4):2].
  range r65:/Table/111/1 updating replica (n3,s3):3 to (n3,s3):15. Discarding available replicas: [], discarding dead replicas: [(n5,s5):4,(n4,s4):2].

Discovered dead nodes would be marked as decommissioned:
  n4, n5


Proceed with plan creation [y/N] y
Plan created.
To stage recovery application in half-online mode invoke:

'cockroach debug recover apply-plan  --host=127.0.0.1:26257 --insecure=true <plan file>'

Alternatively distribute plan to below nodes and invoke 'debug recover apply-plan --store=<store-dir> <plan file>' on:
- node n2, store(s) s2
- node n1, store(s) s1
- node n3, store(s) s3

Release note: None

Fixes: #93038
Fixes: #93046

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@aliher1911 aliher1911 force-pushed the loq_05online_plan_replicas branch 4 times, most recently from 973507f to 70e07e9 Compare December 28, 2022 14:41
@aliher1911 aliher1911 changed the title loqrecovery: add replica info collection to admin server loqrecovery: use captured meta range content for LOQ plans Dec 28, 2022
@aliher1911 aliher1911 force-pushed the loq_05online_plan_replicas branch 5 times, most recently from 809abfb to 5305b54 Compare December 29, 2022 21:33
@aliher1911 aliher1911 force-pushed the loq_05online_plan_replicas branch 10 times, most recently from c727406 to 36bace0 Compare January 9, 2023 13:07
@aliher1911 aliher1911 marked this pull request as ready for review January 9, 2023 13:09
@aliher1911 aliher1911 requested review from a team as code owners January 9, 2023 13:09
@aliher1911 aliher1911 requested review from a team January 9, 2023 13:09
@aliher1911 aliher1911 requested a review from a team as a code owner January 9, 2023 13:09
@aliher1911 aliher1911 force-pushed the loq_05online_plan_replicas branch from 36bace0 to 8b8eb85 Compare January 10, 2023 12:59
Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just minor stuff.

pkg/kv/kvserver/loqrecovery/utils.go Show resolved Hide resolved
storeNames = append(storeNames, fmt.Sprintf("s%d", id))
}
return strings.Join(storeNames, ", ")
}

func (s storeIDSet) intersect(other storeIDSet) storeIDSet {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We allow use of Go generics now (#93735). This seems like a good opportunity to try it out with some generic map intersect/diff functions (or just use a library which provides it). Doesn't have to happen in this PR though, or at all.

@@ -1415,7 +1415,9 @@ func init() {
f.StringVarP(&debugRecoverPlanOpts.outputFileName, "plan", "o", "",
"filename to write plan to")
f.IntSliceVar(&debugRecoverPlanOpts.deadStoreIDs, "dead-store-ids", nil,
"list of dead store IDs")
"list of dead store IDs (can't be used together with dead-node-ids)")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just get rid of dead-store-ids, to avoid the complexity of supporting both?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should get rid of dead store ids in 23.2. Just to avoid any confusion on mixed cluster where old option which was the primary means disappears after upgrade. It should be less of an issue with 23.1 to 23.2 where we don't expect people to use dead-store-ids anymore.

pkg/cli/debug_recover_loss_of_quorum.go Outdated Show resolved Hide resolved
pkg/cli/debug_recover_loss_of_quorum.go Outdated Show resolved Hide resolved
pkg/kv/kvserver/loqrecovery/plan.go Outdated Show resolved Hide resolved
pkg/kv/kvserver/loqrecovery/plan.go Show resolved Hide resolved
pkg/kv/kvserver/loqrecovery/plan.go Show resolved Hide resolved
pkg/kv/kvserver/loqrecovery/utils.go Show resolved Hide resolved
pkg/kv/kvserver/loqrecovery/plan.go Outdated Show resolved Hide resolved
@aliher1911 aliher1911 force-pushed the loq_05online_plan_replicas branch from 8b8eb85 to 0618f19 Compare January 11, 2023 17:36
@aliher1911 aliher1911 requested review from a team as code owners January 11, 2023 17:36
@aliher1911 aliher1911 requested review from rhu713 and jayshrivastava and removed request for a team January 11, 2023 17:36
@dhartunian dhartunian removed the request for review from a team January 11, 2023 17:45
@aliher1911 aliher1911 force-pushed the loq_05online_plan_replicas branch 8 times, most recently from 9b527a8 to f21dbac Compare January 16, 2023 16:24
Previously loss of quorum recovery planner was using local replica
info collected from all nodes to find source of truth for replicas
that lost quorum.
With online approach local info snapshots don't have atomicity.
This could cause planner to fail if available replicas are caught
in different states on different nodes.
This commit adds alternative planning approach when range metadata
is available. Instead of fixing individual replicas that can't make
progress it finds ranges that can't make progress from metadata
using descriptors and updates their replicas to recover from loss
of quorum.
This commit also adds replica collection stage as a part of make-plan
command itself. To invoke collection from a cluster instead of using
files one needs to provide --host and other standard cluster
connection related flags (--cert-dir, --insecure etc.) as appropriate.

Release note: None
@aliher1911 aliher1911 force-pushed the loq_05online_plan_replicas branch from f21dbac to b5d5fc3 Compare January 17, 2023 17:05
@aliher1911
Copy link
Contributor Author

bors r=erikgrinaker

@craig
Copy link
Contributor

craig bot commented Jan 17, 2023

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Jan 17, 2023

Build succeeded:

@craig craig bot merged commit 6fd5b04 into cockroachdb:master Jan 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants