Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cluster id check for hash kv handler #15924

Merged

Conversation

CaojiamingAlan
Copy link
Contributor

@CaojiamingAlan CaojiamingAlan commented May 19, 2023

Fix #15548
This PR only adds cluster id check for hashKVHandler. Cluster id checks for other handlers(member, promote, version, etc.) will be in separate PRs.

@serathius
Copy link
Member

Change overall looks good, but would like to have more tests as this is pretty critical functionality. If possible I would also like to have an e2e tests for inplace cluster restore. Do you think you can add an e2e tests?

@CaojiamingAlan
Copy link
Contributor Author

Change overall looks good, but would like to have more tests as this is pretty critical functionality. If possible I would also like to have an e2e tests for inplace cluster restore. Do you think you can add an e2e tests?

Sure. Working on it.

@CaojiamingAlan CaojiamingAlan requested a review from serathius May 21, 2023 18:40
Copy link
Member

@jmhbnz jmhbnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work on this @CaojiamingAlan.

Can you please squash the two identically named commits for etcdserver: add e2e test for periodic check after in-place recovery.

tests/e2e/corrupt_test.go Outdated Show resolved Hide resolved
@CaojiamingAlan CaojiamingAlan force-pushed the add-cluster-id-check-for-hashKVHandler branch from 216a75b to 903a440 Compare May 21, 2023 19:44
@CaojiamingAlan CaojiamingAlan requested a review from jmhbnz May 21, 2023 21:05
@CaojiamingAlan
Copy link
Contributor Author

@serathius I've added the tests. Could you please review them? thx.

@chaochn47
Copy link
Member

Hi @CaojiamingAlan I can help review the PR in case Marrek bandwidth is limited, give me a hour to refresh my mind on this area of code.

Copy link
Member

@chaochn47 chaochn47 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some nit-picks.

server/etcdserver/corrupt.go Outdated Show resolved Hide resolved
server/etcdserver/corrupt_test.go Outdated Show resolved Hide resolved
server/etcdserver/corrupt_test.go Outdated Show resolved Hide resolved
server/etcdserver/corrupt_test.go Outdated Show resolved Hide resolved
server/etcdserver/corrupt_test.go Outdated Show resolved Hide resolved
@chaochn47 chaochn47 mentioned this pull request May 24, 2023
@CaojiamingAlan
Copy link
Contributor Author

@chaochn47 Comments fixed. Thank you for your review.

@chaochn47
Copy link
Member

chaochn47 commented May 24, 2023

Thanks @CaojiamingAlan, please consider squash all the commits into one before asking for merging the PR~

@CaojiamingAlan CaojiamingAlan force-pushed the add-cluster-id-check-for-hashKVHandler branch from 3c8a40b to b16b4d4 Compare May 24, 2023 22:07
@CaojiamingAlan
Copy link
Contributor Author

@chaochn47 Squashed.

tests/e2e/corrupt_test.go Outdated Show resolved Hide resolved
Copy link
Member

@serathius serathius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some comments.

@CaojiamingAlan CaojiamingAlan force-pushed the add-cluster-id-check-for-hashKVHandler branch from 1113d40 to 21afa04 Compare June 9, 2023 15:52
api/v3rpc/rpctypes/error.go Outdated Show resolved Hide resolved
server/etcdserver/corrupt.go Outdated Show resolved Hide resolved
@CaojiamingAlan CaojiamingAlan force-pushed the add-cluster-id-check-for-hashKVHandler branch 2 times, most recently from 4525a37 to 3c48272 Compare June 16, 2023 15:33
@ahrtr
Copy link
Member

ahrtr commented Jul 5, 2023

@CaojiamingAlan Have you raised a followup ticket for #15924 (comment)?

Please also rebase this PR although github doesn't show any conflict.

@CaojiamingAlan CaojiamingAlan force-pushed the add-cluster-id-check-for-hashKVHandler branch from 3c48272 to f787be4 Compare July 5, 2023 17:31
@CaojiamingAlan
Copy link
Contributor Author

@ahrtr Have just rebased the PR and raised #16178. I was waiting for this to be merged first so didn't raise it previously. Sorry for misunderstanding what you meant.

BTW, are all errors supposed to be in api/v3rpc/rpctypes/error.go? I've just realized that Raft and corrupt uses HTTP rather than gRPC. I think using errors with the 'grpc' prefix is a bit confusing.

@CaojiamingAlan CaojiamingAlan force-pushed the add-cluster-id-check-for-hashKVHandler branch from f787be4 to 79b212e Compare July 5, 2023 18:04
@ahrtr
Copy link
Member

ahrtr commented Jul 5, 2023

I think using errors with the 'grpc' prefix is a bit confusing.

Right. We should avoid defining errors in api/v3rpc/rpctypes/error.go in this case. Instead, we should follow exactly the same pattern as mvcc.ErrCompacted

server/etcdserver/corrupt.go Outdated Show resolved Hide resolved
@CaojiamingAlan CaojiamingAlan force-pushed the add-cluster-id-check-for-hashKVHandler branch from 79b212e to 0afc201 Compare July 5, 2023 18:31
server/etcdserver/corrupt.go Outdated Show resolved Hide resolved
@CaojiamingAlan CaojiamingAlan force-pushed the add-cluster-id-check-for-hashKVHandler branch from 0afc201 to eff9517 Compare July 5, 2023 19:09
Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks @CaojiamingAlan

@ahrtr ahrtr merged commit 226e2cf into etcd-io:main Jul 5, 2023
@CaojiamingAlan CaojiamingAlan deleted the add-cluster-id-check-for-hashKVHandler branch July 17, 2023 18:47
@serathius
Copy link
Member

Was this backported to v3.4 and v3.5? @CaojiamingAlan can you backport?

@ahrtr ahrtr mentioned this pull request Oct 27, 2023
14 tasks
ahrtr added a commit to ahrtr/etcd that referenced this pull request Nov 22, 2023
ahrtr added a commit to ahrtr/etcd that referenced this pull request Nov 22, 2023
ahrtr added a commit to ahrtr/etcd that referenced this pull request Nov 22, 2023
ahrtr added a commit to ahrtr/etcd that referenced this pull request Nov 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Etcd corruption detection triggers during in-place cluster recovery
5 participants