Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wal: add Verify function to perform corruption check on wal contents #10516

Merged
merged 2 commits into from
Mar 12, 2019

Conversation

shreyas-s-rao
Copy link
Contributor

@shreyas-s-rao shreyas-s-rao commented Mar 3, 2019

Signed-off-by: Shreyas Rao [email protected]

To allow verification of the WAL contents with minimal memory footprint. Added a Verify function to the wal package that performs a corruption check on the WAL contents and reports any errors. This serves to be a streamlined version of the ReadAll method on the WAL struct, as it allows for a quick validation of the WAL contents before starting etcd from an existing data directory without the overhead of returning the entire WAL data in-memory to the caller. Also refactored the openAtIndex function to allow the Verify function to reuse code for creating decoder.

Fixes #10497

@codecov-io
Copy link

codecov-io commented Mar 3, 2019

Codecov Report

Merging #10516 into master will decrease coverage by 0.29%.
The diff coverage is 63.76%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #10516     +/-   ##
=========================================
- Coverage   71.88%   71.59%   -0.3%     
=========================================
  Files         393      393             
  Lines       36521    36572     +51     
=========================================
- Hits        26254    26183     -71     
- Misses       8449     8556    +107     
- Partials     1818     1833     +15
Impacted Files Coverage Δ
wal/wal.go 54.31% <63.76%> (+0.79%) ⬆️
pkg/fileutil/purge.go 65.9% <0%> (-6.82%) ⬇️
pkg/netutil/netutil.go 63.11% <0%> (-6.56%) ⬇️
etcdctl/ctlv3/command/lease_command.go 65.34% <0%> (-5.95%) ⬇️
pkg/adt/interval_tree.go 88.28% <0%> (-5.41%) ⬇️
lease/leasehttp/http.go 63.97% <0%> (-4.42%) ⬇️
proxy/grpcproxy/watch.go 88.55% <0%> (-4.22%) ⬇️
proxy/grpcproxy/watcher.go 89.79% <0%> (-4.09%) ⬇️
etcdserver/v2_server.go 80.76% <0%> (-3.85%) ⬇️
pkg/transport/listener.go 54.5% <0%> (-3.8%) ⬇️
... and 14 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6da17cd...5f36a97. Read the comment docs.

@shreyas-s-rao
Copy link
Contributor Author

@xiang90 I see that the integration test is failing on Travis CI. I ran the test locally on master branch at 2c69559 and it is still failing. PTAL. Is there anything to be done from my end?

wal/wal.go Outdated Show resolved Hide resolved
wal/wal.go Outdated Show resolved Hide resolved
@jingyih
Copy link
Contributor

jingyih commented Mar 4, 2019

cc @wenjiaswe

wal/wal.go Outdated Show resolved Hide resolved
wal/wal.go Outdated Show resolved Hide resolved
wal/wal.go Outdated Show resolved Hide resolved
@shreyas-s-rao shreyas-s-rao force-pushed the wal-verify-func branch 2 times, most recently from ba5ed69 to dc8552d Compare March 8, 2019 15:52
wal/wal.go Show resolved Hide resolved
wal/wal.go Outdated Show resolved Hide resolved
@shreyas-s-rao shreyas-s-rao force-pushed the wal-verify-func branch 2 times, most recently from eb554ce to 593172e Compare March 11, 2019 09:47
@shreyas-s-rao
Copy link
Contributor Author

@xiang90 @spzala The fmt test on Travis seems to be failing with the error vendored licenses do not match given bill of materials. PTAL.

@xiang90
Copy link
Contributor

xiang90 commented Mar 11, 2019

@shreyas-s-rao awesome. can you write a small test for verify func?

@shreyas-s-rao
Copy link
Contributor Author

@xiang90 I have added a test for Verify, that ensures that the Verify function detects WAL corruption.

wal/wal_test.go Outdated Show resolved Hide resolved
@xiang90
Copy link
Contributor

xiang90 commented Mar 12, 2019

@shreyas-s-rao a few nits. looks good to me after fixing them.

@xiang90
Copy link
Contributor

xiang90 commented Mar 12, 2019

lgtm

@shreyas-s-rao
Copy link
Contributor Author

@xiang90 Thanks for the review!

Could you please add this PR to your next release v3.3.13? It would really help unblock an issue in the project I work on (etcd-backup-restore), as Verify is very important for our use case. Thanks.

@xiang90
Copy link
Contributor

xiang90 commented Mar 12, 2019

@shreyas-s-rao we do not usually backport new features. but i will let @gyuho and @jpbetz make the decision.

@shreyas-s-rao
Copy link
Contributor Author

@xiang90 Ah ok. Thanks for the info.

@jpbetz
Copy link
Contributor

jpbetz commented Mar 12, 2019

This adds new functionality, so is clearly more than a bug fix, but does not include an API change, so it's backward compatible. Given that it protects users against corruption, I'm slightly in favor of this being backported. But it's @gyuho's call as the patch manager of 3.3. I'll support whatever he decides.

@xiang90
Copy link
Contributor

xiang90 commented Mar 12, 2019

tests failures are unrelated.

@xiang90 xiang90 merged commit 4478993 into etcd-io:master Mar 12, 2019
@shreyas-s-rao
Copy link
Contributor Author

shreyas-s-rao commented Mar 13, 2019

@xiang90 @jpbetz @gyuho Is there a defined release cycle for etcd? I just wanted to know the ETA on the next patch release, say v3.3.13

@shreyas-s-rao
Copy link
Contributor Author

@xiang90 @jpbetz @gyuho I would like to know if there has been any decision taken on the release for this PR? I am only asking from the perspective of my project which is currently in need of the Verify feature, and it would really help us out if we know the tentative release dates for the next patch or release, based on which we need to make a few internal decisions within our project. Thanks.

@jpbetz
Copy link
Contributor

jpbetz commented Mar 15, 2019

@shreyas-s-rao Looks like the last 3.3 patch release was on 2019-02-07. They're typically released every month or two, depending on how many fixes have been accumulated. There are very few backports to 3.3.13 so far, so I'd not rush to release the next patch version yet, but probably some time this month? @gyuho Any thoughts?

@PadmaB
Copy link

PadmaB commented Mar 22, 2019

@gyuho any thoughts on the above request? This fix is important for us, so knowing the release date of next patch version will help us plan better internally. Thanks!

@PadmaB
Copy link

PadmaB commented Mar 29, 2019

We are close to end of the month, is there any release planned? Any updates on this will be highly appreciated. Thanks!

@gyuho
Copy link
Contributor

gyuho commented Mar 29, 2019

can you help backport?

@shreyas-s-rao
Copy link
Contributor Author

Sure @gyuho . What is the process for that?

@gyuho
Copy link
Contributor

gyuho commented Apr 2, 2019

@shreyas-s-rao You can check out the upstream release branch (e.g. release-3.3), and create a PR against that branch with cherry-picked patches.

@shreyas-s-rao
Copy link
Contributor Author

@gyuho Sure. I've raised #10603 on release-3.3 branch.

@shreyas-s-rao shreyas-s-rao deleted the wal-verify-func branch April 16, 2019 05:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

8 participants