Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sanity check for revision numbers #85

Merged
merged 1 commit into from
Jan 18, 2019

Conversation

shreyas-s-rao
Copy link
Collaborator

What this PR does / why we need it:
This PR adds a sanity check feature to the data validation package, to prevent possible data loss during initialization, by ensuring that the etcd revision is greater than or equal to the latest snapshot revision.

Which issue(s) this PR fixes:
Fixes #67

Special notes for your reviewer:

Release note:

Added a sanity check to prevent data loss during initialization, by ensuring that the etcd revision is greater than or equal to the latest snapshot revision

@shreyas-s-rao shreyas-s-rao changed the title Add sanity check for revision numbers [WIP] Add sanity check for revision numbers Dec 12, 2018
@shreyas-s-rao
Copy link
Collaborator Author

Will push unit tests shortly.

@shreyas-s-rao
Copy link
Collaborator Author

@georgekuruvillak I played around a bit with the ConsistentIndex approach you suggested. Turns out ConsistentIndex is very different from Revision number, as it's more closely associated with the keys rather than time-ordered events in etcd. ConsistentIndex initially tends to correspond to the total number of keys in the db when etcd is started, and decreases when the db is compacted. So I don't think we can use this for the sanity check. Please confirm.

@swapnilgm swapnilgm added exp/beginner Issue that requires only basic skills status/in-progress Issue is in progress/work needs/review Needs review size/s Size of pull request is small (see gardener-robot robot/bots/size.py) platform/all priority/normal area/robustness Robustness, reliability, resilience related needs/lgtm Needs approval for merging needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Dec 14, 2018
@shreyas-s-rao shreyas-s-rao force-pushed the revision-safety-check branch 4 times, most recently from 4440e3d to 2fec95a Compare December 26, 2018 05:47
@shreyas-s-rao shreyas-s-rao changed the title [WIP] Add sanity check for revision numbers Add sanity check for revision numbers Dec 26, 2018
@shreyas-s-rao shreyas-s-rao force-pushed the revision-safety-check branch 2 times, most recently from 15583e7 to 50f6ae5 Compare January 3, 2019 06:42
@shreyas-s-rao
Copy link
Collaborator Author

I have been able to drastically reduce the time taken to get revision number of etcd from the db file, by eliminating iterating through the db buckets and KV pairs. Ready for review.

Copy link
Contributor

@georgekuruvillak georgekuruvillak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some suggestions. Please do check it out.

Copy link
Contributor

@swapnilgm swapnilgm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have commented changes in PR. Please address them.

Currently, lot of code related to test seems to be redundant at multiple places, e.g. pkg/snapshot/restorer/restorer_suite_test.go and pkg/initializer/validator/validator_suite_test.go. Also, functions like startEmbeddedEtcd. We will have to refactor it soon. For this PR, i am fine with it. But we should take next task, to refactor it, cause now basic test framework, all package is set up and we can work optimisation of it. We will create new issue for this.

cmd/initializer.go Show resolved Hide resolved
pkg/initializer/validator/datavalidator.go Show resolved Hide resolved
pkg/initializer/validator/datavalidator.go Outdated Show resolved Hide resolved
pkg/initializer/validator/datavalidator.go Outdated Show resolved Hide resolved
pkg/initializer/validator/datavalidator.go Outdated Show resolved Hide resolved
pkg/initializer/validator/datavalidator_test.go Outdated Show resolved Hide resolved
pkg/initializer/validator/datavalidator_test.go Outdated Show resolved Hide resolved
pkg/server/httpAPI.go Outdated Show resolved Hide resolved
@shreyas-s-rao shreyas-s-rao force-pushed the revision-safety-check branch 2 times, most recently from a671d82 to c7f1ec5 Compare January 7, 2019 09:01
@shreyas-s-rao
Copy link
Collaborator Author

@swapnilgm @georgekuruvillak Thanks for your reviews. I've addressed your review comments. PTAL.

@swapnilgm swapnilgm added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jan 7, 2019
@gardener-robot-ci-1 gardener-robot-ci-1 added needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jan 7, 2019
@swapnilgm
Copy link
Contributor

@shreyas-s-rao Thanks for the changes. Overall LGTM. You also, have to update the integration tests accordingly. Please check the last test pipeline logs for the details. https://concourse.ci.infra.gardener.cloud/teams/gardener/pipelines/etcd-backup-restore-master/jobs/master-pull-request-job/builds/33

@swapnilgm
Copy link
Contributor

Update: As we saw issue while testing the PR, that for empty ETCD the sanity check fails. Shreyas, will be updating PR for handling this case.

@shreyas-s-rao
Copy link
Collaborator Author

@swapnilgm I've updated the PR and fixed the integration tests as well. PTAL.

@swapnilgm swapnilgm added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jan 16, 2019
@gardener-robot-ci-1 gardener-robot-ci-1 added needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jan 16, 2019
Copy link
Contributor

@swapnilgm swapnilgm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@swapnilgm swapnilgm merged commit e007440 into gardener:master Jan 18, 2019
@shreyas-s-rao shreyas-s-rao deleted the revision-safety-check branch January 22, 2019 05:44
@PadmaB PadmaB modified the milestones: ---, 0.4.1 Jan 23, 2019
@gardener-robot gardener-robot added priority/3 Priority (lower number equals higher priority) and removed priority/3 Priority (lower number equals higher priority) labels Mar 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/robustness Robustness, reliability, resilience related exp/beginner Issue that requires only basic skills needs/lgtm Needs approval for merging needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) needs/review Needs review platform/all size/s Size of pull request is small (see gardener-robot robot/bots/size.py) status/in-progress Issue is in progress/work
Projects
None yet
6 participants