Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/etcd data validation #93

Merged

Conversation

georgekuruvillak
Copy link
Contributor

@georgekuruvillak georgekuruvillak commented Jan 16, 2019

What this PR does / why we need it:
Design decision on how to go ahead with data validation.
Which issue(s) this PR fixes:
Fixes #71
Etcd data directory validation takes a long time and resource to perform. Currently, it gets triggered when etcd restarts. This PR changes the logic to perform full validation only when etcd restarts erroneously.
Special notes for your reviewer:
Please go through the design doc so that we can come to a consensus on how to proceed ahead with the validation steps and division of responsibilities between the script in etcd container and the sidecar.
Release note:

Unnecessary data validation will now be skipped, allowing for quicker etcd restarts.

@georgekuruvillak georgekuruvillak force-pushed the fix/etcd-data-validation branch 3 times, most recently from a0be0da to f39faa6 Compare January 30, 2019 12:07
@georgekuruvillak georgekuruvillak added component/etcd-backup-restore ETCD Backup & Restore needs/review Needs review exp/intermediate Issue that requires some project experience size/s Size of pull request is small (see gardener-robot robot/bots/size.py) platform/all needs/lgtm Needs approval for merging needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jan 30, 2019
Copy link
Collaborator

@shreyas-s-rao shreyas-s-rao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@georgekuruvillak Thanks for the well-written PR! Just a few typos and suggestions. Other than that, LGTM! 👍🏼

hack/templates/etcd-statefulset.yaml.tpl Outdated Show resolved Hide resolved
hack/templates/etcd-statefulset.yaml.tpl Outdated Show resolved Hide resolved
doc/validation.md Outdated Show resolved Hide resolved
doc/validation.md Outdated Show resolved Hide resolved
doc/validation.md Outdated Show resolved Hide resolved
doc/validation.md Outdated Show resolved Hide resolved
@shreyas-s-rao shreyas-s-rao added the needs/rebase Needs git rebase label Feb 8, 2019
@shreyas-s-rao
Copy link
Collaborator

@georgekuruvillak Please change the bootstrap script in the helm chart as well, at

Also, in our example YAMLs, the label selector in the Service spec (app: etcd) does not match the labels given for the StatefulSet (app: etcd-${cloud}) at

Could you please make the necessary changes for this as well? Thanks.

@georgekuruvillak georgekuruvillak changed the title Fix/etcd data validation [WIP]Fix/etcd data validation Feb 11, 2019
@swapnilgm swapnilgm added this to the 0.6.0 milestone Feb 19, 2019
Copy link
Collaborator

@amshuman-kr amshuman-kr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just renamed MARKER to VALIDATION_MARKER (also the actual file name) to better reflect its usage.

DISCLAIMER: I have just typed the changes up on the browser. Not tested!

doc/validation.md Outdated Show resolved Hide resolved
* Do directory content validation.
* Start etcd
* Yes
* Check if previous exit was normal
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this second check required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. I have changed the proposal as how the validation is done now.

@@ -23,23 +23,102 @@ metadata:
data:
bootstrap.sh: |-
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the bootstrap.sh be specific to the cloud provider?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is generated from the template file. So the code generator generates across all the cloud providers. This will be changed when the helm charts are introduced.

example/etcd-statefulset-aws.yaml Outdated Show resolved Hide resolved
done
}

if [ ! -f $MARKER ] ;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if [ ! -f $MARKER ] ;
if [ ! -f $VALIDATION_MARKER ] ;

hack/templates/etcd-statefulset.yaml.tpl Outdated Show resolved Hide resolved
hack/templates/etcd-statefulset.yaml.tpl Outdated Show resolved Hide resolved
hack/templates/etcd-statefulset.yaml.tpl Outdated Show resolved Hide resolved
hack/templates/etcd-statefulset.yaml.tpl Outdated Show resolved Hide resolved
hack/templates/etcd-statefulset.yaml.tpl Outdated Show resolved Hide resolved
@georgekuruvillak georgekuruvillak force-pushed the fix/etcd-data-validation branch 2 times, most recently from e82b5f1 to 02e6e14 Compare March 18, 2019 12:56
@swapnilgm swapnilgm changed the title [WIP]Fix/etcd data validation Fix/etcd data validation Mar 19, 2019
@shreyas-s-rao
Copy link
Collaborator

@georgekuruvillak Can you also add a test or two for testing in sanity mode? And also modify the tests to pass the mode argument to the Validate method.

cmd/miscellaneous.go Outdated Show resolved Hide resolved
Copy link
Contributor

@swapnilgm swapnilgm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Some minor changes considering the logging perspective. Please address those. Also, Please address comments from Amshu.

chart/templates/etcd-bootstrap-configmap.yaml Outdated Show resolved Hide resolved
chart/templates/etcd-bootstrap-configmap.yaml Outdated Show resolved Hide resolved
@georgekuruvillak georgekuruvillak force-pushed the fix/etcd-data-validation branch from 02e6e14 to d4a0a5f Compare March 21, 2019 06:36
@georgekuruvillak
Copy link
Contributor Author

@georgekuruvillak Can you also add a test or two for testing in sanity mode? And also modify the tests to pass the mode argument to the Validate method.

I have made the changes and have added test for sanity check. PTAL

Copy link
Collaborator

@shreyas-s-rao shreyas-s-rao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@georgekuruvillak Thanks for making all the suggested changes. LGTM! 👍🏼

@shreyas-s-rao shreyas-s-rao added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) reviewed/lgtm Has approval for merging and removed needs/lgtm Needs approval for merging needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) needs/rebase Needs git rebase needs/review Needs review labels Mar 21, 2019
@gardener-robot-ci-1 gardener-robot-ci-1 added needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Mar 21, 2019
@shreyas-s-rao shreyas-s-rao merged commit 807897f into gardener:master Mar 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/etcd-backup-restore ETCD Backup & Restore exp/intermediate Issue that requires some project experience needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) platform/all reviewed/lgtm Has approval for merging size/s Size of pull request is small (see gardener-robot robot/bots/size.py)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve DB file verification time
5 participants