Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve DB file verification time #71

Closed
swapnilgm opened this issue Nov 23, 2018 · 4 comments · Fixed by #93
Closed

Improve DB file verification time #71

swapnilgm opened this issue Nov 23, 2018 · 4 comments · Fixed by #93
Assignees
Labels
area/performance Performance (across all domains, such as control plane, networking, storage, etc.) related component/etcd-backup-restore ETCD Backup & Restore exp/intermediate Issue that requires some project experience platform/all priority/2 Priority (lower number equals higher priority) topology/seed Affects Seed clusters
Milestone

Comments

@swapnilgm
Copy link
Contributor

As the etcd DB size grows to around ~1GB, DB file validation takes time in 3-5 minutes. Need to analyse this; and improve the time for verification of DB file. Also, check the effect of IOPS on time as well.

Environment:
Etcd version: 3.3.10
Etcd-backup-restore version: 0.3.1

@swapnilgm swapnilgm added component/etcd-backup-restore ETCD Backup & Restore exp/intermediate Issue that requires some project experience platform/all status/accepted Issue was accepted as something we need to work on priority/normal area/performance Performance (across all domains, such as control plane, networking, storage, etc.) related labels Nov 23, 2018
@amshuman-kr
Copy link
Collaborator

amshuman-kr commented Nov 24, 2018 via email

@vlerenc
Copy link
Member

vlerenc commented Nov 24, 2018

Right, that was what I was thinking/fearing as well. If it cannot be avoided, maybe we can skip it at least when etcd was properly shut down (where it is the least likely that something breaks), which would speed up regular scenarios significantly, e.g. when the control plane is updated or the seed is rolled. That should actually be possible to implement: Write a marker and keep it unless the process stops regularly; then validate only if marker is still on disk when starting the etcd?

@amshuman-kr
Copy link
Collaborator

amshuman-kr commented Nov 24, 2018 via email

@georgekuruvillak
Copy link
Contributor

georgekuruvillak commented Nov 28, 2018

Start etcd. On an etcd process crash, we can try to perform a data validation check and then do a restore if needed. Basically inverting the flow we have at the moment. The etcd container restarts after the data directory validation/restore operation and the cycle continues. We should do a sanity check(#67) before taking the first full snapshot and making etcd ready for service.

@swapnilgm swapnilgm added this to the 0.5.0 milestone Dec 5, 2018
@gardener-robot-ci-1 gardener-robot-ci-1 added lifecycle/stale Nobody worked on this for 6 months (will further age) and removed lifecycle/stale Nobody worked on this for 6 months (will further age) labels Feb 4, 2019
@swapnilgm swapnilgm modified the milestones: 0.5.0, 0.6.0 Feb 19, 2019
@gardener-robot-ci-1 gardener-robot-ci-1 removed the status/accepted Issue was accepted as something we need to work on label Mar 22, 2019
@vlerenc vlerenc added priority/critical Needs to be resolved soon, because it impacts users negatively topology/seed Affects Seed clusters labels Apr 17, 2019
@gardener-robot gardener-robot added priority/2 Priority (lower number equals higher priority) and removed priority/critical Needs to be resolved soon, because it impacts users negatively labels Mar 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/performance Performance (across all domains, such as control plane, networking, storage, etc.) related component/etcd-backup-restore ETCD Backup & Restore exp/intermediate Issue that requires some project experience platform/all priority/2 Priority (lower number equals higher priority) topology/seed Affects Seed clusters
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants