-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve DB file verification time #71
Comments
Worst-case, we might want to rethink whether we need to validate on every restart.
From: Swapnil Mhamane <[email protected]>
Reply-To: gardener/etcd-backup-restore <[email protected]>
Date: Friday, 23 November 2018 at 7:34 PM
To: gardener/etcd-backup-restore <[email protected]>
Cc: Subscribed <[email protected]>
Subject: [gardener/etcd-backup-restore] Improve DB file verification time (#71)
As the etcd DB size grows to around ~1GB, DB file validation takes time in 3-5 minutes. Need to analyse this; and improve the time for verification of DB file. Also, check the effect of IOPS on time as well.
Environment:
Etcd version: 3.3.10
Etcd-backup-restore version: 0.3.1
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#71>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AequRs61whriyv2ghDz79GDfhjdrhvYhks5uyABxgaJpZM4Ywtaw>.
|
Right, that was what I was thinking/fearing as well. If it cannot be avoided, maybe we can skip it at least when etcd was properly shut down (where it is the least likely that something breaks), which would speed up regular scenarios significantly, e.g. when the control plane is updated or the seed is rolled. That should actually be possible to implement: Write a marker and keep it unless the process stops regularly; then validate only if marker is still on disk when starting the etcd? |
If I am not mistaken, we already found that the time taken for verification is proportionate to the size of the database.
So, from my perspective, trying to optimize the actual verification process is just delaying the inevitable.
Only workable long term solution is to avoid unnecessary verification.
|
Start etcd. On an etcd process crash, we can try to perform a data validation check and then do a restore if needed. Basically inverting the flow we have at the moment. The etcd container restarts after the data directory validation/restore operation and the cycle continues. We should do a sanity check(#67) before taking the first full snapshot and making etcd ready for service. |
As the etcd DB size grows to around ~1GB, DB file validation takes time in 3-5 minutes. Need to analyse this; and improve the time for verification of DB file. Also, check the effect of IOPS on time as well.
Environment:
Etcd version: 3.3.10
Etcd-backup-restore version: 0.3.1
The text was updated successfully, but these errors were encountered: