-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sync controller: replace revision file with full diff each interval #1892
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to this approach, I think this is much much simpler!
eb22309
to
2e6a0f8
Compare
I fixed the tests, and removed the code that gets/sets the The only other thing to consider doing is adding some code to remove any existing |
Seems like we could just leave the older metadata/revision files there, unless we think they might be used by some other tooling and it doesn't make sense to keep out-of-date revision files around (in which case this would be breaking anyway). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much simpler. 👍
I'm fine with leaving the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the simpler approach!
Have you observed any time differences (based on log messages would be simplest, I suppose) between the old approach and this one?
Haven't specifically looked, though performance seemed fine/~instantaneous in the testing I did. I can try with some bigger data sets. |
00805df
to
4482702
Compare
…erval Signed-off-by: Steve Kriss <[email protected]>
Signed-off-by: Steve Kriss <[email protected]>
Signed-off-by: Steve Kriss <[email protected]>
Signed-off-by: Steve Kriss <[email protected]>
4482702
to
cc6f204
Compare
Tweaked logging so it only info-logs if there are actually backups to sync into the cluster for a location; otherwise, debug-logs. |
Signed-off-by: Steve Kriss <[email protected]>
The algorithm I'm using now should actually be faster than the old one -- previously, when syncing a location, we'd first get a list of all backups in the bucket, then make a K8s API call per backup in the BSL to check if that backup exists in-cluster. Now, I'm making a single List call (to a lister) upfront to get all backups in the cluster, and then doing an in-memory set diff to figure out which ones should be synced in. So a bunch of over-the-wire API calls are saved. |
Signed-off-by: Steve Kriss <[email protected]>
@nrb PTAL. |
* upstream/master: (38 commits) sync controller: replace revision file with full diff each interval (vmware-tanzu#1892) Increment logging for item backupper (vmware-tanzu#1904) Add LD_LIBRARY_PATH as an env varible for the use of vsphere plugin (vmware-tanzu#1893) Remove unused flag (vmware-tanzu#1913) Use layers in the builder Dockerfile (vmware-tanzu#1907) Fix for vmware-tanzu#1888: check item's original namespace, not remapped one, for inclusion/exclusion (vmware-tanzu#1909) fail on make verify if generated CRDs differ (vmware-tanzu#1906) velero API type changes for structural schema CRDs (vmware-tanzu#1898) Generate CRDs with structural schema (vmware-tanzu#1885) Plan for moving plugin repos (vmware-tanzu#1870) move plugin proto updating into make update (vmware-tanzu#1887) Add features package (vmware-tanzu#1849) GCP: support specifying Cloud KMS key name for backup storage locations (vmware-tanzu#1879) Adds to website (vmware-tanzu#1882) proposal for generating Velero CRDs with structural schema (vmware-tanzu#1875) Improve contributing docs (vmware-tanzu#1852) [doc] Diagram (image) now mentions velero (vmware-tanzu#1877) AWS: add support for arbitrary SSE algorithms, e.g. AES256 (vmware-tanzu#1869) update restic docs for PR vmware-tanzu#1807 (vmware-tanzu#1867) changelog for PR vmware-tanzu#1864 ...
Signed-off-by: Steve Kriss [email protected]
Closes #1343
This is kind of what I was thinking ref. #1343 (comment). Instead of relying on the
revision
file to tell us when BSL contents change, we get the full list of backups from the BSL (this is just a single API call to the object store), and do a setwise comparison (based on name) to the backups in the cluster. We then sync/import any ones that are in the BSL but not in the cluster.I also removed the limitation that the sync period must be >= 1 minute - I figure users should be able to set it lower if they want.
I haven't yet removed the actual updating of the
revision
file when backups happen - we could choose to leave that in for now, or remove it entirely.If this makes sense, I'll proceed with tests/etc.