-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow backup storage locations to specify backup sync period or toggle off sync #1936
Conversation
57e57cc
to
0aa7035
Compare
@@ -37,6 +38,11 @@ import ( | |||
"github.com/vmware-tanzu/velero/pkg/plugin/clientmgmt" | |||
) | |||
|
|||
const ( | |||
backupSyncPeriodKey = "backupSyncPeriod" | |||
skipBackupSyncKey = "skipBackupSync" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to use backupSyncPeriod: 0
be a sentinel value for skipping sync, so we only have one variable to check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that makes sense. I had initially thought of doing it that way but figured adding a config variable for skipping sync would make it more explicit. I'll update the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's certainly true that it's more explicit, which tends to be good. Curious what others think, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have the changes to use backupSyncPeriod: 0 as a sentinel value for skipping sync all ready, just waiting for input from other reviewers then will update the PR.
5173fef
to
3bdeedc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we remove the --backup-sync-period
option in the Velero server?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @betta1 - few comments that we need to resolve before proceeding here.
@@ -134,6 +136,32 @@ func (c *backupSyncController) run() { | |||
|
|||
for _, location := range locations { | |||
log := c.logger.WithField("backupLocation", location.Name) | |||
|
|||
if backupSyncPeriod, ok := location.Spec.Config[backupSyncPeriodKey]; ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we're adding this to BackupStorageLocations, it makes more sense to be a first-class Spec
field (optional) than an opaque config key, since it's not provider-specific. It would also mean it could be strongly typed as a duration field. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I did find this strange. velero install
should also set the default value when creating a BSL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree we could have this as a first class Spec field on the BSL. I had brought this issue up during the community call 2 weeks ago and from the discussion at the time having this as a config key seemed an ok approach and I also found it less intrusive change than adding it as a Spec field. I initially viewed the custom backup sync period as something that BSLs can opt in if they want to have sync periods that are greater than the server's global backup sync period or toggle of sync entirely but we can def have this as a Spec field. In addition to the Spec field, we'd probably also need to add backup-sync-period flag to both velero-install and velero backup-location create cli commands as well.
log.Debug("Checking if backups need to be synced at this time for this location") | ||
lastSync := location.Status.LastSyncedTime | ||
nextSync := lastSync.Add(syncDuration) | ||
if time.Now().UTC().Before(nextSync) { | ||
c.logger.WithFields(logrus.Fields{ | ||
"backupLocation": location.Name, | ||
"lastSync": lastSync, | ||
"nextSync": nextSync, | ||
}).Debug("Sync not required at this time") | ||
continue | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The relationship between this logic and the --backup-sync-period
could produce some unintuitive behavior in certain instances. For example, if --backup-sync-period
is set to 1h
, but a specific BSL has a value of 1m
, that BSL will still only get synced every 1h
, because the run()
function that this check is in will only be run every hour.
I would like to do a little more thinking on how this should work. One option would be to have the run()
function run frequently (i.e. every 30s? minute? TBD), and each time it runs, check each BSL to see if it should be synced, based on its last-synced time, and either its specific sync interval, or the default value (as provided via --backup-sync-period
).
I am also curious - are you mainly interested in being able to disable sync for a specific BSL, or is the custom interval equally important? What's the use case for the custom interval?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, we should change this to the approach proposed by @skriss. The CronJob controller checks every 10 seconds, but I think 30s/1m is also reasonable for this usecase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skriss I agree with that approach as well. We're interested in both. For our use case we have an object store plugin and currently Velero queries the backup target every 1 min which is too frequent especially when we have several backup storage locations for a given backup target. The backup target may be shared with other tenants and we'd like to configure how frequently we need to connect to the target so we don't hit it with too many connection requests. There're other cases where the backup target might be offline e.g during an upgrade etc and the ability to toggle off backup sync in this case would be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skriss @prydonius I updated the PR as the approach suggested, please take a look and let me know any edits I need to make. Sorry for the late update I was OOO towards end of last week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no worries - I'll take a look soon.
18e6bff
to
5baa50b
Compare
5baa50b
to
d5e20eb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@betta1 this mostly looks good, added a couple small specific comments. A couple general comments as well:
- https://github.com/vmware-tanzu/velero/blob/master/pkg/cmd/server/server.go#L197 - the help text can probably now be tweaked to indicate that this is the default sync period for BSLs if none is explicitly specified for a BSL
- I'm wondering if we want to make the hard-coded 1-minute resync period configurable? Prior to this change, a user could set it to e.g.
1s
if they wanted; now, 1 minute is effectively the minimum interval for syncing a BSL. I'm also wary of adding too many flags, so interested in others' thoughts here. We could also just lower that hardcoded value (30s? 10s?), which should reduce the need for customizing it.
Adding a flag for it is going to be a little confusing IMO (users might be confused about whether they should set that or set the sync period in the BSL). I think we should just reduce it to 10s or 30s. I don't really expect anyone will want to sync this faster than 10s, and especially since CronJob has this limitation also. |
…e off sync Signed-off-by: Antony Bett <[email protected]>
d5e20eb
to
132dea8
Compare
I agree, I think just lowering the hardcoded value for the resync period should be sufficient for now (probably we can create an issue to track exposing this as a configurable flag later if users are interested). I've lowered it to 30s as I don't want this change to cause Velero to consume more cycles/resources just running the resync process too frequently. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm - thanks @betta1!
pkg/cmd/cli/backuplocation/create.go
Outdated
@@ -78,6 +82,7 @@ func (o *CreateOptions) BindFlags(flags *pflag.FlagSet) { | |||
flags.StringVar(&o.Provider, "provider", o.Provider, "name of the backup storage provider (e.g. aws, azure, gcp)") | |||
flags.StringVar(&o.Bucket, "bucket", o.Bucket, "name of the object storage bucket where backups should be stored") | |||
flags.StringVar(&o.Prefix, "prefix", o.Prefix, "prefix under which all Velero data should be stored within the bucket. Optional.") | |||
flags.DurationVar(&o.BackupSyncPeriod, "backup-sync-period", defaultBackupSyncPeriod, "how often to ensure all Velero backups in object storage exist as Backup API objects in the cluster. Optional. Set this to `0s` to disable sync") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering if we want to not set an explicit value for the BSL if the user doesn't specify the flag, rather than defaulting it to a minute here? If we did that, then the server-level setting would apply. The benefit is that if the user then changes the server-level setting, it would affect all BSLs that were not set up with an explicit sync period, whereas now, they're all getting an individual setting of 1m by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this would actually be ideal. I'd like to make this change and the only issue i'm running into is how can I detect that the user didn't specify the flag? Atm when I set the default value to o.BackupSyncPeriod, like so:
flags.DurationVar(&o.BackupSyncPeriod, "backup-sync-period", o.BackupSyncPeriod, "how often..."
it gets parsed as 0s which is an issue since this means disable sync. Any suggestion on how to detect this flag wasn't provided?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can use this method: https://godoc.org/github.com/spf13/pflag#FlagSet.Changed
but I'm not 100% sure off the top of my head, would need to experiment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @skriss, I modified to check if the user explicitly specified a backup sync period and only then set the backup sync period for the BSL. Otherwise if user does not specify the flag the BSL backup sync period is nil which will have the BSL use the server side backup sync.
otherwise LGTM! |
Signed-off-by: Antony Bett <[email protected]>
132dea8
to
9b6d157
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This resolves #1934