Should service action (restore) remove all node ports from a service? #354

ncdc · 2018-03-06T15:00:57Z

We are currently removing all nodePorts from all services when we restore them:

https://github.com/heptio/ark/blob/973f630cc7b2bdd464cc4e78d776ab4099a43381/pkg/restore/service_action.go#L60-L63

This doesn't really make sense to me. @skriss do you remember what the original motivation for this was?

skriss · 2018-03-06T15:55:37Z

IIRC it was because keeping them makes the "clone namespace" use case fail due to port collision.

ncdc · 2018-03-06T15:57:16Z

Ok, that makes sense. I don't think they should generally be dropped by default, though. We should think through ways to allow the user to express the desired behavior in a clean manner.

ncdc · 2018-03-08T20:11:43Z

cc @jbeda - another UX question. How do we support both 1) restore into new cluster and 2) restore into new cluster into a different namespace?

jbeda · 2018-03-08T22:59:38Z

My answer would be similar to #192 (comment). By all means let's come up with granular flags for stuff like this. Or have a config file for specifying what gets remapped how. But then package up the common use cases under a single flag. Document it so that users can tweak it if they want.

ncdc · 2018-03-09T15:53:12Z

Makes sense. Maybe we allow users to remap node ports as an option?

ark restore create --from-backup foo --namespace-mappings bob1:bob2 --node-port-mappings 40001:40002,41234:42345

jbeda · 2018-03-09T16:40:46Z

That sounds great. But from a user point of view you'd want to be able to ask "what node ports does this backup/namespace use?" and "Will those conflict with the cluster I'm trying to apply to?" Answering those questions doesn't fit cleanly into this model unfortunately.

In addition, you may want to let the system pick the node port as you remap. Unfortunately, it is difficult to know if the node port in the backup was assigned by the system during admission control or if it was assigned explicitly by the user. Once Daniel's big multi-merge thing goes in we may have more data. To really tell what the intent was we'd have to start cracking into the annotation added by kubectl apply.

Perhaps we need some sort of a restore dry run that helps to suss out these issues and can give you a summary of the resources to be created along with perhaps testing those against the target cluster? That is a pretty big new feature though.

ncdc · 2018-03-09T16:45:58Z

Yeah, this is potentially a pretty big thing... Right now it's "fire and forget it" when it comes to restoring. The server does the best it can, but it's all automated and there's no opportunity for user input. Nor do we have a holistic in-memoryr view of either the backed up resources or the destination cluster's resources, so we can't currently easily evaluate everything and say "hey, your node port is going to conflict".

When restoring resources that raise an already exists error, check their equality before logging a message on the restore. If they're the same except for some metadata, don't generate a message. The restore process was modified so that if an object had an empty namespace string, no namespace key is created on the object. This was to avoid manipulating the copy of the current cluster's object by adding the target namespace. There are some cases right now that are known to not be equal via this method: - The `default` ServiceAccount in a namespace will not match, primarily because of differing default tokens. These will be handled in their own patch - IP addresses for Services are recorded in the backup object, but are either not present on the cluster object, or different. An issue for this already exists at vmware-tanzu#354 - Endpoints have differing values for `renewTime`. This may be insubstantial, but isn't currently handled by the resetMetadataAndStatus function. - PersistentVolume objects in the backup have a `claimRef` section, while those in cluster do not. Signed-off-by: Nolan Brubaker <[email protected]>

When restoring resources that raise an already exists error, check their equality before logging a message on the restore. If they're the same except for some metadata, don't generate a message. The restore process was modified so that if an object had an empty namespace string, no namespace key is created on the object. This was to avoid manipulating the copy of the current cluster's object by adding the target namespace. There are some cases right now that are known to not be equal via this method: - The `default` ServiceAccount in a namespace will not match, primarily because of differing default tokens. These will be handled in their own patch - IP addresses for Services are recorded in the backup object, but are either not present on the cluster object, or different. An issue for this already exists at vmware-tanzu#354 - Endpoints have differing values for `renewTime`. This may be insubstantial, but isn't currently handled by the resetMetadataAndStatus function. - PersistentVolume objects do not match on spec fields, such as claimRef and cloud provider persistent disk info Signed-off-by: Nolan Brubaker <[email protected]>

rosskukulinski · 2018-06-15T17:57:07Z

Ref #448

This is tricky, because NodePort can be defined by the user, or can be selected by kube-controller-manager at creation time. IMO, if kube-controller-manager picked the NodePort, then let that process happen again. If the user specified it, then we probably need a remap mechanism. I don’t believe this information is actually available to Ark after the fact.

This is a big-impact issue that touches UX (via CLI flags) and user expectations. Will need more thought around use-cases.

jbeda · 2018-06-18T16:13:17Z

This is tricky but i think there is a generally accepted way here. There are a set of annotations that list the specified state of the resource. kubectl apply uses these. From here you can figure out user intent. it isn't 100% guaranteed to work but it is something that we can document.

I'd say:

If the annotation is there and the user specified the node port then keep that specification. if they didn't then remove it from the backup.
If the annotation is missing, then default to something and give users an option to modify that default. I'm leaning toward removing the node port specification but could go either way.

Over time, we may need a way to specify generic munging policy so that users can specify how/where to remove stuff like this on a per-restore basis.

Also be aware that there is work to make that annotation more official with a feature called "server side apply" and that we should be tracking that. BryanL has experience with this issue wrt ksonnet and it may be worth reaching out to him.

rosskukulinski · 2018-06-18T18:52:45Z

WFM +1

Perhaps #474 could help in terms of generic munging policy down the road.

timoreimann · 2018-07-16T16:47:58Z

Just to add one user data point: We just encountered this very issue when running a test recovery with Ark on one of our namespaces, where the omission of the node ports eventually prevented some access patterns we had employed before. In this very case, it'd have been desirable to default to keeping the specified node ports, or at least have an option to enforce such behavior.

It's nothing overly critical for us right now though as we pin node ports in a single, non-production use case only. However, it's something we'll have to look out for when creating new service specs.

The approach that @jbeda describes to leverage the specified JSON state from the annotation (when available) sounds like a reasonable first step. If that's something we could agree to, I'd be happy to take a stab at a PR. Please let me know.

rosskukulinski · 2018-07-16T16:51:47Z

Thanks for volunteering, @timoreimann! @ncdc / @skriss does Joe's approach work for you? [My suggestion for (2) is to default to removing nodePort and add CLI flag to enable keeping it]

skriss · 2018-07-17T03:34:42Z

The approach makes sense to me, and I think removing nodePort by default during restore is sensible. I'd still back everything up, which is inline with Ark's existing practice.

skriss · 2018-10-12T15:55:27Z

this was fixed by #712, closing.

…u#354) make update Signed-off-by: Shubham Pampattiwar <[email protected]>

ncdc mentioned this issue Mar 8, 2018

How to define StorageClass for minio #361

Closed

nrb mentioned this issue Mar 29, 2018

Compare for duplicates before logging object exists errors #405

Merged

ncdc added enhancement labels May 8, 2018

rosskukulinski added this to the v1.0.0 milestone Jun 15, 2018

rosskukulinski added Enhancement/User End-User Enhancement to Velero and removed Enhancement labels Jun 25, 2018

ncdc added the Breaking change Impacts backwards compatibility label Jun 27, 2018

timoreimann mentioned this issue Jul 27, 2018

Preserve node ports during restore when annotations hold specification. #712

Merged

skriss closed this as completed Oct 12, 2018

yusufgungor mentioned this issue Nov 18, 2020

Preserve nodePort support with --preserve-nodeports flag #3095

Merged

weshayutin pushed a commit to weshayutin/velero that referenced this issue Oct 21, 2024

Add Backup warning for inclusion of NS managed by ArgoCD (vmware-tanz…

1f897ab

…u#354) make update Signed-off-by: Shubham Pampattiwar <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should service action (restore) remove all node ports from a service? #354

Should service action (restore) remove all node ports from a service? #354

ncdc commented Mar 6, 2018

skriss commented Mar 6, 2018

ncdc commented Mar 6, 2018

ncdc commented Mar 8, 2018

jbeda commented Mar 8, 2018

ncdc commented Mar 9, 2018

jbeda commented Mar 9, 2018

ncdc commented Mar 9, 2018

rosskukulinski commented Jun 15, 2018

jbeda commented Jun 18, 2018

rosskukulinski commented Jun 18, 2018

timoreimann commented Jul 16, 2018 •

edited

Loading

rosskukulinski commented Jul 16, 2018

skriss commented Jul 17, 2018 •

edited

Loading

skriss commented Oct 12, 2018

Should service action (restore) remove all node ports from a service? #354

Should service action (restore) remove all node ports from a service? #354

Comments

ncdc commented Mar 6, 2018

skriss commented Mar 6, 2018

ncdc commented Mar 6, 2018

ncdc commented Mar 8, 2018

jbeda commented Mar 8, 2018

ncdc commented Mar 9, 2018

jbeda commented Mar 9, 2018

ncdc commented Mar 9, 2018

rosskukulinski commented Jun 15, 2018

jbeda commented Jun 18, 2018

rosskukulinski commented Jun 18, 2018

timoreimann commented Jul 16, 2018 • edited Loading

rosskukulinski commented Jul 16, 2018

skriss commented Jul 17, 2018 • edited Loading

skriss commented Oct 12, 2018

timoreimann commented Jul 16, 2018 •

edited

Loading

skriss commented Jul 17, 2018 •

edited

Loading