Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Select namespace to include by its labelSelector #7492

Open
kaovilai opened this issue Mar 4, 2024 · 20 comments · May be fixed by #8342
Open

Select namespace to include by its labelSelector #7492

kaovilai opened this issue Mar 4, 2024 · 20 comments · May be fixed by #8342
Assignees
Labels
Breaking change Impacts backwards compatibility kind/release-note kind/requirement Needs triage We need discussion to understand problem and decide the priority
Milestone

Comments

@kaovilai
Copy link
Member

kaovilai commented Mar 4, 2024

Describe the problem/challenge you have

User expect when specifying backup selector to match a namespace labelselector that all content of that namespace is included in backup.

When we are saying 'namespace backup' we mean backup of the namespace content. Similarly as it works when using spec.includedNamespaces

old request prior to #8342

includedNamespacesByLabelSelector Equivalent to includedNamespaces in Backup/Restore CR but not by name, but namespace's labelSelector.

User would not have to label each item inside the namespace, they can update which namespaces have this label, which on a future run manually or via schedule, could desirably result in different list of namespaces being included.

Not to be confused with #7105 which wants to exclude namespaces not containing backup items selected by label.

Describe the solution you'd like

#8342

old solutions prior to #8342

includedNamespacesByLabelSelector Equivalent to includedNamespaces in Backup/Restore CR but not by name, but namespace's labelSelector.

User would not have to label each item inside the namespace, they can update which namespaces have this label, which on a future run manually or via schedule, could desirably result in different list of namespaces being included.

Not to be confused with #7105 which wants to exclude namespaces not containing backup items selected by label.

Anything else you would like to add:

Environment:

  • Velero version (use velero version):
  • Kubernetes version (use kubectl version):
  • Kubernetes installer & version:
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

  • 👍 for "The project would be better with this feature added"
  • 👎 for "This feature will not enhance the project in a meaningful way"

Hackaround

# not tested, just quick notes
NSes=$(kubectl get ns --label -oname)
echo '
backup…
includedNamespaces: $NSes
' | kubectl create -f --

and if you need this to be a schedule, can probably also wrap this into a kubernetes cronjob

@kaovilai kaovilai changed the title Select namespace to include by it's labelSelector Select namespace to include by its labelSelector Mar 4, 2024
@blackpiglet
Copy link
Contributor

@kaovilai
Thanks for proposing this solution.
I suggest to put this on hold for a while. If there is a way to make it work in #7105, there is no need to introduce a new filter.
Although we can make the decision about which filter has higher priority, more filters will introduce confusion to the user.

@kaovilai
Copy link
Member Author

kaovilai commented Mar 8, 2024

Ok. I think we can make it work within 7105.

If user label a namespace to backup, I think they would want everything in it.. thus it should behave equivalently to includedNamespaces.

No need for new flag.

@reasonerjt reasonerjt added Needs triage We need discussion to understand problem and decide the priority kind/requirement labels Mar 11, 2024
@kaovilai kaovilai closed this as not planned Won't fix, can't repro, duplicate, stale Apr 17, 2024
@blackpiglet
Copy link
Contributor

blackpiglet commented Apr 19, 2024

The PR #7697 addresses this requirement.

@kaovilai
Copy link
Member Author

Re closing as completed per 7697.

@muellerfabi
Copy link

Re closing as completed per 7697.

tbh after checking that PR, I still don't get how to backup selected namespaces upon their label.

@blackpiglet
Copy link
Contributor

@muellerfabi
There are two rules:

  • When the backup's namespace filter is not set, or all namespaces are included, and the backup's label selectors are set, the namespaces, that do not have backup including resources, are not collected.
  • If the namespace I/E filters and the (Or)LabelSelectors selected namespaces are different. The tracker takes the union of them.

@kaovilai
Copy link
Member Author

kaovilai commented May 8, 2024

If the namespace I/E filters and the (Or)LabelSelectors selected namespaces are different. The tracker takes the union of them.

Which is equivalent to --includedNamespaces right?

@blackpiglet
Copy link
Contributor

Say there is a namespace ns1 labeled as demo=issue-7492, and a namespace ns2, then create a backup with parameters --selector demo=issue-7493 and --include-namespaces ns2.
Both namespace ns1 and ns2 are included in the backup.

@kaovilai
Copy link
Member Author

reports in #8340 say resources under labeled namespaces that are itself not labeled are not included.

@kaovilai
Copy link
Member Author

reopening for

  • at least some unit tests that validate functionality.
  • or fix/discussion of requirement whether unlabeled resources under labeled namespace is supposed to be included
  • and/or if we are ok with "breaking change" to complete this requirement.

@kaovilai kaovilai reopened this Oct 23, 2024
kaovilai added a commit to kaovilai/velero that referenced this issue Oct 23, 2024
kaovilai added a commit to kaovilai/velero that referenced this issue Oct 23, 2024
@lorenzofelletti
Copy link

Personally, I see two ways for implementing this – slightly different semantically – and I don't have a strong preference for one or the other. Either way, is a feature I'd really like to have when defining backups.

Approach 1

One way is the one proposed in #8342, i.e. if a matching label selector is put on the namespace this means that we want all resources in that namespace to be backed up also, thus treating them as "subresources of the namespace".

Pros

  • easier to implement
  • does not require changing the spec

Cons

  • changes the behaviour of labelSelector
  • mixes up namespace and resource selection logic (what if one's not interested in all the resources in the namespace?).

Approach 2

Another way to approach the problem, which is consistent with the current way backups work, is to have a separate selector, say namespaceLabelSelector, that is specific only to find which namespaces one is interested to backup, and then leave the current labelSelector behaviour unchanged of selecting individual resources.

Pros

  • consistent with having some spec fields to define the namespaces I'm interested in, and some other to select resources
  • allows to still select only individual resources inside the namespace using labelSelector as usual.

Cons

  • requires changing the spec.

@kaovilai
Copy link
Member Author

#8342 way is chosen because I believe overall sentiment is to not "introduce new filter". And I think expectations of most users when trying to label NS is like approach 1. but approach 2 is ok too and was the original proposal.

@kaovilai Thanks for proposing this solution. I suggest to put this on hold for a while. If there is a way to make it work in #7105, there is no need to introduce a new filter. Although we can make the decision about which filter has higher priority, more filters will introduce confusion to the user.

@reasonerjt reasonerjt added Breaking change Impacts backwards compatibility kind/release-note labels Oct 27, 2024
@kaovilai
Copy link
Member Author

kaovilai commented Oct 29, 2024

@lorenzofelletti @shubham-pampattiwar please help list scenarios where you may find ability to use label to choose includedNamespaces better than writing individual namespaces into backup spec.

#8342 (comment)

From what I'm gathering, it appears so user don't have to motify their backup yaml and reuse the same backup as their need changes by labeling namespaces needed via other tools like kubectl or even argocd.

I need to double check the implementation that backup describe lists the namespaces included by label.

@lorenzofelletti
Copy link

A use case in which that would be a nice feature to have is to define the Schedule once, and then don't have to worry about keeping it in sync every time you add a new namespace.

@blackpiglet
Copy link
Contributor

blackpiglet commented Oct 31, 2024

Thanks for the input, but I didn't quite understand your point.
Using the label selector to select the namespace should work the same as the namespace filter in the scenario you mentioned.
They both can specify which namespaces to include in the backup and adding new namespaces in the k8s cluster doesn't impact the namespace selection.

@lorenzofelletti
Copy link

Sorry I didn't explain my point clearly enough in the previous message.

This is a scenario I encountered that would improve with such a feature:

  • We manage N clusters
  • on each of them we don't know beforehand which namespaces will be added, and the number of namespaces vary over time.

One invariant we know is that all namespaces we care to backup will have a specific label.

Such a feature would allow us to define the Schedule once, then deploy it on each cluster, and never having to update it (unless for functional changes).

Right now, on the other hand, we need to tailor the Schedule to each cluster it will be deployed on (because each cluster will have its own list of namespaces), and also whenever someone adds a new namespace need to make sure that the new namespace is added to the list of included namespaces.
Not a huge deal, but such a feature would allow us to automate the logic "namespace has label -> thus backup up it" without having to manually reproduce it in each cluster, and having to keep it in sync over time.

@sergioperez
Copy link

sergioperez commented Nov 19, 2024

Hello,

A very similar scenario is:

  • There are different Velero Schedules in a cluster (hourly, daily, weekly, monthly)
  • Each schedule targets a subset of namespaces labeled as "backup=weekly"
  • The application owners are able to control this label in a cluster, hence they are able to have some control about how their application is backed up

This would let us dynamically select namespaces with their content without manually injecting labels into all the resources. At the same time, the Schedules would not need to be updated when a new namespace is created.

@kaovilai
Copy link
Member Author

kaovilai commented Nov 19, 2024

So basically scenarios where users expecting backup do not have direct access to velero resources.

@kaovilai
Copy link
Member Author

kaovilai commented Dec 4, 2024

From community meeting @sseago suggested that to avoid breaking change, add a new field to backup.

so user would have to set backup.spec.includResourcesInLabeledNamespaces etc.

@reasonerjt
Copy link
Contributor

@kaovilai I recall in the meeting we agreed that we don't want to introduce more flags or attributes to control the behavior of velero in this scenario. Therefore, if we think this is worth fixing it's ok we introduce this break change.

@reasonerjt reasonerjt assigned kaovilai and unassigned blackpiglet Dec 11, 2024
@kaovilai kaovilai added this to the v1.16 milestone Dec 11, 2024
@kaovilai kaovilai added this to OADP Dec 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Breaking change Impacts backwards compatibility kind/release-note kind/requirement Needs triage We need discussion to understand problem and decide the priority
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants