Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

helm: better support parallel deployments #650

Closed
marquiz opened this issue Nov 16, 2021 · 14 comments
Closed

helm: better support parallel deployments #650

marquiz opened this issue Nov 16, 2021 · 14 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@marquiz
Copy link
Contributor

marquiz commented Nov 16, 2021

What would you like to be added:

Spun out from #640 (comment):

Other Helm apps may also want to add NFD as a dependency. Actually, custom/local NFD sources look like they were designed for that reason. What needs to be done here is to make sure that there is a configuration for adding NFD as a Helm chart dependency in a way that it won't overlap with other NFD Helm chart dependencies and/or other NFD instances that may be present in the cluster (e.g. the volume mounts that configure the local source is an example of such an overlap, and this PR is another).

We should properly support multiple parallel Helm deployments. That is, sufficiently isolate them to avoid races/clashes

Why is this needed:

Helm makes it possible to have multiple parallel deployments and we should try our best to support this

@marquiz marquiz added the kind/feature Categorizes issue or PR as related to a new feature. label Nov 16, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 14, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 16, 2022
@marquiz
Copy link
Contributor Author

marquiz commented Mar 16, 2022

Still a valid issue, helping hands would be welcome
/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Mar 16, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 14, 2022
@marquiz
Copy link
Contributor Author

marquiz commented Jul 8, 2022

#831 certainly is one step in solving this issue.

@jasine do you have any comments on this issue? Have you tried multiple parallel Helm-based NFD deployments?

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 8, 2022
@jasine
Copy link
Contributor

jasine commented Jul 9, 2022

#831 certainly is one step in solving this issue.

@jasine do you have any comments on this issue? Have you tried multiple parallel Helm-based NFD deployments?

/remove-lifecycle stale

@marquiz I just tried to make a second deployment on cluster and failed with blowing error

Error: INSTALLATION FAILED: rendered manifests contain a resource that already exists. Unable to continue with install: CustomResourceDefinition "nodefeaturerules.nfd.k8s-sigs.io" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: key "meta.helm.sh/release-name" must equal "nfd-2": current value is "nfd-1"

and the reason is crd nodefeaturerules.nfd.k8s-sigs.io placed under manifests folder, while helm encourage crds placed at crds folder

when I renamed manifests to crds, multiple parallel Helm-based NFD deployed succeed.

@eliaskoromilas
Copy link
Contributor

I think that, as of NFD v0.11.1, there is no "real" need for parallel deployments. Even custom feature sources can be applied with a simple NodeFeatureRule. A cluster-scoped NFD deployment is just enough to address the needs of every app that either uses the out-of-the-box feature labels, dynamically specifies new features using the local filesystem, or registers custom rules through the operator.

Having said that, I wouldn't suggest using NFD as a direct Helm dependency, but instead as a requirement in a higher layer (e.g. Helmfile).

@marquiz
Copy link
Contributor Author

marquiz commented Aug 9, 2022

I think that, as of NFD v0.11.1, there is no "real" need for parallel deployments. Even custom feature sources can be applied with a simple NodeFeatureRule. A cluster-scoped NFD deployment is just enough to address the needs of every app that either uses the out-of-the-box feature labels, dynamically specifies new features using the local filesystem, or registers custom rules through the operator.

Yeah, I fully agree on this 👍 We're really trying to make it unnecessary to have multiple parallel NFD deployments. The ´NodeFeatureRule` already causes some fuss with parallel deployments as it's a cluster-scoped (non-namespaced) resource and only the default instance (by default) is processing those resources. #828 will complicate matters further, probably making parallel NFD installations unsupported when/if gRPC communication is dropped from NFD. So, I think I'm not going to invest my time on this issue anymore.

Having said that, I'm still open to contributions if somebody wants to work on this. Currently I see two shortcomings in parallel installs: CRDs (thanks @jasine) and the hostPath mounts for hooks and feature files (/etc/kubernetes/node-feature-discovery/{source.d/ | features.d/}). CRD would be an easy fix, just rename. For mounts we could think about "namespacing" the host dirs e.g. /etc/kubernetes/node-feature-discovery/source.d.{INSTANCE}/

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 7, 2022
@vaibhav2107
Copy link
Contributor

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 14, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 12, 2023
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 14, 2023
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

@k8s-ci-robot k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Apr 13, 2023
@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

6 participants