Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a new API for managing Security Groups #1752

Open
EmilienM opened this issue Nov 21, 2023 · 14 comments
Open

Implement a new API for managing Security Groups #1752

EmilienM opened this issue Nov 21, 2023 · 14 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@EmilienM
Copy link
Contributor

EmilienM commented Nov 21, 2023

/kind feature

How it works today

Currently, if OpenStackCluster.spec.managedSecurityGroups is True, security groups will be created and added to the Bastion (if it exists), the control plane and the worker nodes respectively.

By default, these groups have rules that allow the following traffic:

  • Bastion node
    • SSH traffic from other nodes
  • Control plane nodes
    • SSH traffic from other nodes
    • API server traffic from anywhere
    • Etcd traffic from other control plane nodes
    • Kubelet traffic from other cluster nodes
    • Calico CNI traffic from other cluster nodes
  • Worker nodes
    • SSH traffic from other nodes
    • Node port traffic from anywhere
    • Kubelet traffic from other cluster nodes
    • Calico CNI traffic from other cluster nodes

Additionally:

  • If this is not flexible enough, pre-existing security groups (via their name) can be added to the spec of an OpenStackMachineTemplate, via its OpenStackMachineSpec and for the Bastion in OpenStackCluster.
  • To permit all traffic between cluster nodes on all ports and protocols, we can set OpenStackCluster.spec.allowAllInClusterTraffic to True. However, API server and node port traffic is still permitted from anywhere, as with the default rules.
  • If OpenStackCluster.spec.apiServerLoadBalancer.enabled is true, it is possible to allow additional ports (both TCP & UDP will be opened!) from anywhere via OpenStackCluster.spec.apiServerLoadBalancer.additionalPorts.

Proposed solution

2 options:

  • Via the OpenStackCluster, we want to design a new API that will allow our users to specify their security group rules for the Control Plane and the workers, and offer flexibility. CAPO will manage these rules. If no rules are provided, then we maintain the current behaviour of not managing any security group and it's up to the user to handle it.

Or:

  • Create a new CRD OpenStackSecurityGroup and a new Controller (not necessarily by the way, the CRD can be managed by OpenStackCluster controller?)

And then:

  • Keep the option of providing pre-existing security groups in OpenStackMachineTemplate, via its OpenStackMachineSpec and for the Bastion in OpenStackCluster. They'll not be exclusive to the provided rules in the new API, they'll be able to live together and both sets of rules will be applied to the machines.
  • Deprecate OpenStackCluster.spec.apiServerLoadBalancer.additionalPorts as we'll have an API to create Control Plane Security Group rules (with more flexibility than we have now with this field). Conversion will happen between this parameter and the one one for Control Plane Security Group rules, for backward compatibility.
  • We want to design a migration path to the new API, allowing our users to keep the legacy Security Groups so they can safely upgrade and at some point use the new API. Maybe rename OpenStackCluster.spec.managedSecurityGroups to OpenStackCluster.spec.legacyManagedSecurityGroups ? However the new API will be mutually exclusive with the old ManagedSecurityGroups field, to avoid complexity.
  • OpenStackCluster.spec.allowAllInClusterTraffic can be kept (otherwise upgrade path becomes complicated) and ingress/egress traffic will be allowed within the cluster through additional rules in the managed Control Plane & Worker security groups.

API Change

// OpenStackClusterSpec defines the desired state of OpenStackCluster.
type OpenStackClusterSpec struct {

[...]

	// LegacyManagedSecurityGroups determines whether OpenStack security groups for the cluster
	// will be managed by the OpenStack provider or whether pre-existing security groups will
	// be specified as part of the configuration.
	// By default, the managed security groups have rules that allow the Kubelet, etcd, the
	// Kubernetes API server and the Calico CNI plugin to function correctly.
        // This is deprecated, `ControlPlaneSecurityGroup` and `WorkerSecurityGroup` should be used instead.
	// +optional
	LegacyManagedSecurityGroups bool `json:"legacyLanagedSecurityGroups"`

	// ControlPlaneSecurityGroup contains all the information about the OpenStack Security
	// Group that needs to be applied to control plane nodes.
	// By default, we will create a security group with rules that allow the Kubelet, etcd and
	// the Kubernetes API server to function correctly.
	// +optional
	ControlPlaneSecurityGroup *SecurityGroup `json:"controlPlaneSecurityGroup,omitempty"`

	// WorkerSecurityGroup contains all the information about the OpenStack Security
	// Group that needs to be applied to worker nodes.
	// By default, we will create a security group with rules that allow the Kubelet to
	// function correctly.
	// +optional
	WorkerSecurityGroup *SecurityGroup `json:"workerSecurityGroup,omitempty"`
}

or a new CRD.

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Nov 21, 2023
@EmilienM
Copy link
Contributor Author

/assign EmilienM

@jichenjc
Copy link
Contributor

Keep the option of providing pre-existing security groups in OpenStackMachineTemplate, via its OpenStackMachineSpec and for the Bastion in OpenStackCluster. They'll not be exclusive to the provided rules in the new API, they'll be able to live together and both sets of rules will be applied to the machines.

I think the original idea is to create a json file and describe the security group including its rules ?
and by default we will load them into CAPO something like a CR

will the above proposal be same or customer need create a CR by themselves?

Keep the option of providing pre-existing security groups in OpenStackMachineTemplate, via its OpenStackMachineSpec and for the Bastion in OpenStackCluster. They'll not be exclusive to the provided rules in the new API, they'll be able to live together and both sets of rules will be applied to the machines.

to confirm ,we still support pre-created sec group in openstack API layer then use in our template, correct?

@wwentland
Copy link
Contributor

There have been discussions about this in various places, for example:

I really like the way this is addressed in CAPA, which defines a CNIIngressRule resource, which could serve as an inspiration here as well. It might make sense to mirror their API/naming scheme where possible, to make it easier for users who are working with both providers.

@EmilienM
Copy link
Contributor Author

I really like the way this is addressed in CAPA, which defines a CNIIngressRule resource, which could serve as an inspiration here as well. It might make sense to mirror their API/naming scheme where possible, to make it easier for users who are working with both providers.

Me too, I very often take inspiration from CAPZ or CAPA when I think of new APIs in CAPO. However I also like to look at existing namings in CAPO and I think keeping it consistent with other fields is sometimes more helpful than trying to copy from other providers.

@EmilienM
Copy link
Contributor Author

I think the original idea is to create a json file and describe the security group including its rules ? and by default we will load them into CAPO something like a CR

Yes exactly.

will the above proposal be same or customer need create a CR by themselves?

I don't understand that question, but the customer provides the rules via the new API and CAPO manages the Security Group rules.

to confirm ,we still support pre-created sec group in openstack API layer then use in our template, correct?

Yes, we should not touch that part and that's what I mean when I say "keep the option of (...)".

@EmilienM
Copy link
Contributor Author

let's use this KEP to discuss design: #1756

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 20, 2024
@EmilienM
Copy link
Contributor Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 21, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 21, 2024
@EmilienM
Copy link
Contributor Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 21, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 19, 2024
@EmilienM
Copy link
Contributor Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 19, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 17, 2024
@EmilienM
Copy link
Contributor Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants