diff --git a/docs/proposals/20201020-management-cluster-operator.md b/docs/proposals/20201020-management-cluster-operator.md index 0a6881e9a3d9..983c6d5714f3 100644 --- a/docs/proposals/20201020-management-cluster-operator.md +++ b/docs/proposals/20201020-management-cluster-operator.md @@ -6,6 +6,9 @@ authors: reviewers: - "@vincepri" - "@ncdc" + - "@justinsb" + - "@detiber" + - "@CecileRobertMichon" creation-date: 2020-09-14 last-updated: 2020-10-20 status: implementable (aspirational :wink:) @@ -62,29 +65,37 @@ Any discrepancies should be rectified in the main Cluster API glossary. ## Summary The clusterctl CLI currently handles the lifecycle of Cluster API -providers installed in a management cluster. It provides a great Day 1 -experience in getting CAPI up and running. However, clusterctl’s imperative +providers installed in a management cluster. It provides a great Day 0 and Day +1 experience in getting CAPI up and running. However, clusterctl’s imperative design makes it difficult for cluster admins to stand up and manage CAPI management clusters in their own preferred way. This proposal provides a solution that leverages a declarative API and an operator to empower admins to handle the lifecycle of providers within the -management cluster +management cluster. ## Motivation -In its current form clusterctl is designed to provide a super simple user -experience for day 1 operations of a Cluster API management cluster with a -single infrastructure provider. +In its current form clusterctl is designed to provide a simple user experience +for day 1 operations of a Cluster API management cluster. However such design is not optimized for supporting declarative approaches -when operating Cluster API management clusters. For example, in order to -upgrade a cluster we now need to supply all the information that was provided -initially during a `clusterctl init` which is inconvenient in many cases such -as distributed teams and CI pipelines where the configuration needs to be -stored and synced externally. +when operating Cluster API management clusters. -With the management cluster operator, we aim to address these limitations by +These declarative approaches are important to enable GitOps workflows in case +users don't want to rely solely on the `clusterctl` CLI. + +Providing a declarative API also enables us to leverage controller-runtime's +new component config and allow us to configure the controller manager and even +the resource limits of the provider's deployment. + +Another example is improving cluster upgrades. In order to upgrade a cluster +we now need to supply all the information that was provided initially during a +`clusterctl init` which is inconvenient in many cases such as distributed +teams and CI pipelines where the configuration needs to be stored and synced +externally. + +With the management cluster operator, we aim to address these use cases by introducing an operator that handles the lifecycle of providers within the management cluster based on a declarative API. @@ -92,23 +103,23 @@ management cluster based on a declarative API. - Define an API that enables declarative management of the lifecycle of Cluster API providers -- Define and document how providers’ installation and upgrades using the - operator should work in air-gapped environments. -- Define the lifecycle of the operator itself. +- Support air-gapped environments through sufficient documentation initially. - Identify and document differences between clusterctl CLI and the operator in managing the lifecycle of providers, if any. - Define how the clusterctl CLI should be changed in order to interact with the management cluster operator in a transparent and effective way. +- To support the ability to upgrade from an earlier version of Cluster API to + one managed by the operator. As a minimum, we will support upgrading from a + version of v1alpha3 cluster (i.e. v0.3.[TBD]). ### Non-Goals/Future Work -- Deprecate the clusterctl CLI. - Manage cluster templates using the operator. - Implement an operator driven version of `clusterctl move`. - Manage cert-manager using the operator. - Support multiple installations of the same provider within a management cluster in light of [issue 3042] and [issue 3354]. -- Support multiple template processors. +- Support any template processing engines. - Support the installation of v1alpha3 providers using the operator. ## Proposal @@ -120,11 +131,8 @@ management cluster based on a declarative API. 1. As an admin, I would like to have an easy and declarative way to change controller settings (e.g. enabling pprof for debugging). 1. As an admin, I would like to have an easy and declarative way to change the - resource requirements such as limits and requests for a provider - deployment. -1. As an admin, I would like to upgrade the Cluster API providers in a - management cluster without having to provide the same information I used - when I did `clusterctl init`. + resource requirements (e.g. such as limits and requests for a provider + deployment). 1. As an admin, I would like to have the option to use clusterctl CLI as of today, without being concerned about the operator. 1. As an admin, I would like to be able to install the operator using kubectl @@ -138,6 +146,9 @@ The existing `Provider` type used by the clusterctl CLI will be deprecated and its instances will be migrated to instances of the new API types as defined in the next section. +The management cluster operator will be responsible for migrating the existing +provider types to support GitOps workflows excluding `clusterctl`. + #### New API Types These are the new API types being defined. @@ -157,19 +168,8 @@ type CoreProvider struct { metav1.TypeMeta `json:",inline"` metav1.ObjectMeta `json:"metadata,omitempty"` - - Spec CoreProviderSpec `json:"spec,omitempty"` - Status CoreProviderStatus `json:"status,omitempty"` -} - -// CoreProviderSpec defines the desired state of CoreProvider -type CoreProviderSpec struct { - ProviderSpec `json:",inline"` -} - -// CoreProviderStatus defines the observed state of CoreProvider -type CoreProviderStatus struct { - ProviderStatus `json:",inline"` + Spec ProviderSpec `json:"spec,omitempty"` + Status ProviderStatus `json:"status,omitempty"` } // BootstrapProvider is the Schema for the BootstrapProviders API @@ -177,18 +177,8 @@ type BootstrapProvider struct { metav1.TypeMeta `json:",inline"` metav1.ObjectMeta `json:"metadata,omitempty"` - Spec BootstrapProviderSpec `json:"spec,omitempty"` - Status BootstrapProviderStatus `json:"status,omitempty"` -} - -// BootstrapProviderSpec defines the desired state of BootstrapProvider -type BootstrapProviderSpec struct { - ProviderSpec `json:",inline"` -} - -// BootstrapProviderStatus defines the observed state of BootstrapProvider -type BootstrapProviderStatus struct { - ProviderStatus `json:",inline"` + Spec ProviderSpec `json:"spec,omitempty"` + Status ProviderStatus `json:"status,omitempty"` } // ControlPlaneProvider is the Schema for the ControlPlaneProviders API @@ -196,18 +186,8 @@ type ControlPlaneProvider struct { metav1.TypeMeta `json:",inline"` metav1.ObjectMeta `json:"metadata,omitempty"` - Spec ControlPlaneProviderSpec `json:"spec,omitempty"` - Status ControlPlaneProviderStatus `json:"status,omitempty"` -} - -// ControlPlaneProviderSpec defines the desired state of ControlPlaneProvider -type ControlPlaneProviderSpec struct { - ProviderSpec `json:",inline"` -} - -// ControlPlaneProviderStatus defines the observed state of ControlPlaneProvider -type ControlPlaneProviderStatus struct { - ProviderStatus `json:",inline"` + Spec ProviderSpec `json:"spec,omitempty"` + Status ProviderStatus `json:"status,omitempty"` } // InfrastructureProvider is the Schema for the InfrastructureProviders API @@ -215,18 +195,8 @@ type InfrastructureProvider struct { metav1.TypeMeta `json:",inline"` metav1.ObjectMeta `json:"metadata,omitempty"` - Spec InfrastructureProviderSpec `json:"spec,omitempty"` - Status InfrastructureProviderStatus `json:"status,omitempty"` -} - -// InfrastructureProviderSpec defines the desired state of InfrastructureProvider -type InfrastructureProviderSpec struct { - ProviderSpec `json:",inline"` -} - -// InfrastructureProviderStatus defines the observed state of InfrastructureProvider -type InfrastructureProviderStatus struct { - ProviderStatus `json:",inline"` + Spec ProviderSpec `json:"spec,omitempty"` + Status ProviderStatus `json:"status,omitempty"` } ``` @@ -239,7 +209,7 @@ Infrastructure. type ProviderSpec struct { // Version indicates the provider version. // +optional - Version string `json:"version,omitempty"` + Version *string `json:"version,omitempty"` // Manager defines the properties that can be enabled on the controller manager for the provider. // +optional @@ -247,17 +217,22 @@ type ProviderSpec struct { // Deployment defines the properties that can be enabled on the deployment for the provider. // +optional - Deployment DeploymentSpec `json:"deployment,omitempty"` + Deployment *DeploymentSpec `json:"deployment,omitempty"` // SecretName is the name of the Secret providing the configuration // variables for the current provider instance, like e.g. credentials. // Such configurations will be used when creating or upgrading provider components. + // The contents of the secret will be treated as immutable. If changes need + // to be made, a new object can be created and the name should be updated. + // The contents should be in the form of key:value. // +optional SecretName *string // FetchConfig determines how the operator will fetch the components and metadata for the provider. // If nil, the operator will try to fetch components according to default - // settings embedded in the operator and in clusterctl. + // embedded fetch configuration for the given kind and `ObjectMeta.Name`. + // For example, the infrastructure name `aws` will fetch artifacts from + // https://github.com/kubernetes-sigs/cluster-api-provider-aws/releases. // +optional FetchConfig *FetchConfiguration `json:"fetchConfig,omitempty"` @@ -270,12 +245,13 @@ type ProviderSpec struct { // ManagerSpec defines the properties that can be enabled on the controller manager for the provider. type ManagerSpec struct { // ControllerManagerConfigurationSpec defines the desired state of GenericControllerManagerConfiguration. - ctrlruntime.ControllerManagerConfigurationSpec + ctrlruntime.ControllerManagerConfigurationSpec `json:",inline"` // ProfilerAddress defines the bind address to expose the pprof profiler (e.g. localhost:6060). // Default empty, meaning the profiler is disabled. + // Controller Manager flag is --profiler-address. // +optional - ProfilerAddress string `json:"profilerAddress,omitempty"` + ProfilerAddress *string `json:"profilerAddress,omitempty"` // MaxConcurrentReconciles is the maximum number of concurrent Reconciles // which can be run. Defaults to 10. @@ -283,16 +259,18 @@ type ManagerSpec struct { MaxConcurrentReconciles *int `json:"maxConcurrentReconciles,omitempty"` // Verbosity set the logs verbosity. Defaults to 1. + // Controller Manager flag is --verbosity. // +optional Verbosity int `json:"verbosity,omitempty"` // Debug, if set, will override a set of fields with opinionated values for // a debugging session. (Verbosity=5, ProfilerAddress=localhost:6060) // +optional - Debug bool `json:"debug, omitempty"` + Debug bool `json:"debug,omitempty"` // FeatureGates define provider specific feature flags that will be passed // in as container args to the provider's controller manager. + // Controller Manager flag is --feature-gates. FeatureGates map[string]bool `json:"featureGates, omitempty"` } @@ -314,11 +292,16 @@ type ContainerSpec struct { // Name of the container. Cannot be updated. Name string `json:"name"` - // Docker image name + // Container Image Name // +optional - Image string `json:"image,omitempty"` + Image *ImageMeta `json:"image,omitempty"` // Args represents extra provider specific flags that are not encoded as fields in this API. + // Explicit controller manager properties defined in the `Provider.ManagerSpec` + // will have higher precedence than those defined in `ContainerSpec.Args`. + // For example, `ManagerSpec.SyncPeriod` will be used instead of the + // container arg `--sync-period` if both are defined. + // The same holds for `ManagerSpec.FeatureGates` and `--feature-gates`. // +optional Args map[string]string `json:"args,omitempty"` @@ -327,9 +310,27 @@ type ContainerSpec struct { Resources *corev1.ResourceRequirements `json:"resources,omitempty"` } +// ImageMeta allows to customize the image used +type ImageMeta struct { + // Repository sets the container registry to pull images from. + // +optional + Repository *string `json:"repository,omitempty` + + // Name allows to specify a name for the image. + // +optional + Name *string `json:"name,omitempty` + + // Tag allows to specify a tag for the image. + // +optional + Tag *string `json:"tag,omitempty` +} + // FetchConfiguration determines the way to fetch the components and metadata for the provider. type FetchConfiguration struct { - // URL to be used for fetching provider’s components and metadata from a remote repository. + // URL to be used for fetching the provider’s components and metadata from a remote Github repository. + // For example, https://github.com/{owner}/{repository}/releases + // The version of the release will be `ProviderSpec.Version` if defined + // otherwise the `latest` version will be computed and used. // +optional URL *string `json:"url,omitempty"` @@ -344,7 +345,8 @@ type FetchConfiguration struct { type ProviderStatus struct { // Contract will contain the core provider contract that the provider is // abiding by, like e.g. v1alpha3. - Contract string `json:"contract,omitempty"` + // +optional + Contract *string `json:"contract,omitempty"` // Conditions define the current service state of the cluster. // +optional @@ -366,6 +368,8 @@ type ProviderStatus struct { as commonly used in the Kubernetes ecosystem; if this value is nil when a new provider is created, the operator will determine the version to use applying the same rules implemented in clusterctl (latest). + Once the latest version is calculated it will be set in + `ProviderSpec.Version`. - The content of `ProviderSpec.SecretName` will be treated as immutable. If changes need to be made, a new object can be created and the name should be updated. @@ -605,9 +609,14 @@ As a final consideration, please note that - The operator executes installation for 1 provider at time, while `clusterctl init` manages installation of a group of providers with a single operation. - `clusterctl init` uses environment variables and a local configuration file, - while the operator uses ConfigMap; given that we want the users to preserve + while the operator uses a Secret; given that we want the users to preserve current behaviour in clusterctl, the init operation should be modified to transfer local configuration to the cluster. + As part of `clusterctl init`, it will obtain the list of variables required + by the provider components and read the corresponding values from the config + or environment variables and build the secret. + Any image overrides defined in the clusterctl config will also be applied to + the provider's components. ##### Upgrading a provider