Skip to content

Commit

Permalink
Cluster overrides functionality in gen2 clusters (#364)
Browse files Browse the repository at this point in the history
* Added overrides support for gen2 clusters

* refactored the code to make the override feature work on existing bc design

* Added a check if the cluster already exists before creating patch file

* Added documentation for clusteroverides design proposal

* Added override in user facing docs(gen2_Tutorial.md)

* Addressed all the reviews(Cleared confusion in patch revision and patch repo path)

* Edited the desc for patchreporevision

* Addressed the reviews(Code refactoring)

* Removed unnecessary code

* renamed overriden

* Handled the errors in delete function

* Refactored the code

Co-authored-by: Jayanth Reddy <[email protected]>
  • Loading branch information
2 people authored and cruizen committed Dec 16, 2022
1 parent cb43a0e commit 02e676b
Show file tree
Hide file tree
Showing 19 changed files with 862 additions and 20 deletions.
1 change: 1 addition & 0 deletions cmd/basecluster/preparegit.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ package basecluster

import (
"fmt"

"github.com/argoproj/argo-cd/v2/util/cli"
"github.com/arlonproj/arlon/pkg/argocd"
bcl "github.com/arlonproj/arlon/pkg/basecluster"
Expand Down
1 change: 1 addition & 0 deletions cmd/cluster/cluster.go
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ func checkForArgocd(c *cobra.Command, args []string) {
_, err := appIf.List(context.Background(), &apppkg.ApplicationQuery{Selector: &query})
if err != nil {
fmt.Println("ArgoCD auth token has expired....Login to ArgoCD again")
fmt.Println(err)
os.Exit(1)
}
}
31 changes: 30 additions & 1 deletion cmd/cluster/create.go
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
package cluster

import (
"context"
_ "embed"
"fmt"
"os"

argoapp "github.com/argoproj/argo-cd/v2/pkg/apiclient/application"
"github.com/argoproj/argo-cd/v2/pkg/apis/application/v1alpha1"
"github.com/argoproj/argo-cd/v2/util/cli"
arlonv1 "github.com/arlonproj/arlon/api/v1"
Expand All @@ -24,13 +26,17 @@ func createClusterCommand() *cobra.Command {
var argocdNs string
var arlonNs string
var arlonRepoUrl string
var patchRepoUrl string
var arlonRepoRevision string
var arlonRepoPath string
var patchRepoPath string
var clusterRepoUrl string
var repoAlias string
var clusterRepoRevision string
var patchRepoRevision string
var clusterRepoPath string
var clusterName string
var overridesDir string
var outputYaml bool
var profileName string
command := &cobra.Command{
Expand All @@ -55,6 +61,20 @@ func createClusterCommand() *cobra.Command {
if err != nil {
return fmt.Errorf("failed to get repository credentials: %s", err)
}
overridden := false
if overridesDir != "" {
_, err = appIf.Get(context.Background(),
&argoapp.ApplicationQuery{Name: &clusterName})
if err == nil {
return fmt.Errorf("arlon cluster already exists")
}
err = cluster.CreatePatchDir(config, clusterName, patchRepoUrl, argocdNs,
patchRepoPath, patchRepoRevision, clusterRepoRevision, overridesDir, clusterRepoUrl, clusterRepoPath)
if err != nil {
return fmt.Errorf("failed to create patch files directory: %s", err)
}
overridden = true
}
createInArgoCd := !outputYaml
baseClusterName, err := bcl.ValidateGitDir(creds,
clusterRepoUrl, clusterRepoRevision, clusterRepoPath)
Expand All @@ -81,9 +101,14 @@ func createClusterCommand() *cobra.Command {
return fmt.Errorf("failed to create arlon app: %s", err)
}
// Create "cluster app" for cluster
if overridden {
clusterRepoUrl = patchRepoUrl
clusterRepoPath = patchRepoPath
clusterRepoRevision = patchRepoRevision
}
clusterApp, err := cluster.CreateClusterApp(appIf, argocdNs,
clusterName, baseClusterName, clusterRepoUrl, clusterRepoRevision,
clusterRepoPath, createInArgoCd)
clusterRepoPath, createInArgoCd, overridden)
if err != nil {
return fmt.Errorf("failed to create cluster app: %s", err)
}
Expand Down Expand Up @@ -132,13 +157,17 @@ func createClusterCommand() *cobra.Command {
command.Flags().StringVar(&argocdNs, "argocd-ns", "argocd", "the argocd namespace")
command.Flags().StringVar(&arlonNs, "arlon-ns", "arlon", "the arlon namespace")
command.Flags().StringVar(&arlonRepoUrl, "arlon-repo-url", "https://github.com/arlonproj/arlon.git", "the git repository url for arlon template")
command.Flags().StringVar(&patchRepoUrl, "patch-repo-url", "", "the git repository url for base cluster template")
command.Flags().StringVar(&arlonRepoRevision, "arlon-repo-revision", "v0.9.0", "the git revision for arlon template")
command.Flags().StringVar(&arlonRepoPath, "arlon-repo-path", "pkg/cluster/manifests", "the git repository path for arlon template")
command.Flags().StringVar(&patchRepoPath, "patch-repo-path", "", "the git repository path for base cluster template")
command.Flags().StringVar(&clusterRepoUrl, "repo-url", "", "the git repository url for cluster template")
command.Flags().StringVar(&repoAlias, "repo-alias", gitrepo.RepoDefaultCtx, "git repository alias to use")
command.Flags().StringVar(&clusterRepoRevision, "repo-revision", "main", "the git revision for cluster template")
command.Flags().StringVar(&patchRepoRevision, "patch-repo-revision", "main", "the git revision for patch files")
command.Flags().StringVar(&clusterRepoPath, "repo-path", "", "the git repository path for cluster template")
command.Flags().StringVar(&clusterName, "cluster-name", "", "the cluster name")
command.Flags().StringVar(&overridesDir, "overrides-dir", "", "path to the corresponding patch file to the cluster")
command.Flags().BoolVar(&outputYaml, "output-yaml", false, "output root applications YAML instead of deploying to ArgoCD")
command.Flags().StringVar(&profileName, "profile", "", "profile name (if specified, must refer to dynamic profile)")
command.MarkFlagRequired("cluster-name")
Expand Down
33 changes: 33 additions & 0 deletions docs/gen2_Tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -370,6 +370,34 @@ arlon cluster create --cluster-name <clusterName> --repo-path <pathToDirectory>
arlon cluster create --cluster-name <clusterName> --repo-alias prod --repo-path <pathToDirectory> [--output-yaml] [--profile <profileName>] [--repo-revision <repoRevision>]
```


## gen2 cluster creation with overrides

We call the concept of constructing various clusters with patches from the same base manifest as cluster overrides.
The cluster overrides feature is built on top of the existing base cluster design. So, A user can create a cluster from the base manifest using the same command as in the above step(gen2 cluster creation).
Now, to create a cluster with overrides in the base manifest, a user should have the corresponding patch files in a dedicated folder in local which doesn't contain any other files except patch files. Example of a patch file where we want to override replicas count to 2 is:

```shell
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
name: .*
spec:
replicas: 2
```

Refer to this [document](https://blog.scottlowe.org/2019/11/12/using-kustomize-with-cluster-api-manifests/) to know more about patch files

Command to create a gen2 workload cluster form the base cluster manifest with overrides to the manifest is:

```shell
arlon cluster create <cluster-name> --repo-url <repo url where base manifest is present> --repo-path <repo path to the base manifest> --override <path to the patch files folder> --patch-repo-url <repo url where patch files should be stored> --patch-repo-path <repo path to store the patch files>
````
Runnning the above command will create a cluster named folder in patch repo path of patch repo url which contains the patch files, kustomization.yaml and configurations.yaml which are used to create the cluster app.
Note that the patch file repo url can be different or same from the base manifest repo url acoording to the requirement of the user. A user can use a different repo url for string patch files for the cluster.
## gen2 cluster update
To update the profiles of a gen2 workload cluster:
Expand Down Expand Up @@ -399,6 +427,8 @@ Arlon creates between 2 and 3 ArgoCD application resources to compose a gen2 clu
an optional profile is specified at cluster creation time). When you destroy a gen2 cluster, Arlon will find all related ArgoCD applications
and clean them up.

If the cluster which which is being deleted is a cluster created using patch files, the controller first cleans the git repo where the respective patch files of the cluster are present and then it destroys all the related ArgoCD applications and clean them up.

## Known issues and limitations

Gen2 clusters are powerful because the base cluster can be arbitrarily complex and feature rich. Since they are fairly
Expand All @@ -408,6 +438,8 @@ new and still evolving, gen2 clusters have several known limitations relative to
which is an exact clone of the base cluster except for the names of its resources and their namespace.
The work-around is to make a copy of the base cluster directory, push the new directory, make
the desired changes, commit & push the changes, and register the directory as a new base cluster.
* The clusters created directly from the base manifest are completely declarative whereas the clusters which are created using override property are not completely declarative.
* If a user passes a different repository for patch repo url from the repo where base manifest is present, argocd won't be able to detect if there are any changes in the base manifest repository but will deect all the chnages in patch repo url for the cluster.
* If you modify and commit a change to one or more properties of the base cluster that the underlying Cluster API provider deems as "immutable", new
workload clusters created from the base cluster will have the modified propert(ies), but ArgoCD will flag existing clusters as OutOfSync, since
the provider will continually reject attempts to apply the new property values. The existing clusters continue to function, despite appearing unhealthy
Expand All @@ -422,6 +454,7 @@ Examples of immutable properties:

* Most fields of AWSMachineTemplate (instance type, labels, etc...)


## For more information

For more details on gen2 clusters, refer to the [design document](baseclusters.md).
115 changes: 115 additions & 0 deletions docs/gen2_overrides_proposal_1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
# Gen2 Cluster Overrides - Proposal 1
This is a design proposal doc for gen2 cluster overrides. Right now, according to our gen2 design, we can deploy multiple clusters with same specifications from one base cluster. But what if we want to deploy cluster with a different sshkeyname from the same manifest?. To allow deploying clusters with different specifications from the same base clsuter we are introducing the concept of clusteroverrides. So, clusteroverrides is being able to deploy clusters with different specs using same manifest and overriding the specs which we want to change.

We have 2 different approches to override in a cluster:

1. Converting the git repo where base cluster is present to helm charts
2. Overriding specifies fields using kustomize

In the first approach, We first let user upload the base manifest in the repo and deploy the cluster from it and then convert it into helm chart, so that we will be able to override fields in the manifest. The downside of this approach is that we don't have a specific template for base manifest, a user can use any form of the template in which case we will not be able to convert the manifest to heml chart.

So, continuing with the 2nd approach, in the kustomize approach, we create an overlay folder parallel to the basecluster folder which contains folders named with the cluster name. These cluster named folders contain the specific override files to the cluster. An example of the folder structure is as belows:

### A directory layout

.
├── Basecluster # Basecluster folder(Contains base manifest)
├── Overlays # Contains folders specific to each cluster created from base manifest
├── Cluster1 # Contains overrides corresponding to cluster1
├── Cluster2 # Contains overrides corresponding to cluster2


### Let's consider an example case to understand the kustomize approach

1. Let's consider three different clusters on AWS. The management cluster already exists.

2. Two of these clusters will run in the AWS “us-west-2” region, while the third will run in the “us-east-2” region.

3. One of the two “us-west-2” clusters will use larger instance types to accommodate more resource-intensive workloads.

Now, we need to get our gitrepo ready by pushing the basemanifest into a folder named base and for overriding, we need to create a folder for each cluster with the cluster name and place them in overlays folder which is parallel to the base folder
## Setting up the Directory Structure

To accommodate this use case, we will need to use a directory structure that supports the use of kustomize overlays. Therefore, the directory structure would look like this for the project:

(parent)
|- base
|- overlays
|- usw2-cluster1
|- usw2-cluster2
|- use2-cluster1

The base directory will store the base manifest for the final Cluster API manifests, as well as a kustomization.yaml file that identifies these Cluster API manifests as resources for kustomize to use.

The contents of kustomize file in base folder is as follows:

```
resources:
- basemanifest.yaml
```

The kustomization.yaml states that the resources for cluster is in basemanifest.yaml file

## What consists in the cluster named folders?

The intriguing parts begin with the overlays. You will need to provide the cluster-specific patches kustomize will use to generate the YAML for each cluster in its own directory under the overlays directory.

With the "usw2-cluster1" overlay, let's begin. You must first comprehend what modifications must be made to the basic configuration in order to develop the appropriate configuration for this specific cluster in order to grasp what will be required.

We can use two methods for patches
1. JSON patches
2. YAML patches

In JSON patches, we have to write a JSON file to replace fields in the manifests. So, we need to write a different file for each replace and that would become hectic.

Example of a JSON patch:
[
{ "op": "replace",
"path": "/metadata/name",
"value": "usw2-cluster-1" },
{ "op": "replace",
"path": "/spec/infrastructureRef/name",
"value": "usw2-cluster-1" }
]

So, let's discuss the YAML approach which will be much easier to handle the overrides

---
apiVersion: cluster.x-k8s.io/v1alpha2
kind: Machine
metadata:
name: .*
labels:
cluster.x-k8s.io/cluster-name: "usw2-cluster1"

This will add a cluster name field to label in the manifest which is an advantage over JSON approach. We can both add and replace fields in manifest unlike just just replace in JSON aproach.

You would once more require a reference to the patch file in kustomization for this last example. Both the patch file itself and kustomization.yaml

This kustomization.yaml file will be pointing to both the basecluster manifest and patch file basically working like a link between both the basecluster and manifest.

Using this particular approach the present basecluster approach will need to take a redesign as we will need to skip the name suffix method we using before to create a manifest for each cluster respectively with their own names.

In this approach, Instead of the configurations.yaml(Needed for name suffix), we will have a folder for each cluster and argocd path pointing to the cluster folder. This will help us in skipping the name suffix method we were using before.

We will be able to basically override any of the field in manifest without any limitations before creating a cluster using this approach.

### UX(User experience):

To provide a user the freedom to completely override any part of the base manifest, we ask the user to point to a yaml file in which the fields have been overridden.

This would be easier to user as well because he/she would generate the manifest file anyway. So, they need to make changes to the already generated and point it.

But we should even take care of the point that the base manifest in the git and the overriden manifest file are comparable. Example of a command:

```arlon cluster create <cluster name> --repo-url <repo url> --repo-path <repo path> --overrides <path to overriden manifest file>```

## Limitations:

- Manifests (base and overlays) for the base cluster as well as workload clusters reside in the same repository. This means those who create the workload cluster will need write access to the base cluster repository which might not be the case in enterprises.
- So, if we consider having the manifests (base and overlays) are in different repositories, they will need a link to each other and as of now, if we update the base cluster while having manifest in one repo and patches in another repo. Argocd will not be able to take up the updated changes in base manifest

- The main goal of gen2 clusters was to remove the dependency on git to store metadata and make the clusters completely declarative unlike gen1 clusters. But here, we re-introduce a dependency on Arlon API (library) and git (state in git with dir structure)
- Although we can make this approach declarative by introducing another controller (CRD), this would increase the whole complexity of the issue and arlon.

- Using this approach, we might not be able to prefix a name in the base manifest which is an issue because, Some resources generate external resources, like AWS load balancer and we need to avoid naming conflict - hence name prefix (not sufficient) + name reference in gen2 is required
Loading

0 comments on commit 02e676b

Please sign in to comment.