Skip to content

Commit

Permalink
Move bottom-up design from google doc to markdown
Browse files Browse the repository at this point in the history
  • Loading branch information
Alexander Matyushentsev authored and alexmt committed Nov 11, 2019
1 parent 921c307 commit 53ff80f
Show file tree
Hide file tree
Showing 2 changed files with 207 additions and 3 deletions.
208 changes: 206 additions & 2 deletions specs/design-bottom-up.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,207 @@
# GitOps Engine Design - White Box
# GitOps Engine Design - Bottom Up [On Hold - see [Top Down](./design-top-down.md)]

https://docs.google.com/document/d/10OvgB9nMmleYPxPlLuXq6RNFodJSKZN7GUSNTyfwEhw/edit#
## Summary

The Bottom Up approach assumes that teams identify similar components in Argo CD and Flux and merge them one by one. The following five components were identified so far:

* Access to Git repositories
* Kubernetes resource cache
* Manifest Generation
* Resources reconciliation
* Sync Planning

The rest of the document describes separate proposals of how exactly components could be merged.

## Kubernetes Resource Cache

Both Argo CD and Flux have to load target cluster resources. This is required to enable the following main use cases:
- Collect resources that are no longer in Git and delete/warn the user.
- Present information about cluster state to the user: Argo CD UI, fluxctl `list-images`, `list-workloads` commands.
- Compare the state of the cluster with the configuration in Git.

Projects use different approaches to collect cluster state information. Argo CD leverages Kubernetes watch APIs to maintain
lightweight cluster state cache. Flux fetches required resources when information is needed.

The problem is that Kubernetes does not provide an SQL like API which allows to effectively find required resources and in
some cases, Flux has to load whole cluster/namespace state into memory and go through the in-memory resources list. This is
a time and memory consuming approach, which also puts pressure on Kubernetes' API server.

### Goals

Extract Argo CD caching logic into a reusable component that maintains a lightweight cluster state cache. argoproj/argo-cd/controller/cache

### Non-Goals
Support multi-cluster caching. The ability to maintain a cache of multiple-cluster is implemented in Argo CD code but it is tightly coupled
to how Argo CD stores cluster credentials and add too much complexity.

### Proposal.

The cluster cache component encapsulates interaction with Kubernetes APIs and allows to quickly inspect Kubernetes resources in a thread-safe
manner. The component is responsible for the following tasks:

- Identify resource APIs supported by the target cluster and provide API’s metadata (e.g. if API is namespaced or cluster scope).
- Notifying about changes in the resource APIs supported by the target cluster (e.g. added CRDs, removed CRDs ...).
- Loads initial state and watch for changes in every supported resource API.
- Handles available changes APIs: start/stops watches; removes obsolete APIs from the cache.

The component does not cache the whole resource manifest because it would require too much memory. Instead, it stores only
resource identifiers and relationships between resources. The whole resource manifest or any other resource metadata should
be cached by the component user using event handlers.

The component watches only the preferred version of each resource API. So resource object passed to the event handlers has the
structure of the preferred version.

The component is responsible for the handling of following Kubernetes API edge cases:

Resources of the deprecated extensions API group have duplicates in groups apps, networking.k8s.io, policy.
* The ReplicaSet from apps group might reference Deployment from the extensions group as a parent.
* The relationship between Service and Endpoint is not explicit: [kubernetes/#28483](https://github.com/kubernetes/kubernetes/issues/28483)
* The relationship between ServiceAccount and Token is not explicit.
* Resources of Openshift deprecated groups authorization.openshift.io and project.openshift.io create
duplicates in rbac.authorization.k8s.io and core groups.

#### Top-Level Component APIs

The listing below represents top-level API exposed by cluster cache component:

```golang
// ResourceID is a unique resource identifier.
type ResourceID struct {
// Namespace is empty for cluster-scoped resources.
Namespace string
Name string
Group string
Kind string
}

type ListOptions struct {
// A selector to restrict the list of returned objects by their labels.
Selector metav1.LabelSelector
// Restricts list of returned objects by namespace. If not empty the only namespaced resources are returned.
Namespaces []string
// If set to true then only namespaced object are returned.
NamespacedOnly bool
// If set to true then only cluster level object are returned.
ClusterLevelOnly bool
// If set to true then only objects without owners are returned.
TopLevelOnly bool
}

// Cache provides a set of methods to access the cached cluster's state. All methods are thread safe.
type Cache interface {
// List returns a list of resource ids which match the specified list options
List(options ListOptions) ([]ResourceID, error)
// GetResourceAPIMetadata returns API ( metav1.APIResource includes information about supported verb, namespaced/cluster level etc)
GetResourceAPIMetadata(gk schema.GroupKind) (metav1.APIResource, error)
// IterateChildTree builds a DAG using parent-child relationships based on ownerReferences resource field and
// traverse resources in a topological order starting from specified root ids
IterateChildTree(roots []ResourceID, action func(key ResourceID) error) error
}
```

The Cache interface methods are serving following use cases:

List:
* Returns resources managed by Argo CD/Flux. Typically top-level resources labeled with a special label.
* Returns orphaned namespace resources. This will enable the Argo CD feature of warning the user if a namespace has any unmanaged resources.

GetResourceAPIMetadata:
* Answers whether a resource namespace-scoped or cluster-scoped. This is useful in two cases:
* to gracefully handle user errors when cluster-level resource Git have namespace. This is incorrect, but kubectl gracefully handles such errors.
* set fallback namespace to namespaced resources without namespace
* Helps to create a dynamic K8s client and specify the resource/kind.

IterateChildTree:
* The method allows Argo CD to get information about resources tree which is used to visualize cluster state in the UI

#### Customizations
The listing below contains a set of data structures that allows customizing caching behavior.

```golang
type ResourceFilter struct {
APIGroups []string
Kinds []string
}

type ResourcesAPIFilter struct {
// ResourceExclusions holds the api groups, kinds which should be excluded
ResourceExclusions []ResourceFilter
// ResourceInclusions holds the only api groups that should be included. Assumes that everything is included in empty.
ResourceInclusions []ResourceFilter
}

// ResourceEventHandlers is a set of handlers which are executed when resources updated/created/deleted.
type ResourceEventHandlers struct {
OnCreated func(obj *unstructured.Unstructured)
OnUpdated func(updated *unstructured.Unstructured)
OnDeleted func(key ResourceID)
}

// Settings contains list of parameters which customize caching behavior
type Settings struct {
Filter ResourcesAPIFilter
EventHandlers ResourceEventHandlers
Namespaces []string
ResyncPeriod time.Duration
}

func NewClusterCache(config *rest.Config, settings Settings) (Cache, error)
```

ResourceEventHandlers:
A set of callbacks that are executed when cache changes. Useful to collect and cache additional information about resources, for example:
* Cache whole manifest of a managed resource to use it later for reconciliation
* Cache resource metadata such as a list of images or health status.

ResourceAPIFilter:
Enables limiting the set of monitored resource APIs.

Namespaces:
Allows switching component into namespace only mode. If one or more namespaces are specified then component ignore cluster level resources and watch only resources in the specified namespaces.

NOTE: Kubernetes API allows to list/watch resources only in one namespace or the whole cluster. So if more than one namespace is specified then component have to start separate set of watches for each namespace.

ResyncPeriod:
Specifies interval after which cluster cache should be automatically invalidated.

#### Health Assessment (optionally)

The health assessment subcomponent provides the ability to get health information about a given resource. The health assessment package is not directly related to caching but helps to leverage functionality provided by caching and thus its proposed for inclusion into the caching component..

The health assessment logic is available in package argoproj/argo-cd/util/health and includes the following features:
* Support for several Kubernetes built-in resources such as Pod, ReplicaSet, Pod, Ingress and few others
* A framework that allows customizing health assessment logic using Lua script. Framework includes testing infrastructure.

The health information is represented by the following data structure:

```golang
type HealthStatus struct {
Status HealthStatusCode
Message string
}
```

The health status might take one of the following values:
* Healthy/Degraded - self explanatory
* Progressing - the resource is not healthy yet but there is still a chance to become Healthy.
* Unknown - the health assessment failed. The error message is in the `Message` field.
* Suspended - the resource is neither progressing nor degraded. For example Deployment is considered suspended if `spec.paused` field is set to true.
* Missing - the expected resource is missing

The library API is represented by a single method:

```golang
type HealthAssessor interface {
GetResourceHealth(obj *unstructured.Unstructured) (*HealthStatus, error)
}
```

#### Additional Considerations

The live state cache could be useful for the docker-registry monitoring feature: the `OnUpdated` resource event handler can be used to maintain a images pull secrets. However, if the docker registry part is extracted into a separate binary we would have to run a separate instance of a cluster cache which means 2x more Kubernetes API calls. The workaround would be to optionally point Docker Registry Monitor to Flux?

## Reconciliation [WIP]
## Access to Git repositories [WIP]
## Manifest Generation [WIP]
## Resources reconciliation [WIP]
## Sync Planning [WIP]
2 changes: 1 addition & 1 deletion specs/design-top-down.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# GitOps Engine Design - Top Down
# GitOps Engine Design - Top Down [WIP]

## Summary

Expand Down

0 comments on commit 53ff80f

Please sign in to comment.