Skip to content
This repository has been archived by the owner on Jun 8, 2022. It is now read-only.

add trait workload interaction design #23

Merged
merged 1 commit into from
May 18, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
204 changes: 204 additions & 0 deletions design/one-pager-trait-workload-interaction-mechanism.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,204 @@
# Traits and workloads interaction mechanism in OAM

* Owner: Ryan Zhang (@ryanzhang-oss)
* Reviewers: Crossplane Maintainers
* Status: Draft

## Terminology

* **CRD (Custom Resource Definition)** : A standard Kubernetes Custom Resource Definition
* **CR (Custom Resource)** : An instance of a Kubernetes type that was defined using a CRD
* **GVK (Group Version Kind)** : The API Group, Version, and Kind for a type of Kubernetes
resource (including CRDs)
* **Workload child resources** : The Kubernetes resources generated by a workload controller. They
should all have a controller reference pointing to the parent workload instance.

## Background
Traits and workloads are two major types of resources in OAM. Traits usually affect how a
kubernetes resource operate either directly (through spec change) or indirectly (add ingress or
sidecar). However, the current OAM implementation does not contain a generic mechanism for traits
to locate the corresponding resource to modify.

We will use the following hypothetical OAM application as the baseline to illustrate the problem
and our solution.

```yaml
apiVersion: core.oam.dev/v1alpha2
kind: WorkloadDefinition
metadata:
name: mydbs.standard.oam.dev
spec:
definitionRef:
name: mydbs.standard.oam.dev
---
apiVersion: core.oam.dev/v1alpha2
kind: TraitDefinition
metadata:
name: manualscalertraits.core.oam.dev
spec:
definitionRef:
name: manualscalertraits.core.oam.dev
---
apiVersion: core.oam.dev/v1alpha2
kind: Component
metadata:
name: example-db
spec:
workload:
apiVersion: standard.oam.dev/v1alpha2
kind: Mydb
metadata:
name: mydb-example
spec:
containers:
- name: mysql
image: mysql:latest
---
apiVersion: core.oam.dev/v1alpha2
kind: ApplicationConfiguration
metadata:
name: example-appconfig
spec:
components:
- componentName: example-db
traits:
- trait:
apiVersion: core.oam.dev/v1alpha2
kind: ManualScalerTrait
metadata:
name: example-appconfig-trait
spec:
replicaCount: 3
```

The problem is two folds
1. A trait controller needs a way to find the workload that it is applied to.
- In the example, the manual scalar trait needs to know that it is supposed to scale the
example-db workload. However, we want to keep the applicationConfiguration controller
agnostic to the schema of any `trait` or `workload` it generates to make it extensible.
Thus, the applicationConfiguration controller needs to emit a `ManualScalerTrait` CR that
contains a reference to the `example-db` workload without knowing the trait's specific schema.

2. A trait controller needs to know the exact resources it should modify. Note that these
resources are most likely not the workload itself.
- Use the same example, just knowing the `example-db` workload is not enough for the
`ManualScalerTrait` to work. The trait controller does not work with the `example-db` workload
directly. It needs to find the actual Kubernetes resources that the `example-db`
workload generates and then it can modify the `replica` field in its spec.

## Goals
In order to maximize the extensibility of our OAM implementation, our solution need to meet the
following two design objections.
1. **Extensible trait system**: We want to allow a `trait` to apply to any eligible `workload`
instead of just a list of specific ones. This means that we want to empower a trait developer to
write the controller code once, and it will work for any new `workload` that this `trait` can
apply to in the future.
- Using the example again, the `ManualScalerTrait` should work with any workload that
generates a Kubernetes resource that has a `replica` field in its spec even if the
workload does not exist when the `ManualScalerTrait` is implemented.
2. **Adopting existing CRDs**: The mechanism cannot put any limit on the `trait` or `workload
` CRDs. This means that we cannot assume any pre-defined CRD fields in any `trait` or `workload
` beyond Kubernetes conventions (i.e. spec or status).
- For example, the following `EtcdBackup` operator can be used as a `trait` in an OAM
application to apply to an `EtcdCluster` workload. Here, the `etcdEndPoints` field in
the trait signals to which `workload` it applies, and we need to accommodate
this type of `trait`.
```yaml
apiVersion: "etcd.database.coreos.com/v1beta2"
kind: "EtcdBackup"
metadata:
name: example-etcd-cluster-backup
spec:
etcdEndpoints: [<etcd-cluster-endpoints>]
storageType: S3
s3:
path: <full-s3-path>
awsSecret: <aws-secret>
```


## Proposal
The overall idea is for the applicationConfiguration controller to fill critical information
in the workload and trait CR it emits. In addition, we will provide a helper library so that
trait controller developers can locate the resources they need with a simple function call.
Here is the list of changes that we propose.
1. ApplicationConfig controller no longer assumes that all `trait` CRDs contain a "spec
.workloadRef" field conforms to the OAM definition. It only fills the workload GVK to a `trait
` CR if its CRD has a "spec.workloadRef" field defined as below.
```yaml
workloadRef:
properties:
apiVersion:
type: string
kind:
type: string
name:
type: string
required:
- apiVersion
- kind
- name
type: object
```
2. Add a `childResourceKinds` field in the WorkloadDefinition.
Currently, a workloadDefinition is nothing but a shim of a real workload CRD. We propose to add
an **optional** field called `childResourceKinds` to the schema of the workloadDefinition. We encourage
workload owners to fill in this field when they register their controllers to the OAM system.
This is the way for them to declare the types of the Kubernetes resources their workload
controller actually generates. In our example, the workload definition can claim to generate
deployment and service child resources.
```yaml
apiVersion: core.oam.dev/v1alpha2
kind: WorkloadDefinition
metadata:
name: mydb.standard.oam.dev
spec:
definitionRef:
name: mydb.standard.oam.dev
childResourceKinds:
- apiVersion: apps/v1
kind: Deployment
- apiVersion: v1
kind: Service
```
3. OAM runtime will provide a helper library. The library follows the following logic to help a
trait developer locate the resources for the trait to modify.
1. Get the corresponding `workload` instance from the Kubernetes cluster with the information
inserted by the application controller in the `trait` CR.
2. Fetch the corresponding `workloadDefinition` CR following an
[OAM convention](https://github.com/oam-dev/spec/blob/master/3.workload.md#definitionref).
The convention requires that the name of the `workloadDefinition` CR is the name of the
`workload` CRD it refers to. For example, the name of the `workloadDefinition` CR
that refers to a `containerizedworkloads.core.oam.dev` CRD is exactly
`containerizedworkloads.core.oam.dev` as well.
3. Fetch all the `childResourceKinds` values in the corresponding`workloadDefinition` instance.
4. List each child resource by its GVK and filter by owner reference. Here, we assume that
all the child resources that the workload controller generates have an controller reference
field pointing back to the workload instance.

## Impact to the existing system
Here are the impacts of this mechanism to the existing OAM components
- ApplicationConfiguration: This mechanism requires minimum changes in the
applicationConfiguration controller.
- Workload: This mechanism does not affect workload controller implementation.
- Trait: This mechanism is optional so all existing trait controller still works. This mechanism
requires modification to any existing trait that wants to take advantage of
extensibility of OAM. Any trait that only applies to a certain type of workload, such as
`EtcdBackup` trait, doesn't need to use this mechanism.
- WorkloadDefinition: workload owners can modify the existing workloadDefinition if needed.

## Alternative approach
1. One alternative approach is that we can make the applicationConfiguration controller watch all
the possible workload child resources. It also inserts the child resources GVK and name to the
corresponding workload CR. I would not recommend this approach as it increases the complexity
of the applicationConfiguration controller and makes more of availability liability.
2. Another approach is to implement a separate type for binding traits to workloads. This would
work, but it seems that label/annotation is a natural place to record the information. Otherwise
, we need a way for the trait to discover the binding instance first.

## Extra labels
There might be cases that a workload generates more than one resource with the same GVK and only
want to expose a subset of them to traits. In this case, we can add a pre-defined label such as
`core.oam.dev/expose=true` for the workload owner to indicate what resources to expose. This
section is just to illustrate that this is a solvable problem, and it's beyond the scope of this
proposal for now .