Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic resource pruning EP #109

Merged
merged 8 commits into from
May 3, 2022
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
196 changes: 196 additions & 0 deletions enhancements/automatic-resource-pruning.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,196 @@
---
title: Automatic Resource Pruning
ryantking marked this conversation as resolved.
Show resolved Hide resolved
authors:
- "@ryantking"
reviewers:
- "@jmrodri"
- "@gallettilance"
approvers:
- "@jmrodri"
creation-date:2022-01-28
last-updated: 2022-01-28
status: implementable
---

ryantking marked this conversation as resolved.
Show resolved Hide resolved
## Summary

This EP aims to provide an easy way for users to operator authors to limit the number of ephemeral resources that may
ryantking marked this conversation as resolved.
Show resolved Hide resolved
exist on the cluster at any point in time. In this context, we define an ephemeral resource as a short-lived resource
that is continuously created as the operator runs. Short-lived does not refer to a specific amount of time, rather, the
fact that the resource has a defined life span. For example, a web server running in a `Pod` is not ephemeral because it
will run until an external force such as a cluster administrator or CD pipeline acts on it while a log rotating script
running in a `Pod` is ephemeral since it will run for until it finishes its defined work. Operator authors will be able
ryantking marked this conversation as resolved.
Show resolved Hide resolved
to employ different strategies to limit the number of ephemeral resources such as adding a numerical limit, an age
limit, or something custom to the author's use case.
ryantking marked this conversation as resolved.
Show resolved Hide resolved

## Motivation

Often, operators will create ephemeral resources during execution. For example, if we imagine an operator that
implements Kubernetes' builtin `CronJob` functionality, every time the operator reconciles and finds a `CronJos` to run,
ryantking marked this conversation as resolved.
Show resolved Hide resolved
it creates a new `Job` type to represent a single execution of the `CronJob`. Users will often want to have access to
theses ephemeral resources in order to view historical data, but want to limit the number that can exist on the system.
Looking again at Kubernetes out-of-the-box functionalities, users can configure the retention policy for resources such
as `ReplicaSets` and `Jobs` to maintain a certain amount of historical data. Operator authors should have a defined path
for implementing the same functionality for their operators.

### Goals

- Add a library to [operator-lib](https://github.com/operator-framework/operator-lib) that houses this functionality
with a user-friendly API.
- Add a happy path for what we determine to be common use cases, such as removing `Pods` in a finished state if they are
older than a set age.
ryantking marked this conversation as resolved.
Show resolved Hide resolved
- Provide an easy way for operator authors to plug in custom logic for their custom resource types and use cases.

### Non-Goals

- Add auto-pruning to any of the templates or scaffolding functionality.
- Adding auto-pruning support for Helm or Ansible operators.

## Proposal

The proposed implementation is adding a package, `prune`, to
[operator-lib](https://github.com/operator-framework/operator-lib) that exposes this functionality. There will be a
primary entry point function that takes in configuration and prunes resources accordingly. The configuration will accept
one or many resource types, a pruning strategy, namespaces, label selectors, and other common settings such as a dry run
mode, hooks, and logging configuration.

Another aspect of the library will be determining when it can and cannot prune a resource. For example, the prune
functionality should not remove a running `Pod` until it has completed, even if it meets the strategy criteria. The
library will expose a way for operator authors to specify what criteria makes a resource of a specific type safe to
prune; this can include checking annotations, a status, or any other data available on the resource. An important
distinction to draw is the difference between a strategy and checking whether it is safe to prune a resource (henceforth
called the "is-pruneable" functionality). Strategies look generically at collections of resources and decide which
resources in the collection to prune, if any. They should only take criteria common to all Kubernetes resources into
account such as the count of resources and creation timestamp. The is-pruneable functionality conversely looks at one
and only one resource at a given point in time to determine whether or not the current prune task can remove a resource
based on its specific data.

Note that stategies will not be programmatically limited to being resource agnostic, but it will be a defined best
practice to write strategies in such a way. One exception to this recommendation will be when the operator author wants
to prune based on a cumulative value such as the summation of a field across multiple resources. The operator author
must then be sure to add safe guards to the strategy to avoid unexpected behavior if used with an incompatible resource.

### User Stories

#### Story 1

An an operator author, I want to limit the number of `Jobs` in a completed state on the cluster so that the long term
storage cost of my operator has a cap.

ryantking marked this conversation as resolved.
Show resolved Hide resolved
#### Story 2

As an operator author, I want to limit the number of `Pods` in a completed state on a the cluster and preserve the
`Pods` in an error state so that the long term storage cost of my operator has a soft cap and the user can see status
information about failed `Pods` before manually removing.
ryantking marked this conversation as resolved.
Show resolved Hide resolved

#### Story 3

As an operator author, I want to limit the number of `Pods`with a custom heuristic based on the creation timestamp so
that the long term storage cost of my operator has a cap based on my operator's logic.

ryantking marked this conversation as resolved.
Show resolved Hide resolved
#### Story 4

As an operator author, I want to prune a custom resources with specific status information when there is a certain
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • "a custom resource"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that we want to limit it to "status" informtaion. Fields in spec may in addition also get considered, a class for VolumeSnapshotContent for instance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not take status to mean Kubernetes status field. But a more generic status as in a specific condition or state. @ryantking thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not take status to mean Kubernetes status field. But a more generic status as in a specific condition or state. @ryantking thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in practice it will be contained in the status field, but the best practice for where to store operator-maintained data is out of scope for this EP so yes this can refer to any generic state information stored on the resource.

number so that the long term storage cost of my operator has a cap.

#### Story 5

As an operator author, I want to prune both `Jobs` and `Pods` of a certain age so that the long term storage cost of my
operator has a cap and there are no orphaned resources.

ryantking marked this conversation as resolved.
Show resolved Hide resolved
### Implementation Details

- A strategy is a function that takes in a collection of resources and returns a collection of resources to remove.
- The identifier for a resource types will be `GroupVersionKind` value.
- An is-pruneable function takes in one instance of a resource and returns an error value that indicates if it is safe
to prune or has other issues.
- The library will provide built-in is-pruneable functions for `Pods` and `Jobs` that can be overwritten.
- A registry will hold a mapping of resource types (`GVKs`) to is-pruneable functions.

A proposed go API is in [Appendix A](#appendix-a).

### Risks and Mitigations

The primary risk associated with this EP is exposing too many knobs and features in the day 1 implementation. We can
mitigate this by only exposing functionality that is absolutely needed. APIs are easy to grow, but near-impossible to
shrink.

## Design Details

### Test Plan

The following components will be unit tested:

- The builtin strategies.
- The builtin is-prunable functions.
- The main prune routine.

The feature author will add an integration test suite that runs the prune routine in the use cases defined in the user
stories.

## Implementation History

[operator-framework/operator-lib#75](https://github.com/operator-framework/operator-lib/pull/75): Implements a first
pass of the prune package with only support for `Jobs` and `Pods`. The API is also slightly different than the one
proposed in this EP.

## Drawbacks

The user will need to manually integrate this functionality into their operator since it is a library.

## Alternatives

An alternative approach would be adding this logic to the core SDK and scaffolding it optionally during operation
generation. The primary drawbacks with this approach are the increased complexity to the implementation and adding it to
existing operators.
Comment on lines +162 to +164
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A second alternative would be to implement a brand new operator that exposes a new set of prune APIs, e.g.:

  • PruneStrategy
  • ClusterPruneStrategy

Where the spec could look something like:

spec:
  objects:
    - group: example.com
      version: v1
      kind: MyKind
      matchers:
        - selector:
            matchLabels:
              example.com/is-completed: true
        - maxAge: 10h
  default:
    matchers:
      - selector:
          matchLabels:
            example.com/is-completed: true
      - maxCount: 50
  

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joelanford would this operator be deployed with each operator? I can't see how a new operator would help individual operators cleanup their resources.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would likely be deployed once per cluster, and then two scenarios would be in play:

  1. each operator could self-configure pruning by laying down one of the prune strategy objects, either as part of the operator deployment itself or associated with an operand
  2. a cluster admin could provide their own prune strategy objects to cover their specific needs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The feature already exists for jobs and TTL. Here is the KEP, which foresees that it can be extended to pods.

IMHO a drawback of a generic controller is that it is either limited to the greatest common factor, which basically means the resource metadata or you need to implement special logic for each resource.
A good example of that it the quota controller. Generic is quota on count of instances. And then you have specialized components for pods, services and PVCs with specific logic for resource requests and limits.

Having the specialized logic together with the controller that created the resources may offer more flexibility to the operator author.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fgiloux Yep, all good points. I'm not necessarily suggesting we should actually pursue the alternative, but it may be helpful to consider it in whatever design we come up with.

The main motivation for something like a separate controller is to make it extremely easy for an operator author to say, "here, prune this stuff in this way" without having to worry about the details and mechanics of how all that happens.

We could implement this as a library that operator authors plug into their controllers as well. For example, perhaps they would just add something in main.go like this:

maxAge := time.Minute * 60
if err := prune.Prune(ctx, mgr, prune.Config{
	Selectors: prune.Selectors{
		&myapi.OtherKind{}: prune.MaxAge(maxAge),
	},
    PruneInterval: 5 * time.Minute,
}); err != nll {
	setupLog.Error(err, "pruner failed")
	os.Exit(1)
}

where the prune package has something like:

package prune

type Selectors map[client.Object]Selector

type Selector interface {
	Select(ctx context.Context, in []client.Object) ([]client.Object, error)
}

type SelectorFunc func(ctx context.Context, in []client.Object) ([]client.Object, error)
func (f SelectorFunc) Select(ctx context.Context, in []client.Object) ([]client.Object, error) {
	return f(ctx, in)
}

func MaxAge(maxAge time.Duration) SelectorFunc {
	return SelectorFunc(func(ctx context.Context, in []client.Object) ([]client.Object, error) {
		var out []client.Object
		for _, obj := range in {
			if in.GetCreationTimestamp().Before(time.Now().Add(-maxAge)) {
				out = append(out, obj)	
			}
		}
		return out
	})
}


## Open Questions

- What are the predefined use cases that we want to support? Currently we support pruning completed `Jobs` and `Pods` by
age and max count.

### Implementation-specific

- What type of Kubernetes object should we generically work with? E.g. `metav1.Object`or `runtime.Object`?
Copy link
Member

@joelanford joelanford Feb 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My vote would be client.Object from controller-runtime, which is:

type Object interface {
	metav1.Object
	runtime.Object
}

That interface supports all well-behaved root API types and users get compile-time checks rather than having to resort to runtime type assertions when they need to access the other half of the object.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joelanford will this work with an operator that was written with library-go?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know for sure, but I would think so. You can pass a client.Object to any function that accepts a metav1.Object or a runtime.Object, and all API root types (e.g. custom objects of CRDs and all built-in API types that are apply-able on-cluster) implement both interfaces.

- How do we specify which Kubernetes objects to delete? Pass back another list of objects? We just need name, namespace,
and `GVK`.
- Which Kubernetes client should we work with? Dynamic client due to custom resource types?
- Should we register `IsPruneable` functions or a `ResourceConfig` structure that will hold that function and
potentially additional configuration.

## Appendix A

The following is the proposed Go API:

```go
// StrategyFunc takes a list of resources and returns the subset to prune.
type StrategyFunc func(ctx context.Context, objs []runtime.Object) ([]runtime.Object, error)

// ErrUnpruneable indicates that it is not allowed to prune a specific object.
type ErrUnpruneable struct {
Obj *runtime.Object
Reason string
}

// IsPruneableFunc is a function that checks a the data of an object to see whether or not it is safe to prune it.
ryantking marked this conversation as resolved.
Show resolved Hide resolved
// It should return `nil` if it is safe to prune, `ErrUnpruneable` if it is unsafe, or another error.
// It should safely assert the object is the expected type, otherwise it might panic.
type IsPruneableFunc func(obj *runtime.Object) error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to be able to pass a logger. Either explicitly (my preference) or through the context as controller-runtime does. A logger should be passed to StrategyFunc and Prune which have context as parameter, in the same way.


// RegisterIsPruneableFunc registers a function to check whether it is safe to prune a resources of a certain type.
func RegisterIsPrunableFunc(gvk schema.GroupVersionKind, isPruneable IsPruneableFunc) { /* ... */ }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure why IsPruneableFunc get registered and StrategyFunc not

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think because IsPrunableFunc has intimate knowledge of the GVK. While the Strategy is agnostic


// Pruner is an object that runs a prune job.
type Pruner struct {
// ...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having had a closer look at the EP I would very much like to see what you intend to have in type Pruner struct to understand what can be configurable through type PrunerOption func(p *Pruner)

}

// PrunerOption configures the pruner.
type PrunerOption func(p *Pruner)

// NewPruner returns a pruner that uses the given startegy to prune objects.
ryantking marked this conversation as resolved.
Show resolved Hide resolved
func NewPruner(client dynamic.Interface, opts ...PrunerOption) Pruner { return Pruner{} }

// Prune runs the pruner.
func (p Pruner) Prune(ctx Context) error { return nil }
```