-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatic resource pruning EP #109
Conversation
Signed-off-by: Ryan King <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass review. A few typos. I will do a more thorough review next week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good work. Some input from my side
|
||
#### Story 4 | ||
|
||
As an operator author, I want to prune a custom resources with specific status information when there is a certain |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- "a custom resource"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure that we want to limit it to "status" informtaion. Fields in spec may in addition also get considered, a class for VolumeSnapshotContent for instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not take status
to mean Kubernetes status
field. But a more generic status as in a specific condition or state. @ryantking thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not take status
to mean Kubernetes status
field. But a more generic status as in a specific condition or state. @ryantking thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in practice it will be contained in the status
field, but the best practice for where to store operator-maintained data is out of scope for this EP so yes this can refer to any generic state information stored on the resource.
type IsPruneableFunc func(obj *runtime.Object) error | ||
|
||
// RegisterIsPruneableFunc registers a function to check whether it is safe to prune a resources of a certain type. | ||
func RegisterIsPrunableFunc(gvk schema.GroupVersionKind, isPruneable IsPruneableFunc) { /* ... */ } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure why IsPruneableFunc get registered and StrategyFunc not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think because IsPrunableFunc
has intimate knowledge of the GVK. While the Strategy is agnostic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finished my review. A few suggested rephrasings. Nicely done.
- Swap word "ephemeral" out for "unbounded" - Add alternative Go API proposal. Signed-off-by: Ryan King <[email protected]>
Signed-off-by: Ryan King <[email protected]>
Signed-off-by: Ryan King <[email protected]>
Signed-off-by: Ryan King <[email protected]>
@ryantking the API proposed in Appendix A was easier for me to rationalize and understand. I was able to easily picture a workflow for that API. The API proposed in Appendix B was definitely more flexible but the workflow felt more awkward to me. Like it was more complicated but for little gain in the short term. Basically I could not think of a scenario that was not doable with the API from Appendix A. |
@ryantking |
|
||
### Implementation-specific | ||
|
||
- What type of Kubernetes object should we generically work with? E.g. `metav1.Object`or `runtime.Object`? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My vote would be client.Object
from controller-runtime, which is:
type Object interface {
metav1.Object
runtime.Object
}
That interface supports all well-behaved root API types and users get compile-time checks rather than having to resort to runtime type assertions when they need to access the other half of the object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joelanford will this work with an operator that was written with library-go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know for sure, but I would think so. You can pass a client.Object
to any function that accepts a metav1.Object
or a runtime.Object
, and all API root types (e.g. custom objects of CRDs and all built-in API types that are apply-able on-cluster) implement both interfaces.
An alternative approach would be adding this logic to the core SDK and scaffolding it optionally during operation | ||
generation. The primary drawbacks with this approach are the increased complexity to the implementation and adding it to | ||
existing operators. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A second alternative would be to implement a brand new operator that exposes a new set of prune APIs, e.g.:
PruneStrategy
ClusterPruneStrategy
Where the spec could look something like:
spec:
objects:
- group: example.com
version: v1
kind: MyKind
matchers:
- selector:
matchLabels:
example.com/is-completed: true
- maxAge: 10h
default:
matchers:
- selector:
matchLabels:
example.com/is-completed: true
- maxCount: 50
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joelanford would this operator be deployed with each operator? I can't see how a new operator would help individual operators cleanup their resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would likely be deployed once per cluster, and then two scenarios would be in play:
- each operator could self-configure pruning by laying down one of the prune strategy objects, either as part of the operator deployment itself or associated with an operand
- a cluster admin could provide their own prune strategy objects to cover their specific needs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The feature already exists for jobs and TTL. Here is the KEP, which foresees that it can be extended to pods.
IMHO a drawback of a generic controller is that it is either limited to the greatest common factor, which basically means the resource metadata or you need to implement special logic for each resource.
A good example of that it the quota controller. Generic is quota on count of instances. And then you have specialized components for pods, services and PVCs with specific logic for resource requests and limits.
Having the specialized logic together with the controller that created the resources may offer more flexibility to the operator author.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fgiloux Yep, all good points. I'm not necessarily suggesting we should actually pursue the alternative, but it may be helpful to consider it in whatever design we come up with.
The main motivation for something like a separate controller is to make it extremely easy for an operator author to say, "here, prune this stuff in this way" without having to worry about the details and mechanics of how all that happens.
We could implement this as a library that operator authors plug into their controllers as well. For example, perhaps they would just add something in main.go
like this:
maxAge := time.Minute * 60
if err := prune.Prune(ctx, mgr, prune.Config{
Selectors: prune.Selectors{
&myapi.OtherKind{}: prune.MaxAge(maxAge),
},
PruneInterval: 5 * time.Minute,
}); err != nll {
setupLog.Error(err, "pruner failed")
os.Exit(1)
}
where the prune package has something like:
package prune
type Selectors map[client.Object]Selector
type Selector interface {
Select(ctx context.Context, in []client.Object) ([]client.Object, error)
}
type SelectorFunc func(ctx context.Context, in []client.Object) ([]client.Object, error)
func (f SelectorFunc) Select(ctx context.Context, in []client.Object) ([]client.Object, error) {
return f(ctx, in)
}
func MaxAge(maxAge time.Duration) SelectorFunc {
return SelectorFunc(func(ctx context.Context, in []client.Object) ([]client.Object, error) {
var out []client.Object
for _, obj := range in {
if in.GetCreationTimestamp().Before(time.Now().Add(-maxAge)) {
out = append(out, obj)
}
}
return out
})
}
// IsPruneableFunc is a function that checks a the data of an object to see whether or not it is safe to prune it. | ||
// It should return `nil` if it is safe to prune, `ErrUnpruneable` if it is unsafe, or another error. | ||
// It should safely assert the object is the expected type, otherwise it might panic. | ||
type IsPruneableFunc func(obj *runtime.Object) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to be able to pass a logger. Either explicitly (my preference) or through the context as controller-runtime does. A logger should be passed to StrategyFunc and Prune which have context as parameter, in the same way.
@ryantking I have had a look at the latest version of the enhancement proposal and it looks good to me. The only things that I have slight concerns with are:
|
@fgiloux As far as the tenancy aspect goes, in the example of nightly pipelines vs release pipelines, I think that you could solve it with an func pipelineIsPruneable(obj client.Object) error {
pipeline, ok := obj.(*pipelinesv1.Pipeline)
// check assertion
if isRelease(pipeline) and age(pipeline) < 3 * time.Week {
return ErrUnprunable
}
// additional logic
return nil
} Would that pseudo-code work in theory? Another option for multi-tenancy is to create multiple Re: logging, I'll make sure the proposal incorporates the logging changes you made to the current implementation. |
@ryantking I think it is important to consider personas:
|
|
||
// Pruner is an object that runs a prune job. | ||
type Pruner struct { | ||
// ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having had a closer look at the EP I would very much like to see what you intend to have in type Pruner struct
to understand what can be configurable through type PrunerOption func(p *Pruner)
@fgiloux I'm still struggling to understand which specific design choices in the proposed API limit the use cases you are laying out. If you are giving me use cases that you want to make sure are covered, then can you present them as user stories so I can add them to the EP? If there are use cases that you think are not possible with the proposed API then I think we should talk offline in greater detail to figure out what changes I need to make. The way I look at it, there are two real ways the user implements prune logic:
A concrete example would be an Let me know your thoughts or feel free to reach out if you want to talk in more detail. |
@ryantking I am fine with the proposal. Mind that I have little karma on this repository as I am not part of the Operator Framework organisation. |
Signed-off-by: Ryan King [email protected]