-
Notifications
You must be signed in to change notification settings - Fork 669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Plugin] Flux Operator #3829
Comments
@vsoch this is great to see, happy to help where we can! So the main changes that need to happen to add an a new plugin:
There are a few relatively recent examples that lay a great foundation for how backend plugins can be added. This is a comment laying out the PRs that added the dask plugin and here is the issue for the ray plugin. I would advise you to take a look through the process that both of those plugins went through and I would be happy to fill in the gaps. |
Also @vsoch please join slack.flyte.org I am sure the community would love this and love to help. We would also love to learn what you plan to do with it. |
@vsoch This is great to see. Please let us know if any further question arises in the process, we'd like to see this integration happening too. Also, we host a bi-weekly contributor meetup where all the maintainers, steering committee members, and new/existing contributors discuss ideas. It'd be great to have you there. More info here |
heyo! Apologies for the small silence - I had 3x the amount of normal meetings this week and didn't have enough time to program! I have this on my TODO and worst case will be a few weeks away - I will definitely keep you in the loop. Thank you for keeping the issue open! |
okay I'm starting with flyteidl. My question is pretty simple - how do I know what fields to create here? syntax = "proto3";
import "flyteidl/core/tasks.proto";
package flyteidl.plugins;
option go_package = "github.com/flyteorg/flyteidl/gen/pb-go/flyteidl/plugins";
// Custom Proto for Dask Plugin.
message DaskJob {
// Spec for the scheduler pod.
DaskScheduler scheduler = 1;
// Spec of the default worker group.
DaskWorkerGroup workers = 2;
}
// Specification for the scheduler pod.
message DaskScheduler {
// Optional image to use. If unset, will use the default image.
string image = 1;
// Resources assigned to the scheduler pod.
core.Resources resources = 2;
}
message DaskWorkerGroup {
// Number of workers in the group.
uint32 number_of_workers = 1;
// Optional image to use for the pods of the worker group. If unset, will use the default image.
string image = 2;
// Resources assigned to the all pods of the worker group.
// As per https://kubernetes.dask.org/en/latest/kubecluster.html?highlight=limit#best-practices
// it is advised to only set limits. If requests are not explicitly set, the plugin will make
// sure to set requests==limits.
// The plugin sets ` --memory-limit` as well as `--nthreads` for the workers according to the limit.
core.Resources resources = 3;
} It looks like it's hitting a tiny subset of resources here: https://kubernetes.dask.org/en/latest/operator_resources.html and is this an overlapping set between what flyte needs and dask? What about the others in the CRD? |
@vsoch any updates on this? |
I don’t think anyone ever answered my question? |
Hello 👋, this issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will engage on it to decide if it is still applicable. |
Hi! 👋 I develop the Flux operator https://flux-framework.org/flux-operator/ which conceptually is like the MPI operator, but it brings up a Flux Framework cluster (that acts as a job) to run a scoped piece of work, akin the MPI operator. I'm interested in adding it as a plugin (and can also do the development work for it) but I wanted to check first about the order of operations.
I had first cloned https://github.com/flyteorg/flyteplugins, and I started adding the operator under k8s until I noticed that the others (e.g., dask) had a DaskJob that is also defined under the flyteidl repository. So I think the correct order of operations (and what I want to check here) is:
Are there any more pieces? Thanks for the help! I tried flyte out this week and really loved it - it already has support for several CRD I've been hoping to see in one place, so I'm eager to see support for our operator here as well.
The text was updated successfully, but these errors were encountered: