[Feature] Support Volcano for batch scheduling #755

tgaddair · 2022-11-22T20:45:27Z

This PR is based on the branch created by @loleek and the Spark Operator integration with Volcano.

Why are these changes needed?

To enable advanced multi-tenancy features for running multiple Ray cluster in a shared environment.

Related issue number

Closes #697

Checks

I've made sure the tests are passing.
Testing Strategy
- Unit tests
- Manual tests
- This PR is not tested :(

kevin85421 · 2022-11-22T20:55:29Z

Thank @tgaddair for this contribution!

In my opinion, KubeRay shouldn't be too opinionated about external tools. Maybe the better solution is to make sure the exposed interfaces are enough for users to configure which tools they want to use. cc @DmitriGekhtman @Jeffwan

DmitriGekhtman · 2022-11-22T21:05:44Z

Thank @tgaddair for this contribution!

In my opinion, KubeRay shouldn't be too opinionated about external tools. Maybe the better solution is to make sure the exposed interfaces are enough for users to configure which tools they want to use. cc @DmitriGekhtman @Jeffwan

I generally agree.

Counterarguments in this case:

From what I can tell, Volcano is pretty well established at this point as the go-to thing for batch scheduling. (The spark operator has built-in support.)
The change is not too heavy-weight and is guarded by a flag. (So users still have to opt-in.)
This might be the most minimal change for natural support.

make sure the exposed interfaces are enough for users to configure which tools they want to use

@kevin85421 do you have an idea for a less invasive interface change that could work here?

DmitriGekhtman · 2022-11-22T21:16:12Z

It does appear that the same functionality could be achieved with some boilerplate -- namely, adding the relevant annotations, configuring the podgroup yourself, and kubectl-applying a yaml file structured as

podgroup
---
raycluster

kevin85421 · 2022-11-22T21:34:19Z

From what I can tell, Volcano is pretty well established at this point as the go-to thing for batch scheduling. (The spark operator has built-in support.)

I know it is a well-established tool, but it also increases the complexity of the KubeRay operator. We need to write down the comparison between different solutions for Volcano integration and choose the best one.

In addition, the Spark operator is weird for Spark folks (at least for the part of Spark). For example, it will create a driver for each Spark job (In most production cases, a cluster will only have 1 driver). Hence, the spark operator is an unusual operator for me, so I am not sure if it is a good example or not.

The change is not too heavy-weight and is guarded by a flag. (So users still have to opt-in.)

Currently, it is not heavy-weight. However, if we integrate a lot of external tools into the KubeRay operator, it will be heavy.

This might be the most minimal change for natural support.

@kevin85421 do you have an idea for a less invasive interface change that could work here?

No, I have no context about Volcano. That's why I want to see the comparison between different solutions.

I did not have a strong opinion about how to integrate Volcano, but I hope to see the comparison so that we can choose the best one.

DmitriGekhtman · 2022-11-22T21:43:43Z

These are great points!

it also increases the complexity of the KubeRay operator

The operator does have enough issues as it is :)

Yeah, if it's possible for us to document how to do it oneself that would be great.

I imagine a developer building infrastructure on top of KubeRay could write their own operator to manage a FancyProprietaryRayCluster custom resource. This operator could create a RayCluster and some extra stuff (ingresses, volcano podgroups, whatever) for each FancyProprietaryRayCluster.

Jeffwan · 2022-11-23T01:15:01Z

I will help review this as well. Please wait for my comments.

DmitriGekhtman · 2022-11-23T16:02:16Z

As @tgaddair explained yesterday, the reason "native" integration is required is that Volcano needs creation of a PodGroup with an owner reference to the RayCluster CR before the RayCluster's pods are reconciled.

I think all of this is reasonable. We should probably take some steps to mitigate the risk from increased code complexity.
Some things that would need to be done to merge this:

Use an annotation to toggle batch scheduling support, rather than an operator flag.
Express this feature in a modular way that would allow plugging in other batch schedulers.
Write some CI tests.
Do some manual testing and record the testing steps in this PR's comment thread.
Document this support in the "features" section of the docs. In that section, identify yourself as the maintainer of this integration (see the note here for an example.)

ray-operator/controllers/ray/raycluster_controller.go

Jeffwan

the overall looks good to me. I just left some minor comments.

I know some other users uses different solutions than volcano. but sig-scheduling is not able to unify the podgroup definition and normally it's hard to support multiples ones. volcano looks good to me since it's already CNCF project now

Jeffwan · 2022-11-29T06:38:12Z

ray-operator/controllers/ray/raycluster_controller.go

+	if EnableBatchScheduler {
+		var minMember int32
+		var totalResource corev1.ResourceList
+		if instance.Spec.EnableInTreeAutoscaling == nil || !*instance.Spec.EnableInTreeAutoscaling {


Should we use minMember to indicate min of gang or use a separate annotation?

For Volcano, it considers minMember to be the min gang size, which we set to be the total replica count (if not using autoscaling) or the total minReplica count (when using autoscaling).

Did you want to make this more configurable so that users can specify their own min gang size?

ray-operator/controllers/ray/utils/util.go

This reverts commit 22d71d9.

tgaddair · 2022-11-30T05:07:54Z

@Jeffwan thanks for the review! I made some updates to make the implementation more general, so we can add in additional schedulers in the future. This also makes the implementation consistent with the Spark Operator, which follows a similar convention. I'll take some time to address your comments and ensure tests are passing.

Co-authored-by: Dmitri Gekhtman <[email protected]> Signed-off-by: Travis Addair <[email protected]>

DmitriGekhtman · 2022-11-30T18:28:14Z

Last request from me:
For the doc page, could you document a simple example workflow that demonstrates the gang scheduling functionality?
That would help people understand the usefulness of this integration. It'd also serve as a manual (later, automated) e2e test of the integration.

docs/guidance/volcano-integration.md

…into volcano-int

tgaddair · 2022-12-01T05:02:40Z

Hey @DmitriGekhtman, thanks for the great feedback. I added a pretty complete end-to-end example in the docs, and verified it worked as written. Let me know what you think!

I believe this PR should be good to go at this point.

docs/guidance/volcano-integration.md

DmitriGekhtman · 2022-12-01T06:19:22Z

Looks great, one last suggestion concerning memory allocation for the Ray head pod.

tgaddair · 2022-12-01T17:50:42Z

Thanks @DmitriGekhtman, updated the example to use 2Gi of RAM for the head node.

DmitriGekhtman

Thanks for the excellent contribution.
This comes just in time for the KubeRay 0.4.0 branch cut!

sihanwang41 · 2022-12-01T18:19:31Z

docs/guidance/volcano-integration.md

+metadata:
+  name: kuberay-test-queue
+spec:
+  weight: 1


Could you explain weight in the doc?

sihanwang41 · 2022-12-01T18:24:03Z

docs/guidance/volcano-integration.md

+  name: kuberay-test-queue
+spec:
+  weight: 1
+  capability:


Is it possible we to update the queue capability when there is new pod group waiting for being scheduled?

Yes, will update docs to clarify this.

sihanwang41 · 2022-12-01T18:27:30Z

ray-operator/controllers/ray/batchscheduler/interface/interface.go

+	"sigs.k8s.io/controller-runtime/pkg/builder"
+)
+
+type BatchScheduler interface {


add some comments to explain the interface and all interface's functions.

sihanwang41 · 2022-12-01T18:27:36Z

ray-operator/controllers/ray/batchscheduler/interface/interface.go

+	AddMetadataToPod(app *rayiov1alpha1.RayCluster, pod *v1.Pod)
+}
+
+type BatchSchedulerFactory interface {


sihanwang41 · 2022-12-01T19:48:01Z

oops, looks like I leave comments late. @tgaddair feel free to have follow up pr to address, Thank you for the contribution!

DmitriGekhtman · 2022-12-01T19:55:37Z

Sorry, @sihanwang41, I was a bit trigger happy!

@tgaddair feel free to follow-up on Sihan's documentation comments in another PR!

tgaddair · 2022-12-01T19:57:30Z

Thanks @sihanwang41, will update in a follow-up that also fixes a typo in the example.

tgaddair · 2022-12-01T20:16:09Z

@sihanwang41 addressed comments in #776.

kevin85421 · 2022-12-01T20:31:30Z

The document is very detailed! Thank @tgaddair for this high-quality contribution!

I feel risky to merge this big feature into the master at the last minute of the v0.4.0 branch cut. Will we add this PR into release v0.4.0 or move it to the next release? cc @DmitriGekhtman @sihanwang41

Risk:

Although this feature is disabled by default, we do not have a chance to update the exposed interfaces if we release in v0.4.0.
We do not know the Kubernetes compatibility for Volcano. ([Feature] Support batch scheduling and queueing #213 (comment) => I am not sure about the accuracy of this comment, but we need to be serious about it.)

Any thoughts? Thanks!

Updated:

The exposed interface is simple (only 1 flag --enable-batch-scheduler).
Release managers will test Kubernetes compatibility before v0.4.0 release.

It is okay to include this PR in release v0.4.0 after we test it.

DmitriGekhtman · 2022-12-01T20:43:09Z

Thanks @kevin85421, these are reasonable concerns!
I think the key priority for the release will be testing compatibility of core, stable functionality (the operator and especially the RayCluster controller) against a range of Kubernetes versions. This integration is an alpha feature.

Also, eventually, we need automated tools to validate a Kubernetes compatibility matrix -- I think that's tracked somewhere.

Adds Volcano integration for KubeRay. Signed-off-by: Travis Addair <[email protected]> Co-authored-by: dengkai02 <[email protected]> Co-authored-by: Dmitri Gekhtman <[email protected]>

xubo245 · 2023-11-22T14:21:50Z

docs/guidance/volcano-integration.md

+
+## Run Ray Cluster with Volcano scheduler
+
+Add the `ray.io/scheduler-name: volcano` label to your RayCluster CR to submit the cluster pods to Volcano for scheduling.


Can kuberay support RayJob CRD integrate volcano?

dengkai02 and others added 2 commits November 22, 2022 12:43

Resolved conflicts

4f6dc0c

WIP: enable batch scheduler

80e0a93

Revert dockerfile

3bcadf4

DmitriGekhtman reviewed Nov 23, 2022

View reviewed changes

ray-operator/controllers/ray/raycluster_controller.go Outdated Show resolved Hide resolved

tgaddair added 2 commits November 24, 2022 12:55

WIP

510fa36

Added volcano roles

976e0b0

Jeffwan reviewed Nov 29, 2022

View reviewed changes

tgaddair added 10 commits November 29, 2022 12:40

WIP

30d2b1f

Added batch scheduler

0c0808c

Added batch scheduler interface

390a0db

Make copyright consistent

0de097c

Added test

05fdd8e

Refactored init

7b9fd97

Conditions in helm

d466abc

Added dep

22d71d9

Revert "Added dep"

272fef2

This reverts commit 22d71d9.

Added docs

67b1752

tgaddair marked this pull request as ready for review November 30, 2022 01:07

tgaddair added 3 commits November 29, 2022 20:56

Moved into controllers

538892d

fmt

a77d995

Fixed helm

5e33b0b

Fixed helm

a5e7efb

Apply suggestions from code review

0878ffa

Co-authored-by: Dmitri Gekhtman <[email protected]> Signed-off-by: Travis Addair <[email protected]>

DmitriGekhtman reviewed Nov 30, 2022

View reviewed changes

docs/guidance/volcano-integration.md Show resolved Hide resolved

tgaddair added 5 commits November 30, 2022 16:33

Fixed Equals

ecfcc5e

Merge branch 'volcano-int' of https://github.com/ray-project/kuberay …

37e50bd

…into volcano-int

WIP docs

8621ea3

Added apiGroups

d894cb0

Completed README

af81355

Typo

224aa32

DmitriGekhtman reviewed Dec 1, 2022

View reviewed changes

docs/guidance/volcano-integration.md Outdated Show resolved Hide resolved

Updated example

09a7ff8

DmitriGekhtman approved these changes Dec 1, 2022

View reviewed changes

DmitriGekhtman merged commit d6aef8b into master Dec 1, 2022

sihanwang41 reviewed Dec 1, 2022

View reviewed changes

tgaddair deleted the volcano-int branch December 1, 2022 19:57

tgaddair mentioned this pull request Dec 1, 2022

[docs] Updated Volcano integration documentation #776

Merged

This was referenced Dec 1, 2022

【community bridge】Integrate Volcano with Ray volcano-sh/volcano#2429

Closed

[Feature] Support batch scheduling and queueing #213

Closed

tgaddair mentioned this pull request Dec 12, 2022

Added KubeRay docs volcano-sh/volcano#2601

Merged

xubo245 reviewed Nov 22, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Support Volcano for batch scheduling #755

[Feature] Support Volcano for batch scheduling #755

tgaddair commented Nov 22, 2022 •

edited

Loading

kevin85421 commented Nov 22, 2022

DmitriGekhtman commented Nov 22, 2022

DmitriGekhtman commented Nov 22, 2022

kevin85421 commented Nov 22, 2022

DmitriGekhtman commented Nov 22, 2022 •

edited

Loading

Jeffwan commented Nov 23, 2022 •

edited

Loading

DmitriGekhtman commented Nov 23, 2022 •

edited

Loading

Jeffwan left a comment

Jeffwan Nov 29, 2022

tgaddair Nov 30, 2022

tgaddair commented Nov 30, 2022

DmitriGekhtman commented Nov 30, 2022 •

edited

Loading

tgaddair commented Dec 1, 2022

DmitriGekhtman commented Dec 1, 2022

tgaddair commented Dec 1, 2022

DmitriGekhtman left a comment

sihanwang41 Dec 1, 2022

sihanwang41 Dec 1, 2022

tgaddair Dec 1, 2022

sihanwang41 Dec 1, 2022

sihanwang41 Dec 1, 2022

sihanwang41 commented Dec 1, 2022

DmitriGekhtman commented Dec 1, 2022

tgaddair commented Dec 1, 2022

tgaddair commented Dec 1, 2022

kevin85421 commented Dec 1, 2022 •

edited

Loading

DmitriGekhtman commented Dec 1, 2022 •

edited

Loading

xubo245 Nov 22, 2023


		## Run Ray Cluster with Volcano scheduler

		Add the `ray.io/scheduler-name: volcano` label to your RayCluster CR to submit the cluster pods to Volcano for scheduling.

[Feature] Support Volcano for batch scheduling #755

[Feature] Support Volcano for batch scheduling #755

Conversation

tgaddair commented Nov 22, 2022 • edited Loading

Why are these changes needed?

Related issue number

Checks

kevin85421 commented Nov 22, 2022

DmitriGekhtman commented Nov 22, 2022

DmitriGekhtman commented Nov 22, 2022

kevin85421 commented Nov 22, 2022

DmitriGekhtman commented Nov 22, 2022 • edited Loading

Jeffwan commented Nov 23, 2022 • edited Loading

DmitriGekhtman commented Nov 23, 2022 • edited Loading

Jeffwan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tgaddair commented Nov 30, 2022

DmitriGekhtman commented Nov 30, 2022 • edited Loading

tgaddair commented Dec 1, 2022

DmitriGekhtman commented Dec 1, 2022

tgaddair commented Dec 1, 2022

DmitriGekhtman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sihanwang41 commented Dec 1, 2022

DmitriGekhtman commented Dec 1, 2022

tgaddair commented Dec 1, 2022

tgaddair commented Dec 1, 2022

kevin85421 commented Dec 1, 2022 • edited Loading

DmitriGekhtman commented Dec 1, 2022 • edited Loading

Choose a reason for hiding this comment

tgaddair commented Nov 22, 2022 •

edited

Loading

DmitriGekhtman commented Nov 22, 2022 •

edited

Loading

Jeffwan commented Nov 23, 2022 •

edited

Loading

DmitriGekhtman commented Nov 23, 2022 •

edited

Loading

DmitriGekhtman commented Nov 30, 2022 •

edited

Loading

kevin85421 commented Dec 1, 2022 •

edited

Loading

DmitriGekhtman commented Dec 1, 2022 •

edited

Loading