Skip to content

Commit

Permalink
[Docs][KubeRay] Update KubeRay + Kueue guides to use newer versions o…
Browse files Browse the repository at this point in the history
…f Kueue (ray-project#48564)

## Why are these changes needed?

Update KubeRay + Kueue guides to use newer versions of Kueue

## Related issue number

<!-- For example: "Closes ray-project#1234" -->

## Checks

- [ ] I've signed off every commit(by using the -s flag, i.e., `git
commit -s`) in this PR.
- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for
https://docs.ray.io/en/master/.
- [ ] I've added any new APIs to the API Reference. For example, if I
added a
method in Tune, I've added it in `doc/source/tune/api/` under the
           corresponding `.rst` file.
- [X] I've made sure the tests are passing. Note that there might be a
few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(

Signed-off-by: Andrew Sy Kim <[email protected]>
  • Loading branch information
andrewsykim authored Nov 5, 2024
1 parent 12e58ab commit 11aca59
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 11 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -37,16 +37,16 @@ Create a GKE cluster with the `enable-autoscaling` option:
```bash
gcloud container clusters create kuberay-gpu-cluster \
--num-nodes=1 --min-nodes 0 --max-nodes 1 --enable-autoscaling \
--zone=us-west1-b --machine-type e2-standard-4 --cluster-version 1.29
--zone=us-east4-c --machine-type e2-standard-4
```

Create a GPU node pool with the `enable-queued-provisioning` option enabled:
```bash
gcloud beta container node-pools create gpu-node-pool \
gcloud container node-pools create gpu-node-pool \
--accelerator type=nvidia-l4,count=1,gpu-driver-version=latest \
--enable-queued-provisioning \
--reservation-affinity=none \
--zone us-west1-b \
--zone us-east4-c \
--cluster kuberay-gpu-cluster \
--num-nodes 0 \
--min-nodes 0 \
Expand All @@ -55,14 +55,10 @@ gcloud beta container node-pools create gpu-node-pool \
--machine-type g2-standard-4
```

This command creates a node pool which initially has zero nodes. Use the `gcloud beta` command because some of the flags have beta status.
This command creates a node pool which initially has zero nodes.
The `--enable-queued-provisioning` flag enables "queued provisioning" in the Kubernetes node autoscaler using the ProvisioningRequest API. More details are below.
You need to use the `--reservation-affinity=none` flag because GKE doesn't support Node Reservations with ProvisioningRequest.

:::{note}
"enable-queued-provisioning" is only available on versions 1.28+ with the `gcloud beta` command
:::


## Install the KubeRay operator

Expand All @@ -71,9 +67,9 @@ The KubeRay operator Pod must be on the CPU node if you set up the taint for the

## Install Kueue

Install Kueue with the ProvisioningRequest API enabled.
Install the latest released version of Kueue.
```
kubectl apply --server-side -f https://github.com/kubernetes-sigs/kueue/releases/download/v0.6.0/manifests-alpha-enabled.yaml
kubectl apply --server-side -f https://github.com/kubernetes-sigs/kueue/releases/download/v0.8.2/manifests.yaml
```

See [Kueue Installation](https://kueue.sigs.k8s.io/docs/installation/#install-a-released-version) for more details on installing Kueue.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ The KubeRay operator Pod must be on the CPU node if you set up the taint for the
## Step 2: Install Kueue

```bash
VERSION=v0.6.0
VERSION=v0.8.2
kubectl apply --server-side -f https://github.com/kubernetes-sigs/kueue/releases/download/$VERSION/manifests.yaml
```

Expand Down

0 comments on commit 11aca59

Please sign in to comment.