Skip to content

Commit

Permalink
Add a note about the need to restart the kueue pod (kubernetes-sigs#2685
Browse files Browse the repository at this point in the history
)
  • Loading branch information
mimowo authored Jul 24, 2024
1 parent f679118 commit 014d4ec
Show file tree
Hide file tree
Showing 9 changed files with 46 additions and 1 deletion.
5 changes: 5 additions & 0 deletions site/content/en/docs/tasks/run/jobsets.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,11 @@ This guide is for [batch users](/docs/tasks#batch-user) that have a basic unders

2. See [JobSet Installation](https://jobset.sigs.k8s.io/docs/installation/) for installation and configuration details of JobSet Operator.

{{% alert title="Note" color="note" %}}
In order to use JobSet you need to restart Kueue after the installation.
You can do it by running: `kubectl delete pods -lcontrol-plane=controller-manager -nkueue-system`.
{{% /alert %}}

## JobSet definition

When running [JobSets](https://jobset.sigs.k8s.io/docs/concepts/) on
Expand Down
7 changes: 6 additions & 1 deletion site/content/en/docs/tasks/run/kubeflow/mpijobs.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,12 @@ Check [administer cluster quotas](/docs/tasks/manage/administer_cluster_quotas)

Check [the MPI Operator installation guide](https://github.com/kubeflow/mpi-operator#installation).

You can [modify kueue configurations from installed releases](/docs/installation#install-a-custom-configured-released-version) to include MPIJobs as an allowed workload.
You can [modify kueue configurations from installed releases](/docs/installation#install-a-custom-configured-released-version) to include MPIJobs as an allowed workload.

{{% alert title="Note" color="note" %}}
In order to use MPIJob you need to restart Kueue after the installation.
You can do it by running: `kubectl delete pods -lcontrol-plane=controller-manager -nkueue-system`.
{{% /alert %}}

## MPI Operator definition

Expand Down
5 changes: 5 additions & 0 deletions site/content/en/docs/tasks/run/kubeflow/mxjobs.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,11 @@ Note that the minimum requirement training-operator version is v1.7.0.

You can [modify kueue configurations from installed releases](/docs/installation#install-a-custom-configured-released-version) to include MXJobs as an allowed workload.

{{% alert title="Note" color="note" %}}
In order to use Training Operator you need to restart Kueue after the installation.
You can do it by running: `kubectl delete pods -lcontrol-plane=controller-manager -nkueue-system`.
{{% /alert %}}

## MXJob definition

### a. Queue selection
Expand Down
5 changes: 5 additions & 0 deletions site/content/en/docs/tasks/run/kubeflow/paddlejobs.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,11 @@ Note that the minimum requirement training-operator version is v1.7.0.

You can [modify kueue configurations from installed releases](/docs/installation#install-a-custom-configured-released-version) to include PaddleJobs as an allowed workload.

{{% alert title="Note" color="note" %}}
In order to use Training Operator you need to restart Kueue after the installation.
You can do it by running: `kubectl delete pods -lcontrol-plane=controller-manager -nkueue-system`.
{{% /alert %}}

## PaddleJob definition

### a. Queue selection
Expand Down
5 changes: 5 additions & 0 deletions site/content/en/docs/tasks/run/kubeflow/pytorchjobs.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,11 @@ Note that the minimum requirement training-operator version is v1.7.0.

You can [modify kueue configurations from installed releases](/docs/installation#install-a-custom-configured-released-version) to include PyTorchJobs as an allowed workload.

{{% alert title="Note" color="note" %}}
In order to use Training Operator you need to restart Kueue after the installation.
You can do it by running: `kubectl delete pods -lcontrol-plane=controller-manager -nkueue-system`.
{{% /alert %}}

## PyTorchJob definition

### a. Queue selection
Expand Down
5 changes: 5 additions & 0 deletions site/content/en/docs/tasks/run/kubeflow/tfjobs.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,11 @@ Note that the minimum requirement training-operator version is v1.7.0.

You can [modify kueue configurations from installed releases](/docs/installation#install-a-custom-configured-released-version) to include TFJobs as an allowed workload.

{{% alert title="Note" color="note" %}}
In order to use Training Operator you need to restart Kueue after the installation.
You can do it by running: `kubectl delete pods -lcontrol-plane=controller-manager -nkueue-system`.
{{% /alert %}}

## TFJob definition

### a. Queue selection
Expand Down
5 changes: 5 additions & 0 deletions site/content/en/docs/tasks/run/kubeflow/xgboostjobs.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,11 @@ Note that the minimum requirement training-operator version is v1.7.0.

You can [modify kueue configurations from installed releases](/docs/installation#install-a-custom-configured-released-version) to include XGBoostJobs as an allowed workload.

{{% alert title="Note" color="note" %}}
In order to use Training Operator you need to restart Kueue after the installation.
You can do it by running: `kubectl delete pods -lcontrol-plane=controller-manager -nkueue-system`.
{{% /alert %}}

## XGBoostJob definition

### a. Queue selection
Expand Down
5 changes: 5 additions & 0 deletions site/content/en/docs/tasks/run/rayclusters.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,11 @@ This guide is for [batch users](/docs/tasks#batch-user) that have a basic unders

3. See [KubeRay Installation](https://ray-project.github.io/kuberay/deploy/installation/) for installation and configuration details of KubeRay.

{{% alert title="Note" color="note" %}}
In order to use RayCluster you need to restart Kueue after the installation.
You can do it by running: `kubectl delete pods -lcontrol-plane=controller-manager -nkueue-system`.
{{% /alert %}}

## RayCluster definition

When running [RayClusters](https://docs.ray.io/en/latest/cluster/getting-started.html) on
Expand Down
5 changes: 5 additions & 0 deletions site/content/en/docs/tasks/run/rayjobs.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,11 @@ This guide is for [batch users](/docs/tasks#batch-user) that have a basic unders

2. See [KubeRay Installation](https://ray-project.github.io/kuberay/deploy/installation/) for installation and configuration details of KubeRay.

{{% alert title="Note" color="note" %}}
In order to use RayJob you need to restart Kueue after the installation.
You can do it by running: `kubectl delete pods -lcontrol-plane=controller-manager -nkueue-system`.
{{% /alert %}}

## RayJob definition

When running [RayJobs](https://ray-project.github.io/kuberay/guidance/rayjob/) on
Expand Down

0 comments on commit 014d4ec

Please sign in to comment.