Skip to content

Commit

Permalink
Organize tasks into folders (#1888)
Browse files Browse the repository at this point in the history
Change-Id: I031c8806294cc128751cb9edb6a62a09c4fb2225
  • Loading branch information
alculquicondor authored Mar 25, 2024
1 parent 6d45624 commit 19ee32e
Show file tree
Hide file tree
Showing 38 changed files with 108 additions and 53 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,12 +21,12 @@ Read the [overview](https://kueue.sigs.k8s.io/docs/overview/) to learn more.
- **Resource management:** Support resource fair sharing and [preemption](https://kueue.sigs.k8s.io/docs/concepts/cluster_queue/#preemption) with a variety of policies between different tenants.
- **Dynamic resource reclaim:** A mechanism to [release](https://kueue.sigs.k8s.io/docs/concepts/workload/#dynamic-reclaim) quota as the pods of a Job complete.
- **Resource flavor fungibility:** Quota [borrowing or preemption](https://kueue.sigs.k8s.io/docs/concepts/cluster_queue/#flavorfungibility) in ClusterQueue and Cohort.
- **Integrations:** Built-in support for popular jobs, e.g. [BatchJob](https://kueue.sigs.k8s.io/docs/tasks/run_jobs/), [Kubeflow training jobs](https://kueue.sigs.k8s.io/docs/tasks/run_kubeflow_jobs/), [RayJob](https://kueue.sigs.k8s.io/docs/tasks/run_rayjobs/), [RayCluster](https://kueue.sigs.k8s.io/docs/tasks/run_rayclusters/), [JobSet](https://kueue.sigs.k8s.io/docs/tasks/run_jobsets/), [plain Pod](https://kueue.sigs.k8s.io/docs/tasks/run_plain_pods/).
- **Integrations:** Built-in support for popular jobs, e.g. [BatchJob](https://kueue.sigs.k8s.io/docs/tasks/run/jobs/), [Kubeflow training jobs](https://kueue.sigs.k8s.io/docs/tasks/run/kubeflow/), [RayJob](https://kueue.sigs.k8s.io/docs/tasks/run/rayjobs/), [RayCluster](https://kueue.sigs.k8s.io/docs/tasks/run/rayclusters/), [JobSet](https://kueue.sigs.k8s.io/docs/tasks/run/jobsets/), [plain Pod](https://kueue.sigs.k8s.io/docs/tasks/run/plain_pods/).
- **System insight:** Build-in [prometheus metrics](https://kueue.sigs.k8s.io/docs/reference/metrics/) to help monitor the state of the system, as well as Conditions.
- **AdmissionChecks:** A mechanism for internal or external components to influence whether a workload can be [admitted](https://kueue.sigs.k8s.io/docs/concepts/admission_check/).
- **Advanced autoscaling support:** Integration with cluster-autoscaler's [provisioningRequest](https://kueue.sigs.k8s.io/docs/admission-check-controllers/provisioning/#job-using-a-provisioningrequest) via admissionChecks.
- **Sequential admission:** A simple implementation of [all-or-nothing scheduling](https://kueue.sigs.k8s.io/docs/tasks/setup_sequential_admission/).
- **Partial admission:** Allows jobs to run with a [smaller parallelism](https://kueue.sigs.k8s.io/docs/tasks/run_jobs/#partial-admission), based on available quota, if the application supports it.
- **Sequential admission:** A simple implementation of [all-or-nothing scheduling](https://kueue.sigs.k8s.io/docs/tasks/manage/setup_sequential_admission/).
- **Partial admission:** Allows jobs to run with a [smaller parallelism](https://kueue.sigs.k8s.io/docs/tasks/run/jobs/#partial-admission), based on available quota, if the application supports it.

## Production Readiness status

Expand Down
2 changes: 1 addition & 1 deletion cmd/experimental/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,4 +30,4 @@ Keep in mind the following rules for each integration:
mark the integration as stale for at most 2 releases. After that, Kueue maintainers will remove
the folder.
- Based on user feedback, the [Kueue maintainers](/OWNERS), at their discretion, might choose to
move the [integration to pkg/controller/jobs](https://kueue.sigs.k8s.io/docs/tasks/integrate_a_custom_job/).
move the [integration to pkg/controller/jobs](https://kueue.sigs.k8s.io/docs/tasks/dev/integrate_a_custom_job/).
2 changes: 1 addition & 1 deletion site/content/en/docs/concepts/cluster_queue.md
Original file line number Diff line number Diff line change
Expand Up @@ -487,5 +487,5 @@ If set to `None` or `spec.stopPolicy` is removed the ClusterQueue will to normal

- Create [local queues](/docs/concepts/local_queue)
- Create [resource flavors](/docs/concepts/resource_flavor) if you haven't already done so.
- Learn how to [administer cluster quotas](/docs/tasks/administer_cluster_quotas).
- Learn how to [administer cluster quotas](/docs/tasks/manage/administer_cluster_quotas).
- Read the [API reference](/docs/reference/kueue.v1beta1/#kueue-x-k8s-io-v1beta1-ClusterQueue) for `ClusterQueue`
8 changes: 4 additions & 4 deletions site/content/en/docs/concepts/multikueue.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,10 +64,10 @@ Known Limitations:
An approach similar to the one described for [`batch/Job`](#batchjob) is taken into account to overcome this.

## Submitting Jobs
In a [configured MultiKueue environemnt](/docs/tasks/setup_multikueue), you can submit any MultiKueue supported job to the Manager cluster, targeting a ClusterQueue configured for Multikueue.
In a [configured MultiKueue environemnt](/docs/tasks/manage/setup_multikueue), you can submit any MultiKueue supported job to the Manager cluster, targeting a ClusterQueue configured for Multikueue.
Kueue delegates the job to the configured worker clusters without any additional configuration changes.

## What’s next?
- Learn how to [setup a MultiKueue environment](/docs/tasks/setup_multikueue/)
- Learn how to [submit JobSets](/docs/tasks/run_jobsets/#jobset-definition) to a running Kueue cluster.
- Learn how to [submit batch/Jobs](/docs/tasks/run_jobs/#1-define-the-job) to a running Kueue cluster.
- Learn how to [setup a MultiKueue environment](/docs/tasks/manage/setup_multikueue/)
- Learn how to [submit JobSets](/docs/tasks/run/jobsets/#jobset-definition) to a running Kueue cluster.
- Learn how to [submit batch/Jobs](/docs/tasks/run/jobs/#1-define-the-job) to a running Kueue cluster.
2 changes: 1 addition & 1 deletion site/content/en/docs/concepts/workload.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,5 +154,5 @@ the requeueState (`.status.requeueState`) will be reset to null.
## What's next

- Learn about [workload priority class](/docs/concepts/workload_priority_class).
- Learn how to [run jobs](/docs/tasks/run_jobs)
- Learn how to [run jobs](/docs/tasks/run/jobs)
- Read the [API reference](/docs/reference/kueue.v1beta1/#kueue-x-k8s-io-v1beta1-Workload) for `Workload`
4 changes: 2 additions & 2 deletions site/content/en/docs/concepts/workload_priority_class.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,6 @@ Workload's `PriorityClassSource` and `PriorityClassName` fields are immutable.

## What's next?

- Learn how to [run jobs](/docs/tasks/run_jobs)
- Learn how to [run jobs with workload priority](/docs/tasks/run_job_with_workload_priority)
- Learn how to [run jobs](/docs/tasks/run/jobs)
- Learn how to [run jobs with workload priority](/docs/tasks/manage/run_job_with_workload_priority)
- Read the [API reference](/docs/reference/kueue.v1beta1/#kueue-x-k8s-io-v1beta1-WorkloadPriorityClass) for `WorkloadPriorityClass`
4 changes: 2 additions & 2 deletions site/content/en/docs/overview/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,12 +28,12 @@ A core design principle for Kueue is to avoid duplicating mature functionality i
- **Resource management:** Support resource fair sharing and [preemption](/docs/concepts/cluster_queue/#preemption) with a variety of policies between different tenants.
- **Dynamic resource reclaim:** A mechanism to [release](/docs/concepts/workload/#dynamic-reclaim) quota as the pods of a Job complete.
- **Resource flavor fungibility:** Quota [borrowing or preemption](/docs/concepts/cluster_queue/#flavorfungibility) in ClusterQueue and Cohort.
- **Integrations:** Built-in support for popular jobs, e.g. [BatchJob](/docs/tasks/run_jobs/), [Kubeflow training jobs](/docs/tasks/run_kubeflow_jobs/), [RayJob](/docs/tasks/run_rayjobs/), [RayCluster](/docs/tasks/run_rayclusters/), [JobSet](/docs/tasks/run_jobsets/), [plain Pod](/docs/tasks/run_plain_pods/).
- **Integrations:** Built-in support for popular jobs, e.g. [BatchJob](/docs/tasks/run/jobs/), [Kubeflow training jobs](/docs/tasks/run/kubeflow/), [RayJob](/docs/tasks/run/rayjobs/), [RayCluster](/docs/tasks/run/rayclusters/), [JobSet](/docs/tasks/run/jobsets/), [plain Pod](/docs/tasks/run/plain_pods/).
- **System insight:** Built-in [prometheus metrics](/docs/reference/metrics/) to help monitor the state of the system, as well as Conditions.
- **AdmissionChecks:** A mechanism for internal or external components to influence whether a workload can be [admitted](/docs/concepts/admission_check/).
- **Advanced autoscaling support:** Integration with cluster-autoscaler's [provisioningRequest](/docs/admission-check-controllers/provisioning/#job-using-a-provisioningrequest) via admissionChecks.
- **Sequential admission:** A simple implementation of [all-or-nothing scheduling](/docs/tasks/setup_sequential_admission/).
- **Partial admission:** Allows jobs to run with a [smaller parallelism](/docs/tasks/run_jobs/#partial-admission), based on available quota, if the application supports it.
- **Partial admission:** Allows jobs to run with a [smaller parallelism](/docs/tasks/run/jobs/#partial-admission), based on available quota, if the application supports it.

## High-level Kueue operation

Expand Down
31 changes: 16 additions & 15 deletions site/content/en/docs/tasks/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,37 +19,38 @@ quotas and queues.

As a batch administrator, you can learn how to:

- [Setup role-based access control](/docs/tasks/rbac)
- [Setup role-based access control](manage/rbac)
to Kueue objects.
- [Administer cluster quotas](/docs/tasks/administer_cluster_quotas) with ClusterQueues and LocalQueues.
- Setup [Sequential Admission with Ready Pods](/docs/tasks/setup_sequential_admission).
- [Administer cluster quotas](manage/administer_cluster_quotas) with ClusterQueues and LocalQueues.
- Setup [Sequential Admission with Ready Pods](manage/setup_sequential_admission).
- As a batch administrator, you can learn how to
[monitor pending workloads](/docs/tasks/monitor_pending_workloads).
- As a batch administrator, you can learn how to [run a Kueue managed Jobs with a custom WorkloadPriority](/docs/tasks/run_job_with_workload_priority).
- As a batch administrator, you can learn how to [setup a MultiKueue environment](/docs/tasks/setup_multikueue).
[monitor pending workloads](manage/monitor_pending_workloads).
- As a batch administrator, you can learn how to [run a Kueue managed Jobs with a custom WorkloadPriority](manage/run_job_with_workload_priority).
- As a batch administrator, you can learn how to [setup a MultiKueue environment](manage/setup_multikueue).

### Batch user

A _batch user_ runs [workloads](/docs/concepts/workload). A typical
batch user is a researcher, AI/ML engineer, data scientist, among others.

As a batch user, you can learn how to:
- [Run a Kueue managed batch/Job](/docs/tasks/run_jobs).
- [Run a Kueue managed Flux MiniCluster](/docs/tasks/run_flux_minicluster).
- [Run a Kueue managed Kubeflow Job](/docs/tasks/run_kubeflow_jobs).
- [Run a Kueue managed batch/Job](run/jobs).
- [Run a Kueue managed Flux MiniCluster](run/flux_miniclusters).
- [Run a Kueue managed Kubeflow Job](run/kubeflow).
Kueue supports MPIJob v2beta1, PyTorchJob, TFJob, XGBoostJob, PaddleJob, and MXJob.
- [Run a Kueue managed KubeRay RayJob](/docs/tasks/run_rayjobs).
- [Submit Kueue jobs from Python](/docs/tasks/run_python_jobs).
- [Run a Kueue managed plain Pod](/docs/tasks/run_plain_pods).
- [Run a Kueue managed JobSet](/docs/tasks/run_jobsets).
- [Run a Kueue managed KubeRay RayJob](run/rayjobs).
- [Run a Kueue managed KubeRay RayCluster](run/rayclusters).
- [Submit Kueue jobs from Python](run/python_jobs).
- [Run a Kueue managed plain Pod](run/plain_pods).
- [Run a Kueue managed JobSet](run/jobsets).

### Platform developer

A _platform developer_ integrates Kueue with other software and/or contributes to Kueue.

As a platform developer, you can learn how to:
- [Integrate a custom Job with Kueue](/docs/tasks/integrate_a_custom_job).
- [Enable pprof endpoints](/docs/tasks/enabling_pprof_endpoints).
- [Integrate a custom Job with Kueue](dev/integrate_a_custom_job).
- [Enable pprof endpoints](dev/enabling_pprof_endpoints).

## Troubleshooting

Expand Down
7 changes: 7 additions & 0 deletions site/content/en/docs/tasks/dev/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
title: "Developer Tools"
weight: 3
date: 2024-03-22
description: >
As a _platform developer_, you can integrate with or develop for Kueue.
---
7 changes: 7 additions & 0 deletions site/content/en/docs/tasks/manage/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
title: "Manage Kueue"
weight: 1
date: 2024-03-22
description: >
As a _batch administrator_, you can manage Kueue.
---
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: "Administer Cluster Quotas"
date: 2022-03-14
weight: 3
weight: 2
description: >
Manage your cluster resource quotas and to establish fair sharing rules among the tenants.
---
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---

title: "Monitor pending Workloads"
linkTitle: "Monitor pending Workloads"
weight: 3
date: 2023-12-05
description: >
How to monitor pending Workloads
---

Kueue provides two ways of monitoring pending Workloads. For Kueue 0.6 and newer, the preferred way to monitor pending Workloads is using the on-demand API.
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ title: "Pending workloads in Status"
date: 2023-09-27
weight: 3
description: >
Pending workloads in Status
Obtain the pending workloads in ClusterQueue and LocalQueue statuses.
---

This page shows you how to monitor pending workloads.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ title: "Pending Workloads on-demand"
date: 2023-12-05
weight: 3
description: >
Pending Workloads on-demand
Obtain the pending Workloads via the on-demand visibility API
---

This page shows you how to monitor pending workloads with VisibilityOnDemand feature.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: "Setup RBAC"
date: 2022-02-14
weight: 2
weight: 1
description: >
Setup role-based access control (RBAC) in your cluster to control the types of users that can view and create Kueue objects.
---
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: "Run job with WorkloadPriority"
date: 2023-10-02
weight: 8
weight: 4
description: >
Run job with WorkloadPriority, which is independent from Pod's priority
---
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: "Sequential Admission with Ready Pods"
date: 2022-03-14
weight: 4
weight: 5
description: >
Simple implementation of the all-or-nothing scheduling
---
Expand Down
15 changes: 0 additions & 15 deletions site/content/en/docs/tasks/monitor_pending_workloads/_index.md

This file was deleted.

7 changes: 7 additions & 0 deletions site/content/en/docs/tasks/run/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
title: "Run Workloads"
weight: 2
date: 2024-03-22
description: >
As a _batch user_, you can run workloads.
---
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
---
title: "Run A Flux MiniCluster"
linkTitle: "Flux MiniClusters"
date: 2022-02-14
weight: 6
description: >
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
---
title: "Run A Job"
title: "Run A Kubernetes Job"
linkTitle: "Kubernetes Jobs"
date: 2022-02-14
weight: 5
description: >
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
---
title: "Run A JobSet"
linkTitle: "Jobsets"
date: 2023-06-16
weight: 7
description: >
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---

title: "Run with Kubeflow"
linkTitle: "Run with Kubeflow"
linkTitle: "Kubeflow Jobs"
weight: 6
date: 2023-08-23
description: >
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
---
title: "Run Plain Pods"
linkTitle: "Plain Pods"
date: 2023-09-27
weight: 6
description: >
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
---
title: "Run Jobs Using Python"
linkTitle: "Python"
date: 2023-07-05
weight: 7
description: >
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
---
title: "Run A RayCluster"
linkTitle: "RayClusters"
date: 2024-01-17
weight: 6
description: >
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
---
title: "Run A RayJob"
linkTitle: "RayJobs"
date: 2023-05-18
weight: 6
description: >
Expand Down
30 changes: 30 additions & 0 deletions site/static/_redirects
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
###############################################
# set server-side redirects in this file #
# see https://www.netlify.com/docs/redirects/ #
# test at https://play.netlify.com/redirects #
###############################################

/docs/tasks/administer_cluster_quotas /doc/tasks/manage/administer_cluster_quotas 301
/docs/tasks/monitor_pending_workloads /doc/tasks/manage/monitor_pending_workloads 301
/docs/tasks/rbac /doc/tasks/manage/rbac 301
/docs/tasks/run_job_with_workload_priority /doc/tasks/manage/run_job_with_workload_priority 301
/docs/tasks/setup_multikueue /doc/tasks/manage/setup_multikueue 301
/docs/tasks/setup_sequential_admission /doc/tasks/manage/setup_sequential_admission 301

/docs/tasks/enabling_pprof_endpoints /doc/tasks/dev/enabling_pprof_endpoints 301
/docs/tasks/integrate_a_custom_job /doc/tasks/dev/integrate_a_custom_job 301

/docs/tasks/run_flux_minicluster /docs/tasks/run/flux_miniclusters 301
/docs/tasks/run_jobs /docs/tasks/run/jobs 301
/docs/tasks/run_jobsets /docs/tasks/run/jobsets 301
/docs/tasks/run_kubeflow_jobs /docs/tasks/run/kubeflow 301
/docs/tasks/run_plain_pods /docs/tasks/run/plain_pods 301
/docs/tasks/run_rayclusters /docs/tasks/run/rayclusters 301
/docs/tasks/run_rayjobs /docs/tasks/run/rayjobs 301

/docs/tasks/run_kubeflow_jobs/run_mpijobs /docs/tasks/run/kubeflow/mpijobs 301
/docs/tasks/run_kubeflow_jobs/run_mxjobs /docs/tasks/run/kubeflow/mxjobs 301
/docs/tasks/run_kubeflow_jobs/run_paddlejobs /docs/tasks/run/kubeflow/paddlejobs 301
/docs/tasks/run_kubeflow_jobs/run_pytorchjobs /docs/tasks/run/kubeflow/pytorchjobs 301
/docs/tasks/run_kubeflow_jobs/run_tfjobs /docs/tasks/run/kubeflow/tfjobs 301
/docs/tasks/run_kubeflow_jobs/run_xgboostjobs /docs/tasks/run/kubeflow/xgboostjobs 301
2 changes: 1 addition & 1 deletion site/static/examples/python/README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
# Kueue in Python

Documentation for these examples can be found [on the Kueue documentation site](https://kueue.sigs.k8s.io/docs/tasks/run_python_jobs/).
Documentation for these examples can be found [on the Kueue documentation site](https://kueue.sigs.k8s.io/docs/tasks/run/python_jobs/).

0 comments on commit 19ee32e

Please sign in to comment.