Are you...
- Engaged in MLOps or ML research? 🔍
- Frustrated by YAML and kubectl complexities? 😤
- Wishing for a Python-Kubernetes blend in your ML pipelines? 💖
Discover KubeJobs:
- Python-Powered Kubernetes: Write Kubernetes operations in Python, get rid of YAML files. 🎉
- MLOps Made Magical: Manage Kubernetes Jobs, PVs, PVCs effortlessly, tailored for ML workflows. 🪄
- Research Reading, Writing Kubernetes: Set up and run multiple jobs for varied experiment parameters systematically. 📚
- Visualized Data: Track Kubernetes job and pod information via a Streamlit-powered web interface. 🖥️
Kubejobs is a Python package for creating and running Kubernetes Jobs, Persistent Volumes (PVs), and Persistent Volume Claims (PVCs). This package simplifies the process of deploying and managing jobs, PVs, and PVCs on Kubernetes, making it easier for users without Kubernetes experience to get started.
- Easy creation and management of Kubernetes Jobs through Python code.
- Supports GPU selection, shared memory size configuration, environment variables, and volume mounts.
- Initialize the KubernetesJob class using command-line arguments with Google Fire.
- Automatically generates YAML files for Kubernetes Jobs.
Install KubeJobs using pip:
pip install kubejobs
The KubernetesPod
class helps you create a Kubernetes Pod, generate its YAML configuration, and run the pod. Kubernetes Pods are the smallest deployable units in Kubernetes and can contain one or more containers. They have no automatic restart policy, making them suitable for short-lived tasks.
from kubejobs.pods import KubernetesPod
# Create a Kubernetes Pod with a name, container image, and command
pod = KubernetesPod(
name="my-pod",
image="ubuntu:20.04",
command=["/bin/bash", "-c", "echo 'Hello, World!'"],
)
# Generate the YAML configuration for the Pod
print(pod.generate_yaml())
# Run the Pod on the Kubernetes cluster
pod.run()
The KubernetesJob
class helps you create a Kubernetes Job, generate its YAML configuration, and run the job. Kubernetes Jobs are useful for running short-lived, one-off tasks in your cluster, under-the-hood, they create one or more Pods to execute the task.
from kubejobs.jobs import KubernetesJob
# Create a Kubernetes Job with a name, container image, and command
job = KubernetesJob(
name="my-job",
image="ubuntu:20.04",
command=["/bin/bash", "-c", "echo 'Hello, World!'"],
)
# Generate the YAML configuration for the Job
print(job.generate_yaml())
# Run the Job on the Kubernetes cluster
job.run()
You can mount an NFS partition to the Kubernetes Job by specifying the NFS server address, path, and mount options.
from kubejobs.jobs import KubernetesJob
job = KubernetesJob(
...
volume_mounts={
"nfs": {"mountPath": "/nfs", "server": "10.24.1.255", "path": "/"}
},
)
Note that both server
and path
are required fields for the NFS volume mount.
The create_jobs_for_experiments function allows you to create and run a series of Kubernetes Jobs for a list of commands. This can be helpful for running multiple experiments or tasks with different parameters in parallel.
from kubejobs.jobs import create_jobs_for_experiments
# List of commands to run as separate Kubernetes Jobs
commands = [
"python experiment.py --param1 value1",
"python experiment.py --param1 value2",
"python experiment.py --param1 value3",
]
# Create and run Kubernetes Jobs for each command in the list
create_jobs_for_experiments(
commands,
image="nvcr.io/nvidia/cuda:12.0.0-cudnn8-devel-ubuntu22.04",
gpu_type="nvidia.com/gpu", # Request GPU resources
gpu_limit=1, # Number of GPU resources to allocate
backoff_limit=4, # Maximum number of retries before marking job as failed
)
The create_pvc function helps you create a Persistent Volume Claim (PVC) in your Kubernetes cluster. PVCs are used to request storage resources from your cluster, allowing your applications to store and retrieve data.
from kubejobs.jobs import create_pvc
# Create a PVC with a name, storage size, and access mode
create_pvc("my-pvc", "5Ti", access_modes=["ReadWriteOnce"])
The create_pv function helps you create a Persistent Volume (PV) in your Kubernetes cluster. PVs represent physical storage resources in a cluster, which can be consumed by PVCs. This allows you to manage storage resources independently from applications that use them.
from kubejobs.jobs import create_pv
# Create a PV with a name, storage size, storage class, access mode, and other properties
create_pv(
pv_name="my-pv",
storage="1500Gi",
storage_class_name="my-storage-class",
access_modes=["ReadOnlyMany"],
pv_type="local", # Type of Persistent Volume, either 'local' or 'gcePersistentDisk'
claim_name="my-pvc", # Name of the PVC to bind to the PV
local_path="/mnt/data", # Path on the host for a local PV (required when pv_type is 'local')
)
web_pod_info.py
fetches, processes, and displays Kubernetes pod information. It can produce both console-based and web-based outputs using the Rich and Streamlit libraries, respectively.
- Fetch Kubernetes Pod information from a specific namespace.
- Display the fetched data in a table format in the console.
- Utilize Streamlit to render the data in a web interface.
Run the script from the command line with an optional namespace
parameter:
python kubejobs/web_pod_info.py --namespace=<your-namespace>
Run the script as above, and then navigate to the Streamlit URL displayed in the console to view the web table.
streamlit run kubejobs/web_pod_info.py
web_job_info.py
is a utility script for fetching and displaying information about Kubernetes jobs. It uses the Rich library for console-based output and Streamlit for web-based visualization.
- Fetch Kubernetes Job information from a specific namespace.
- Display the fetched data in a table format in the console.
- Utilize Streamlit to render the data in a web interface.
Run the script from the command line with an optional namespace
parameter:
python kubejobs/web_job_info.py --namespace=<your-namespace>
Run the script as above, and then navigate to the Streamlit URL displayed in the console to view the web table.
streamlit run kubejobs/web_job_info.py
manage_user_jobs.py
is a Python script for listing or deleting Kubernetes jobs initiated by a specific user. It offers console-based output using the Rich library.
- Fetch Kubernetes Jobs from a specific namespace.
- Filter jobs based on username and a search term.
- Optionally delete jobs that meet the filter criteria.
Run the script from the command line with optional parameters for namespace, username, search term, and actions.
python kubejobs/manage_user_jobs.py --namespace=<your-namespace> --username=<your-username> --term=<search-term> --delete=<True/False>
For more detailed examples and usage information, please refer to the official documentation.
Contributions are welcome! If you'd like to contribute, please:
- Fork the repository
- Create a new branch for your feature or bugfix
- Commit your changes and push them to your fork
- Open a pull request
KubeJobs is released under the MIT License.