From f70af363a0dd4824e1faafe67280fffedc7b7960 Mon Sep 17 00:00:00 2001 From: Rui Vieira Date: Tue, 22 Oct 2024 14:23:41 +0100 Subject: [PATCH] Merge LM-Eval dev branch (#337) * Add lm-eval-service controller (#258) * feat: Initial database support (#246) * Initial database support - Add status checking - Add better storage flags - Add spec.storage.format validation - Add DDL -Add HIBERNATE format to DB (test) - Update service image - Revert identifier to DATABASE - Update CR options (remove mandatory data) * Remove default DDL generation env var * Update service image to latest tag * Add migration awareness * Add updating pods for migration * Change JDBC url from mysql to mariadb * Fix TLS mount * Revert images * Remove redundant logic * Fix comments * feat: Add TLS certificate mount on ModelMesh (#255) * feat: Add TLS certificate mount on ModelMesh * Revert from http to https until https://github.com/kserve/modelmesh/pull/147 is merged * Add lm-eval-service controller refactor the existing TrustyAIService controller and add LMEvalService controller Signed-off-by: Yihong Wang --------- Signed-off-by: Yihong Wang Co-authored-by: Rui Vieira * fix: Fix typo in operator's arguments (#261) Operator's arguments changed from `--eanble-services` to `--enable-services`. trustyai.opendatahub.io_lmevaljobs.yaml and zz_generated.deepcopy.go regenerated. * feat: Add LMES driver build to GHA (#272) * sync: sync dev/lm-eval with main branch (#271) * feat: Initial database support (#246) * Initial database support - Add status checking - Add better storage flags - Add spec.storage.format validation - Add DDL -Add HIBERNATE format to DB (test) - Update service image - Revert identifier to DATABASE - Update CR options (remove mandatory data) * Remove default DDL generation env var * Update service image to latest tag * Add migration awareness * Add updating pods for migration * Change JDBC url from mysql to mariadb * Fix TLS mount * Revert images * Remove redundant logic * Fix comments * feat: Add TLS certificate mount on ModelMesh (#255) * feat: Add TLS certificate mount on ModelMesh * Revert from http to https until https://github.com/kserve/modelmesh/pull/147 is merged * Pin oc version, ubi version (#263) * Restore checkout of trustyai-exp (#265) * Add operator installation robustness (#266) * fix: Skip InferenceService patching for KServe RawDeployment (#262) * feat: ConfigMap key to disable KServe Serverless configuration (#267) * feat: Add support for custom certificates in database connection (#259) * Add TLS endpoint for ModelMesh payload processors. (#268) Keep non-TLS endpoint for KServe Serverless (disabled by default) --------- Signed-off-by: Yihong Wang Co-authored-by: Rui Vieira Co-authored-by: Rob Geada * Weekly sync up of dev/lm-eval branch (#278) * feat: Initial database support (#246) * Initial database support - Add status checking - Add better storage flags - Add spec.storage.format validation - Add DDL -Add HIBERNATE format to DB (test) - Update service image - Revert identifier to DATABASE - Update CR options (remove mandatory data) * Remove default DDL generation env var * Update service image to latest tag * Add migration awareness * Add updating pods for migration * Change JDBC url from mysql to mariadb * Fix TLS mount * Revert images * Remove redundant logic * Fix comments * feat: Add TLS certificate mount on ModelMesh (#255) * feat: Add TLS certificate mount on ModelMesh * Revert from http to https until https://github.com/kserve/modelmesh/pull/147 is merged * Pin oc version, ubi version (#263) * Restore checkout of trustyai-exp (#265) * Add operator installation robustness (#266) * fix: Skip InferenceService patching for KServe RawDeployment (#262) * feat: ConfigMap key to disable KServe Serverless configuration (#267) * feat: Add support for custom certificates in database connection (#259) * Add TLS endpoint for ModelMesh payload processors. (#268) Keep non-TLS endpoint for KServe Serverless (disabled by default) * fix: Correct maxSurge and maxUnavailable (#275) * feat: Add support for custom DB names (#257) * feat: Add support for custom DB names * fix: Correct custom DB name --------- Signed-off-by: Yihong Wang Co-authored-by: Rui Vieira Co-authored-by: Rob Geada * Driver updates job's status periodically (#280) The driver periodically update the LMEvalJob.Status.Message field with the outputs from the lm-eval. The message pattern the driver captures is like `Running text generation: 81%|`. Then users can use this information to check the progress of the job. Signed-off-by: Yihong Wang * Add Dockerfile for LMES job image (#276) Add Dockerfile for LMES job image and the needed files Signed-off-by: Yihong Wang * feat: Add overlays (#283) * feat: Add overlays * Remove redundant lmes-tas overlay. Change job image name. * Add job image build (#284) * Change job image use midstream lm-evaluation-harness (#285) * feat: support batch size (#290) Add batch size support in the LMEvalJob which leverages the `--batch_size` in the `lm-evaluation-harness`. This only affects the local models. The `--bath_size` doesn't work for remote inference APIs. Signed-off-by: Yihong Wang * Add the `openai` package into the lmes job image (#292) update the LMES job's Dockerfile to include the `openai` package. Signed-off-by: Yihong Wang * fix: fix dependency error in the job image (#296) Split up the unitxt and openai dependencies to avoid the conflict. Signed-off-by: Yihong Wang * feat: add device detection in lmes driver (#298) Added a new feature in LMES driver to detect the available devices by using the PyTorch API. This feature can be disabled by passing the `--detect-device false` option. Signed-off-by: Yihong Wang * feat: support unitxt recipes (#301) Add new fields in the CRD to support unitxt recipes and leverage the driver to create corresponding yaml files of the unitxt recipes. Signed-off-by: Yihong Wang * feat: support custom dataset (#309) Updated the CRD data struct to allow users to specify a custom Unitxt card in JSON format. The custom Unitxt card is equivalent to a custom dataset definition. Also restructured and updated the CRD to support Volumes, VolumeMounts, Env, Resources, Labels, and Annotations. Signed-off-by: Yihong Wang * feat: new pulling mechanism for job statuses (#314) Update the driver to keep running even the user program finishes. The driver provides two APIs: - GetStatus(): retrieve job status - Shutdown(): properly tear down the driver In the controller side, it uses `pod/exec` resource to run the driver command to invoke the driver APIs to retrieve the job status and shutdown the driver when job is done. Signed-off-by: Yihong Wang * Move operator's cmd/operator/main.go to cmd/main.go to keep operator-sdk compatibility (#295) * Remove hardcoded job's user ID (#322) * Fix mkdir command in Job dockerfile (#330) * Refactor some lmesreconcile methods (#323) * Refactor lmes reconcile optoins Signed-off-by: ted chang * Update controllers/lmes/lmevaljob_controller.go Co-authored-by: Yihong Wang * Update controllers/lmes/lmevaljob_controller.go Co-authored-by: Yihong Wang Signed-off-by: ted chang --------- Signed-off-by: ted chang Co-authored-by: Yihong Wang * tidy: clean up lmes-job image (#333) remove BAM related packages and patch. Signed-off-by: Yihong Wang * Enable job suspend for Kueue (#317) * Refactor lmes reconcile optoins Signed-off-by: ted chang * Update controllers/lmes/lmevaljob_controller.go Co-authored-by: Yihong Wang * Update controllers/lmes/lmevaljob_controller.go Co-authored-by: Yihong Wang Signed-off-by: ted chang * Enable job suspend for Kueue Signed-off-by: ted chang --------- Signed-off-by: ted chang Co-authored-by: Yihong Wang * Add overlay placeholders for main merge (#334) * sync: sync up dev/lm-eval branch with main branch (#336) * [CI] Run tests from trustyai-tests (#279) * Change Dockerfile to clone trustyai-tests * Add PYTEST_MARKERS env and remove TESTS_REGEX * RHOAIENG-12274: Update operator's overlays (#287) * Update operator's overlays * Update kustomization.yaml * Add devflag printout to GH Action comment (#289) * Add timeout loop to DSC install (#305) * RHOAIENG-13625: Add DBAvailable status to CR (#304) * Add DBAvailable status to CR * Remove probes * Add KServe destination rule for Inference Services in the ServiceMesh (#315) * Add DestinationRule creation for KServe serverless * Add permissions for destination rules * Add role for destination rules * Add missing role for creating destination rules * Fix spacing in DestinationRule template * Add check if DestinationRule CRD is present before creating it (#316) * Add check for DestinationRule CRD * Add API extensions to operator's scheme * Add permission for CRD resource * Fix operator metrics service target port (#320) * Add readiness probes (#312) * Enable KServe serverless in the rhoai overlay (#321) * Update overlay images (#331) * Add correct CA cert to JDBC (#324) * Add correct CA cert to JDBC * Add require SSL * Support for VirtualServices for InferenceLogger traffic (#332) * Generate KServe Inference Logger in conformance with DestinationRule and VirtualService * Add VirtualService creation for models in the mesh * Add permissions for VirtualServices * Update manifests for VirtualServices * Fix VirtualServiceName variable * fix yaml linter after the sync Signed-off-by: Yihong Wang * tidy the go.mod and go.sum as well Signed-off-by: Yihong Wang --------- Signed-off-by: Yihong Wang Co-authored-by: Adolfo Aguirrezabal Co-authored-by: Rui Vieira Co-authored-by: Rob Geada Co-authored-by: Rui Vieira --------- Signed-off-by: Yihong Wang Signed-off-by: ted chang Co-authored-by: Yihong Wang Co-authored-by: Rob Geada Co-authored-by: ted chang Co-authored-by: Adolfo Aguirrezabal --- .github/workflows/build-and-push.yaml | 26 +- .github/workflows/controller-tests.yaml | 2 +- .yamllint.yaml | 4 +- Dockerfile | 6 +- Dockerfile.driver | 25 + Dockerfile.lmes-job | 27 + Makefile | 9 +- PROJECT | 17 +- api/lmes/v1alpha1/groupversion_info.go | 43 + api/lmes/v1alpha1/lmevaljob_types.go | 296 ++ api/lmes/v1alpha1/zz_generated.deepcopy.go | 325 ++ api/{ => tas}/v1alpha1/groupversion_info.go | 0 .../v1alpha1/trustyaiservice_types.go | 0 .../v1alpha1/zz_generated.deepcopy.go | 1 - cmd/lmes_driver/main.go | 214 ++ cmd/lmes_driver/main_test.go | 73 + main.go => cmd/main.go | 29 +- config/base/kustomization.yaml | 2 +- config/base/params.env | 8 + .../trustyai.opendatahub.io_lmevaljobs.yaml | 3273 +++++++++++++++++ ...styai.opendatahub.io_trustyaiservices.yaml | 20 +- config/crd/kustomization.yaml | 1 + config/manager/manager.yaml | 2 + config/overlays/lmes/kustomization.yaml | 8 + config/overlays/lmes/lmes-only-patch.yaml | 14 + config/overlays/odh/params.env | 7 + config/overlays/rhoai/kustomization.yaml | 4 + config/overlays/rhoai/params.env | 9 +- config/overlays/rhoai/tas-only-patch.yaml | 14 + config/rbac/role.yaml | 61 +- controllers/constants/version.go | 6 + controllers/controllers.go | 81 + controllers/lmes.go | 23 + controllers/lmes/config.go | 108 + controllers/lmes/constants.go | 43 + controllers/lmes/driver/driver.go | 469 +++ controllers/lmes/driver/driver_test.go | 317 ++ controllers/lmes/lmevaljob_controller.go | 901 +++++ controllers/lmes/lmevaljob_controller_test.go | 957 +++++ controllers/tas.go | 23 + controllers/{ => tas}/certificates.go | 5 +- controllers/{ => tas}/config_maps.go | 6 +- controllers/{ => tas}/config_maps_test.go | 9 +- controllers/{ => tas}/constants.go | 3 +- controllers/{ => tas}/database.go | 4 +- controllers/{ => tas}/deployment.go | 10 +- controllers/{ => tas}/deployment_test.go | 21 +- controllers/{ => tas}/destination_rule.go | 6 +- controllers/{ => tas}/events.go | 4 +- controllers/{ => tas}/finalizers.go | 5 +- controllers/{ => tas}/inference_services.go | 27 +- controllers/{ => tas}/monitor.go | 9 +- controllers/{ => tas}/monitor_test.go | 5 +- controllers/{ => tas}/oauth.go | 12 +- controllers/{ => tas}/route.go | 9 +- controllers/{ => tas}/route_test.go | 5 +- controllers/{ => tas}/secrets.go | 5 +- controllers/{ => tas}/service_accounts.go | 8 +- .../{ => tas}/service_accounts_test.go | 3 +- controllers/{ => tas}/services.go | 12 +- controllers/{ => tas}/statuses.go | 4 +- controllers/{ => tas}/statuses_test.go | 4 +- controllers/{ => tas}/storage.go | 5 +- controllers/{ => tas}/storage_test.go | 5 +- controllers/{ => tas}/subreconciler.go | 7 +- controllers/{ => tas}/suite_test.go | 13 +- controllers/{ => tas}/templates/parser.go | 0 .../templates/service/deployment.tmpl.yaml | 0 .../service/destination-rule.tmpl.yaml | 0 .../templates/service/route.tmpl.yaml | 0 .../service/service-internal.tmpl.yaml | 0 .../service/service-monitor-central.tmpl.yaml | 0 .../service/service-monitor-local.tmpl.yaml | 0 .../templates/service/service-tls.tmpl.yaml | 0 .../service/virtual-service.tmpl.yaml | 0 .../{ => tas}/trustyaiservice_controller.go | 21 +- controllers/{ => tas}/virtual_service.go | 6 +- controllers/utils.go | 78 - controllers/utils/utils.go | 57 + controllers/version.go | 5 - go.mod | 76 +- go.sum | 161 +- patch/lmes/models.patch | 12 + 83 files changed, 7730 insertions(+), 340 deletions(-) create mode 100644 Dockerfile.driver create mode 100644 Dockerfile.lmes-job create mode 100644 api/lmes/v1alpha1/groupversion_info.go create mode 100644 api/lmes/v1alpha1/lmevaljob_types.go create mode 100644 api/lmes/v1alpha1/zz_generated.deepcopy.go rename api/{ => tas}/v1alpha1/groupversion_info.go (100%) rename api/{ => tas}/v1alpha1/trustyaiservice_types.go (100%) rename api/{ => tas}/v1alpha1/zz_generated.deepcopy.go (99%) create mode 100644 cmd/lmes_driver/main.go create mode 100644 cmd/lmes_driver/main_test.go rename main.go => cmd/main.go (79%) create mode 100644 config/crd/bases/trustyai.opendatahub.io_lmevaljobs.yaml create mode 100644 config/overlays/lmes/kustomization.yaml create mode 100644 config/overlays/lmes/lmes-only-patch.yaml create mode 100644 config/overlays/rhoai/tas-only-patch.yaml create mode 100644 controllers/constants/version.go create mode 100644 controllers/controllers.go create mode 100644 controllers/lmes.go create mode 100644 controllers/lmes/config.go create mode 100644 controllers/lmes/constants.go create mode 100644 controllers/lmes/driver/driver.go create mode 100644 controllers/lmes/driver/driver_test.go create mode 100644 controllers/lmes/lmevaljob_controller.go create mode 100644 controllers/lmes/lmevaljob_controller_test.go create mode 100644 controllers/tas.go rename controllers/{ => tas}/certificates.go (97%) rename controllers/{ => tas}/config_maps.go (95%) rename controllers/{ => tas}/config_maps_test.go (94%) rename controllers/{ => tas}/constants.go (98%) rename controllers/{ => tas}/database.go (96%) rename controllers/{ => tas}/deployment.go (97%) rename controllers/{ => tas}/deployment_test.go (99%) rename controllers/{ => tas}/destination_rule.go (96%) rename controllers/{ => tas}/events.go (94%) rename controllers/{ => tas}/finalizers.go (93%) rename controllers/{ => tas}/inference_services.go (91%) rename controllers/{ => tas}/monitor.go (97%) rename controllers/{ => tas}/monitor_test.go (98%) rename controllers/{ => tas}/oauth.go (92%) rename controllers/{ => tas}/route.go (97%) rename controllers/{ => tas}/route_test.go (98%) rename controllers/{ => tas}/secrets.go (97%) rename controllers/{ => tas}/service_accounts.go (95%) rename controllers/{ => tas}/service_accounts_test.go (99%) rename controllers/{ => tas}/services.go (83%) rename controllers/{ => tas}/statuses.go (99%) rename controllers/{ => tas}/statuses_test.go (99%) rename controllers/{ => tas}/storage.go (97%) rename controllers/{ => tas}/storage_test.go (97%) rename controllers/{ => tas}/subreconciler.go (95%) rename controllers/{ => tas}/suite_test.go (97%) rename controllers/{ => tas}/templates/parser.go (100%) rename controllers/{ => tas}/templates/service/deployment.tmpl.yaml (100%) rename controllers/{ => tas}/templates/service/destination-rule.tmpl.yaml (100%) rename controllers/{ => tas}/templates/service/route.tmpl.yaml (100%) rename controllers/{ => tas}/templates/service/service-internal.tmpl.yaml (100%) rename controllers/{ => tas}/templates/service/service-monitor-central.tmpl.yaml (100%) rename controllers/{ => tas}/templates/service/service-monitor-local.tmpl.yaml (100%) rename controllers/{ => tas}/templates/service/service-tls.tmpl.yaml (100%) rename controllers/{ => tas}/templates/service/virtual-service.tmpl.yaml (100%) rename controllers/{ => tas}/trustyaiservice_controller.go (94%) rename controllers/{ => tas}/virtual_service.go (96%) delete mode 100644 controllers/utils.go create mode 100644 controllers/utils/utils.go delete mode 100644 controllers/version.go create mode 100644 patch/lmes/models.patch diff --git a/.github/workflows/build-and-push.yaml b/.github/workflows/build-and-push.yaml index 98bf455d..8c9de159 100644 --- a/.github/workflows/build-and-push.yaml +++ b/.github/workflows/build-and-push.yaml @@ -56,6 +56,8 @@ jobs: echo "GITHUB.HEAD_REF: ${{ github.head_ref }}" echo "SHA: ${{ github.event.pull_request.head.sha }}" echo "MAIN IMAGE AT: ${{ vars.QUAY_RELEASE_REPO }}:latest" + echo "LMES DRIVER IMAGE AT: ${{ vars.QUAY_RELEASE_LMES_DRIVER_REPO }}:latest" + echo "LMES JOB IMAGE AT: ${{ vars.QUAY_RELEASE_LMES_JOB_REPO }}:latest" echo "CI IMAGE AT: quay.io/trustyai/trustyai-service-operator-ci:${{ github.event.pull_request.head.sha }}" # # Set environments depending on context @@ -64,27 +66,41 @@ jobs: run: | echo "TAG=${{ github.event.pull_request.head.sha }}" >> $GITHUB_ENV echo "IMAGE_NAME=quay.io/trustyai/trustyai-service-operator-ci" >> $GITHUB_ENV + echo "DRIVER_IMAGE_NAME=quay.io/trustyai/ta-lmes-driver-ci" >> $GITHUB_ENV + echo "JOB_IMAGE_NAME=quay.io/trustyai/ta-lmes-job-ci" >> $GITHUB_ENV - name: Set main-branch environment if: env.BUILD_CONTEXT == 'main' run: | echo "TAG=latest" >> $GITHUB_ENV echo "IMAGE_NAME=${{ vars.QUAY_RELEASE_REPO }}" >> $GITHUB_ENV + echo "DRIVER_IMAGE_NAME=${{ vars.QUAY_RELEASE_LMES_DRIVER_REPO }}" >> $GITHUB_ENV + echo "JOB_IMAGE_NAME=${{ vars.QUAY_RELEASE_LMES_JOB_REPO }}" >> $GITHUB_ENV - name: Set tag environment if: env.BUILD_CONTEXT == 'tag' run: | echo "TAG=${{ github.ref_name }}" >> $GITHUB_ENV echo "IMAGE_NAME=${{ vars.QUAY_RELEASE_REPO }}" >> $GITHUB_ENV + echo "DRIVER_IMAGE_NAME=${{ vars.QUAY_RELEASE_LMES_DRIVER_REPO }}" >> $GITHUB_ENV + echo "JOB_IMAGE_NAME=${{ vars.QUAY_RELEASE_LMES_JOB_REPO }}" >> $GITHUB_ENV - # Run docker commands + # Run docker commands - name: Put expiry date on CI-tagged image if: env.BUILD_CONTEXT == 'ci' run: sed -i 's#summary="odh-trustyai-service-operator\"#summary="odh-trustyai-service-operator" \\ \n quay.expires-after=7d#' Dockerfile - name: Log in to Quay run: docker login -u ${{ secrets.QUAY_ROBOT_USERNAME }} -p ${{ secrets.QUAY_ROBOT_SECRET }} quay.io - - name: Build image + - name: Build main image run: docker build -t ${{ env.IMAGE_NAME }}:$TAG . - - name: Push to Quay CI repo + - name: Push main image to Quay run: docker push ${{ env.IMAGE_NAME }}:$TAG + - name: Build LMES driver image + run: docker build -f Dockerfile.driver -t ${{ env.DRIVER_IMAGE_NAME }}:$TAG . + - name: Push LMES driver image to Quay + run: docker push ${{ env.DRIVER_IMAGE_NAME }}:$TAG + - name: Build LMES job image + run: docker build -f Dockerfile.lmes-job -t ${{ env.JOB_IMAGE_NAME }}:$TAG . + - name: Push LMES job image to Quay + run: docker push ${{ env.JOB_IMAGE_NAME }}:$TAG # Create CI Manifests - name: Set up manifests for CI @@ -127,6 +143,10 @@ jobs: 📦 [PR image](https://quay.io/trustyai/trustyai-service-operator-ci:${{ github.event.pull_request.head.sha }}): `quay.io/trustyai/trustyai-service-operator-ci:${{ github.event.pull_request.head.sha }}` + 📦 [LMES driver image](https://quay.io/trustyai/ta-lmes-driver:${{ github.event.pull_request.head.sha }}): `quay.io/trustyai/ta-lmes-driver:${{ github.event.pull_request.head.sha }}` + + 📦 [LMES job image](https://quay.io/trustyai/ta-lmes-job:${{ github.event.pull_request.head.sha }}): `quay.io/trustyai/ta-lmes-job:${{ github.event.pull_request.head.sha }}` + 🗂️ [CI manifests](https://github.com/trustyai-explainability/trustyai-service-operator-ci/tree/operator-${{ env.TAG }}) ``` diff --git a/.github/workflows/controller-tests.yaml b/.github/workflows/controller-tests.yaml index 989953d8..ab97ee13 100644 --- a/.github/workflows/controller-tests.yaml +++ b/.github/workflows/controller-tests.yaml @@ -13,7 +13,7 @@ jobs: - name: Setup Go uses: actions/setup-go@v4 with: - go-version: '1.19.0' + go-version: '1.21.12' - name: Download & install envtest binaries run: | diff --git a/.yamllint.yaml b/.yamllint.yaml index 8b994d09..46877bee 100644 --- a/.yamllint.yaml +++ b/.yamllint.yaml @@ -6,4 +6,6 @@ rules: level: warning hyphens: max-spaces-after: 1 - level: warning \ No newline at end of file + level: warning + indentation: + indent-sequences: consistent diff --git a/Dockerfile b/Dockerfile index 3a90f972..e4dd8ae8 100644 --- a/Dockerfile +++ b/Dockerfile @@ -1,5 +1,5 @@ # Build the manager binary -FROM registry.access.redhat.com/ubi8/go-toolset:1.21 as builder +FROM registry.access.redhat.com/ubi8/go-toolset:1.21 AS builder ARG TARGETOS ARG TARGETARCH @@ -12,7 +12,7 @@ COPY go.sum go.sum RUN go mod download # Copy the go source -COPY main.go main.go +COPY cmd/ cmd/ COPY api/ api/ COPY controllers/ controllers/ @@ -22,7 +22,7 @@ COPY controllers/ controllers/ # the docker BUILDPLATFORM arg will be linux/arm64 when for Apple x86 it will be linux/amd64. Therefore, # by leaving it empty we can ensure that the container and binary shipped on it will have the same platform. USER root -RUN CGO_ENABLED=0 GOOS=${TARGETOS:-linux} GOARCH=${TARGETARCH} go build -a -o manager main.go +RUN CGO_ENABLED=0 GOOS=${TARGETOS:-linux} GOARCH=${TARGETARCH} go build -a -o manager cmd/main.go # Use distroless as minimal base image to package the manager binary # Refer to https://github.com/GoogleContainerTools/distroless for more details diff --git a/Dockerfile.driver b/Dockerfile.driver new file mode 100644 index 00000000..e9a51f44 --- /dev/null +++ b/Dockerfile.driver @@ -0,0 +1,25 @@ +FROM registry.access.redhat.com/ubi8/go-toolset:1.21 AS builder + +WORKDIR /go/src/github.com/trustyai-explainability/trustyai-service-operator +# Copy the Go Modules manifests +COPY go.mod go.mod +COPY go.sum go.sum +# cache deps before building and copying source so that we don't need to re-download as much +# and so that source changes don't invalidate our downloaded layer +RUN go mod download +# Copy the go source +COPY cmd/ cmd/ +COPY api/ api/ +COPY controllers/ controllers/ + +RUN GO111MODULE=on CGO_ENABLED=0 GOOS=linux go build -tags netgo -ldflags '-extldflags "-static"' -o /bin/driver ./cmd/lmes_driver/*.go + +FROM registry.access.redhat.com/ubi8/ubi-minimal:latest + +COPY --from=builder /bin/driver /bin/driver + +USER 65532:65532 + +WORKDIR /bin + +ENTRYPOINT [ "/bin/driver" ] \ No newline at end of file diff --git a/Dockerfile.lmes-job b/Dockerfile.lmes-job new file mode 100644 index 00000000..fbfe63da --- /dev/null +++ b/Dockerfile.lmes-job @@ -0,0 +1,27 @@ +FROM registry.access.redhat.com/ubi9/python-311@sha256:fccda5088dd13d2a3f2659e4c904beb42fc164a0c909e765f01af31c58affae3 + +USER root +RUN sed -i.bak 's/include-system-site-packages = false/include-system-site-packages = true/' /opt/app-root/pyvenv.cfg + +USER default +WORKDIR /opt/app-root/src +RUN mkdir /opt/app-root/src/hf_home && chmod g+rwx /opt/app-root/src/hf_home +RUN mkdir /opt/app-root/src/output && chmod g+rwx /opt/app-root/src/output +RUN mkdir /opt/app-root/src/my_tasks && chmod g+rwx /opt/app-root/src/my_tasks +RUN mkdir -p /opt/app-root/src/my_catalogs/cards && chmod -R g+rwx /opt/app-root/src/my_catalogs +RUN mkdir -p /opt/app-root/src/.cache +ENV PATH="/opt/app-root/bin:/opt/app-root/src/.local/bin/:/opt/app-root/src/bin:/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" + +# Clone the Git repository, check out v0.4.4 and install the Python package +RUN git clone https://github.com/opendatahub-io/lm-evaluation-harness.git && \ + cd lm-evaluation-harness && git checkout 543617fef9ba885e87f8db8930fbbff1d4e2ca49 && \ + pip install --no-cache-dir --user -e .[api] + +RUN python -c 'from lm_eval.tasks.unitxt import task; import os.path; print("class: !function " + task.__file__.replace("task.py", "task.Unitxt"))' > ./my_tasks/unitxt + +ENV PYTHONPATH=/opt/app-root/src/.local/lib/python3.11/site-packages:/opt/app-root/src/lm-evaluation-harness:/opt/app-root/src:/opt/app-root/src/server +ENV HF_HOME=/opt/app-root/src/hf_home +ENV UNITXT_ARTIFACTORIES=/opt/app-root/src/my_catalogs + +CMD ["/opt/app-root/bin/python"] + diff --git a/Makefile b/Makefile index 605395f1..57e8ae28 100644 --- a/Makefile +++ b/Makefile @@ -7,6 +7,9 @@ VERSION ?= 1.17.0 BUILD_TOOL ?= podman +# enable TrustyAIService by default for `make run` +ENABLED_SERVICES ?= TAS + # CHANNELS define the bundle channels used in the bundle. # Add a new line here if you would like to change its default config. (E.g CHANNELS = "candidate,fast,stable") # To re-generate a bundle for other specific channels without changing the standard setup, you can: @@ -111,11 +114,11 @@ test: manifests generate fmt vet envtest ## Run tests. .PHONY: build build: manifests generate fmt vet ## Build manager binary. - go build -o bin/manager main.go + go build -o bin/manager cmd/main.go .PHONY: run run: manifests generate fmt vet ## Run a controller from your host. - go run ./main.go + go run ./cmd/main.go --enable-services $(ENABLED_SERVICES) # If you wish built the manager image targeting other platforms you can use the --platform flag. # (i.e. docker build --platform linux/arm64 ). However, you must enable docker buildKit for it. @@ -182,7 +185,7 @@ ENVTEST ?= $(LOCALBIN)/setup-envtest ## Tool Versions KUSTOMIZE_VERSION ?= v3.8.7 -CONTROLLER_TOOLS_VERSION ?= v0.11.1 +CONTROLLER_TOOLS_VERSION ?= v0.16.3 KUSTOMIZE_INSTALL_SCRIPT ?= "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh" .PHONY: kustomize diff --git a/PROJECT b/PROJECT index a4ee0204..113c2fb1 100644 --- a/PROJECT +++ b/PROJECT @@ -4,7 +4,7 @@ # More info: https://book.kubebuilder.io/reference/project-config.html domain: opendatahub.io layout: -- go.kubebuilder.io/v3 +- go.kubebuilder.io/v4 plugins: manifests.sdk.operatorframework.io/v2: {} scorecard.sdk.operatorframework.io/v2: {} @@ -18,6 +18,19 @@ resources: domain: opendatahub.io group: trustyai kind: TrustyAIService - path: github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1 + path: github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1 version: v1alpha1 +- api: + crdVersion: v1 + namespaced: true + controller: true + domain: opendatahub.io + group: trustyai + kind: LMEvalJob + path: github.com/trustyai-explainability/trustyai-service-operator/api/lmes/v1alpha1 + version: v1alpha1 + webhooks: + defaulting: true + validation: true + webhookVersion: v1 version: "3" diff --git a/api/lmes/v1alpha1/groupversion_info.go b/api/lmes/v1alpha1/groupversion_info.go new file mode 100644 index 00000000..9699b596 --- /dev/null +++ b/api/lmes/v1alpha1/groupversion_info.go @@ -0,0 +1,43 @@ +/* +Copyright 2024. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +*/ + +// Package v1alpha1 contains API Schema definitions for the trustyai.opendatahub.io v1alpha1 API group +// +kubebuilder:object:generate=true +// +groupName=trustyai.opendatahub.io +package v1alpha1 + +import ( + "k8s.io/apimachinery/pkg/runtime/schema" + "sigs.k8s.io/controller-runtime/pkg/scheme" +) + +const ( + GroupName = "trustyai.opendatahub.io" + Version = "v1alpha1" + KindName = "LMEvalJob" + FinalizerName = "trustyai.opendatahub.io/lmes-finalizer" +) + +var ( + // GroupVersion is group version used to register these objects + GroupVersion = schema.GroupVersion{Group: GroupName, Version: Version} + + // SchemeBuilder is used to add go types to the GroupVersionKind scheme + SchemeBuilder = &scheme.Builder{GroupVersion: GroupVersion} + + // AddToScheme adds the types in this group-version to the given scheme. + AddToScheme = SchemeBuilder.AddToScheme +) diff --git a/api/lmes/v1alpha1/lmevaljob_types.go b/api/lmes/v1alpha1/lmevaljob_types.go new file mode 100644 index 00000000..3ec86f88 --- /dev/null +++ b/api/lmes/v1alpha1/lmevaljob_types.go @@ -0,0 +1,296 @@ +/* +Copyright 2024. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +*/ + +package v1alpha1 + +import ( + "fmt" + "strings" + + corev1 "k8s.io/api/core/v1" + metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" +) + +// EDIT THIS FILE! THIS IS SCAFFOLDING FOR YOU TO OWN! +// NOTE: json tags are required. Any new fields you add must have json tags for the fields to be serialized. + +// Represent a job's status +// +kubebuilder:validation:Enum=New;Scheduled;Running;Complete;Cancelled;Suspended +type JobState string + +const ( + // The job is just created + NewJobState JobState = "New" + // The job is scheduled and waiting for available resources to run it + ScheduledJobState JobState = "Scheduled" + // The job is running + RunningJobState JobState = "Running" + // The job is complete + CompleteJobState JobState = "Complete" + // The job is cancelled + CancelledJobState JobState = "Cancelled" + // The job is suspended + SuspendedJobState JobState = "Suspended" +) + +// +kubebuilder:validation:Enum=NoReason;Succeeded;Failed;Cancelled +type Reason string + +const ( + // Job is still running and no final result yet + NoReason Reason = "NoReason" + // Job finished successfully + SucceedReason Reason = "Succeeded" + // Job failed + FailedReason Reason = "Failed" + // Job is cancelled + CancelledReason Reason = "Cancelled" +) + +type Arg struct { + Name string `json:"name"` + Value string `json:"value,omitempty"` +} + +type Card struct { + // Unitxt card's ID + // +optional + Name string `json:"name,omitempty"` + // A JSON string for a custom unitxt card which contains the custom dataset. + // Use the documentation here: https://www.unitxt.ai/en/latest/docs/adding_dataset.html#adding-to-the-catalog + // to compose a custom card, store it as a JSON file, and use the JSON content as the value here. + // +optional + Custom string `json:"custom,omitempty"` +} + +// Use a task recipe to form a custom task. It maps to the Unitxt Recipe +// Find details of the Unitxt Recipe here: +// https://www.unitxt.ai/en/latest/unitxt.standard.html#unitxt.standard.StandardRecipe +type TaskRecipe struct { + // The Unitxt dataset card + Card Card `json:"card"` + // The Unitxt template + Template string `json:"template"` + // The Unitxt Task + // +optional + Task *string `json:"task,omitempty"` + // Metrics + // +optional + Metrics []string `json:"metrics,omitempty"` + // The Unitxt format + // +optional + Format *string `json:"format,omitempty"` + // A limit number of records to load + // +optional + LoaderLimit *int `json:"loaderLimit,omitempty"` + // Number of fewshot + // +optional + NumDemos *int `json:"numDemos,omitempty"` + // The pool size for the fewshot + // +optional + DemosPoolSize *int `json:"demosPoolSize,omitempty"` +} + +type TaskList struct { + // TaskNames from lm-eval's task list + TaskNames []string `json:"taskNames,omitempty"` + // Task Recipes specifically for Unitxt + TaskRecipes []TaskRecipe `json:"taskRecipes,omitempty"` +} + +func (t *TaskRecipe) String() string { + var b strings.Builder + b.WriteString(fmt.Sprintf("card=%s,template=%s", t.Card.Name, t.Template)) + if t.Task != nil { + b.WriteString(fmt.Sprintf(",task=%s", *t.Task)) + } + if len(t.Metrics) > 0 { + b.WriteString(fmt.Sprintf(",metrics=[%s]", strings.Join(t.Metrics, ","))) + } + if t.Format != nil { + b.WriteString(fmt.Sprintf(",format=%s", *t.Format)) + } + if t.LoaderLimit != nil { + b.WriteString(fmt.Sprintf(",loader_limit=%d", *t.LoaderLimit)) + } + if t.NumDemos != nil { + b.WriteString(fmt.Sprintf(",num_demos=%d", *t.NumDemos)) + } + if t.DemosPoolSize != nil { + b.WriteString(fmt.Sprintf(",demos_pool_size=%d", *t.DemosPoolSize)) + } + return b.String() +} + +type LMEvalContainer struct { + // Define Env information for the main container + // +optional + Env []corev1.EnvVar `json:"env,omitempty"` + // Define the volume mount information + // +optional + VolumeMounts []corev1.VolumeMount `json:"volumeMounts,omitempty"` + // Compute Resources required by this container. + // More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ + // +optional + Resources *corev1.ResourceRequirements `json:"resources,omitempty"` +} + +// The following Getter-ish functions avoid nil pointer panic +func (c *LMEvalContainer) GetEnv() []corev1.EnvVar { + if c == nil { + return nil + } + return c.Env +} + +func (c *LMEvalContainer) GetVolumMounts() []corev1.VolumeMount { + if c == nil { + return nil + } + return c.VolumeMounts +} + +func (c *LMEvalContainer) GetResources() *corev1.ResourceRequirements { + if c == nil { + return nil + } + return c.Resources +} + +type LMEvalPodSpec struct { + // Extra container data for the lm-eval container + // +optional + Container *LMEvalContainer `json:"container,omitempty"` + // Specify the volumes information for the lm-eval and sidecar containers + // +optional + Volumes []corev1.Volume `json:"volumes,omitempty"` + // Specify extra containers for the lm-eval job + // FIXME: aggregate the sidecar containers into the pod + // +optional + SideCars []corev1.Container `json:"sideCars,omitempty"` +} + +// The following Getter-ish functions avoid nil pointer panic +func (p *LMEvalPodSpec) GetContainer() *LMEvalContainer { + if p == nil { + return nil + } + return p.Container +} + +func (p *LMEvalPodSpec) GetVolumes() []corev1.Volume { + if p == nil { + return nil + } + return p.Volumes +} + +func (p *LMEvalPodSpec) GetSideCards() []corev1.Container { + if p == nil { + return nil + } + return p.SideCars +} + +// LMEvalJobSpec defines the desired state of LMEvalJob +type LMEvalJobSpec struct { + // INSERT ADDITIONAL SPEC FIELDS - desired state of cluster + // Important: Run "make" to regenerate code after modifying this file + + // Model name + Model string `json:"model"` + // Args for the model + // +optional + ModelArgs []Arg `json:"modelArgs,omitempty"` + // Evaluation task list + TaskList TaskList `json:"taskList"` + // Sets the number of few-shot examples to place in context + // +optional + NumFewShot *int `json:"numFewShot,omitempty"` + // Accepts an integer, or a float between 0.0 and 1.0 . If passed, will limit + // the number of documents to evaluate to the first X documents (if an integer) + // per task or first X% of documents per task + // +optional + Limit string `json:"limit,omitempty"` + // Map to `--gen_kwargs` parameter for the underlying library. + // +optional + GenArgs []Arg `json:"genArgs,omitempty"` + // If this flag is passed, then the model's outputs, and the text fed into the + // model, will be saved at per-document granularity + // +optional + LogSamples *bool `json:"logSamples,omitempty"` + // Batch size for the evaluation. This is used by the models that run and are loaded + // locally and not apply for the commercial APIs. + BatchSize *int `json:"batchSize,omitempty"` + // Specify extra information for the lm-eval job's pod + // +optional + Pod *LMEvalPodSpec `json:"pod,omitempty"` + // Suspend keeps the job but without pods. This is intended to be used by the Kueue integration + // +optional + Suspend bool `json:"suspend,omitempty"` +} + +// LMEvalJobStatus defines the observed state of LMEvalJob +type LMEvalJobStatus struct { + // Important: Run "make" to regenerate code after modifying this file + + // The name of the Pod that runs the evaluation job + // +optional + PodName string `json:"podName,omitempty"` + // State of the job + // +optional + State JobState `json:"state,omitempty"` + // Final result of the job + // +optional + Reason Reason `json:"reason,omitempty"` + // Message about the current/final status + // +optional + Message string `json:"message,omitempty"` + // Information when was the last time the job was successfully scheduled. + // +optional + LastScheduleTime *metav1.Time `json:"lastScheduleTime,omitempty"` + // Information when the job's state changes to Complete. + // +optional + CompleteTime *metav1.Time `json:"completeTime,omitempty"` + // Evaluation results + // +optional + Results string `json:"results,omitempty"` +} + +// +kubebuilder:object:root=true +// +kubebuilder:subresource:status +// +kubebuilder:printcolumn:name="State",type=string,JSONPath=`.status.state` +// LMEvalJob is the Schema for the lmevaljobs API +type LMEvalJob struct { + metav1.TypeMeta `json:",inline"` + metav1.ObjectMeta `json:"metadata,omitempty"` + + Spec LMEvalJobSpec `json:"spec,omitempty"` + Status LMEvalJobStatus `json:"status,omitempty"` +} + +// +kubebuilder:object:root=true + +// LMEvalJobList contains a list of LMEvalJob +type LMEvalJobList struct { + metav1.TypeMeta `json:",inline"` + metav1.ListMeta `json:"metadata,omitempty"` + Items []LMEvalJob `json:"items"` +} + +func init() { + SchemeBuilder.Register(&LMEvalJob{}, &LMEvalJobList{}) +} diff --git a/api/lmes/v1alpha1/zz_generated.deepcopy.go b/api/lmes/v1alpha1/zz_generated.deepcopy.go new file mode 100644 index 00000000..802ebab4 --- /dev/null +++ b/api/lmes/v1alpha1/zz_generated.deepcopy.go @@ -0,0 +1,325 @@ +//go:build !ignore_autogenerated + +/* +Copyright 2023. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +*/ + +// Code generated by controller-gen. DO NOT EDIT. + +package v1alpha1 + +import ( + "k8s.io/api/core/v1" + runtime "k8s.io/apimachinery/pkg/runtime" +) + +// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. +func (in *Arg) DeepCopyInto(out *Arg) { + *out = *in +} + +// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new Arg. +func (in *Arg) DeepCopy() *Arg { + if in == nil { + return nil + } + out := new(Arg) + in.DeepCopyInto(out) + return out +} + +// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. +func (in *Card) DeepCopyInto(out *Card) { + *out = *in +} + +// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new Card. +func (in *Card) DeepCopy() *Card { + if in == nil { + return nil + } + out := new(Card) + in.DeepCopyInto(out) + return out +} + +// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. +func (in *LMEvalContainer) DeepCopyInto(out *LMEvalContainer) { + *out = *in + if in.Env != nil { + in, out := &in.Env, &out.Env + *out = make([]v1.EnvVar, len(*in)) + for i := range *in { + (*in)[i].DeepCopyInto(&(*out)[i]) + } + } + if in.VolumeMounts != nil { + in, out := &in.VolumeMounts, &out.VolumeMounts + *out = make([]v1.VolumeMount, len(*in)) + for i := range *in { + (*in)[i].DeepCopyInto(&(*out)[i]) + } + } + if in.Resources != nil { + in, out := &in.Resources, &out.Resources + *out = new(v1.ResourceRequirements) + (*in).DeepCopyInto(*out) + } +} + +// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new LMEvalContainer. +func (in *LMEvalContainer) DeepCopy() *LMEvalContainer { + if in == nil { + return nil + } + out := new(LMEvalContainer) + in.DeepCopyInto(out) + return out +} + +// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. +func (in *LMEvalJob) DeepCopyInto(out *LMEvalJob) { + *out = *in + out.TypeMeta = in.TypeMeta + in.ObjectMeta.DeepCopyInto(&out.ObjectMeta) + in.Spec.DeepCopyInto(&out.Spec) + in.Status.DeepCopyInto(&out.Status) +} + +// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new LMEvalJob. +func (in *LMEvalJob) DeepCopy() *LMEvalJob { + if in == nil { + return nil + } + out := new(LMEvalJob) + in.DeepCopyInto(out) + return out +} + +// DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object. +func (in *LMEvalJob) DeepCopyObject() runtime.Object { + if c := in.DeepCopy(); c != nil { + return c + } + return nil +} + +// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. +func (in *LMEvalJobList) DeepCopyInto(out *LMEvalJobList) { + *out = *in + out.TypeMeta = in.TypeMeta + in.ListMeta.DeepCopyInto(&out.ListMeta) + if in.Items != nil { + in, out := &in.Items, &out.Items + *out = make([]LMEvalJob, len(*in)) + for i := range *in { + (*in)[i].DeepCopyInto(&(*out)[i]) + } + } +} + +// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new LMEvalJobList. +func (in *LMEvalJobList) DeepCopy() *LMEvalJobList { + if in == nil { + return nil + } + out := new(LMEvalJobList) + in.DeepCopyInto(out) + return out +} + +// DeepCopyObject is an autogenerated deepcopy function, copying the receiver, creating a new runtime.Object. +func (in *LMEvalJobList) DeepCopyObject() runtime.Object { + if c := in.DeepCopy(); c != nil { + return c + } + return nil +} + +// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. +func (in *LMEvalJobSpec) DeepCopyInto(out *LMEvalJobSpec) { + *out = *in + if in.ModelArgs != nil { + in, out := &in.ModelArgs, &out.ModelArgs + *out = make([]Arg, len(*in)) + copy(*out, *in) + } + in.TaskList.DeepCopyInto(&out.TaskList) + if in.NumFewShot != nil { + in, out := &in.NumFewShot, &out.NumFewShot + *out = new(int) + **out = **in + } + if in.GenArgs != nil { + in, out := &in.GenArgs, &out.GenArgs + *out = make([]Arg, len(*in)) + copy(*out, *in) + } + if in.LogSamples != nil { + in, out := &in.LogSamples, &out.LogSamples + *out = new(bool) + **out = **in + } + if in.BatchSize != nil { + in, out := &in.BatchSize, &out.BatchSize + *out = new(int) + **out = **in + } + if in.Pod != nil { + in, out := &in.Pod, &out.Pod + *out = new(LMEvalPodSpec) + (*in).DeepCopyInto(*out) + } +} + +// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new LMEvalJobSpec. +func (in *LMEvalJobSpec) DeepCopy() *LMEvalJobSpec { + if in == nil { + return nil + } + out := new(LMEvalJobSpec) + in.DeepCopyInto(out) + return out +} + +// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. +func (in *LMEvalJobStatus) DeepCopyInto(out *LMEvalJobStatus) { + *out = *in + if in.LastScheduleTime != nil { + in, out := &in.LastScheduleTime, &out.LastScheduleTime + *out = (*in).DeepCopy() + } + if in.CompleteTime != nil { + in, out := &in.CompleteTime, &out.CompleteTime + *out = (*in).DeepCopy() + } +} + +// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new LMEvalJobStatus. +func (in *LMEvalJobStatus) DeepCopy() *LMEvalJobStatus { + if in == nil { + return nil + } + out := new(LMEvalJobStatus) + in.DeepCopyInto(out) + return out +} + +// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. +func (in *LMEvalPodSpec) DeepCopyInto(out *LMEvalPodSpec) { + *out = *in + if in.Container != nil { + in, out := &in.Container, &out.Container + *out = new(LMEvalContainer) + (*in).DeepCopyInto(*out) + } + if in.Volumes != nil { + in, out := &in.Volumes, &out.Volumes + *out = make([]v1.Volume, len(*in)) + for i := range *in { + (*in)[i].DeepCopyInto(&(*out)[i]) + } + } + if in.SideCars != nil { + in, out := &in.SideCars, &out.SideCars + *out = make([]v1.Container, len(*in)) + for i := range *in { + (*in)[i].DeepCopyInto(&(*out)[i]) + } + } +} + +// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new LMEvalPodSpec. +func (in *LMEvalPodSpec) DeepCopy() *LMEvalPodSpec { + if in == nil { + return nil + } + out := new(LMEvalPodSpec) + in.DeepCopyInto(out) + return out +} + +// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. +func (in *TaskList) DeepCopyInto(out *TaskList) { + *out = *in + if in.TaskNames != nil { + in, out := &in.TaskNames, &out.TaskNames + *out = make([]string, len(*in)) + copy(*out, *in) + } + if in.TaskRecipes != nil { + in, out := &in.TaskRecipes, &out.TaskRecipes + *out = make([]TaskRecipe, len(*in)) + for i := range *in { + (*in)[i].DeepCopyInto(&(*out)[i]) + } + } +} + +// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new TaskList. +func (in *TaskList) DeepCopy() *TaskList { + if in == nil { + return nil + } + out := new(TaskList) + in.DeepCopyInto(out) + return out +} + +// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil. +func (in *TaskRecipe) DeepCopyInto(out *TaskRecipe) { + *out = *in + out.Card = in.Card + if in.Task != nil { + in, out := &in.Task, &out.Task + *out = new(string) + **out = **in + } + if in.Metrics != nil { + in, out := &in.Metrics, &out.Metrics + *out = make([]string, len(*in)) + copy(*out, *in) + } + if in.Format != nil { + in, out := &in.Format, &out.Format + *out = new(string) + **out = **in + } + if in.LoaderLimit != nil { + in, out := &in.LoaderLimit, &out.LoaderLimit + *out = new(int) + **out = **in + } + if in.NumDemos != nil { + in, out := &in.NumDemos, &out.NumDemos + *out = new(int) + **out = **in + } + if in.DemosPoolSize != nil { + in, out := &in.DemosPoolSize, &out.DemosPoolSize + *out = new(int) + **out = **in + } +} + +// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new TaskRecipe. +func (in *TaskRecipe) DeepCopy() *TaskRecipe { + if in == nil { + return nil + } + out := new(TaskRecipe) + in.DeepCopyInto(out) + return out +} diff --git a/api/v1alpha1/groupversion_info.go b/api/tas/v1alpha1/groupversion_info.go similarity index 100% rename from api/v1alpha1/groupversion_info.go rename to api/tas/v1alpha1/groupversion_info.go diff --git a/api/v1alpha1/trustyaiservice_types.go b/api/tas/v1alpha1/trustyaiservice_types.go similarity index 100% rename from api/v1alpha1/trustyaiservice_types.go rename to api/tas/v1alpha1/trustyaiservice_types.go diff --git a/api/v1alpha1/zz_generated.deepcopy.go b/api/tas/v1alpha1/zz_generated.deepcopy.go similarity index 99% rename from api/v1alpha1/zz_generated.deepcopy.go rename to api/tas/v1alpha1/zz_generated.deepcopy.go index 1ddafeca..2c7b65d0 100644 --- a/api/v1alpha1/zz_generated.deepcopy.go +++ b/api/tas/v1alpha1/zz_generated.deepcopy.go @@ -1,5 +1,4 @@ //go:build !ignore_autogenerated -// +build !ignore_autogenerated /* Copyright 2023. diff --git a/cmd/lmes_driver/main.go b/cmd/lmes_driver/main.go new file mode 100644 index 00000000..537310fb --- /dev/null +++ b/cmd/lmes_driver/main.go @@ -0,0 +1,214 @@ +/* +Copyright 2024. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +*/ + +package main + +import ( + "context" + "encoding/json" + "flag" + "fmt" + "io" + "os" + "strings" + + ctrl "sigs.k8s.io/controller-runtime" + "sigs.k8s.io/controller-runtime/pkg/log" + "sigs.k8s.io/controller-runtime/pkg/log/zap" + + "github.com/spf13/viper" + "github.com/trustyai-explainability/trustyai-service-operator/controllers/lmes/driver" +) + +const ( + OutputPath = "/opt/app-root/src/output" +) + +type strArrayArg []string + +func (t *strArrayArg) Set(value string) error { + *t = append(*t, value) + return nil +} + +func (t *strArrayArg) String() string { + // supposedly, use ":" as the separator for task recipe should be safe + return strings.Join(*t, ":") +} + +var ( + taskRecipes strArrayArg + customCards strArrayArg + copy = flag.String("copy", "", "copy this binary to specified destination path") + getStatus = flag.Bool("get-status", false, "Get current status") + shutdown = flag.Bool("shutdown", false, "Shutdown the driver") + outputPath = flag.String("output-path", OutputPath, "output path") + detectDevice = flag.Bool("detect-device", false, "detect available device(s), CUDA or CPU") + driverLog = ctrl.Log.WithName("driver") +) + +func init() { + flag.Var(&taskRecipes, "task-recipe", "task recipe") + flag.Var(&customCards, "custom-card", "A JSON string represents a custom card") +} + +func main() { + opts := zap.Options{ + Development: true, + } + opts.BindFlags(flag.CommandLine) + + flag.Parse() + viper.AutomaticEnv() + + log.SetLogger(zap.New(zap.UseFlagOptions(&opts))) + ctx := context.Background() + args := flag.Args() + + if *copy != "" { + // copy exec to destination + if err := copyExec(*copy); err != nil { + driverLog.Error(err, "failed to copy binary") + os.Exit(1) + return + } + os.Exit(0) + return + } + + if *getStatus { + getStatusOrDie(ctx) + return + } + + if *shutdown { + shutdownOrDie(ctx) + return + } + + if len(args) == 0 { + driverLog.Error(fmt.Errorf("no user program"), "empty args") + os.Exit(1) + } + + driverOpt := driver.DriverOption{ + Context: ctx, + OutputPath: *outputPath, + DetectDevice: *detectDevice, + Logger: driverLog, + TaskRecipes: taskRecipes, + CustomCards: customCards, + Args: args, + } + + driver, err := driver.NewDriver(&driverOpt) + if err != nil { + driverLog.Error(err, "Driver.NewDriver failed") + os.Exit(1) + } + + var exitCode = 0 + if err := driver.Run(); err != nil { + driverLog.Error(err, "Driver.Run failed") + exitCode = 1 + } + os.Exit(exitCode) +} + +func copyExec(destination string) (err error) { + defer func() { + if err != nil { + err = fmt.Errorf("copy this binary to %s: %w", destination, err) + } + }() + + path, err := findThisBinary() + if err != nil { + return err + } + src, err := os.Open(path) + if err != nil { + return err + } + defer src.Close() + dst, err := os.OpenFile(destination, os.O_RDWR|os.O_CREATE|os.O_TRUNC, 0o555) // 0o555 -> readable and executable by all + if err != nil { + return err + } + defer dst.Close() + if _, err = io.Copy(dst, src); err != nil { + return err + } + return dst.Close() +} + +func findThisBinary() (string, error) { + bin, err := os.Executable() + if err != nil { + return "", fmt.Errorf("failed to file executable: %w", err) + } + return bin, nil +} + +func getStatusOrDie(ctx context.Context) { + driver, err := driver.NewDriver(&driver.DriverOption{ + Context: ctx, + OutputPath: *outputPath, + DetectDevice: *detectDevice, + Logger: driverLog, + }) + + if err != nil { + driverLog.Error(err, "failed to initialize the driver") + os.Exit(1) + } + + status, err := driver.GetStatus() + if err != nil { + driverLog.Error(err, "failed to get status", "error", err.Error()) + os.Exit(1) + } + + b, err := json.Marshal(status) + if err != nil { + driverLog.Error(err, "json serialization failed", "error", err.Error()) + os.Exit(1) + } + + fmt.Print(string(b)) + os.Exit(0) +} + +func shutdownOrDie(ctx context.Context) { + driver, err := driver.NewDriver(&driver.DriverOption{ + Context: ctx, + OutputPath: *outputPath, + DetectDevice: *detectDevice, + Logger: driverLog, + }) + + if err != nil { + driverLog.Error(err, "failed to initialize the driver") + os.Exit(1) + } + + err = driver.Shutdown() + if err != nil { + driverLog.Error(err, "failed to shutdown", "error", err.Error()) + os.Exit(1) + } + os.Exit(0) +} diff --git a/cmd/lmes_driver/main_test.go b/cmd/lmes_driver/main_test.go new file mode 100644 index 00000000..070ec88c --- /dev/null +++ b/cmd/lmes_driver/main_test.go @@ -0,0 +1,73 @@ +/* +Copyright 2024. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +*/ + +package main + +import ( + "context" + "flag" + "os" + "testing" + + "github.com/stretchr/testify/assert" + "github.com/trustyai-explainability/trustyai-service-operator/controllers/lmes/driver" + "sigs.k8s.io/controller-runtime/pkg/log/zap" +) + +func Test_ArgParsing(t *testing.T) { + os.Args = []string{ + "/opt/app-root/src/bin/driver", + "--output-path", "/opt/app-root/src/output", + "--detect-device", + "--task-recipe", "card=unitxt.card1,template=unitxt.template,metrics=[unitxt.metric1,unitxt.metric2],format=unitxt.format,num_demos=5,demos_pool_size=10", + "--task-recipe", "card=unitxt.card2,template=unitxt.template2,metrics=[unitxt.metric3,unitxt.metric4],format=unitxt.format,num_demos=5,demos_pool_size=10", + "--", + "sh", "-c", "python", + } + + opts := zap.Options{ + Development: true, + } + opts.BindFlags(flag.CommandLine) + + flag.Parse() + + args := flag.Args() + + assert.Equal(t, true, *detectDevice) + assert.Equal(t, strArrayArg{ + "card=unitxt.card1,template=unitxt.template,metrics=[unitxt.metric1,unitxt.metric2],format=unitxt.format,num_demos=5,demos_pool_size=10", + "card=unitxt.card2,template=unitxt.template2,metrics=[unitxt.metric3,unitxt.metric4],format=unitxt.format,num_demos=5,demos_pool_size=10", + }, taskRecipes) + + dOption := driver.DriverOption{ + Context: context.Background(), + OutputPath: *outputPath, + DetectDevice: *detectDevice, + Logger: driverLog, + TaskRecipes: taskRecipes, + Args: args, + } + + assert.Equal(t, []string{ + "card=unitxt.card1,template=unitxt.template,metrics=[unitxt.metric1,unitxt.metric2],format=unitxt.format,num_demos=5,demos_pool_size=10", + "card=unitxt.card2,template=unitxt.template2,metrics=[unitxt.metric3,unitxt.metric4],format=unitxt.format,num_demos=5,demos_pool_size=10", + }, dOption.TaskRecipes) + + assert.Equal(t, []string{ + "sh", "-c", "python", + }, dOption.Args) +} diff --git a/main.go b/cmd/main.go similarity index 79% rename from main.go rename to cmd/main.go index ce5ee19e..dcb17fef 100644 --- a/main.go +++ b/cmd/main.go @@ -18,6 +18,7 @@ package main import ( "flag" + "fmt" "os" kservev1alpha1 "github.com/kserve/kserve/pkg/apis/serving/v1alpha1" @@ -37,8 +38,11 @@ import ( "sigs.k8s.io/controller-runtime/pkg/healthz" "sigs.k8s.io/controller-runtime/pkg/log/zap" - trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1" + lmesv1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/lmes/v1alpha1" + tasv1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1" "github.com/trustyai-explainability/trustyai-service-operator/controllers" + "github.com/trustyai-explainability/trustyai-service-operator/controllers/constants" + "github.com/trustyai-explainability/trustyai-service-operator/controllers/utils" //+kubebuilder:scaffold:imports ) @@ -49,7 +53,8 @@ var ( func init() { utilruntime.Must(clientgoscheme.AddToScheme(scheme)) - utilruntime.Must(trustyaiopendatahubiov1alpha1.AddToScheme(scheme)) + utilruntime.Must(tasv1alpha1.AddToScheme(scheme)) + utilruntime.Must(lmesv1alpha1.AddToScheme(scheme)) utilruntime.Must(monitoringv1.AddToScheme(scheme)) utilruntime.Must(kservev1alpha1.AddToScheme(scheme)) utilruntime.Must(kservev1beta1.AddToScheme(scheme)) @@ -62,11 +67,15 @@ func main() { var metricsAddr string var enableLeaderElection bool var probeAddr string + var configMap string + var enabledServices controllers.EnabledServices flag.StringVar(&metricsAddr, "metrics-bind-address", ":8080", "The address the metric endpoint binds to.") flag.StringVar(&probeAddr, "health-probe-bind-address", ":8081", "The address the probe endpoint binds to.") flag.BoolVar(&enableLeaderElection, "leader-elect", false, "Enable leader election for controller manager. "+ "Enabling this will ensure there is only one active controller manager.") + flag.Var(&enabledServices, "enable-services", "Specify a list of services to enable and use ',' as the separator") + flag.StringVar(&configMap, "configmap", constants.ConfigMap, "The configmap that stores settings for the operator") opts := zap.Options{ Development: true, } @@ -75,6 +84,11 @@ func main() { ctrl.SetLogger(zap.New(zap.UseFlagOptions(&opts))) + if enabledServices.Empty() { + setupLog.Error(fmt.Errorf("no service is specified"), "please specify at least one service") + os.Exit(1) + } + mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{ Scheme: scheme, MetricsBindAddress: metricsAddr, @@ -102,18 +116,13 @@ func main() { recorder := mgr.GetEventRecorderFor("trustyai-service-operator") - ns, err := controllers.GetNamespace() + ns, err := utils.GetNamespace() if err != nil { setupLog.Error(err, "unable to operator's namespace") } - if err = (&controllers.TrustyAIServiceReconciler{ - Client: mgr.GetClient(), - Scheme: mgr.GetScheme(), - Namespace: ns, - EventRecorder: recorder, - }).SetupWithManager(mgr); err != nil { - setupLog.Error(err, "unable to create controller", "controller", "TrustyAIService") + if err = controllers.SetupControllers(enabledServices, mgr, ns, configMap, recorder); err != nil { + setupLog.Error(err, "unable to initialize controller(s)") os.Exit(1) } //+kubebuilder:scaffold:builder diff --git a/config/base/kustomization.yaml b/config/base/kustomization.yaml index 0247b9bb..f50d41df 100644 --- a/config/base/kustomization.yaml +++ b/config/base/kustomization.yaml @@ -47,4 +47,4 @@ vars: name: config apiVersion: v1 fieldref: - fieldpath: data.kServeServerless \ No newline at end of file + fieldpath: data.kServeServerless diff --git a/config/base/params.env b/config/base/params.env index 4a9c77e0..e0de8034 100644 --- a/config/base/params.env +++ b/config/base/params.env @@ -2,3 +2,11 @@ trustyaiServiceImage=quay.io/trustyai/trustyai-service:latest trustyaiOperatorImage=quay.io/trustyai/trustyai-service-operator:latest oauthProxyImage=quay.io/openshift/origin-oauth-proxy:4.14.0 kServeServerless=enabled +lmes-driver-image=quay.io/trustyai/ta-lmes-driver:latest +lmes-pod-image=quay.io/trustyai/ta-lmes-job:latest +lmes-pod-checking-interval=10s +lmes-image-pull-policy=Always +lmes-max-batch-size=24 +lmes-default-batch-size=8 +lmes-detect-device=true + diff --git a/config/crd/bases/trustyai.opendatahub.io_lmevaljobs.yaml b/config/crd/bases/trustyai.opendatahub.io_lmevaljobs.yaml new file mode 100644 index 00000000..db26ae25 --- /dev/null +++ b/config/crd/bases/trustyai.opendatahub.io_lmevaljobs.yaml @@ -0,0 +1,3273 @@ +--- +apiVersion: apiextensions.k8s.io/v1 +kind: CustomResourceDefinition +metadata: + annotations: + controller-gen.kubebuilder.io/version: v0.16.3 + name: lmevaljobs.trustyai.opendatahub.io +spec: + group: trustyai.opendatahub.io + names: + kind: LMEvalJob + listKind: LMEvalJobList + plural: lmevaljobs + singular: lmevaljob + scope: Namespaced + versions: + - additionalPrinterColumns: + - jsonPath: .status.state + name: State + type: string + name: v1alpha1 + schema: + openAPIV3Schema: + description: LMEvalJob is the Schema for the lmevaljobs API + properties: + apiVersion: + description: |- + APIVersion defines the versioned schema of this representation of an object. + Servers should convert recognized schemas to the latest internal value, and + may reject unrecognized values. + More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources + type: string + kind: + description: |- + Kind is a string value representing the REST resource this object represents. + Servers may infer this from the endpoint the client submits requests to. + Cannot be updated. + In CamelCase. + More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds + type: string + metadata: + type: object + spec: + description: LMEvalJobSpec defines the desired state of LMEvalJob + properties: + batchSize: + description: |- + Batch size for the evaluation. This is used by the models that run and are loaded + locally and not apply for the commercial APIs. + type: integer + genArgs: + description: Map to `--gen_kwargs` parameter for the underlying library. + items: + properties: + name: + type: string + value: + type: string + required: + - name + type: object + type: array + limit: + description: |- + Accepts an integer, or a float between 0.0 and 1.0 . If passed, will limit + the number of documents to evaluate to the first X documents (if an integer) + per task or first X% of documents per task + type: string + logSamples: + description: |- + If this flag is passed, then the model's outputs, and the text fed into the + model, will be saved at per-document granularity + type: boolean + model: + description: Model name + type: string + modelArgs: + description: Args for the model + items: + properties: + name: + type: string + value: + type: string + required: + - name + type: object + type: array + numFewShot: + description: Sets the number of few-shot examples to place in context + type: integer + pod: + description: Specify extra information for the lm-eval job's pod + properties: + container: + description: Extra container data for the lm-eval container + properties: + env: + description: Define Env information for the main container + items: + description: EnvVar represents an environment variable present + in a Container. + properties: + name: + description: Name of the environment variable. Must + be a C_IDENTIFIER. + type: string + value: + description: |- + Variable references $(VAR_NAME) are expanded + using the previously defined environment variables in the container and + any service environment variables. If a variable cannot be resolved, + the reference in the input string will be unchanged. Double $$ are reduced + to a single $, which allows for escaping the $(VAR_NAME) syntax: i.e. + "$$(VAR_NAME)" will produce the string literal "$(VAR_NAME)". + Escaped references will never be expanded, regardless of whether the variable + exists or not. + Defaults to "". + type: string + valueFrom: + description: Source for the environment variable's value. + Cannot be used if value is not empty. + properties: + configMapKeyRef: + description: Selects a key of a ConfigMap. + properties: + key: + description: The key to select. + type: string + name: + description: |- + Name of the referent. + More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names + type: string + optional: + description: Specify whether the ConfigMap or + its key must be defined + type: boolean + required: + - key + type: object + x-kubernetes-map-type: atomic + fieldRef: + description: |- + Selects a field of the pod: supports metadata.name, metadata.namespace, `metadata.labels['']`, `metadata.annotations['']`, + spec.nodeName, spec.serviceAccountName, status.hostIP, status.podIP, status.podIPs. + properties: + apiVersion: + description: Version of the schema the FieldPath + is written in terms of, defaults to "v1". + type: string + fieldPath: + description: Path of the field to select in + the specified API version. + type: string + required: + - fieldPath + type: object + x-kubernetes-map-type: atomic + resourceFieldRef: + description: |- + Selects a resource of the container: only resources limits and requests + (limits.cpu, limits.memory, limits.ephemeral-storage, requests.cpu, requests.memory and requests.ephemeral-storage) are currently supported. + properties: + containerName: + description: 'Container name: required for volumes, + optional for env vars' + type: string + divisor: + anyOf: + - type: integer + - type: string + description: Specifies the output format of + the exposed resources, defaults to "1" + pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ + x-kubernetes-int-or-string: true + resource: + description: 'Required: resource to select' + type: string + required: + - resource + type: object + x-kubernetes-map-type: atomic + secretKeyRef: + description: Selects a key of a secret in the pod's + namespace + properties: + key: + description: The key of the secret to select + from. Must be a valid secret key. + type: string + name: + description: |- + Name of the referent. + More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names + type: string + optional: + description: Specify whether the Secret or its + key must be defined + type: boolean + required: + - key + type: object + x-kubernetes-map-type: atomic + type: object + required: + - name + type: object + type: array + resources: + description: |- + Compute Resources required by this container. + More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ + properties: + claims: + description: |- + Claims lists the names of resources, defined in spec.resourceClaims, + that are used by this container. + + This is an alpha field and requires enabling the + DynamicResourceAllocation feature gate. + + This field is immutable. It can only be set for containers. + items: + description: ResourceClaim references one entry in PodSpec.ResourceClaims. + properties: + name: + description: |- + Name must match the name of one entry in pod.spec.resourceClaims of + the Pod where this field is used. It makes that resource available + inside a container. + type: string + required: + - name + type: object + type: array + x-kubernetes-list-map-keys: + - name + x-kubernetes-list-type: map + limits: + additionalProperties: + anyOf: + - type: integer + - type: string + pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ + x-kubernetes-int-or-string: true + description: |- + Limits describes the maximum amount of compute resources allowed. + More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ + type: object + requests: + additionalProperties: + anyOf: + - type: integer + - type: string + pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ + x-kubernetes-int-or-string: true + description: |- + Requests describes the minimum amount of compute resources required. + If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, + otherwise to an implementation-defined value. + More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ + type: object + type: object + volumeMounts: + description: Define the volume mount information + items: + description: VolumeMount describes a mounting of a Volume + within a container. + properties: + mountPath: + description: |- + Path within the container at which the volume should be mounted. Must + not contain ':'. + type: string + mountPropagation: + description: |- + mountPropagation determines how mounts are propagated from the host + to container and the other way around. + When not set, MountPropagationNone is used. + This field is beta in 1.10. + type: string + name: + description: This must match the Name of a Volume. + type: string + readOnly: + description: |- + Mounted read-only if true, read-write otherwise (false or unspecified). + Defaults to false. + type: boolean + subPath: + description: |- + Path within the volume from which the container's volume should be mounted. + Defaults to "" (volume's root). + type: string + subPathExpr: + description: |- + Expanded path within the volume from which the container's volume should be mounted. + Behaves similarly to SubPath but environment variable references $(VAR_NAME) are expanded using the container's environment. + Defaults to "" (volume's root). + SubPathExpr and SubPath are mutually exclusive. + type: string + required: + - mountPath + - name + type: object + type: array + type: object + sideCars: + description: |- + Specify extra containers for the lm-eval job + FIXME: aggregate the sidecar containers into the pod + items: + description: A single application container that you want to + run within a pod. + properties: + args: + description: |- + Arguments to the entrypoint. + The container image's CMD is used if this is not provided. + Variable references $(VAR_NAME) are expanded using the container's environment. If a variable + cannot be resolved, the reference in the input string will be unchanged. Double $$ are reduced + to a single $, which allows for escaping the $(VAR_NAME) syntax: i.e. "$$(VAR_NAME)" will + produce the string literal "$(VAR_NAME)". Escaped references will never be expanded, regardless + of whether the variable exists or not. Cannot be updated. + More info: https://kubernetes.io/docs/tasks/inject-data-application/define-command-argument-container/#running-a-command-in-a-shell + items: + type: string + type: array + command: + description: |- + Entrypoint array. Not executed within a shell. + The container image's ENTRYPOINT is used if this is not provided. + Variable references $(VAR_NAME) are expanded using the container's environment. If a variable + cannot be resolved, the reference in the input string will be unchanged. Double $$ are reduced + to a single $, which allows for escaping the $(VAR_NAME) syntax: i.e. "$$(VAR_NAME)" will + produce the string literal "$(VAR_NAME)". Escaped references will never be expanded, regardless + of whether the variable exists or not. Cannot be updated. + More info: https://kubernetes.io/docs/tasks/inject-data-application/define-command-argument-container/#running-a-command-in-a-shell + items: + type: string + type: array + env: + description: |- + List of environment variables to set in the container. + Cannot be updated. + items: + description: EnvVar represents an environment variable + present in a Container. + properties: + name: + description: Name of the environment variable. Must + be a C_IDENTIFIER. + type: string + value: + description: |- + Variable references $(VAR_NAME) are expanded + using the previously defined environment variables in the container and + any service environment variables. If a variable cannot be resolved, + the reference in the input string will be unchanged. Double $$ are reduced + to a single $, which allows for escaping the $(VAR_NAME) syntax: i.e. + "$$(VAR_NAME)" will produce the string literal "$(VAR_NAME)". + Escaped references will never be expanded, regardless of whether the variable + exists or not. + Defaults to "". + type: string + valueFrom: + description: Source for the environment variable's + value. Cannot be used if value is not empty. + properties: + configMapKeyRef: + description: Selects a key of a ConfigMap. + properties: + key: + description: The key to select. + type: string + name: + description: |- + Name of the referent. + More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names + type: string + optional: + description: Specify whether the ConfigMap + or its key must be defined + type: boolean + required: + - key + type: object + x-kubernetes-map-type: atomic + fieldRef: + description: |- + Selects a field of the pod: supports metadata.name, metadata.namespace, `metadata.labels['']`, `metadata.annotations['']`, + spec.nodeName, spec.serviceAccountName, status.hostIP, status.podIP, status.podIPs. + properties: + apiVersion: + description: Version of the schema the FieldPath + is written in terms of, defaults to "v1". + type: string + fieldPath: + description: Path of the field to select in + the specified API version. + type: string + required: + - fieldPath + type: object + x-kubernetes-map-type: atomic + resourceFieldRef: + description: |- + Selects a resource of the container: only resources limits and requests + (limits.cpu, limits.memory, limits.ephemeral-storage, requests.cpu, requests.memory and requests.ephemeral-storage) are currently supported. + properties: + containerName: + description: 'Container name: required for + volumes, optional for env vars' + type: string + divisor: + anyOf: + - type: integer + - type: string + description: Specifies the output format of + the exposed resources, defaults to "1" + pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ + x-kubernetes-int-or-string: true + resource: + description: 'Required: resource to select' + type: string + required: + - resource + type: object + x-kubernetes-map-type: atomic + secretKeyRef: + description: Selects a key of a secret in the + pod's namespace + properties: + key: + description: The key of the secret to select + from. Must be a valid secret key. + type: string + name: + description: |- + Name of the referent. + More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names + type: string + optional: + description: Specify whether the Secret or + its key must be defined + type: boolean + required: + - key + type: object + x-kubernetes-map-type: atomic + type: object + required: + - name + type: object + type: array + envFrom: + description: |- + List of sources to populate environment variables in the container. + The keys defined within a source must be a C_IDENTIFIER. All invalid keys + will be reported as an event when the container is starting. When a key exists in multiple + sources, the value associated with the last source will take precedence. + Values defined by an Env with a duplicate key will take precedence. + Cannot be updated. + items: + description: EnvFromSource represents the source of a + set of ConfigMaps + properties: + configMapRef: + description: The ConfigMap to select from + properties: + name: + description: |- + Name of the referent. + More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names + type: string + optional: + description: Specify whether the ConfigMap must + be defined + type: boolean + type: object + x-kubernetes-map-type: atomic + prefix: + description: An optional identifier to prepend to + each key in the ConfigMap. Must be a C_IDENTIFIER. + type: string + secretRef: + description: The Secret to select from + properties: + name: + description: |- + Name of the referent. + More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names + type: string + optional: + description: Specify whether the Secret must be + defined + type: boolean + type: object + x-kubernetes-map-type: atomic + type: object + type: array + image: + description: |- + Container image name. + More info: https://kubernetes.io/docs/concepts/containers/images + This field is optional to allow higher level config management to default or override + container images in workload controllers like Deployments and StatefulSets. + type: string + imagePullPolicy: + description: |- + Image pull policy. + One of Always, Never, IfNotPresent. + Defaults to Always if :latest tag is specified, or IfNotPresent otherwise. + Cannot be updated. + More info: https://kubernetes.io/docs/concepts/containers/images#updating-images + type: string + lifecycle: + description: |- + Actions that the management system should take in response to container lifecycle events. + Cannot be updated. + properties: + postStart: + description: |- + PostStart is called immediately after a container is created. If the handler fails, + the container is terminated and restarted according to its restart policy. + Other management of the container blocks until the hook completes. + More info: https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#container-hooks + properties: + exec: + description: Exec specifies the action to take. + properties: + command: + description: |- + Command is the command line to execute inside the container, the working directory for the + command is root ('/') in the container's filesystem. The command is simply exec'd, it is + not run inside a shell, so traditional shell instructions ('|', etc) won't work. To use + a shell, you need to explicitly call out to that shell. + Exit status of 0 is treated as live/healthy and non-zero is unhealthy. + items: + type: string + type: array + type: object + httpGet: + description: HTTPGet specifies the http request + to perform. + properties: + host: + description: |- + Host name to connect to, defaults to the pod IP. You probably want to set + "Host" in httpHeaders instead. + type: string + httpHeaders: + description: Custom headers to set in the request. + HTTP allows repeated headers. + items: + description: HTTPHeader describes a custom + header to be used in HTTP probes + properties: + name: + description: The header field name + type: string + value: + description: The header field value + type: string + required: + - name + - value + type: object + type: array + path: + description: Path to access on the HTTP server. + type: string + port: + anyOf: + - type: integer + - type: string + description: |- + Name or number of the port to access on the container. + Number must be in the range 1 to 65535. + Name must be an IANA_SVC_NAME. + x-kubernetes-int-or-string: true + scheme: + description: |- + Scheme to use for connecting to the host. + Defaults to HTTP. + type: string + required: + - port + type: object + tcpSocket: + description: |- + Deprecated. TCPSocket is NOT supported as a LifecycleHandler and kept + for the backward compatibility. There are no validation of this field and + lifecycle hooks will fail in runtime when tcp handler is specified. + properties: + host: + description: 'Optional: Host name to connect + to, defaults to the pod IP.' + type: string + port: + anyOf: + - type: integer + - type: string + description: |- + Number or name of the port to access on the container. + Number must be in the range 1 to 65535. + Name must be an IANA_SVC_NAME. + x-kubernetes-int-or-string: true + required: + - port + type: object + type: object + preStop: + description: |- + PreStop is called immediately before a container is terminated due to an + API request or management event such as liveness/startup probe failure, + preemption, resource contention, etc. The handler is not called if the + container crashes or exits. The Pod's termination grace period countdown begins before the + PreStop hook is executed. Regardless of the outcome of the handler, the + container will eventually terminate within the Pod's termination grace + period (unless delayed by finalizers). Other management of the container blocks until the hook completes + or until the termination grace period is reached. + More info: https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#container-hooks + properties: + exec: + description: Exec specifies the action to take. + properties: + command: + description: |- + Command is the command line to execute inside the container, the working directory for the + command is root ('/') in the container's filesystem. The command is simply exec'd, it is + not run inside a shell, so traditional shell instructions ('|', etc) won't work. To use + a shell, you need to explicitly call out to that shell. + Exit status of 0 is treated as live/healthy and non-zero is unhealthy. + items: + type: string + type: array + type: object + httpGet: + description: HTTPGet specifies the http request + to perform. + properties: + host: + description: |- + Host name to connect to, defaults to the pod IP. You probably want to set + "Host" in httpHeaders instead. + type: string + httpHeaders: + description: Custom headers to set in the request. + HTTP allows repeated headers. + items: + description: HTTPHeader describes a custom + header to be used in HTTP probes + properties: + name: + description: The header field name + type: string + value: + description: The header field value + type: string + required: + - name + - value + type: object + type: array + path: + description: Path to access on the HTTP server. + type: string + port: + anyOf: + - type: integer + - type: string + description: |- + Name or number of the port to access on the container. + Number must be in the range 1 to 65535. + Name must be an IANA_SVC_NAME. + x-kubernetes-int-or-string: true + scheme: + description: |- + Scheme to use for connecting to the host. + Defaults to HTTP. + type: string + required: + - port + type: object + tcpSocket: + description: |- + Deprecated. TCPSocket is NOT supported as a LifecycleHandler and kept + for the backward compatibility. There are no validation of this field and + lifecycle hooks will fail in runtime when tcp handler is specified. + properties: + host: + description: 'Optional: Host name to connect + to, defaults to the pod IP.' + type: string + port: + anyOf: + - type: integer + - type: string + description: |- + Number or name of the port to access on the container. + Number must be in the range 1 to 65535. + Name must be an IANA_SVC_NAME. + x-kubernetes-int-or-string: true + required: + - port + type: object + type: object + type: object + livenessProbe: + description: |- + Periodic probe of container liveness. + Container will be restarted if the probe fails. + Cannot be updated. + More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes + properties: + exec: + description: Exec specifies the action to take. + properties: + command: + description: |- + Command is the command line to execute inside the container, the working directory for the + command is root ('/') in the container's filesystem. The command is simply exec'd, it is + not run inside a shell, so traditional shell instructions ('|', etc) won't work. To use + a shell, you need to explicitly call out to that shell. + Exit status of 0 is treated as live/healthy and non-zero is unhealthy. + items: + type: string + type: array + type: object + failureThreshold: + description: |- + Minimum consecutive failures for the probe to be considered failed after having succeeded. + Defaults to 3. Minimum value is 1. + format: int32 + type: integer + grpc: + description: |- + GRPC specifies an action involving a GRPC port. + This is a beta field and requires enabling GRPCContainerProbe feature gate. + properties: + port: + description: Port number of the gRPC service. Number + must be in the range 1 to 65535. + format: int32 + type: integer + service: + default: "" + description: |- + Service is the name of the service to place in the gRPC HealthCheckRequest + (see https://github.com/grpc/grpc/blob/master/doc/health-checking.md). + + If this is not specified, the default behavior is defined by gRPC. + type: string + required: + - port + type: object + httpGet: + description: HTTPGet specifies the http request to perform. + properties: + host: + description: |- + Host name to connect to, defaults to the pod IP. You probably want to set + "Host" in httpHeaders instead. + type: string + httpHeaders: + description: Custom headers to set in the request. + HTTP allows repeated headers. + items: + description: HTTPHeader describes a custom header + to be used in HTTP probes + properties: + name: + description: The header field name + type: string + value: + description: The header field value + type: string + required: + - name + - value + type: object + type: array + path: + description: Path to access on the HTTP server. + type: string + port: + anyOf: + - type: integer + - type: string + description: |- + Name or number of the port to access on the container. + Number must be in the range 1 to 65535. + Name must be an IANA_SVC_NAME. + x-kubernetes-int-or-string: true + scheme: + description: |- + Scheme to use for connecting to the host. + Defaults to HTTP. + type: string + required: + - port + type: object + initialDelaySeconds: + description: |- + Number of seconds after the container has started before liveness probes are initiated. + More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes + format: int32 + type: integer + periodSeconds: + description: |- + How often (in seconds) to perform the probe. + Default to 10 seconds. Minimum value is 1. + format: int32 + type: integer + successThreshold: + description: |- + Minimum consecutive successes for the probe to be considered successful after having failed. + Defaults to 1. Must be 1 for liveness and startup. Minimum value is 1. + format: int32 + type: integer + tcpSocket: + description: TCPSocket specifies an action involving + a TCP port. + properties: + host: + description: 'Optional: Host name to connect to, + defaults to the pod IP.' + type: string + port: + anyOf: + - type: integer + - type: string + description: |- + Number or name of the port to access on the container. + Number must be in the range 1 to 65535. + Name must be an IANA_SVC_NAME. + x-kubernetes-int-or-string: true + required: + - port + type: object + terminationGracePeriodSeconds: + description: |- + Optional duration in seconds the pod needs to terminate gracefully upon probe failure. + The grace period is the duration in seconds after the processes running in the pod are sent + a termination signal and the time when the processes are forcibly halted with a kill signal. + Set this value longer than the expected cleanup time for your process. + If this value is nil, the pod's terminationGracePeriodSeconds will be used. Otherwise, this + value overrides the value provided by the pod spec. + Value must be non-negative integer. The value zero indicates stop immediately via + the kill signal (no opportunity to shut down). + This is a beta field and requires enabling ProbeTerminationGracePeriod feature gate. + Minimum value is 1. spec.terminationGracePeriodSeconds is used if unset. + format: int64 + type: integer + timeoutSeconds: + description: |- + Number of seconds after which the probe times out. + Defaults to 1 second. Minimum value is 1. + More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes + format: int32 + type: integer + type: object + name: + description: |- + Name of the container specified as a DNS_LABEL. + Each container in a pod must have a unique name (DNS_LABEL). + Cannot be updated. + type: string + ports: + description: |- + List of ports to expose from the container. Not specifying a port here + DOES NOT prevent that port from being exposed. Any port which is + listening on the default "0.0.0.0" address inside a container will be + accessible from the network. + Modifying this array with strategic merge patch may corrupt the data. + For more information See https://github.com/kubernetes/kubernetes/issues/108255. + Cannot be updated. + items: + description: ContainerPort represents a network port in + a single container. + properties: + containerPort: + description: |- + Number of port to expose on the pod's IP address. + This must be a valid port number, 0 < x < 65536. + format: int32 + type: integer + hostIP: + description: What host IP to bind the external port + to. + type: string + hostPort: + description: |- + Number of port to expose on the host. + If specified, this must be a valid port number, 0 < x < 65536. + If HostNetwork is specified, this must match ContainerPort. + Most containers do not need this. + format: int32 + type: integer + name: + description: |- + If specified, this must be an IANA_SVC_NAME and unique within the pod. Each + named port in a pod must have a unique name. Name for the port that can be + referred to by services. + type: string + protocol: + default: TCP + description: |- + Protocol for port. Must be UDP, TCP, or SCTP. + Defaults to "TCP". + type: string + required: + - containerPort + type: object + type: array + x-kubernetes-list-map-keys: + - containerPort + - protocol + x-kubernetes-list-type: map + readinessProbe: + description: |- + Periodic probe of container service readiness. + Container will be removed from service endpoints if the probe fails. + Cannot be updated. + More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes + properties: + exec: + description: Exec specifies the action to take. + properties: + command: + description: |- + Command is the command line to execute inside the container, the working directory for the + command is root ('/') in the container's filesystem. The command is simply exec'd, it is + not run inside a shell, so traditional shell instructions ('|', etc) won't work. To use + a shell, you need to explicitly call out to that shell. + Exit status of 0 is treated as live/healthy and non-zero is unhealthy. + items: + type: string + type: array + type: object + failureThreshold: + description: |- + Minimum consecutive failures for the probe to be considered failed after having succeeded. + Defaults to 3. Minimum value is 1. + format: int32 + type: integer + grpc: + description: |- + GRPC specifies an action involving a GRPC port. + This is a beta field and requires enabling GRPCContainerProbe feature gate. + properties: + port: + description: Port number of the gRPC service. Number + must be in the range 1 to 65535. + format: int32 + type: integer + service: + default: "" + description: |- + Service is the name of the service to place in the gRPC HealthCheckRequest + (see https://github.com/grpc/grpc/blob/master/doc/health-checking.md). + + If this is not specified, the default behavior is defined by gRPC. + type: string + required: + - port + type: object + httpGet: + description: HTTPGet specifies the http request to perform. + properties: + host: + description: |- + Host name to connect to, defaults to the pod IP. You probably want to set + "Host" in httpHeaders instead. + type: string + httpHeaders: + description: Custom headers to set in the request. + HTTP allows repeated headers. + items: + description: HTTPHeader describes a custom header + to be used in HTTP probes + properties: + name: + description: The header field name + type: string + value: + description: The header field value + type: string + required: + - name + - value + type: object + type: array + path: + description: Path to access on the HTTP server. + type: string + port: + anyOf: + - type: integer + - type: string + description: |- + Name or number of the port to access on the container. + Number must be in the range 1 to 65535. + Name must be an IANA_SVC_NAME. + x-kubernetes-int-or-string: true + scheme: + description: |- + Scheme to use for connecting to the host. + Defaults to HTTP. + type: string + required: + - port + type: object + initialDelaySeconds: + description: |- + Number of seconds after the container has started before liveness probes are initiated. + More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes + format: int32 + type: integer + periodSeconds: + description: |- + How often (in seconds) to perform the probe. + Default to 10 seconds. Minimum value is 1. + format: int32 + type: integer + successThreshold: + description: |- + Minimum consecutive successes for the probe to be considered successful after having failed. + Defaults to 1. Must be 1 for liveness and startup. Minimum value is 1. + format: int32 + type: integer + tcpSocket: + description: TCPSocket specifies an action involving + a TCP port. + properties: + host: + description: 'Optional: Host name to connect to, + defaults to the pod IP.' + type: string + port: + anyOf: + - type: integer + - type: string + description: |- + Number or name of the port to access on the container. + Number must be in the range 1 to 65535. + Name must be an IANA_SVC_NAME. + x-kubernetes-int-or-string: true + required: + - port + type: object + terminationGracePeriodSeconds: + description: |- + Optional duration in seconds the pod needs to terminate gracefully upon probe failure. + The grace period is the duration in seconds after the processes running in the pod are sent + a termination signal and the time when the processes are forcibly halted with a kill signal. + Set this value longer than the expected cleanup time for your process. + If this value is nil, the pod's terminationGracePeriodSeconds will be used. Otherwise, this + value overrides the value provided by the pod spec. + Value must be non-negative integer. The value zero indicates stop immediately via + the kill signal (no opportunity to shut down). + This is a beta field and requires enabling ProbeTerminationGracePeriod feature gate. + Minimum value is 1. spec.terminationGracePeriodSeconds is used if unset. + format: int64 + type: integer + timeoutSeconds: + description: |- + Number of seconds after which the probe times out. + Defaults to 1 second. Minimum value is 1. + More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes + format: int32 + type: integer + type: object + resources: + description: |- + Compute Resources required by this container. + Cannot be updated. + More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ + properties: + claims: + description: |- + Claims lists the names of resources, defined in spec.resourceClaims, + that are used by this container. + + This is an alpha field and requires enabling the + DynamicResourceAllocation feature gate. + + This field is immutable. It can only be set for containers. + items: + description: ResourceClaim references one entry in + PodSpec.ResourceClaims. + properties: + name: + description: |- + Name must match the name of one entry in pod.spec.resourceClaims of + the Pod where this field is used. It makes that resource available + inside a container. + type: string + required: + - name + type: object + type: array + x-kubernetes-list-map-keys: + - name + x-kubernetes-list-type: map + limits: + additionalProperties: + anyOf: + - type: integer + - type: string + pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ + x-kubernetes-int-or-string: true + description: |- + Limits describes the maximum amount of compute resources allowed. + More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ + type: object + requests: + additionalProperties: + anyOf: + - type: integer + - type: string + pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ + x-kubernetes-int-or-string: true + description: |- + Requests describes the minimum amount of compute resources required. + If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, + otherwise to an implementation-defined value. + More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ + type: object + type: object + securityContext: + description: |- + SecurityContext defines the security options the container should be run with. + If set, the fields of SecurityContext override the equivalent fields of PodSecurityContext. + More info: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/ + properties: + allowPrivilegeEscalation: + description: |- + AllowPrivilegeEscalation controls whether a process can gain more + privileges than its parent process. This bool directly controls if + the no_new_privs flag will be set on the container process. + AllowPrivilegeEscalation is true always when the container is: + 1) run as Privileged + 2) has CAP_SYS_ADMIN + Note that this field cannot be set when spec.os.name is windows. + type: boolean + capabilities: + description: |- + The capabilities to add/drop when running containers. + Defaults to the default set of capabilities granted by the container runtime. + Note that this field cannot be set when spec.os.name is windows. + properties: + add: + description: Added capabilities + items: + description: Capability represent POSIX capabilities + type + type: string + type: array + drop: + description: Removed capabilities + items: + description: Capability represent POSIX capabilities + type + type: string + type: array + type: object + privileged: + description: |- + Run container in privileged mode. + Processes in privileged containers are essentially equivalent to root on the host. + Defaults to false. + Note that this field cannot be set when spec.os.name is windows. + type: boolean + procMount: + description: |- + procMount denotes the type of proc mount to use for the containers. + The default is DefaultProcMount which uses the container runtime defaults for + readonly paths and masked paths. + This requires the ProcMountType feature flag to be enabled. + Note that this field cannot be set when spec.os.name is windows. + type: string + readOnlyRootFilesystem: + description: |- + Whether this container has a read-only root filesystem. + Default is false. + Note that this field cannot be set when spec.os.name is windows. + type: boolean + runAsGroup: + description: |- + The GID to run the entrypoint of the container process. + Uses runtime default if unset. + May also be set in PodSecurityContext. If set in both SecurityContext and + PodSecurityContext, the value specified in SecurityContext takes precedence. + Note that this field cannot be set when spec.os.name is windows. + format: int64 + type: integer + runAsNonRoot: + description: |- + Indicates that the container must run as a non-root user. + If true, the Kubelet will validate the image at runtime to ensure that it + does not run as UID 0 (root) and fail to start the container if it does. + If unset or false, no such validation will be performed. + May also be set in PodSecurityContext. If set in both SecurityContext and + PodSecurityContext, the value specified in SecurityContext takes precedence. + type: boolean + runAsUser: + description: |- + The UID to run the entrypoint of the container process. + Defaults to user specified in image metadata if unspecified. + May also be set in PodSecurityContext. If set in both SecurityContext and + PodSecurityContext, the value specified in SecurityContext takes precedence. + Note that this field cannot be set when spec.os.name is windows. + format: int64 + type: integer + seLinuxOptions: + description: |- + The SELinux context to be applied to the container. + If unspecified, the container runtime will allocate a random SELinux context for each + container. May also be set in PodSecurityContext. If set in both SecurityContext and + PodSecurityContext, the value specified in SecurityContext takes precedence. + Note that this field cannot be set when spec.os.name is windows. + properties: + level: + description: Level is SELinux level label that applies + to the container. + type: string + role: + description: Role is a SELinux role label that applies + to the container. + type: string + type: + description: Type is a SELinux type label that applies + to the container. + type: string + user: + description: User is a SELinux user label that applies + to the container. + type: string + type: object + seccompProfile: + description: |- + The seccomp options to use by this container. If seccomp options are + provided at both the pod & container level, the container options + override the pod options. + Note that this field cannot be set when spec.os.name is windows. + properties: + localhostProfile: + description: |- + localhostProfile indicates a profile defined in a file on the node should be used. + The profile must be preconfigured on the node to work. + Must be a descending path, relative to the kubelet's configured seccomp profile location. + Must only be set if type is "Localhost". + type: string + type: + description: |- + type indicates which kind of seccomp profile will be applied. + Valid options are: + + Localhost - a profile defined in a file on the node should be used. + RuntimeDefault - the container runtime default profile should be used. + Unconfined - no profile should be applied. + type: string + required: + - type + type: object + windowsOptions: + description: |- + The Windows specific settings applied to all containers. + If unspecified, the options from the PodSecurityContext will be used. + If set in both SecurityContext and PodSecurityContext, the value specified in SecurityContext takes precedence. + Note that this field cannot be set when spec.os.name is linux. + properties: + gmsaCredentialSpec: + description: |- + GMSACredentialSpec is where the GMSA admission webhook + (https://github.com/kubernetes-sigs/windows-gmsa) inlines the contents of the + GMSA credential spec named by the GMSACredentialSpecName field. + type: string + gmsaCredentialSpecName: + description: GMSACredentialSpecName is the name + of the GMSA credential spec to use. + type: string + hostProcess: + description: |- + HostProcess determines if a container should be run as a 'Host Process' container. + This field is alpha-level and will only be honored by components that enable the + WindowsHostProcessContainers feature flag. Setting this field without the feature + flag will result in errors when validating the Pod. All of a Pod's containers must + have the same effective HostProcess value (it is not allowed to have a mix of HostProcess + containers and non-HostProcess containers). In addition, if HostProcess is true + then HostNetwork must also be set to true. + type: boolean + runAsUserName: + description: |- + The UserName in Windows to run the entrypoint of the container process. + Defaults to the user specified in image metadata if unspecified. + May also be set in PodSecurityContext. If set in both SecurityContext and + PodSecurityContext, the value specified in SecurityContext takes precedence. + type: string + type: object + type: object + startupProbe: + description: |- + StartupProbe indicates that the Pod has successfully initialized. + If specified, no other probes are executed until this completes successfully. + If this probe fails, the Pod will be restarted, just as if the livenessProbe failed. + This can be used to provide different probe parameters at the beginning of a Pod's lifecycle, + when it might take a long time to load data or warm a cache, than during steady-state operation. + This cannot be updated. + More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes + properties: + exec: + description: Exec specifies the action to take. + properties: + command: + description: |- + Command is the command line to execute inside the container, the working directory for the + command is root ('/') in the container's filesystem. The command is simply exec'd, it is + not run inside a shell, so traditional shell instructions ('|', etc) won't work. To use + a shell, you need to explicitly call out to that shell. + Exit status of 0 is treated as live/healthy and non-zero is unhealthy. + items: + type: string + type: array + type: object + failureThreshold: + description: |- + Minimum consecutive failures for the probe to be considered failed after having succeeded. + Defaults to 3. Minimum value is 1. + format: int32 + type: integer + grpc: + description: |- + GRPC specifies an action involving a GRPC port. + This is a beta field and requires enabling GRPCContainerProbe feature gate. + properties: + port: + description: Port number of the gRPC service. Number + must be in the range 1 to 65535. + format: int32 + type: integer + service: + default: "" + description: |- + Service is the name of the service to place in the gRPC HealthCheckRequest + (see https://github.com/grpc/grpc/blob/master/doc/health-checking.md). + + If this is not specified, the default behavior is defined by gRPC. + type: string + required: + - port + type: object + httpGet: + description: HTTPGet specifies the http request to perform. + properties: + host: + description: |- + Host name to connect to, defaults to the pod IP. You probably want to set + "Host" in httpHeaders instead. + type: string + httpHeaders: + description: Custom headers to set in the request. + HTTP allows repeated headers. + items: + description: HTTPHeader describes a custom header + to be used in HTTP probes + properties: + name: + description: The header field name + type: string + value: + description: The header field value + type: string + required: + - name + - value + type: object + type: array + path: + description: Path to access on the HTTP server. + type: string + port: + anyOf: + - type: integer + - type: string + description: |- + Name or number of the port to access on the container. + Number must be in the range 1 to 65535. + Name must be an IANA_SVC_NAME. + x-kubernetes-int-or-string: true + scheme: + description: |- + Scheme to use for connecting to the host. + Defaults to HTTP. + type: string + required: + - port + type: object + initialDelaySeconds: + description: |- + Number of seconds after the container has started before liveness probes are initiated. + More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes + format: int32 + type: integer + periodSeconds: + description: |- + How often (in seconds) to perform the probe. + Default to 10 seconds. Minimum value is 1. + format: int32 + type: integer + successThreshold: + description: |- + Minimum consecutive successes for the probe to be considered successful after having failed. + Defaults to 1. Must be 1 for liveness and startup. Minimum value is 1. + format: int32 + type: integer + tcpSocket: + description: TCPSocket specifies an action involving + a TCP port. + properties: + host: + description: 'Optional: Host name to connect to, + defaults to the pod IP.' + type: string + port: + anyOf: + - type: integer + - type: string + description: |- + Number or name of the port to access on the container. + Number must be in the range 1 to 65535. + Name must be an IANA_SVC_NAME. + x-kubernetes-int-or-string: true + required: + - port + type: object + terminationGracePeriodSeconds: + description: |- + Optional duration in seconds the pod needs to terminate gracefully upon probe failure. + The grace period is the duration in seconds after the processes running in the pod are sent + a termination signal and the time when the processes are forcibly halted with a kill signal. + Set this value longer than the expected cleanup time for your process. + If this value is nil, the pod's terminationGracePeriodSeconds will be used. Otherwise, this + value overrides the value provided by the pod spec. + Value must be non-negative integer. The value zero indicates stop immediately via + the kill signal (no opportunity to shut down). + This is a beta field and requires enabling ProbeTerminationGracePeriod feature gate. + Minimum value is 1. spec.terminationGracePeriodSeconds is used if unset. + format: int64 + type: integer + timeoutSeconds: + description: |- + Number of seconds after which the probe times out. + Defaults to 1 second. Minimum value is 1. + More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes + format: int32 + type: integer + type: object + stdin: + description: |- + Whether this container should allocate a buffer for stdin in the container runtime. If this + is not set, reads from stdin in the container will always result in EOF. + Default is false. + type: boolean + stdinOnce: + description: |- + Whether the container runtime should close the stdin channel after it has been opened by + a single attach. When stdin is true the stdin stream will remain open across multiple attach + sessions. If stdinOnce is set to true, stdin is opened on container start, is empty until the + first client attaches to stdin, and then remains open and accepts data until the client disconnects, + at which time stdin is closed and remains closed until the container is restarted. If this + flag is false, a container processes that reads from stdin will never receive an EOF. + Default is false + type: boolean + terminationMessagePath: + description: |- + Optional: Path at which the file to which the container's termination message + will be written is mounted into the container's filesystem. + Message written is intended to be brief final status, such as an assertion failure message. + Will be truncated by the node if greater than 4096 bytes. The total message length across + all containers will be limited to 12kb. + Defaults to /dev/termination-log. + Cannot be updated. + type: string + terminationMessagePolicy: + description: |- + Indicate how the termination message should be populated. File will use the contents of + terminationMessagePath to populate the container status message on both success and failure. + FallbackToLogsOnError will use the last chunk of container log output if the termination + message file is empty and the container exited with an error. + The log output is limited to 2048 bytes or 80 lines, whichever is smaller. + Defaults to File. + Cannot be updated. + type: string + tty: + description: |- + Whether this container should allocate a TTY for itself, also requires 'stdin' to be true. + Default is false. + type: boolean + volumeDevices: + description: volumeDevices is the list of block devices + to be used by the container. + items: + description: volumeDevice describes a mapping of a raw + block device within a container. + properties: + devicePath: + description: devicePath is the path inside of the + container that the device will be mapped to. + type: string + name: + description: name must match the name of a persistentVolumeClaim + in the pod + type: string + required: + - devicePath + - name + type: object + type: array + volumeMounts: + description: |- + Pod volumes to mount into the container's filesystem. + Cannot be updated. + items: + description: VolumeMount describes a mounting of a Volume + within a container. + properties: + mountPath: + description: |- + Path within the container at which the volume should be mounted. Must + not contain ':'. + type: string + mountPropagation: + description: |- + mountPropagation determines how mounts are propagated from the host + to container and the other way around. + When not set, MountPropagationNone is used. + This field is beta in 1.10. + type: string + name: + description: This must match the Name of a Volume. + type: string + readOnly: + description: |- + Mounted read-only if true, read-write otherwise (false or unspecified). + Defaults to false. + type: boolean + subPath: + description: |- + Path within the volume from which the container's volume should be mounted. + Defaults to "" (volume's root). + type: string + subPathExpr: + description: |- + Expanded path within the volume from which the container's volume should be mounted. + Behaves similarly to SubPath but environment variable references $(VAR_NAME) are expanded using the container's environment. + Defaults to "" (volume's root). + SubPathExpr and SubPath are mutually exclusive. + type: string + required: + - mountPath + - name + type: object + type: array + workingDir: + description: |- + Container's working directory. + If not specified, the container runtime's default will be used, which + might be configured in the container image. + Cannot be updated. + type: string + required: + - name + type: object + type: array + volumes: + description: Specify the volumes information for the lm-eval and + sidecar containers + items: + description: Volume represents a named volume in a pod that + may be accessed by any container in the pod. + properties: + awsElasticBlockStore: + description: |- + awsElasticBlockStore represents an AWS Disk resource that is attached to a + kubelet's host machine and then exposed to the pod. + More info: https://kubernetes.io/docs/concepts/storage/volumes#awselasticblockstore + properties: + fsType: + description: |- + fsType is the filesystem type of the volume that you want to mount. + Tip: Ensure that the filesystem type is supported by the host operating system. + Examples: "ext4", "xfs", "ntfs". Implicitly inferred to be "ext4" if unspecified. + More info: https://kubernetes.io/docs/concepts/storage/volumes#awselasticblockstore + type: string + partition: + description: |- + partition is the partition in the volume that you want to mount. + If omitted, the default is to mount by volume name. + Examples: For volume /dev/sda1, you specify the partition as "1". + Similarly, the volume partition for /dev/sda is "0" (or you can leave the property empty). + format: int32 + type: integer + readOnly: + description: |- + readOnly value true will force the readOnly setting in VolumeMounts. + More info: https://kubernetes.io/docs/concepts/storage/volumes#awselasticblockstore + type: boolean + volumeID: + description: |- + volumeID is unique ID of the persistent disk resource in AWS (Amazon EBS volume). + More info: https://kubernetes.io/docs/concepts/storage/volumes#awselasticblockstore + type: string + required: + - volumeID + type: object + azureDisk: + description: azureDisk represents an Azure Data Disk mount + on the host and bind mount to the pod. + properties: + cachingMode: + description: 'cachingMode is the Host Caching mode: + None, Read Only, Read Write.' + type: string + diskName: + description: diskName is the Name of the data disk in + the blob storage + type: string + diskURI: + description: diskURI is the URI of data disk in the + blob storage + type: string + fsType: + description: |- + fsType is Filesystem type to mount. + Must be a filesystem type supported by the host operating system. + Ex. "ext4", "xfs", "ntfs". Implicitly inferred to be "ext4" if unspecified. + type: string + kind: + description: 'kind expected values are Shared: multiple + blob disks per storage account Dedicated: single + blob disk per storage account Managed: azure managed + data disk (only in managed availability set). defaults + to shared' + type: string + readOnly: + description: |- + readOnly Defaults to false (read/write). ReadOnly here will force + the ReadOnly setting in VolumeMounts. + type: boolean + required: + - diskName + - diskURI + type: object + azureFile: + description: azureFile represents an Azure File Service + mount on the host and bind mount to the pod. + properties: + readOnly: + description: |- + readOnly defaults to false (read/write). ReadOnly here will force + the ReadOnly setting in VolumeMounts. + type: boolean + secretName: + description: secretName is the name of secret that + contains Azure Storage Account Name and Key + type: string + shareName: + description: shareName is the azure share Name + type: string + required: + - secretName + - shareName + type: object + cephfs: + description: cephFS represents a Ceph FS mount on the host + that shares a pod's lifetime + properties: + monitors: + description: |- + monitors is Required: Monitors is a collection of Ceph monitors + More info: https://examples.k8s.io/volumes/cephfs/README.md#how-to-use-it + items: + type: string + type: array + path: + description: 'path is Optional: Used as the mounted + root, rather than the full Ceph tree, default is /' + type: string + readOnly: + description: |- + readOnly is Optional: Defaults to false (read/write). ReadOnly here will force + the ReadOnly setting in VolumeMounts. + More info: https://examples.k8s.io/volumes/cephfs/README.md#how-to-use-it + type: boolean + secretFile: + description: |- + secretFile is Optional: SecretFile is the path to key ring for User, default is /etc/ceph/user.secret + More info: https://examples.k8s.io/volumes/cephfs/README.md#how-to-use-it + type: string + secretRef: + description: |- + secretRef is Optional: SecretRef is reference to the authentication secret for User, default is empty. + More info: https://examples.k8s.io/volumes/cephfs/README.md#how-to-use-it + properties: + name: + description: |- + Name of the referent. + More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names + type: string + type: object + x-kubernetes-map-type: atomic + user: + description: |- + user is optional: User is the rados user name, default is admin + More info: https://examples.k8s.io/volumes/cephfs/README.md#how-to-use-it + type: string + required: + - monitors + type: object + cinder: + description: |- + cinder represents a cinder volume attached and mounted on kubelets host machine. + More info: https://examples.k8s.io/mysql-cinder-pd/README.md + properties: + fsType: + description: |- + fsType is the filesystem type to mount. + Must be a filesystem type supported by the host operating system. + Examples: "ext4", "xfs", "ntfs". Implicitly inferred to be "ext4" if unspecified. + More info: https://examples.k8s.io/mysql-cinder-pd/README.md + type: string + readOnly: + description: |- + readOnly defaults to false (read/write). ReadOnly here will force + the ReadOnly setting in VolumeMounts. + More info: https://examples.k8s.io/mysql-cinder-pd/README.md + type: boolean + secretRef: + description: |- + secretRef is optional: points to a secret object containing parameters used to connect + to OpenStack. + properties: + name: + description: |- + Name of the referent. + More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names + type: string + type: object + x-kubernetes-map-type: atomic + volumeID: + description: |- + volumeID used to identify the volume in cinder. + More info: https://examples.k8s.io/mysql-cinder-pd/README.md + type: string + required: + - volumeID + type: object + configMap: + description: configMap represents a configMap that should + populate this volume + properties: + defaultMode: + description: |- + defaultMode is optional: mode bits used to set permissions on created files by default. + Must be an octal value between 0000 and 0777 or a decimal value between 0 and 511. + YAML accepts both octal and decimal values, JSON requires decimal values for mode bits. + Defaults to 0644. + Directories within the path are not affected by this setting. + This might be in conflict with other options that affect the file + mode, like fsGroup, and the result can be other mode bits set. + format: int32 + type: integer + items: + description: |- + items if unspecified, each key-value pair in the Data field of the referenced + ConfigMap will be projected into the volume as a file whose name is the + key and content is the value. If specified, the listed keys will be + projected into the specified paths, and unlisted keys will not be + present. If a key is specified which is not present in the ConfigMap, + the volume setup will error unless it is marked optional. Paths must be + relative and may not contain the '..' path or start with '..'. + items: + description: Maps a string key to a path within a + volume. + properties: + key: + description: key is the key to project. + type: string + mode: + description: |- + mode is Optional: mode bits used to set permissions on this file. + Must be an octal value between 0000 and 0777 or a decimal value between 0 and 511. + YAML accepts both octal and decimal values, JSON requires decimal values for mode bits. + If not specified, the volume defaultMode will be used. + This might be in conflict with other options that affect the file + mode, like fsGroup, and the result can be other mode bits set. + format: int32 + type: integer + path: + description: |- + path is the relative path of the file to map the key to. + May not be an absolute path. + May not contain the path element '..'. + May not start with the string '..'. + type: string + required: + - key + - path + type: object + type: array + name: + description: |- + Name of the referent. + More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names + type: string + optional: + description: optional specify whether the ConfigMap + or its keys must be defined + type: boolean + type: object + x-kubernetes-map-type: atomic + csi: + description: csi (Container Storage Interface) represents + ephemeral storage that is handled by certain external + CSI drivers (Beta feature). + properties: + driver: + description: |- + driver is the name of the CSI driver that handles this volume. + Consult with your admin for the correct name as registered in the cluster. + type: string + fsType: + description: |- + fsType to mount. Ex. "ext4", "xfs", "ntfs". + If not provided, the empty value is passed to the associated CSI driver + which will determine the default filesystem to apply. + type: string + nodePublishSecretRef: + description: |- + nodePublishSecretRef is a reference to the secret object containing + sensitive information to pass to the CSI driver to complete the CSI + NodePublishVolume and NodeUnpublishVolume calls. + This field is optional, and may be empty if no secret is required. If the + secret object contains more than one secret, all secret references are passed. + properties: + name: + description: |- + Name of the referent. + More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names + type: string + type: object + x-kubernetes-map-type: atomic + readOnly: + description: |- + readOnly specifies a read-only configuration for the volume. + Defaults to false (read/write). + type: boolean + volumeAttributes: + additionalProperties: + type: string + description: |- + volumeAttributes stores driver-specific properties that are passed to the CSI + driver. Consult your driver's documentation for supported values. + type: object + required: + - driver + type: object + downwardAPI: + description: downwardAPI represents downward API about the + pod that should populate this volume + properties: + defaultMode: + description: |- + Optional: mode bits to use on created files by default. Must be a + Optional: mode bits used to set permissions on created files by default. + Must be an octal value between 0000 and 0777 or a decimal value between 0 and 511. + YAML accepts both octal and decimal values, JSON requires decimal values for mode bits. + Defaults to 0644. + Directories within the path are not affected by this setting. + This might be in conflict with other options that affect the file + mode, like fsGroup, and the result can be other mode bits set. + format: int32 + type: integer + items: + description: Items is a list of downward API volume + file + items: + description: DownwardAPIVolumeFile represents information + to create the file containing the pod field + properties: + fieldRef: + description: 'Required: Selects a field of the + pod: only annotations, labels, name and namespace + are supported.' + properties: + apiVersion: + description: Version of the schema the FieldPath + is written in terms of, defaults to "v1". + type: string + fieldPath: + description: Path of the field to select in + the specified API version. + type: string + required: + - fieldPath + type: object + x-kubernetes-map-type: atomic + mode: + description: |- + Optional: mode bits used to set permissions on this file, must be an octal value + between 0000 and 0777 or a decimal value between 0 and 511. + YAML accepts both octal and decimal values, JSON requires decimal values for mode bits. + If not specified, the volume defaultMode will be used. + This might be in conflict with other options that affect the file + mode, like fsGroup, and the result can be other mode bits set. + format: int32 + type: integer + path: + description: 'Required: Path is the relative + path name of the file to be created. Must not + be absolute or contain the ''..'' path. Must + be utf-8 encoded. The first item of the relative + path must not start with ''..''' + type: string + resourceFieldRef: + description: |- + Selects a resource of the container: only resources limits and requests + (limits.cpu, limits.memory, requests.cpu and requests.memory) are currently supported. + properties: + containerName: + description: 'Container name: required for + volumes, optional for env vars' + type: string + divisor: + anyOf: + - type: integer + - type: string + description: Specifies the output format of + the exposed resources, defaults to "1" + pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ + x-kubernetes-int-or-string: true + resource: + description: 'Required: resource to select' + type: string + required: + - resource + type: object + x-kubernetes-map-type: atomic + required: + - path + type: object + type: array + type: object + emptyDir: + description: |- + emptyDir represents a temporary directory that shares a pod's lifetime. + More info: https://kubernetes.io/docs/concepts/storage/volumes#emptydir + properties: + medium: + description: |- + medium represents what type of storage medium should back this directory. + The default is "" which means to use the node's default medium. + Must be an empty string (default) or Memory. + More info: https://kubernetes.io/docs/concepts/storage/volumes#emptydir + type: string + sizeLimit: + anyOf: + - type: integer + - type: string + description: |- + sizeLimit is the total amount of local storage required for this EmptyDir volume. + The size limit is also applicable for memory medium. + The maximum usage on memory medium EmptyDir would be the minimum value between + the SizeLimit specified here and the sum of memory limits of all containers in a pod. + The default is nil which means that the limit is undefined. + More info: http://kubernetes.io/docs/user-guide/volumes#emptydir + pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ + x-kubernetes-int-or-string: true + type: object + ephemeral: + description: |- + ephemeral represents a volume that is handled by a cluster storage driver. + The volume's lifecycle is tied to the pod that defines it - it will be created before the pod starts, + and deleted when the pod is removed. + + Use this if: + a) the volume is only needed while the pod runs, + b) features of normal volumes like restoring from snapshot or capacity + tracking are needed, + c) the storage driver is specified through a storage class, and + d) the storage driver supports dynamic volume provisioning through + a PersistentVolumeClaim (see EphemeralVolumeSource for more + information on the connection between this volume type + and PersistentVolumeClaim). + + Use PersistentVolumeClaim or one of the vendor-specific + APIs for volumes that persist for longer than the lifecycle + of an individual pod. + + Use CSI for light-weight local ephemeral volumes if the CSI driver is meant to + be used that way - see the documentation of the driver for + more information. + + A pod can use both types of ephemeral volumes and + persistent volumes at the same time. + properties: + volumeClaimTemplate: + description: |- + Will be used to create a stand-alone PVC to provision the volume. + The pod in which this EphemeralVolumeSource is embedded will be the + owner of the PVC, i.e. the PVC will be deleted together with the + pod. The name of the PVC will be `-` where + `` is the name from the `PodSpec.Volumes` array + entry. Pod validation will reject the pod if the concatenated name + is not valid for a PVC (for example, too long). + + An existing PVC with that name that is not owned by the pod + will *not* be used for the pod to avoid using an unrelated + volume by mistake. Starting the pod is then blocked until + the unrelated PVC is removed. If such a pre-created PVC is + meant to be used by the pod, the PVC has to updated with an + owner reference to the pod once the pod exists. Normally + this should not be necessary, but it may be useful when + manually reconstructing a broken cluster. + + This field is read-only and no changes will be made by Kubernetes + to the PVC after it has been created. + + Required, must not be nil. + properties: + metadata: + description: |- + May contain labels and annotations that will be copied into the PVC + when creating it. No other fields are allowed and will be rejected during + validation. + type: object + spec: + description: |- + The specification for the PersistentVolumeClaim. The entire content is + copied unchanged into the PVC that gets created from this + template. The same fields as in a PersistentVolumeClaim + are also valid here. + properties: + accessModes: + description: |- + accessModes contains the desired access modes the volume should have. + More info: https://kubernetes.io/docs/concepts/storage/persistent-volumes#access-modes-1 + items: + type: string + type: array + dataSource: + description: |- + dataSource field can be used to specify either: + * An existing VolumeSnapshot object (snapshot.storage.k8s.io/VolumeSnapshot) + * An existing PVC (PersistentVolumeClaim) + If the provisioner or an external controller can support the specified data source, + it will create a new volume based on the contents of the specified data source. + When the AnyVolumeDataSource feature gate is enabled, dataSource contents will be copied to dataSourceRef, + and dataSourceRef contents will be copied to dataSource when dataSourceRef.namespace is not specified. + If the namespace is specified, then dataSourceRef will not be copied to dataSource. + properties: + apiGroup: + description: |- + APIGroup is the group for the resource being referenced. + If APIGroup is not specified, the specified Kind must be in the core API group. + For any other third-party types, APIGroup is required. + type: string + kind: + description: Kind is the type of resource + being referenced + type: string + name: + description: Name is the name of resource + being referenced + type: string + required: + - kind + - name + type: object + x-kubernetes-map-type: atomic + dataSourceRef: + description: |- + dataSourceRef specifies the object from which to populate the volume with data, if a non-empty + volume is desired. This may be any object from a non-empty API group (non + core object) or a PersistentVolumeClaim object. + When this field is specified, volume binding will only succeed if the type of + the specified object matches some installed volume populator or dynamic + provisioner. + This field will replace the functionality of the dataSource field and as such + if both fields are non-empty, they must have the same value. For backwards + compatibility, when namespace isn't specified in dataSourceRef, + both fields (dataSource and dataSourceRef) will be set to the same + value automatically if one of them is empty and the other is non-empty. + When namespace is specified in dataSourceRef, + dataSource isn't set to the same value and must be empty. + There are three important differences between dataSource and dataSourceRef: + * While dataSource only allows two specific types of objects, dataSourceRef + allows any non-core object, as well as PersistentVolumeClaim objects. + * While dataSource ignores disallowed values (dropping them), dataSourceRef + preserves all values, and generates an error if a disallowed value is + specified. + * While dataSource only allows local objects, dataSourceRef allows objects + in any namespaces. + (Beta) Using this field requires the AnyVolumeDataSource feature gate to be enabled. + (Alpha) Using the namespace field of dataSourceRef requires the CrossNamespaceVolumeDataSource feature gate to be enabled. + properties: + apiGroup: + description: |- + APIGroup is the group for the resource being referenced. + If APIGroup is not specified, the specified Kind must be in the core API group. + For any other third-party types, APIGroup is required. + type: string + kind: + description: Kind is the type of resource + being referenced + type: string + name: + description: Name is the name of resource + being referenced + type: string + namespace: + description: |- + Namespace is the namespace of resource being referenced + Note that when a namespace is specified, a gateway.networking.k8s.io/ReferenceGrant object is required in the referent namespace to allow that namespace's owner to accept the reference. See the ReferenceGrant documentation for details. + (Alpha) This field requires the CrossNamespaceVolumeDataSource feature gate to be enabled. + type: string + required: + - kind + - name + type: object + resources: + description: |- + resources represents the minimum resources the volume should have. + If RecoverVolumeExpansionFailure feature is enabled users are allowed to specify resource requirements + that are lower than previous value but must still be higher than capacity recorded in the + status field of the claim. + More info: https://kubernetes.io/docs/concepts/storage/persistent-volumes#resources + properties: + claims: + description: |- + Claims lists the names of resources, defined in spec.resourceClaims, + that are used by this container. + + This is an alpha field and requires enabling the + DynamicResourceAllocation feature gate. + + This field is immutable. It can only be set for containers. + items: + description: ResourceClaim references + one entry in PodSpec.ResourceClaims. + properties: + name: + description: |- + Name must match the name of one entry in pod.spec.resourceClaims of + the Pod where this field is used. It makes that resource available + inside a container. + type: string + required: + - name + type: object + type: array + x-kubernetes-list-map-keys: + - name + x-kubernetes-list-type: map + limits: + additionalProperties: + anyOf: + - type: integer + - type: string + pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ + x-kubernetes-int-or-string: true + description: |- + Limits describes the maximum amount of compute resources allowed. + More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ + type: object + requests: + additionalProperties: + anyOf: + - type: integer + - type: string + pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ + x-kubernetes-int-or-string: true + description: |- + Requests describes the minimum amount of compute resources required. + If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, + otherwise to an implementation-defined value. + More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ + type: object + type: object + selector: + description: selector is a label query over + volumes to consider for binding. + properties: + matchExpressions: + description: matchExpressions is a list + of label selector requirements. The requirements + are ANDed. + items: + description: |- + A label selector requirement is a selector that contains values, a key, and an operator that + relates the key and values. + properties: + key: + description: key is the label key + that the selector applies to. + type: string + operator: + description: |- + operator represents a key's relationship to a set of values. + Valid operators are In, NotIn, Exists and DoesNotExist. + type: string + values: + description: |- + values is an array of string values. If the operator is In or NotIn, + the values array must be non-empty. If the operator is Exists or DoesNotExist, + the values array must be empty. This array is replaced during a strategic + merge patch. + items: + type: string + type: array + required: + - key + - operator + type: object + type: array + matchLabels: + additionalProperties: + type: string + description: |- + matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels + map is equivalent to an element of matchExpressions, whose key field is "key", the + operator is "In", and the values array contains only "value". The requirements are ANDed. + type: object + type: object + x-kubernetes-map-type: atomic + storageClassName: + description: |- + storageClassName is the name of the StorageClass required by the claim. + More info: https://kubernetes.io/docs/concepts/storage/persistent-volumes#class-1 + type: string + volumeMode: + description: |- + volumeMode defines what type of volume is required by the claim. + Value of Filesystem is implied when not included in claim spec. + type: string + volumeName: + description: volumeName is the binding reference + to the PersistentVolume backing this claim. + type: string + type: object + required: + - spec + type: object + type: object + fc: + description: fc represents a Fibre Channel resource that + is attached to a kubelet's host machine and then exposed + to the pod. + properties: + fsType: + description: |- + fsType is the filesystem type to mount. + Must be a filesystem type supported by the host operating system. + Ex. "ext4", "xfs", "ntfs". Implicitly inferred to be "ext4" if unspecified. + type: string + lun: + description: 'lun is Optional: FC target lun number' + format: int32 + type: integer + readOnly: + description: |- + readOnly is Optional: Defaults to false (read/write). ReadOnly here will force + the ReadOnly setting in VolumeMounts. + type: boolean + targetWWNs: + description: 'targetWWNs is Optional: FC target worldwide + names (WWNs)' + items: + type: string + type: array + wwids: + description: |- + wwids Optional: FC volume world wide identifiers (wwids) + Either wwids or combination of targetWWNs and lun must be set, but not both simultaneously. + items: + type: string + type: array + type: object + flexVolume: + description: |- + flexVolume represents a generic volume resource that is + provisioned/attached using an exec based plugin. + properties: + driver: + description: driver is the name of the driver to use + for this volume. + type: string + fsType: + description: |- + fsType is the filesystem type to mount. + Must be a filesystem type supported by the host operating system. + Ex. "ext4", "xfs", "ntfs". The default filesystem depends on FlexVolume script. + type: string + options: + additionalProperties: + type: string + description: 'options is Optional: this field holds + extra command options if any.' + type: object + readOnly: + description: |- + readOnly is Optional: defaults to false (read/write). ReadOnly here will force + the ReadOnly setting in VolumeMounts. + type: boolean + secretRef: + description: |- + secretRef is Optional: secretRef is reference to the secret object containing + sensitive information to pass to the plugin scripts. This may be + empty if no secret object is specified. If the secret object + contains more than one secret, all secrets are passed to the plugin + scripts. + properties: + name: + description: |- + Name of the referent. + More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names + type: string + type: object + x-kubernetes-map-type: atomic + required: + - driver + type: object + flocker: + description: flocker represents a Flocker volume attached + to a kubelet's host machine. This depends on the Flocker + control service being running + properties: + datasetName: + description: |- + datasetName is Name of the dataset stored as metadata -> name on the dataset for Flocker + should be considered as deprecated + type: string + datasetUUID: + description: datasetUUID is the UUID of the dataset. + This is unique identifier of a Flocker dataset + type: string + type: object + gcePersistentDisk: + description: |- + gcePersistentDisk represents a GCE Disk resource that is attached to a + kubelet's host machine and then exposed to the pod. + More info: https://kubernetes.io/docs/concepts/storage/volumes#gcepersistentdisk + properties: + fsType: + description: |- + fsType is filesystem type of the volume that you want to mount. + Tip: Ensure that the filesystem type is supported by the host operating system. + Examples: "ext4", "xfs", "ntfs". Implicitly inferred to be "ext4" if unspecified. + More info: https://kubernetes.io/docs/concepts/storage/volumes#gcepersistentdisk + type: string + partition: + description: |- + partition is the partition in the volume that you want to mount. + If omitted, the default is to mount by volume name. + Examples: For volume /dev/sda1, you specify the partition as "1". + Similarly, the volume partition for /dev/sda is "0" (or you can leave the property empty). + More info: https://kubernetes.io/docs/concepts/storage/volumes#gcepersistentdisk + format: int32 + type: integer + pdName: + description: |- + pdName is unique name of the PD resource in GCE. Used to identify the disk in GCE. + More info: https://kubernetes.io/docs/concepts/storage/volumes#gcepersistentdisk + type: string + readOnly: + description: |- + readOnly here will force the ReadOnly setting in VolumeMounts. + Defaults to false. + More info: https://kubernetes.io/docs/concepts/storage/volumes#gcepersistentdisk + type: boolean + required: + - pdName + type: object + gitRepo: + description: |- + gitRepo represents a git repository at a particular revision. + DEPRECATED: GitRepo is deprecated. To provision a container with a git repo, mount an + EmptyDir into an InitContainer that clones the repo using git, then mount the EmptyDir + into the Pod's container. + properties: + directory: + description: |- + directory is the target directory name. + Must not contain or start with '..'. If '.' is supplied, the volume directory will be the + git repository. Otherwise, if specified, the volume will contain the git repository in + the subdirectory with the given name. + type: string + repository: + description: repository is the URL + type: string + revision: + description: revision is the commit hash for the specified + revision. + type: string + required: + - repository + type: object + glusterfs: + description: |- + glusterfs represents a Glusterfs mount on the host that shares a pod's lifetime. + More info: https://examples.k8s.io/volumes/glusterfs/README.md + properties: + endpoints: + description: |- + endpoints is the endpoint name that details Glusterfs topology. + More info: https://examples.k8s.io/volumes/glusterfs/README.md#create-a-pod + type: string + path: + description: |- + path is the Glusterfs volume path. + More info: https://examples.k8s.io/volumes/glusterfs/README.md#create-a-pod + type: string + readOnly: + description: |- + readOnly here will force the Glusterfs volume to be mounted with read-only permissions. + Defaults to false. + More info: https://examples.k8s.io/volumes/glusterfs/README.md#create-a-pod + type: boolean + required: + - endpoints + - path + type: object + hostPath: + description: |- + hostPath represents a pre-existing file or directory on the host + machine that is directly exposed to the container. This is generally + used for system agents or other privileged things that are allowed + to see the host machine. Most containers will NOT need this. + More info: https://kubernetes.io/docs/concepts/storage/volumes#hostpath + properties: + path: + description: |- + path of the directory on the host. + If the path is a symlink, it will follow the link to the real path. + More info: https://kubernetes.io/docs/concepts/storage/volumes#hostpath + type: string + type: + description: |- + type for HostPath Volume + Defaults to "" + More info: https://kubernetes.io/docs/concepts/storage/volumes#hostpath + type: string + required: + - path + type: object + iscsi: + description: |- + iscsi represents an ISCSI Disk resource that is attached to a + kubelet's host machine and then exposed to the pod. + More info: https://examples.k8s.io/volumes/iscsi/README.md + properties: + chapAuthDiscovery: + description: chapAuthDiscovery defines whether support + iSCSI Discovery CHAP authentication + type: boolean + chapAuthSession: + description: chapAuthSession defines whether support + iSCSI Session CHAP authentication + type: boolean + fsType: + description: |- + fsType is the filesystem type of the volume that you want to mount. + Tip: Ensure that the filesystem type is supported by the host operating system. + Examples: "ext4", "xfs", "ntfs". Implicitly inferred to be "ext4" if unspecified. + More info: https://kubernetes.io/docs/concepts/storage/volumes#iscsi + type: string + initiatorName: + description: |- + initiatorName is the custom iSCSI Initiator Name. + If initiatorName is specified with iscsiInterface simultaneously, new iSCSI interface + : will be created for the connection. + type: string + iqn: + description: iqn is the target iSCSI Qualified Name. + type: string + iscsiInterface: + description: |- + iscsiInterface is the interface Name that uses an iSCSI transport. + Defaults to 'default' (tcp). + type: string + lun: + description: lun represents iSCSI Target Lun number. + format: int32 + type: integer + portals: + description: |- + portals is the iSCSI Target Portal List. The portal is either an IP or ip_addr:port if the port + is other than default (typically TCP ports 860 and 3260). + items: + type: string + type: array + readOnly: + description: |- + readOnly here will force the ReadOnly setting in VolumeMounts. + Defaults to false. + type: boolean + secretRef: + description: secretRef is the CHAP Secret for iSCSI + target and initiator authentication + properties: + name: + description: |- + Name of the referent. + More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names + type: string + type: object + x-kubernetes-map-type: atomic + targetPortal: + description: |- + targetPortal is iSCSI Target Portal. The Portal is either an IP or ip_addr:port if the port + is other than default (typically TCP ports 860 and 3260). + type: string + required: + - iqn + - lun + - targetPortal + type: object + name: + description: |- + name of the volume. + Must be a DNS_LABEL and unique within the pod. + More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names + type: string + nfs: + description: |- + nfs represents an NFS mount on the host that shares a pod's lifetime + More info: https://kubernetes.io/docs/concepts/storage/volumes#nfs + properties: + path: + description: |- + path that is exported by the NFS server. + More info: https://kubernetes.io/docs/concepts/storage/volumes#nfs + type: string + readOnly: + description: |- + readOnly here will force the NFS export to be mounted with read-only permissions. + Defaults to false. + More info: https://kubernetes.io/docs/concepts/storage/volumes#nfs + type: boolean + server: + description: |- + server is the hostname or IP address of the NFS server. + More info: https://kubernetes.io/docs/concepts/storage/volumes#nfs + type: string + required: + - path + - server + type: object + persistentVolumeClaim: + description: |- + persistentVolumeClaimVolumeSource represents a reference to a + PersistentVolumeClaim in the same namespace. + More info: https://kubernetes.io/docs/concepts/storage/persistent-volumes#persistentvolumeclaims + properties: + claimName: + description: |- + claimName is the name of a PersistentVolumeClaim in the same namespace as the pod using this volume. + More info: https://kubernetes.io/docs/concepts/storage/persistent-volumes#persistentvolumeclaims + type: string + readOnly: + description: |- + readOnly Will force the ReadOnly setting in VolumeMounts. + Default false. + type: boolean + required: + - claimName + type: object + photonPersistentDisk: + description: photonPersistentDisk represents a PhotonController + persistent disk attached and mounted on kubelets host + machine + properties: + fsType: + description: |- + fsType is the filesystem type to mount. + Must be a filesystem type supported by the host operating system. + Ex. "ext4", "xfs", "ntfs". Implicitly inferred to be "ext4" if unspecified. + type: string + pdID: + description: pdID is the ID that identifies Photon Controller + persistent disk + type: string + required: + - pdID + type: object + portworxVolume: + description: portworxVolume represents a portworx volume + attached and mounted on kubelets host machine + properties: + fsType: + description: |- + fSType represents the filesystem type to mount + Must be a filesystem type supported by the host operating system. + Ex. "ext4", "xfs". Implicitly inferred to be "ext4" if unspecified. + type: string + readOnly: + description: |- + readOnly defaults to false (read/write). ReadOnly here will force + the ReadOnly setting in VolumeMounts. + type: boolean + volumeID: + description: volumeID uniquely identifies a Portworx + volume + type: string + required: + - volumeID + type: object + projected: + description: projected items for all in one resources secrets, + configmaps, and downward API + properties: + defaultMode: + description: |- + defaultMode are the mode bits used to set permissions on created files by default. + Must be an octal value between 0000 and 0777 or a decimal value between 0 and 511. + YAML accepts both octal and decimal values, JSON requires decimal values for mode bits. + Directories within the path are not affected by this setting. + This might be in conflict with other options that affect the file + mode, like fsGroup, and the result can be other mode bits set. + format: int32 + type: integer + sources: + description: sources is the list of volume projections + items: + description: Projection that may be projected along + with other supported volume types + properties: + configMap: + description: configMap information about the configMap + data to project + properties: + items: + description: |- + items if unspecified, each key-value pair in the Data field of the referenced + ConfigMap will be projected into the volume as a file whose name is the + key and content is the value. If specified, the listed keys will be + projected into the specified paths, and unlisted keys will not be + present. If a key is specified which is not present in the ConfigMap, + the volume setup will error unless it is marked optional. Paths must be + relative and may not contain the '..' path or start with '..'. + items: + description: Maps a string key to a path + within a volume. + properties: + key: + description: key is the key to project. + type: string + mode: + description: |- + mode is Optional: mode bits used to set permissions on this file. + Must be an octal value between 0000 and 0777 or a decimal value between 0 and 511. + YAML accepts both octal and decimal values, JSON requires decimal values for mode bits. + If not specified, the volume defaultMode will be used. + This might be in conflict with other options that affect the file + mode, like fsGroup, and the result can be other mode bits set. + format: int32 + type: integer + path: + description: |- + path is the relative path of the file to map the key to. + May not be an absolute path. + May not contain the path element '..'. + May not start with the string '..'. + type: string + required: + - key + - path + type: object + type: array + name: + description: |- + Name of the referent. + More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names + type: string + optional: + description: optional specify whether the + ConfigMap or its keys must be defined + type: boolean + type: object + x-kubernetes-map-type: atomic + downwardAPI: + description: downwardAPI information about the + downwardAPI data to project + properties: + items: + description: Items is a list of DownwardAPIVolume + file + items: + description: DownwardAPIVolumeFile represents + information to create the file containing + the pod field + properties: + fieldRef: + description: 'Required: Selects a field + of the pod: only annotations, labels, + name and namespace are supported.' + properties: + apiVersion: + description: Version of the schema + the FieldPath is written in terms + of, defaults to "v1". + type: string + fieldPath: + description: Path of the field to + select in the specified API version. + type: string + required: + - fieldPath + type: object + x-kubernetes-map-type: atomic + mode: + description: |- + Optional: mode bits used to set permissions on this file, must be an octal value + between 0000 and 0777 or a decimal value between 0 and 511. + YAML accepts both octal and decimal values, JSON requires decimal values for mode bits. + If not specified, the volume defaultMode will be used. + This might be in conflict with other options that affect the file + mode, like fsGroup, and the result can be other mode bits set. + format: int32 + type: integer + path: + description: 'Required: Path is the + relative path name of the file to + be created. Must not be absolute or + contain the ''..'' path. Must be utf-8 + encoded. The first item of the relative + path must not start with ''..''' + type: string + resourceFieldRef: + description: |- + Selects a resource of the container: only resources limits and requests + (limits.cpu, limits.memory, requests.cpu and requests.memory) are currently supported. + properties: + containerName: + description: 'Container name: required + for volumes, optional for env + vars' + type: string + divisor: + anyOf: + - type: integer + - type: string + description: Specifies the output + format of the exposed resources, + defaults to "1" + pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ + x-kubernetes-int-or-string: true + resource: + description: 'Required: resource + to select' + type: string + required: + - resource + type: object + x-kubernetes-map-type: atomic + required: + - path + type: object + type: array + type: object + secret: + description: secret information about the secret + data to project + properties: + items: + description: |- + items if unspecified, each key-value pair in the Data field of the referenced + Secret will be projected into the volume as a file whose name is the + key and content is the value. If specified, the listed keys will be + projected into the specified paths, and unlisted keys will not be + present. If a key is specified which is not present in the Secret, + the volume setup will error unless it is marked optional. Paths must be + relative and may not contain the '..' path or start with '..'. + items: + description: Maps a string key to a path + within a volume. + properties: + key: + description: key is the key to project. + type: string + mode: + description: |- + mode is Optional: mode bits used to set permissions on this file. + Must be an octal value between 0000 and 0777 or a decimal value between 0 and 511. + YAML accepts both octal and decimal values, JSON requires decimal values for mode bits. + If not specified, the volume defaultMode will be used. + This might be in conflict with other options that affect the file + mode, like fsGroup, and the result can be other mode bits set. + format: int32 + type: integer + path: + description: |- + path is the relative path of the file to map the key to. + May not be an absolute path. + May not contain the path element '..'. + May not start with the string '..'. + type: string + required: + - key + - path + type: object + type: array + name: + description: |- + Name of the referent. + More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names + type: string + optional: + description: optional field specify whether + the Secret or its key must be defined + type: boolean + type: object + x-kubernetes-map-type: atomic + serviceAccountToken: + description: serviceAccountToken is information + about the serviceAccountToken data to project + properties: + audience: + description: |- + audience is the intended audience of the token. A recipient of a token + must identify itself with an identifier specified in the audience of the + token, and otherwise should reject the token. The audience defaults to the + identifier of the apiserver. + type: string + expirationSeconds: + description: |- + expirationSeconds is the requested duration of validity of the service + account token. As the token approaches expiration, the kubelet volume + plugin will proactively rotate the service account token. The kubelet will + start trying to rotate the token if the token is older than 80 percent of + its time to live or if the token is older than 24 hours.Defaults to 1 hour + and must be at least 10 minutes. + format: int64 + type: integer + path: + description: |- + path is the path relative to the mount point of the file to project the + token into. + type: string + required: + - path + type: object + type: object + type: array + type: object + quobyte: + description: quobyte represents a Quobyte mount on the host + that shares a pod's lifetime + properties: + group: + description: |- + group to map volume access to + Default is no group + type: string + readOnly: + description: |- + readOnly here will force the Quobyte volume to be mounted with read-only permissions. + Defaults to false. + type: boolean + registry: + description: |- + registry represents a single or multiple Quobyte Registry services + specified as a string as host:port pair (multiple entries are separated with commas) + which acts as the central registry for volumes + type: string + tenant: + description: |- + tenant owning the given Quobyte volume in the Backend + Used with dynamically provisioned Quobyte volumes, value is set by the plugin + type: string + user: + description: |- + user to map volume access to + Defaults to serivceaccount user + type: string + volume: + description: volume is a string that references an already + created Quobyte volume by name. + type: string + required: + - registry + - volume + type: object + rbd: + description: |- + rbd represents a Rados Block Device mount on the host that shares a pod's lifetime. + More info: https://examples.k8s.io/volumes/rbd/README.md + properties: + fsType: + description: |- + fsType is the filesystem type of the volume that you want to mount. + Tip: Ensure that the filesystem type is supported by the host operating system. + Examples: "ext4", "xfs", "ntfs". Implicitly inferred to be "ext4" if unspecified. + More info: https://kubernetes.io/docs/concepts/storage/volumes#rbd + type: string + image: + description: |- + image is the rados image name. + More info: https://examples.k8s.io/volumes/rbd/README.md#how-to-use-it + type: string + keyring: + description: |- + keyring is the path to key ring for RBDUser. + Default is /etc/ceph/keyring. + More info: https://examples.k8s.io/volumes/rbd/README.md#how-to-use-it + type: string + monitors: + description: |- + monitors is a collection of Ceph monitors. + More info: https://examples.k8s.io/volumes/rbd/README.md#how-to-use-it + items: + type: string + type: array + pool: + description: |- + pool is the rados pool name. + Default is rbd. + More info: https://examples.k8s.io/volumes/rbd/README.md#how-to-use-it + type: string + readOnly: + description: |- + readOnly here will force the ReadOnly setting in VolumeMounts. + Defaults to false. + More info: https://examples.k8s.io/volumes/rbd/README.md#how-to-use-it + type: boolean + secretRef: + description: |- + secretRef is name of the authentication secret for RBDUser. If provided + overrides keyring. + Default is nil. + More info: https://examples.k8s.io/volumes/rbd/README.md#how-to-use-it + properties: + name: + description: |- + Name of the referent. + More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names + type: string + type: object + x-kubernetes-map-type: atomic + user: + description: |- + user is the rados user name. + Default is admin. + More info: https://examples.k8s.io/volumes/rbd/README.md#how-to-use-it + type: string + required: + - image + - monitors + type: object + scaleIO: + description: scaleIO represents a ScaleIO persistent volume + attached and mounted on Kubernetes nodes. + properties: + fsType: + description: |- + fsType is the filesystem type to mount. + Must be a filesystem type supported by the host operating system. + Ex. "ext4", "xfs", "ntfs". + Default is "xfs". + type: string + gateway: + description: gateway is the host address of the ScaleIO + API Gateway. + type: string + protectionDomain: + description: protectionDomain is the name of the ScaleIO + Protection Domain for the configured storage. + type: string + readOnly: + description: |- + readOnly Defaults to false (read/write). ReadOnly here will force + the ReadOnly setting in VolumeMounts. + type: boolean + secretRef: + description: |- + secretRef references to the secret for ScaleIO user and other + sensitive information. If this is not provided, Login operation will fail. + properties: + name: + description: |- + Name of the referent. + More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names + type: string + type: object + x-kubernetes-map-type: atomic + sslEnabled: + description: sslEnabled Flag enable/disable SSL communication + with Gateway, default false + type: boolean + storageMode: + description: |- + storageMode indicates whether the storage for a volume should be ThickProvisioned or ThinProvisioned. + Default is ThinProvisioned. + type: string + storagePool: + description: storagePool is the ScaleIO Storage Pool + associated with the protection domain. + type: string + system: + description: system is the name of the storage system + as configured in ScaleIO. + type: string + volumeName: + description: |- + volumeName is the name of a volume already created in the ScaleIO system + that is associated with this volume source. + type: string + required: + - gateway + - secretRef + - system + type: object + secret: + description: |- + secret represents a secret that should populate this volume. + More info: https://kubernetes.io/docs/concepts/storage/volumes#secret + properties: + defaultMode: + description: |- + defaultMode is Optional: mode bits used to set permissions on created files by default. + Must be an octal value between 0000 and 0777 or a decimal value between 0 and 511. + YAML accepts both octal and decimal values, JSON requires decimal values + for mode bits. Defaults to 0644. + Directories within the path are not affected by this setting. + This might be in conflict with other options that affect the file + mode, like fsGroup, and the result can be other mode bits set. + format: int32 + type: integer + items: + description: |- + items If unspecified, each key-value pair in the Data field of the referenced + Secret will be projected into the volume as a file whose name is the + key and content is the value. If specified, the listed keys will be + projected into the specified paths, and unlisted keys will not be + present. If a key is specified which is not present in the Secret, + the volume setup will error unless it is marked optional. Paths must be + relative and may not contain the '..' path or start with '..'. + items: + description: Maps a string key to a path within a + volume. + properties: + key: + description: key is the key to project. + type: string + mode: + description: |- + mode is Optional: mode bits used to set permissions on this file. + Must be an octal value between 0000 and 0777 or a decimal value between 0 and 511. + YAML accepts both octal and decimal values, JSON requires decimal values for mode bits. + If not specified, the volume defaultMode will be used. + This might be in conflict with other options that affect the file + mode, like fsGroup, and the result can be other mode bits set. + format: int32 + type: integer + path: + description: |- + path is the relative path of the file to map the key to. + May not be an absolute path. + May not contain the path element '..'. + May not start with the string '..'. + type: string + required: + - key + - path + type: object + type: array + optional: + description: optional field specify whether the Secret + or its keys must be defined + type: boolean + secretName: + description: |- + secretName is the name of the secret in the pod's namespace to use. + More info: https://kubernetes.io/docs/concepts/storage/volumes#secret + type: string + type: object + storageos: + description: storageOS represents a StorageOS volume attached + and mounted on Kubernetes nodes. + properties: + fsType: + description: |- + fsType is the filesystem type to mount. + Must be a filesystem type supported by the host operating system. + Ex. "ext4", "xfs", "ntfs". Implicitly inferred to be "ext4" if unspecified. + type: string + readOnly: + description: |- + readOnly defaults to false (read/write). ReadOnly here will force + the ReadOnly setting in VolumeMounts. + type: boolean + secretRef: + description: |- + secretRef specifies the secret to use for obtaining the StorageOS API + credentials. If not specified, default values will be attempted. + properties: + name: + description: |- + Name of the referent. + More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names + type: string + type: object + x-kubernetes-map-type: atomic + volumeName: + description: |- + volumeName is the human-readable name of the StorageOS volume. Volume + names are only unique within a namespace. + type: string + volumeNamespace: + description: |- + volumeNamespace specifies the scope of the volume within StorageOS. If no + namespace is specified then the Pod's namespace will be used. This allows the + Kubernetes name scoping to be mirrored within StorageOS for tighter integration. + Set VolumeName to any name to override the default behaviour. + Set to "default" if you are not using namespaces within StorageOS. + Namespaces that do not pre-exist within StorageOS will be created. + type: string + type: object + vsphereVolume: + description: vsphereVolume represents a vSphere volume attached + and mounted on kubelets host machine + properties: + fsType: + description: |- + fsType is filesystem type to mount. + Must be a filesystem type supported by the host operating system. + Ex. "ext4", "xfs", "ntfs". Implicitly inferred to be "ext4" if unspecified. + type: string + storagePolicyID: + description: storagePolicyID is the storage Policy Based + Management (SPBM) profile ID associated with the StoragePolicyName. + type: string + storagePolicyName: + description: storagePolicyName is the storage Policy + Based Management (SPBM) profile name. + type: string + volumePath: + description: volumePath is the path that identifies + vSphere volume vmdk + type: string + required: + - volumePath + type: object + required: + - name + type: object + type: array + type: object + suspend: + description: Suspend keeps the job but without pods. This is intended + to be used by the Kueue integration + type: boolean + taskList: + description: Evaluation task list + properties: + taskNames: + description: TaskNames from lm-eval's task list + items: + type: string + type: array + taskRecipes: + description: Task Recipes specifically for Unitxt + items: + description: |- + Use a task recipe to form a custom task. It maps to the Unitxt Recipe + Find details of the Unitxt Recipe here: + https://www.unitxt.ai/en/latest/unitxt.standard.html#unitxt.standard.StandardRecipe + properties: + card: + description: The Unitxt dataset card + properties: + custom: + description: |- + A JSON string for a custom unitxt card which contains the custom dataset. + Use the documentation here: https://www.unitxt.ai/en/latest/docs/adding_dataset.html#adding-to-the-catalog + to compose a custom card, store it as a JSON file, and use the JSON content as the value here. + type: string + name: + description: Unitxt card's ID + type: string + type: object + demosPoolSize: + description: The pool size for the fewshot + type: integer + format: + description: The Unitxt format + type: string + loaderLimit: + description: A limit number of records to load + type: integer + metrics: + description: Metrics + items: + type: string + type: array + numDemos: + description: Number of fewshot + type: integer + task: + description: The Unitxt Task + type: string + template: + description: The Unitxt template + type: string + required: + - card + - template + type: object + type: array + type: object + required: + - model + - taskList + type: object + status: + description: LMEvalJobStatus defines the observed state of LMEvalJob + properties: + completeTime: + description: Information when the job's state changes to Complete. + format: date-time + type: string + lastScheduleTime: + description: Information when was the last time the job was successfully + scheduled. + format: date-time + type: string + message: + description: Message about the current/final status + type: string + podName: + description: The name of the Pod that runs the evaluation job + type: string + reason: + description: Final result of the job + enum: + - NoReason + - Succeeded + - Failed + - Cancelled + type: string + results: + description: Evaluation results + type: string + state: + description: State of the job + enum: + - New + - Scheduled + - Running + - Complete + - Cancelled + - Suspended + type: string + type: object + type: object + served: true + storage: true + subresources: + status: {} diff --git a/config/crd/bases/trustyai.opendatahub.io_trustyaiservices.yaml b/config/crd/bases/trustyai.opendatahub.io_trustyaiservices.yaml index 076a8082..6d20eda9 100644 --- a/config/crd/bases/trustyai.opendatahub.io_trustyaiservices.yaml +++ b/config/crd/bases/trustyai.opendatahub.io_trustyaiservices.yaml @@ -3,8 +3,7 @@ apiVersion: apiextensions.k8s.io/v1 kind: CustomResourceDefinition metadata: annotations: - controller-gen.kubebuilder.io/version: v0.11.1 - creationTimestamp: null + controller-gen.kubebuilder.io/version: v0.16.3 name: trustyaiservices.trustyai.opendatahub.io spec: group: trustyai.opendatahub.io @@ -21,14 +20,19 @@ spec: description: TrustyAIService is the Schema for the trustyaiservices API properties: apiVersion: - description: 'APIVersion defines the versioned schema of this representation - of an object. Servers should convert recognized schemas to the latest - internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources' + description: |- + APIVersion defines the versioned schema of this representation of an object. + Servers should convert recognized schemas to the latest internal value, and + may reject unrecognized values. + More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources type: string kind: - description: 'Kind is a string value representing the REST resource this - object represents. Servers may infer this from the endpoint the client - submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds' + description: |- + Kind is a string value representing the REST resource this object represents. + Servers may infer this from the endpoint the client submits requests to. + Cannot be updated. + In CamelCase. + More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds type: string metadata: type: object diff --git a/config/crd/kustomization.yaml b/config/crd/kustomization.yaml index b29e3781..4da9af27 100644 --- a/config/crd/kustomization.yaml +++ b/config/crd/kustomization.yaml @@ -1,5 +1,6 @@ resources: - bases/trustyai.opendatahub.io_trustyaiservices.yaml + - bases/trustyai.opendatahub.io_lmevaljobs.yaml #+kubebuilder:scaffold:crdkustomizeresource patchesStrategicMerge: diff --git a/config/manager/manager.yaml b/config/manager/manager.yaml index dfd88a06..b3b31377 100644 --- a/config/manager/manager.yaml +++ b/config/manager/manager.yaml @@ -32,6 +32,8 @@ spec: - /manager args: - --leader-elect + - --enable-services + - "TAS,LMES" image: $(trustyaiOperatorImage) name: manager securityContext: diff --git a/config/overlays/lmes/kustomization.yaml b/config/overlays/lmes/kustomization.yaml new file mode 100644 index 00000000..216c929f --- /dev/null +++ b/config/overlays/lmes/kustomization.yaml @@ -0,0 +1,8 @@ +--- +apiVersion: kustomize.config.k8s.io/v1beta1 +kind: Kustomization +resources: + - ../../base + +patchesStrategicMerge: + - lmes-only-patch.yaml diff --git a/config/overlays/lmes/lmes-only-patch.yaml b/config/overlays/lmes/lmes-only-patch.yaml new file mode 100644 index 00000000..5b0466ff --- /dev/null +++ b/config/overlays/lmes/lmes-only-patch.yaml @@ -0,0 +1,14 @@ +apiVersion: apps/v1 +kind: Deployment +metadata: + name: controller-manager + namespace: system +spec: + template: + spec: + containers: + - name: manager + args: + - --leader-elect + - --enable-services + - "LMES" diff --git a/config/overlays/odh/params.env b/config/overlays/odh/params.env index 4a9c77e0..f7ff45d0 100644 --- a/config/overlays/odh/params.env +++ b/config/overlays/odh/params.env @@ -2,3 +2,10 @@ trustyaiServiceImage=quay.io/trustyai/trustyai-service:latest trustyaiOperatorImage=quay.io/trustyai/trustyai-service-operator:latest oauthProxyImage=quay.io/openshift/origin-oauth-proxy:4.14.0 kServeServerless=enabled +lmes-driver-image=quay.io/trustyai/ta-lmes-driver:latest +lmes-pod-image=quay.io/trustyai/ta-lmes-job:latest +lmes-pod-checking-interval=10s +lmes-image-pull-policy=Always +lmes-max-batch-size=24 +lmes-default-batch-size=8 +lmes-detect-device=true diff --git a/config/overlays/rhoai/kustomization.yaml b/config/overlays/rhoai/kustomization.yaml index d4b4a2a9..27ce6516 100644 --- a/config/overlays/rhoai/kustomization.yaml +++ b/config/overlays/rhoai/kustomization.yaml @@ -3,6 +3,10 @@ apiVersion: kustomize.config.k8s.io/v1beta1 kind: Kustomization resources: - ../../base + +patchesStrategicMerge: + - tas-only-patch.yaml + configMapGenerator: - env: params.env behavior: merge diff --git a/config/overlays/rhoai/params.env b/config/overlays/rhoai/params.env index c9cb9c37..4f99c2a2 100644 --- a/config/overlays/rhoai/params.env +++ b/config/overlays/rhoai/params.env @@ -1,4 +1,11 @@ trustyaiServiceImage=quay.io/trustyai/trustyai-service:latest trustyaiOperatorImage=quay.io/trustyai/trustyai-service-operator:latest oauthProxyImage=registry.redhat.io/openshift4/ose-oauth-proxy@sha256:ab112105ac37352a2a4916a39d6736f5db6ab4c29bad4467de8d613e80e9bb33 -kServeServerless=enabled \ No newline at end of file +kServeServerless=enabled +lmes-driver-image=quay.io/trustyai/ta-lmes-driver:latest +lmes-pod-image=quay.io/trustyai/ta-lmes-job:latest +lmes-pod-checking-interval=10s +lmes-image-pull-policy=Always +lmes-max-batch-size=24 +lmes-default-batch-size=8 +lmes-detect-device=true diff --git a/config/overlays/rhoai/tas-only-patch.yaml b/config/overlays/rhoai/tas-only-patch.yaml new file mode 100644 index 00000000..562b942c --- /dev/null +++ b/config/overlays/rhoai/tas-only-patch.yaml @@ -0,0 +1,14 @@ +apiVersion: apps/v1 +kind: Deployment +metadata: + name: controller-manager + namespace: system +spec: + template: + spec: + containers: + - name: manager + args: + - --leader-elect + - --enable-services + - "TAS" diff --git a/config/rbac/role.yaml b/config/rbac/role.yaml index be515044..38d54ff1 100644 --- a/config/rbac/role.yaml +++ b/config/rbac/role.yaml @@ -2,13 +2,16 @@ apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: - creationTimestamp: null name: manager-role rules: - apiGroups: - "" resources: - configmaps + - persistentvolumeclaims + - pods + - secrets + - services verbs: - create - delete @@ -28,26 +31,20 @@ rules: - apiGroups: - "" resources: - - pods + - persistentvolumes verbs: - - create - - delete - get - list - - patch - - update - watch - apiGroups: - "" resources: - - secrets + - pods/exec verbs: - create - delete - get - list - - patch - - update - watch - apiGroups: - "" @@ -102,38 +99,6 @@ rules: - create - get - update -- apiGroups: - - "" - resources: - - persistentvolumeclaims - verbs: - - create - - delete - - get - - list - - patch - - update - - watch -- apiGroups: - - "" - resources: - - persistentvolumes - verbs: - - get - - list - - watch -- apiGroups: - - "" - resources: - - services - verbs: - - create - - delete - - get - - list - - patch - - update - - watch - apiGroups: - monitoring.coreos.com resources: @@ -146,17 +111,6 @@ rules: - networking.istio.io resources: - destinationrules - verbs: - - create - - delete - - get - - list - - patch - - update - - watch -- apiGroups: - - networking.istio.io - resources: - virtualservices verbs: - create @@ -233,6 +187,7 @@ rules: - apiGroups: - trustyai.opendatahub.io resources: + - lmevaljobs - trustyaiservices verbs: - create @@ -245,12 +200,14 @@ rules: - apiGroups: - trustyai.opendatahub.io resources: + - lmevaljobs/finalizers - trustyaiservices/finalizers verbs: - update - apiGroups: - trustyai.opendatahub.io resources: + - lmevaljobs/status - trustyaiservices/status verbs: - get diff --git a/controllers/constants/version.go b/controllers/constants/version.go new file mode 100644 index 00000000..b4646084 --- /dev/null +++ b/controllers/constants/version.go @@ -0,0 +1,6 @@ +package constants + +const ( + Version = "1.17.0" + ConfigMap = "trustyai-service-operator-config" +) diff --git a/controllers/controllers.go b/controllers/controllers.go new file mode 100644 index 00000000..c7fee122 --- /dev/null +++ b/controllers/controllers.go @@ -0,0 +1,81 @@ +/* +Copyright 2024. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +*/ + +package controllers + +import ( + "errors" + "fmt" + "slices" + "strings" + + "k8s.io/client-go/tools/record" + "sigs.k8s.io/controller-runtime/pkg/manager" +) + +// to set up a controller. may include webhook or not +type ControllerSetupFunc func(mgr manager.Manager, ns, configmap string, recorder record.EventRecorder) error + +var ( + // to store all controllers and their set up function + TasServices = map[string]ControllerSetupFunc{} + // convenient list to store all registered services + AllTasServices = []string{} +) + +type EnabledServices []string + +// regiser a service. it's a private function for now. +// add a file in the same folder to call this function. +func registerService(name string, setupf ControllerSetupFunc) { + TasServices[name] = setupf + AllTasServices = append(AllTasServices, name) +} + +func SetupControllers(enabledServices []string, mgr manager.Manager, ns, configmap string, recorder record.EventRecorder) error { + var errs []error + for _, service := range enabledServices { + errs = append(errs, TasServices[service](mgr, ns, configmap, recorder)) + } + return errors.Join(errs...) +} + +func (es *EnabledServices) Set(services string) error { + for _, service := range strings.Split(services, ",") { + if slices.Contains(*es, service) { + return fmt.Errorf("specify the same service twice: %s", service) + } + if _, ok := TasServices[service]; ok { + *es = append(*es, service) + } else { + return fmt.Errorf( + "service %s is not supported. available services: %s", + service, + strings.Join(AllTasServices, ","), + ) + } + } + + return nil +} + +func (es *EnabledServices) Empty() bool { + return len(*es) == 0 +} + +func (es *EnabledServices) String() string { + return strings.Join(*es, ",") +} diff --git a/controllers/lmes.go b/controllers/lmes.go new file mode 100644 index 00000000..a179d74f --- /dev/null +++ b/controllers/lmes.go @@ -0,0 +1,23 @@ +/* +Copyright 2024. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +*/ + +package controllers + +import "github.com/trustyai-explainability/trustyai-service-operator/controllers/lmes" + +func init() { + registerService(lmes.ServiceName, lmes.ControllerSetUp) +} diff --git a/controllers/lmes/config.go b/controllers/lmes/config.go new file mode 100644 index 00000000..53b47002 --- /dev/null +++ b/controllers/lmes/config.go @@ -0,0 +1,108 @@ +/* +Copyright 2024. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +*/ + +package lmes + +import ( + "fmt" + "reflect" + "strconv" + "strings" + "time" + + "github.com/go-logr/logr" + corev1 "k8s.io/api/core/v1" +) + +var options *serviceOptions = &serviceOptions{ + DriverImage: DefaultDriverImage, + PodImage: DefaultPodImage, + PodCheckingInterval: DefaultPodCheckingInterval, + ImagePullPolicy: DefaultImagePullPolicy, + MaxBatchSize: DefaultMaxBatchSize, + DetectDevice: DefaultDetectDevice, + DefaultBatchSize: DefaultBatchSize, +} + +type serviceOptions struct { + PodImage string + DriverImage string + PodCheckingInterval time.Duration + ImagePullPolicy corev1.PullPolicy + MaxBatchSize int + DefaultBatchSize int + DetectDevice bool +} + +func constructOptionsFromConfigMap(log *logr.Logger, configmap *corev1.ConfigMap) error { + + rv := reflect.ValueOf(options).Elem() + var msgs []string + + for idx, cap := 0, rv.NumField(); idx < cap; idx++ { + frv := rv.Field(idx) + fname := rv.Type().Field(idx).Name + configKey, ok := optionKeys[fname] + if !ok { + continue + } + + if v, found := configmap.Data[configKey]; found { + var err error + switch frv.Type().Name() { + case "string": + frv.SetString(v) + case "bool": + val, err := strconv.ParseBool(v) + if err != nil { + val = DefaultDetectDevice + msgs = append(msgs, fmt.Sprintf("invalid setting for %v: %v, use default setting instead", optionKeys[fname], val)) + } + frv.SetBool(val) + case "int": + var intVal int + intVal, err = strconv.Atoi(v) + if err == nil { + frv.SetInt(int64(intVal)) + } + case "Duration": + var d time.Duration + d, err = time.ParseDuration(v) + if err == nil { + frv.Set(reflect.ValueOf(d)) + } + case "PullPolicy": + if p, found := pullPolicyMap[corev1.PullPolicy(v)]; found { + frv.Set(reflect.ValueOf(p)) + } else { + err = fmt.Errorf("invalid PullPolicy") + } + default: + return fmt.Errorf("can not handle the config %v, type: %v", optionKeys[fname], frv.Type().Name()) + } + + if err != nil { + msgs = append(msgs, fmt.Sprintf("invalid setting for %v: %v, use default setting instead", optionKeys[fname], v)) + } + } + } + + if len(msgs) > 0 && log != nil { + log.Error(fmt.Errorf("some settings in the configmap are invalid"), strings.Join(msgs, "\n")) + } + + return nil +} diff --git a/controllers/lmes/constants.go b/controllers/lmes/constants.go new file mode 100644 index 00000000..ec2c5944 --- /dev/null +++ b/controllers/lmes/constants.go @@ -0,0 +1,43 @@ +/* +Copyright 2024. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +*/ + +package lmes + +import ( + "time" + + corev1 "k8s.io/api/core/v1" +) + +const ( + DriverPath = "/bin/driver" + DestDriverPath = "/opt/app-root/src/bin/driver" + PodImageKey = "lmes-pod-image" + DriverImageKey = "lmes-driver-image" + PodCheckingIntervalKey = "lmes-pod-checking-interval" + ImagePullPolicyKey = "lmes-image-pull-policy" + MaxBatchSizeKey = "lmes-max-batch-size" + DefaultBatchSizeKey = "lmes-default-batch-size" + DetectDeviceKey = "lmes-detect-device" + DefaultPodImage = "quay.io/trustyai/ta-lmes-job:latest" + DefaultDriverImage = "quay.io/trustyai/ta-lmes-driver:latest" + DefaultPodCheckingInterval = time.Second * 10 + DefaultImagePullPolicy = corev1.PullAlways + DefaultMaxBatchSize = 24 + DefaultBatchSize = 8 + DefaultDetectDevice = true + ServiceName = "LMES" +) diff --git a/controllers/lmes/driver/driver.go b/controllers/lmes/driver/driver.go new file mode 100644 index 00000000..2c10add8 --- /dev/null +++ b/controllers/lmes/driver/driver.go @@ -0,0 +1,469 @@ +/* +Copyright 2024. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +*/ + +package driver + +import ( + "bufio" + "context" + "encoding/json" + "fmt" + "io" + "io/fs" + "net" + "net/http" + "os" + "os/exec" + "path/filepath" + "regexp" + "strings" + "sync" + "unicode" + + "github.com/go-logr/logr" + lmesv1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/lmes/v1alpha1" +) + +var ( + progressMegPattern = regexp.MustCompile(`^(.*?:\s*?\d*?%)\|`) +) + +const ( + // put the domain socket under /tmp. may move to emptydir to share across containers + socketPath = "/tmp/ta-lmes-driver.sock" + DefaultTaskRecipesPath = "/opt/app-root/src/my_tasks" + DefaultCatalogPath = "/opt/app-root/src/my_catalogs" + TaskRecipePrefix = "tr" + CustomCardPrefix = "custom" + ShutdownURI = "/Shutdown" + GetStatusURI = "/GetStatus" +) + +type DriverOption struct { + Context context.Context + OutputPath string + DetectDevice bool + TaskRecipesPath string + TaskRecipes []string + CatalogPath string + CustomCards []string + Logger logr.Logger + Args []string + SocketPath string +} + +type Driver interface { + Run() error + GetStatus() (*lmesv1alpha1.LMEvalJobStatus, error) + Shutdown() error +} + +// the communication server that is used by the driverImpl to +// send and recive messages using a domain socket +type driverComm struct { + connection chan int + server *http.Server + path string +} + +type driverImpl struct { + Option *DriverOption + lastProgressMsg string + status lmesv1alpha1.LMEvalJobStatus + err error + comm *driverComm +} + +func NewDriver(opt *DriverOption) (Driver, error) { + if opt == nil { + return nil, nil + } + + if opt.Context == nil { + return nil, fmt.Errorf("context is nil") + } + + if opt.TaskRecipesPath == "" { + opt.TaskRecipesPath = DefaultTaskRecipesPath + } + + if opt.CatalogPath == "" { + opt.CatalogPath = DefaultCatalogPath + } + + if opt.SocketPath == "" { + opt.SocketPath = socketPath + } + + return &driverImpl{ + Option: opt, + }, nil +} + +// Run implements Driver. +func (d *driverImpl) Run() error { + d.updateStatus(lmesv1alpha1.RunningJobState, lmesv1alpha1.NoReason, "initializing the evaluation job") + + if err := d.setupComm(); err != nil { + d.err = err + return d.err + } + + execErr := d.exec() + + // dump stderr and stdout to the console + var toConsole = func(file string) { + if data, err := os.ReadFile(file); err == nil { + os.Stdout.Write(data) + } + } + toConsole(filepath.Join(d.Option.OutputPath, "stdout.log")) + toConsole(filepath.Join(d.Option.OutputPath, "stderr.log")) + + d.updateCompleteStatus(execErr) + + // wait for shutdown signal then properly clean up the resources + d.comm.wait4Sutdownload() + d.comm.close() + return d.err +} + +func (d *driverImpl) GetStatus() (*lmesv1alpha1.LMEvalJobStatus, error) { + client := createClient(d.Option.SocketPath) + resp, err := client.Get(fmt.Sprintf("http://unix%s", GetStatusURI)) + if err != nil { + return nil, err + } + defer resp.Body.Close() + content, err := io.ReadAll(resp.Body) + if err != nil { + return nil, err + } + + var status lmesv1alpha1.LMEvalJobStatus + err = json.Unmarshal(content, &status) + + return &status, err +} +func (d *driverImpl) Shutdown() error { + client := createClient(d.Option.SocketPath) + resp, err := client.Post(fmt.Sprintf("http://unix%s", ShutdownURI), "application/json", nil) + if err != nil { + return err + } + + defer resp.Body.Close() + _, err = io.ReadAll(resp.Body) + return err +} + +func (d *driverImpl) detectDevice() error { + if d == nil || !d.Option.DetectDevice { + return nil + } + + // assuming python and torch python package are available. + // use torch python API to detect CUDA's availability + out, err := exec.Command( + "python", + "-c", + "import torch; print('=={}:{}=='.format(torch.cuda.is_available(), torch.cuda.device_count()));", + ).Output() + if err != nil { + return fmt.Errorf("failed to detect available device(s): %v", err) + } + + re := regexp.MustCompile(`(?m)^==(True|False):(\d+?)==$`) + matches := re.FindStringSubmatch(string(out)) + if matches == nil { + return fmt.Errorf("failed to find the matched output") + } + + patchDevice(d.Option.Args, matches[1] == "True") + + return nil +} + +func patchDevice(args []string, hasCuda bool) { + var device = "cpu" + if hasCuda { + device = "cuda" + } + // patch the python command in the Option.Arg by adding the `--device cuda` option + // find the string with the `python -m lm_eval` prefix. usually it should be the last one + for idx, arg := range args { + if strings.HasPrefix(arg, "python -m lm_eval") { + if !strings.Contains(arg, "--device") { + args[idx] = fmt.Sprintf("%s --device %s", arg, device) + } + break + } + } +} + +// Create a domain socket and use HTTP protocal to handle communication +func (d *driverImpl) setupComm() error { + + serve := http.NewServeMux() + d.comm = &driverComm{ + server: &http.Server{Handler: serve}, + connection: make(chan int), + path: d.Option.SocketPath, + } + + // handle the `GetStatus` API: return the complete lmesv1alpha1.LMEvalJobStatus + // or error if the JSON marshaling fails. + serve.HandleFunc(GetStatusURI, func(w http.ResponseWriter, _ *http.Request) { + status, err := json.Marshal(d.status) + if err == nil { + w.Write(status) + } else { + w.Write([]byte(fmt.Sprintf(`{"err": "%s"}`, err.Error()))) + } + }) + + // handle the `Shutdown` API: tear down the communication server. + serve.HandleFunc(ShutdownURI, func(w http.ResponseWriter, _ *http.Request) { + w.Write([]byte(`{"msg": "ok"}`)) + d.comm.notifyShutdownWait() + }) + + go func() { + d.comm.serve() + }() + + return nil +} + +func createClient(path string) *http.Client { + return &http.Client{ + Transport: &http.Transport{ + DialContext: func(_ context.Context, _, _ string) (net.Conn, error) { + return net.Dial("unix", path) + }, + }, + } +} + +func (dc *driverComm) wait4Sutdownload() { + <-dc.connection +} + +func (dc *driverComm) serve() error { + socket, err := net.Listen("unix", dc.path) + if err != nil { + return err + } + + return dc.server.Serve(socket) +} + +func (dc *driverComm) close() { + if dc.server != nil && dc.connection != nil { + dc.server.Shutdown(context.Background()) + close(dc.connection) + os.Remove(dc.path) + } +} + +func (dc *driverComm) notifyShutdownWait() { + dc.connection <- 1 +} + +func (d *driverImpl) exec() error { + // create Unitxt task recipes + if err := d.createTaskRecipes(); err != nil { + return fmt.Errorf("failed to create task recipes: %v", err) + } + + if err := d.createCustomCards(); err != nil { + return fmt.Errorf("failed to create custom cards: %v", err) + } + + // Detect available devices if needed + if err := d.detectDevice(); err != nil { + return err + } + + // Run user program. + var args []string + if len(d.Option.Args) > 1 { + args = d.Option.Args[1:] + } + + stdout, err := os.Create(filepath.Join(d.Option.OutputPath, "stdout.log")) + if err != nil { + return err + } + + stderr, err := os.Create(filepath.Join(d.Option.OutputPath, "stderr.log")) + if err != nil { + return err + } + + // have a pipe to check the output and report progress + // lm-eval's outputs are in the stderr + pr, pw := io.Pipe() + mwriter := io.MultiWriter(stderr, pw) + scanner := bufio.NewScanner(pr) + + executor := exec.Command(d.Option.Args[0], args...) + stdin, err := executor.StdinPipe() + if err != nil { + return err + } + executor.Stdout = stdout + executor.Stderr = mwriter + executor.Env = append(os.Environ(), + "UNITXT_ALLOW_UNVERIFIED_CODE=True", + ) + + var freeRes = func() { + stdin.Close() + stdout.Sync() + stdout.Close() + stderr.Sync() + stderr.Close() + pr.Close() + } + + // temporally fix the trust_remote_code issue + io.WriteString(stdin, "y\n") + if err := executor.Start(); err != nil { + freeRes() + return err + } + + var wg sync.WaitGroup + wg.Add(1) + go func() { + for scanner.Scan() { + msg := scanner.Text() + d.updateProgress(msg) + } + wg.Done() + }() + + finalError := executor.Wait() + freeRes() + wg.Wait() + return finalError +} + +func (d *driverImpl) updateCompleteStatus(err error) { + d.status.State = lmesv1alpha1.CompleteJobState + d.status.Reason = lmesv1alpha1.SucceedReason + d.status.Message = "job completed" + + if err == nil { + var results string + results, err = d.getResults() + d.status.Results = results + } + + if err != nil { + d.status.Reason = lmesv1alpha1.FailedReason + d.status.Message = err.Error() + d.err = err + } + + d.Option.Logger.Info("update status: job completed", "state", d.status) +} + +func (d *driverImpl) updateStatus(state lmesv1alpha1.JobState, reason lmesv1alpha1.Reason, msg string) { + d.status.State = state + d.status.Reason = reason + d.status.Message = msg +} + +func (d *driverImpl) getResults() (string, error) { + var results string + pattern := "*result*.json" + if err := filepath.WalkDir(d.Option.OutputPath, func(path string, dir fs.DirEntry, err error) error { + if err != nil { + return err + } + + if matched, _ := filepath.Match(pattern, filepath.Base(path)); matched { + bytes, err := os.ReadFile(path) + if err != nil { + d.Option.Logger.Error(err, "failed to retrieve the results") + } else { + results = string(bytes) + } + } + return nil + }); err != nil { + return "", err + } + + return results, nil +} + +func (d *driverImpl) updateProgress(msg string) { + msg = strings.Map(func(r rune) rune { + if unicode.IsPrint(r) { + return r + } + // replace control chars to new line + if unicode.IsControl(r) { + return 10 + } + return -1 + }, msg) + + // get multiple lines and only use the last one + msglist := strings.Split(msg, "\n") + + if matches := progressMegPattern.FindStringSubmatch(msglist[len(msglist)-1]); len(matches) == 2 { + if matches[1] != d.lastProgressMsg { + d.lastProgressMsg = strings.Trim(matches[1], " \r") + d.updateStatus(lmesv1alpha1.RunningJobState, lmesv1alpha1.NoReason, d.lastProgressMsg) + } + } +} + +func (d *driverImpl) createTaskRecipes() error { + for i, taskRecipe := range d.Option.TaskRecipes { + err := os.WriteFile( + filepath.Join(d.Option.TaskRecipesPath, fmt.Sprintf("%s_%d.yaml", TaskRecipePrefix, i)), + []byte(fmt.Sprintf( + "task: %s\ninclude: unitxt\nrecipe: %s", + fmt.Sprintf("%s_%d", TaskRecipePrefix, i), + taskRecipe, + )), + 0666, + ) + if err != nil { + return err + } + } + return nil +} + +func (d *driverImpl) createCustomCards() error { + for i, customCard := range d.Option.CustomCards { + err := os.WriteFile( + filepath.Join(d.Option.CatalogPath, "cards", fmt.Sprintf("%s_%d.json", CustomCardPrefix, i)), + []byte(customCard), + 0666, + ) + if err != nil { + return err + } + } + return nil +} diff --git a/controllers/lmes/driver/driver_test.go b/controllers/lmes/driver/driver_test.go new file mode 100644 index 00000000..4d0485f8 --- /dev/null +++ b/controllers/lmes/driver/driver_test.go @@ -0,0 +1,317 @@ +/* +Copyright 2024. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +*/ + +package driver + +import ( + "context" + "crypto/rand" + "flag" + "fmt" + "os" + "testing" + "time" + + "github.com/stretchr/testify/assert" + "github.com/trustyai-explainability/trustyai-service-operator/api/lmes/v1alpha1" + ctrl "sigs.k8s.io/controller-runtime" + "sigs.k8s.io/controller-runtime/pkg/log" + "sigs.k8s.io/controller-runtime/pkg/log/zap" +) + +var ( + driverLog = ctrl.Log.WithName("driver-test") +) + +func TestMain(m *testing.M) { + opts := zap.Options{ + Development: true, + } + opts.BindFlags(flag.CommandLine) + + flag.Parse() + log.SetLogger(zap.New(zap.UseFlagOptions(&opts))) + m.Run() +} + +func genRandomSocketPath() string { + b := make([]byte, 10) + rand.Read(b) + p := fmt.Sprintf("/tmp/ta-lmes-%x.sock", b) + return p +} + +func runDriverAndWait4Complete(t *testing.T, driver Driver, returnError bool) (progressMsgs []string, results string) { + go func() { + if returnError { + assert.NotNil(t, driver.Run()) + } else { + assert.Nil(t, driver.Run()) + } + }() + + for { + time.Sleep(time.Second) + status, err := driver.GetStatus() + assert.Nil(t, err) + if len(progressMsgs) == 0 || progressMsgs[len(progressMsgs)-1] != status.Message { + progressMsgs = append(progressMsgs, status.Message) + } + if status.State == v1alpha1.CompleteJobState { + results = status.Results + break + } + } + return progressMsgs, results +} + +func Test_Driver(t *testing.T) { + driver, err := NewDriver(&DriverOption{ + Context: context.Background(), + OutputPath: ".", + Logger: driverLog, + Args: []string{"sh", "-ec", "echo tttttttttttttttttttt"}, + SocketPath: genRandomSocketPath(), + }) + assert.Nil(t, err) + + runDriverAndWait4Complete(t, driver, false) + + assert.Nil(t, driver.Shutdown()) + assert.Nil(t, os.Remove("./stderr.log")) + assert.Nil(t, os.Remove("./stdout.log")) +} + +func Test_Wait4Shutdown(t *testing.T) { + driver, err := NewDriver(&DriverOption{ + Context: context.Background(), + OutputPath: ".", + Logger: driverLog, + Args: []string{"sh", "-ec", "echo test"}, + SocketPath: genRandomSocketPath(), + }) + assert.Nil(t, err) + + runDriverAndWait4Complete(t, driver, false) + + // can still get the status even the user program finishes + time.Sleep(time.Second * 3) + status, err := driver.GetStatus() + assert.Nil(t, err) + assert.Equal(t, v1alpha1.CompleteJobState, status.State) + + assert.Nil(t, driver.Shutdown()) + + _, err = driver.GetStatus() + assert.ErrorContains(t, err, "no such file or directory") + + assert.Nil(t, os.Remove("./stderr.log")) + assert.Nil(t, os.Remove("./stdout.log")) +} + +func Test_ProgressUpdate(t *testing.T) { + driver, err := NewDriver(&DriverOption{ + Context: context.Background(), + OutputPath: ".", + Logger: driverLog, + Args: []string{"sh", "-ec", "sleep 2; echo 'testing progress: 100%|' >&2; sleep 4"}, + SocketPath: genRandomSocketPath(), + }) + assert.Nil(t, err) + + msgs, _ := runDriverAndWait4Complete(t, driver, false) + + assert.Equal(t, []string{ + "initializing the evaluation job", + "testing progress: 100%", + "job completed", + }, msgs) + + assert.Nil(t, driver.Shutdown()) + assert.Nil(t, os.Remove("./stderr.log")) + assert.Nil(t, os.Remove("./stdout.log")) +} + +func Test_DetectDeviceError(t *testing.T) { + driver, err := NewDriver(&DriverOption{ + Context: context.Background(), + OutputPath: ".", + DetectDevice: true, + Logger: driverLog, + Args: []string{"sh", "-ec", "python -m lm_eval --output_path ./output --model test --model_args arg1=value1 --tasks task1,task2"}, + SocketPath: genRandomSocketPath(), + }) + assert.Nil(t, err) + + msgs, _ := runDriverAndWait4Complete(t, driver, true) + assert.Equal(t, []string{ + "failed to detect available device(s): exit status 1", + }, msgs) + + assert.Nil(t, driver.Shutdown()) + + // the following files don't exist for this case + assert.NotNil(t, os.Remove("./stderr.log")) + assert.NotNil(t, os.Remove("./stdout.log")) +} + +func Test_PatchDevice(t *testing.T) { + driverOpt := DriverOption{ + Context: context.Background(), + OutputPath: ".", + DetectDevice: true, + Logger: driverLog, + Args: []string{"sh", "-ec", "python -m lm_eval --output_path /opt/app-root/src/output --model test --model_args arg1=value1 --tasks task1,task2"}, + } + + // append `--device cuda` + patchDevice(driverOpt.Args, true) + assert.Equal(t, + "python -m lm_eval --output_path /opt/app-root/src/output --model test --model_args arg1=value1 --tasks task1,task2 --device cuda", + driverOpt.Args[2], + ) + + // append `--device cpu` + driverOpt.Args = []string{"sh", "-ec", "python -m lm_eval --output_path /opt/app-root/src/output --model test --model_args arg1=value1 --tasks task1,task2"} + patchDevice(driverOpt.Args, false) + assert.Equal(t, + "python -m lm_eval --output_path /opt/app-root/src/output --model test --model_args arg1=value1 --tasks task1,task2 --device cpu", + driverOpt.Args[2], + ) + + // no change because `--device cpu` exists + driverOpt.Args = []string{"sh", "-ec", "python -m lm_eval --device cpu --output_path /opt/app-root/src/output --model test --model_args arg1=value1 --tasks task1,task2"} + patchDevice(driverOpt.Args, true) + assert.Equal(t, + "python -m lm_eval --device cpu --output_path /opt/app-root/src/output --model test --model_args arg1=value1 --tasks task1,task2", + driverOpt.Args[2], + ) +} + +func Test_TaskRecipes(t *testing.T) { + driver, err := NewDriver(&DriverOption{ + Context: context.Background(), + OutputPath: ".", + Logger: driverLog, + TaskRecipesPath: "./", + TaskRecipes: []string{ + "card=unitxt.card1,template=unitxt.template,metrics=[unitxt.metric1,unitxt.metric2],format=unitxt.format,num_demos=5,demos_pool_size=10", + "card=unitxt.card2,template=unitxt.template2,metrics=[unitxt.metric3,unitxt.metric4],format=unitxt.format,num_demos=5,demos_pool_size=10", + }, + Args: []string{"sh", "-ec", "sleep 2; echo 'testing progress: 100%|' >&2; sleep 4"}, + SocketPath: genRandomSocketPath(), + }) + assert.Nil(t, err) + + msgs, _ := runDriverAndWait4Complete(t, driver, false) + + assert.Equal(t, []string{ + "initializing the evaluation job", + "testing progress: 100%", + "job completed", + }, msgs) + + assert.Nil(t, driver.Shutdown()) + + tr0, err := os.ReadFile("./tr_0.yaml") + assert.Nil(t, err) + assert.Equal(t, + "task: tr_0\ninclude: unitxt\nrecipe: card=unitxt.card1,template=unitxt.template,metrics=[unitxt.metric1,unitxt.metric2],format=unitxt.format,num_demos=5,demos_pool_size=10", + string(tr0), + ) + tr1, err := os.ReadFile("./tr_1.yaml") + assert.Nil(t, err) + assert.Equal(t, + "task: tr_1\ninclude: unitxt\nrecipe: card=unitxt.card2,template=unitxt.template2,metrics=[unitxt.metric3,unitxt.metric4],format=unitxt.format,num_demos=5,demos_pool_size=10", + string(tr1), + ) + assert.Nil(t, os.Remove("./stderr.log")) + assert.Nil(t, os.Remove("./stdout.log")) + assert.Nil(t, os.Remove("./tr_0.yaml")) + assert.Nil(t, os.Remove("./tr_1.yaml")) +} + +func Test_CustomCards(t *testing.T) { + driver, err := NewDriver(&DriverOption{ + Context: context.Background(), + OutputPath: ".", + Logger: driverLog, + TaskRecipesPath: "./", + CatalogPath: "./", + TaskRecipes: []string{ + "card=cards.custom_0,template=unitxt.template,metrics=[unitxt.metric1,unitxt.metric2],format=unitxt.format,num_demos=5,demos_pool_size=10", + }, + CustomCards: []string{ + `{ "__type__": "task_card", "loader": { "__type__": "load_hf", "path": "wmt16", "name": "de-en" }, "preprocess_steps": [ { "__type__": "copy", "field": "translation/en", "to_field": "text" }, { "__type__": "copy", "field": "translation/de", "to_field": "translation" }, { "__type__": "set", "fields": { "source_language": "english", "target_language": "deutch" } } ], "task": "tasks.translation.directed", "templates": "templates.translation.directed.all" }`, + }, + Args: []string{"sh", "-ec", "sleep 1; echo 'testing progress: 100%|' >&2; sleep 3"}, + SocketPath: genRandomSocketPath(), + }) + assert.Nil(t, err) + + os.Mkdir("cards", 0750) + + msgs, _ := runDriverAndWait4Complete(t, driver, false) + + assert.Equal(t, []string{ + "initializing the evaluation job", + "testing progress: 100%", + "job completed", + }, msgs) + + assert.Nil(t, driver.Shutdown()) + + tr0, err := os.ReadFile("./tr_0.yaml") + assert.Nil(t, err) + assert.Equal(t, + "task: tr_0\ninclude: unitxt\nrecipe: card=cards.custom_0,template=unitxt.template,metrics=[unitxt.metric1,unitxt.metric2],format=unitxt.format,num_demos=5,demos_pool_size=10", + string(tr0), + ) + custom0, err := os.ReadFile("./cards/custom_0.json") + assert.Nil(t, err) + assert.Equal(t, + `{ "__type__": "task_card", "loader": { "__type__": "load_hf", "path": "wmt16", "name": "de-en" }, "preprocess_steps": [ { "__type__": "copy", "field": "translation/en", "to_field": "text" }, { "__type__": "copy", "field": "translation/de", "to_field": "translation" }, { "__type__": "set", "fields": { "source_language": "english", "target_language": "deutch" } } ], "task": "tasks.translation.directed", "templates": "templates.translation.directed.all" }`, + string(custom0), + ) + assert.Nil(t, os.Remove("./stderr.log")) + assert.Nil(t, os.Remove("./stdout.log")) + assert.Nil(t, os.Remove("./tr_0.yaml")) + assert.Nil(t, os.Remove("./cards/custom_0.json")) + assert.Nil(t, os.Remove("./cards")) +} + +func Test_ProgramError(t *testing.T) { + driver, err := NewDriver(&DriverOption{ + Context: context.Background(), + OutputPath: ".", + Logger: driverLog, + Args: []string{"sh", "-ec", "sleep 1; exit 1"}, + SocketPath: genRandomSocketPath(), + }) + assert.Nil(t, err) + + msgs, _ := runDriverAndWait4Complete(t, driver, true) + + assert.Equal(t, []string{ + "initializing the evaluation job", + "exit status 1", + }, msgs) + + assert.Nil(t, driver.Shutdown()) + + assert.Nil(t, os.Remove("./stderr.log")) + assert.Nil(t, os.Remove("./stdout.log")) +} diff --git a/controllers/lmes/lmevaljob_controller.go b/controllers/lmes/lmevaljob_controller.go new file mode 100644 index 00000000..b59db244 --- /dev/null +++ b/controllers/lmes/lmevaljob_controller.go @@ -0,0 +1,901 @@ +/* +Copyright 2024. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +*/ + +package lmes + +import ( + "bytes" + "context" + "fmt" + "maps" + "slices" + "strings" + "sync" + "time" + + corev1 "k8s.io/api/core/v1" + v1 "k8s.io/apimachinery/pkg/apis/meta/v1" + "k8s.io/apimachinery/pkg/runtime" + "k8s.io/apimachinery/pkg/types" + "k8s.io/apimachinery/pkg/util/json" + "k8s.io/client-go/kubernetes" + "k8s.io/client-go/kubernetes/scheme" + "k8s.io/client-go/rest" + "k8s.io/client-go/tools/record" + "k8s.io/client-go/tools/remotecommand" + ctrl "sigs.k8s.io/controller-runtime" + "sigs.k8s.io/controller-runtime/pkg/builder" + "sigs.k8s.io/controller-runtime/pkg/client" + "sigs.k8s.io/controller-runtime/pkg/controller/controllerutil" + "sigs.k8s.io/controller-runtime/pkg/event" + "sigs.k8s.io/controller-runtime/pkg/handler" + "sigs.k8s.io/controller-runtime/pkg/log" + "sigs.k8s.io/controller-runtime/pkg/manager" + "sigs.k8s.io/controller-runtime/pkg/predicate" + "sigs.k8s.io/controller-runtime/pkg/reconcile" + "sigs.k8s.io/controller-runtime/pkg/source" + + "github.com/go-logr/logr" + lmesv1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/lmes/v1alpha1" + "github.com/trustyai-explainability/trustyai-service-operator/controllers/lmes/driver" +) + +var ( + pullPolicyMap = map[corev1.PullPolicy]corev1.PullPolicy{ + corev1.PullAlways: corev1.PullAlways, + corev1.PullNever: corev1.PullNever, + corev1.PullIfNotPresent: corev1.PullIfNotPresent, + } + + optionKeys = map[string]string{ + "PodImage": PodImageKey, + "DriverImage": DriverImageKey, + "PodCheckingInterval": PodCheckingIntervalKey, + "ImagePullPolicy": ImagePullPolicyKey, + "DefaultBatchSize": DefaultBatchSizeKey, + "MaxBatchSize": MaxBatchSizeKey, + "DetectDevice": DetectDeviceKey, + } + + labelFilterPrefixes = []string{} + annotationFilterPrefixes = []string{} +) + +// maintain a list of key-time pair data. +// provide a function to add the key and update the time +// atomically and return a reconcile requeue event +// if needed. +type syncedMap4Reconciler struct { + data map[string]time.Time + mutex sync.Mutex +} + +// LMEvalJobReconciler reconciles a LMEvalJob object +type LMEvalJobReconciler struct { + client.Client + Scheme *runtime.Scheme + Recorder record.EventRecorder + ConfigMap string + Namespace string + restConfig *rest.Config + restClient rest.Interface + pullingJobs *syncedMap4Reconciler +} + +// The registered function to set up LMES controller +func ControllerSetUp(mgr manager.Manager, ns, configmap string, recorder record.EventRecorder) error { + clientset, err := kubernetes.NewForConfig(mgr.GetConfig()) + if err != nil { + return err + } + + return (&LMEvalJobReconciler{ + ConfigMap: configmap, + Namespace: ns, + Client: mgr.GetClient(), + Scheme: mgr.GetScheme(), + Recorder: mgr.GetEventRecorderFor("lm-eval-service-controller"), + restConfig: mgr.GetConfig(), + restClient: clientset.CoreV1().RESTClient(), + pullingJobs: newSyncedMap4Reconciler(), + }).SetupWithManager(mgr) +} + +func newSyncedMap4Reconciler() *syncedMap4Reconciler { + return &syncedMap4Reconciler{data: make(map[string]time.Time)} +} + +// check if the paired time of the key is passed. if yes, update the time and +// return a requeue result. otherwise an empty result +func (q *syncedMap4Reconciler) addOrUpdate(key string, after time.Duration) reconcile.Result { + q.mutex.Lock() + defer q.mutex.Unlock() + + v, ok := q.data[key] + if ok && time.Now().Before(v) { + // no need to requeue since there is an existing one + return reconcile.Result{} + } + value := time.Now().Add(after) + q.data[key] = value + return reconcile.Result{Requeue: true, RequeueAfter: after} +} + +// remove the key from the list +func (q *syncedMap4Reconciler) remove(key string) { + q.mutex.Lock() + defer q.mutex.Unlock() + delete(q.data, key) +} + +// +kubebuilder:rbac:groups=trustyai.opendatahub.io,resources=lmevaljobs,verbs=get;list;watch;create;update;patch;delete +// +kubebuilder:rbac:groups=trustyai.opendatahub.io,resources=lmevaljobs/status,verbs=get;update;patch +// +kubebuilder:rbac:groups=trustyai.opendatahub.io,resources=lmevaljobs/finalizers,verbs=update +// +kubebuilder:rbac:groups="",resources=pods,verbs=get;list;watch;create;delete +// +kubebuilder:rbac:groups="",resources=pods/exec,verbs=get;list;watch;create;delete +// +kubebuilder:rbac:groups="",resources=configmaps,verbs=get;watch;list +// +kubebuilder:rbac:groups="",resources=secrets,verbs=get;watch;list + +func (r *LMEvalJobReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) { + log := log.FromContext(ctx) + + job := &lmesv1alpha1.LMEvalJob{} + if err := r.Get(ctx, req.NamespacedName, job); err != nil { + log.Info("unable to fetch LMEvalJob. could be from a deletion request") + return ctrl.Result{}, client.IgnoreNotFound(err) + } + + if !job.ObjectMeta.DeletionTimestamp.IsZero() { + // Handle deletion here + return r.handleDeletion(ctx, job, log) + } + + // Treat this as NewJobState + if job.Status.LastScheduleTime == nil { + job.Status.State = lmesv1alpha1.NewJobState + } + + if job.Spec.Suspend { + return r.handleSuspend(ctx, log, job) + } + + // Handle the job based on its state + switch job.Status.State { + case lmesv1alpha1.NewJobState: + // Handle newly created job + return r.handleNewCR(ctx, log, job) + case lmesv1alpha1.ScheduledJobState: + // the job's pod has been created and the driver hasn't updated the state yet + // let's check the pod status and detect pod failure if there is + // TODO: need a timeout/retry mechanism here to transit to other states + return r.checkScheduledPod(ctx, log, job) + case lmesv1alpha1.RunningJobState: + // TODO: need a timeout/retry mechanism here to transit to other states + return r.checkScheduledPod(ctx, log, job) + case lmesv1alpha1.CompleteJobState: + return r.handleComplete(ctx, log, job) + case lmesv1alpha1.CancelledJobState: + return r.handleCancel(ctx, log, job) + case lmesv1alpha1.SuspendedJobState: + if !job.Spec.Suspend { + return r.handleResume(ctx, log, job) + } + } + + return ctrl.Result{}, nil +} + +// SetupWithManager sets up the controller with the Manager. +func (r *LMEvalJobReconciler) SetupWithManager(mgr ctrl.Manager) error { + + // Add a runnable to retrieve the settings from the specified configmap + if err := mgr.Add(manager.RunnableFunc(func(ctx context.Context) error { + var cm corev1.ConfigMap + log := log.FromContext(ctx) + if err := r.Get( + ctx, + types.NamespacedName{Namespace: r.Namespace, Name: r.ConfigMap}, + &cm); err != nil { + + ctrl.Log.WithName("setup").Error(err, + "failed to get configmap", + "namespace", r.Namespace, + "name", r.ConfigMap) + + return err + } + if err := constructOptionsFromConfigMap(&log, &cm); err != nil { + return err + } + + return nil + })); err != nil { + return err + } + + // watch the pods created by the controller but only for the deletion event + return ctrl.NewControllerManagedBy(mgr). + // since we register the finalizer, no need to monitor deletion events + For(&lmesv1alpha1.LMEvalJob{}, builder.WithPredicates(predicate.Funcs{ + // drop deletion events + DeleteFunc: func(event.DeleteEvent) bool { + return false + }, + })). + Watches( + &source.Kind{Type: &corev1.Pod{}}, + &handler.EnqueueRequestForOwner{ + OwnerType: &lmesv1alpha1.LMEvalJob{}, + IsController: true, + }, + builder.WithPredicates(predicate.Funcs{ + // drop all events except deletion + CreateFunc: func(event.CreateEvent) bool { + return false + }, + UpdateFunc: func(event.UpdateEvent) bool { + return false + }, + GenericFunc: func(event.GenericEvent) bool { + return false + }, + }), + ). + Complete(r) +} + +func (r *LMEvalJobReconciler) updateStatus(ctx context.Context, log logr.Logger, job *lmesv1alpha1.LMEvalJob) error { + stdin, _, err := r.remoteCommand(ctx, job, fmt.Sprintf("%s %s", DestDriverPath, "--get-status")) + if err != nil { + return err + } + newStatus := lmesv1alpha1.LMEvalJobStatus{} + if err = json.Unmarshal(stdin, &newStatus); err != nil { + return err + } + + // driver only provides updates for these fields + if newStatus.State != job.Status.State || + newStatus.Message != job.Status.Message || + newStatus.Reason != job.Status.Reason || + newStatus.Results != job.Status.Results { + + job.Status.State = newStatus.State + job.Status.Message = newStatus.Message + job.Status.Reason = newStatus.Reason + job.Status.Results = newStatus.Results + + err = r.Status().Update(ctx, job) + if err != nil { + log.Error(err, "failed to update status") + } + } + return err +} + +func (r *LMEvalJobReconciler) shutdownDriver(ctx context.Context, job *lmesv1alpha1.LMEvalJob) error { + _, _, err := r.remoteCommand(ctx, job, fmt.Sprintf("%s %s", DestDriverPath, "--shutdown")) + return err +} + +func (r *LMEvalJobReconciler) remoteCommand(ctx context.Context, job *lmesv1alpha1.LMEvalJob, command string) ([]byte, []byte, error) { + request := r.restClient.Post(). + Namespace(job.GetNamespace()). + Resource("pods"). + Name(job.GetName()). + SubResource("exec"). + VersionedParams(&corev1.PodExecOptions{ + Command: []string{"/bin/sh", "-c", command}, + Stdin: false, + Stdout: true, + Stderr: true, + }, scheme.ParameterCodec) + + outBuff := &bytes.Buffer{} + errBuf := &bytes.Buffer{} + exec, err := remotecommand.NewSPDYExecutor(r.restConfig, "POST", request.URL()) + if err != nil { + return nil, nil, err + } + if err = exec.StreamWithContext(ctx, remotecommand.StreamOptions{ + Stdout: outBuff, + Stderr: errBuf, + }); err != nil { + return nil, nil, err + } + return outBuff.Bytes(), errBuf.Bytes(), nil +} + +func (r *LMEvalJobReconciler) handleDeletion(ctx context.Context, job *lmesv1alpha1.LMEvalJob, log logr.Logger) (reconcile.Result, error) { + defer r.pullingJobs.remove(string(job.GetUID())) + + if controllerutil.ContainsFinalizer(job, lmesv1alpha1.FinalizerName) { + // delete the corresponding pod if needed + // remove our finalizer from the list and update it. + if job.Status.State != lmesv1alpha1.CompleteJobState || + job.Status.Reason != lmesv1alpha1.CancelledReason { + + if err := r.deleteJobPod(ctx, job); err != nil && client.IgnoreNotFound(err) != nil { + log.Error(err, "failed to delete pod of the job") + } + } + + controllerutil.RemoveFinalizer(job, lmesv1alpha1.FinalizerName) + if err := r.Update(ctx, job); err != nil { + return ctrl.Result{}, err + } + r.Recorder.Event(job, "Normal", "DetachFinalizer", + fmt.Sprintf("removed finalizer from LMEvalJob %s in namespace %s", + job.Name, + job.Namespace)) + log.Info("Successfully remove the finalizer", "name", job.Name) + } + + return ctrl.Result{}, nil +} + +func (r *LMEvalJobReconciler) handleNewCR(ctx context.Context, log logr.Logger, job *lmesv1alpha1.LMEvalJob) (reconcile.Result, error) { + // If it doesn't contain our finalizer, add it + if !controllerutil.ContainsFinalizer(job, lmesv1alpha1.FinalizerName) { + controllerutil.AddFinalizer(job, lmesv1alpha1.FinalizerName) + if err := r.Update(ctx, job); err != nil { + log.Error(err, "unable to update finalizer") + return ctrl.Result{}, err + } + r.Recorder.Event(job, "Normal", "AttachFinalizer", + fmt.Sprintf("added the finalizer to the LMEvalJob %s in namespace %s", + job.Name, + job.Namespace)) + // Since finalizers were updated. Need to fetch the new LMEvalJob + // End the current reconcile and get revisioned job in next reconcile + return ctrl.Result{}, nil + } + + // Validate the custom card if exists + // FIXME: Move the validation to the webhook once we enable it. + if err := r.validateCustomCard(job, log); err != nil { + // custom card validation failed + job.Status.State = lmesv1alpha1.CompleteJobState + job.Status.Reason = lmesv1alpha1.FailedReason + job.Status.Message = err.Error() + if err := r.Status().Update(ctx, job); err != nil { + log.Error(err, "unable to update LMEvalJob status for custom card validation error") + } + log.Error(err, "Contain invalid custom card in the LMEvalJob", "name", job.Name) + return ctrl.Result{}, err + } + + // construct a new pod and create a pod for the job + currentTime := v1.Now() + pod := createPod(options, job, log) + if err := r.Create(ctx, pod, &client.CreateOptions{}); err != nil { + // Failed to create the pod. Mark the status as complete with failed + job.Status.State = lmesv1alpha1.CompleteJobState + job.Status.Reason = lmesv1alpha1.FailedReason + job.Status.Message = err.Error() + if err := r.Status().Update(ctx, job); err != nil { + log.Error(err, "unable to update LMEvalJob status for pod creation failure") + } + log.Error(err, "Failed to create pod for the LMEvalJob", "name", job.Name) + return ctrl.Result{}, err + } + + // Create the pod successfully. Wait for the driver to update the status + job.Status.State = lmesv1alpha1.ScheduledJobState + job.Status.PodName = pod.Name + job.Status.LastScheduleTime = ¤tTime + if err := r.Status().Update(ctx, job); err != nil { + log.Error(err, "unable to update LMEvalJob status (pod creation done)") + return ctrl.Result{}, err + } + r.Recorder.Event(job, "Normal", "PodCreation", + fmt.Sprintf("the LMEvalJob %s in namespace %s created a pod", + job.Name, + job.Namespace)) + log.Info("Successfully create a Pod for the Job") + // Check the pod after the config interval + return r.pullingJobs.addOrUpdate(string(job.GetUID()), options.PodCheckingInterval), nil +} + +func (r *LMEvalJobReconciler) checkScheduledPod(ctx context.Context, log logr.Logger, job *lmesv1alpha1.LMEvalJob) (ctrl.Result, error) { + pod, err := r.getPod(ctx, job) + if err != nil { + // a weird state, someone delete the corresponding pod? mark this as CompleteJobState + // with error message + job.Status.State = lmesv1alpha1.CompleteJobState + job.Status.Reason = lmesv1alpha1.FailedReason + job.Status.Message = err.Error() + if err := r.Status().Update(ctx, job); err != nil { + log.Error(err, "unable to update LMEvalJob status", "state", job.Status.State) + return ctrl.Result{}, err + } + r.Recorder.Event(job, "Warning", "PodMissing", + fmt.Sprintf("the pod for the LMEvalJob %s in namespace %s is gone", + job.Name, + job.Namespace)) + log.Error(err, "since the job's pod is gone, mark the job as complete with error result.") + return ctrl.Result{}, err + } + + if mainIdx := getContainerByName(&pod.Status, "main"); mainIdx == -1 { + // waiting for the main container to be up + return r.pullingJobs.addOrUpdate(string(job.GetUID()), options.PodCheckingInterval), nil + } else if podFailed, msg := isContainerFailed(&pod.Status.ContainerStatuses[mainIdx]); podFailed { + job.Status.State = lmesv1alpha1.CompleteJobState + job.Status.Reason = lmesv1alpha1.FailedReason + job.Status.Message = msg + if err := r.Status().Update(ctx, job); err != nil { + log.Error(err, "unable to update LMEvalJob status for pod failure") + } + log.Info("detect an error on the job's pod. marked the job as done", "name", job.Name) + return ctrl.Result{}, err + } else if pod.Status.ContainerStatuses[mainIdx].State.Running == nil { + return r.pullingJobs.addOrUpdate(string(job.GetUID()), options.PodCheckingInterval), nil + } + + // pull status from the driver + if err = r.updateStatus(ctx, log, job); err == nil && job.Status.State == lmesv1alpha1.CompleteJobState { + // the update will trigger another reconcile + return ctrl.Result{}, nil + } + if err != nil { + log.Error(err, "unable to retrieve the status from the job's pod. retry after the pulling interval") + } + return r.pullingJobs.addOrUpdate(string(job.GetUID()), options.PodCheckingInterval), nil +} + +func (r *LMEvalJobReconciler) getPod(ctx context.Context, job *lmesv1alpha1.LMEvalJob) (*corev1.Pod, error) { + var pod = corev1.Pod{} + if err := r.Get(ctx, types.NamespacedName{Namespace: job.Namespace, Name: job.Name}, &pod); err != nil { + return nil, err + } + for _, ref := range pod.OwnerReferences { + if ref.APIVersion == job.APIVersion && + ref.Kind == job.Kind && + ref.Name == job.Name { + + return &pod, nil + } + } + return nil, fmt.Errorf("pod doesn't have proper entry in the OwnerReferences") +} + +func (r *LMEvalJobReconciler) deleteJobPod(ctx context.Context, job *lmesv1alpha1.LMEvalJob) error { + pod := corev1.Pod{ + TypeMeta: v1.TypeMeta{ + Kind: "Pod", + APIVersion: "v1", + }, + ObjectMeta: v1.ObjectMeta{ + Name: job.Status.PodName, + Namespace: job.Namespace, + OwnerReferences: []v1.OwnerReference{ + { + APIVersion: job.APIVersion, + Kind: job.Kind, + Name: job.Name, + }, + }, + }, + } + return r.Delete(ctx, &pod, &client.DeleteOptions{}) +} + +func (r *LMEvalJobReconciler) handleComplete(ctx context.Context, log logr.Logger, job *lmesv1alpha1.LMEvalJob) (ctrl.Result, error) { + if job.Status.CompleteTime == nil { + // make sure the pod is in the complete state. if not, run the shutdown command + pod, err := r.getPod(ctx, job) + if err == nil { + if getRunningContainerByName(&pod.Status, "main") != -1 { + // send shutdown command if the main container is running + if err := r.shutdownDriver(ctx, job); err != nil { + log.Error(err, "failed to shutdown the job pod. retry after the pulling interval") + return r.pullingJobs.addOrUpdate(string(job.GetUID()), options.PodCheckingInterval), nil + } + } + } else { + // the pod is gone ?? + log.Error(err, "LMEvalJob is marked as Complete but the pod is gone") + } + + r.Recorder.Event(job, "Normal", "JobCompleted", + fmt.Sprintf("The LMEvalJob %s in namespace %s has completed", + job.Name, + job.Namespace)) + + // record the CompleteTime + current := v1.Now() + job.Status.CompleteTime = ¤t + if err := r.Status().Update(ctx, job); err != nil { + log.Error(err, "failed to update status for completion") + } + } + + // make sure to clean up the pullingJobs + r.pullingJobs.remove(string(job.GetUID())) + return ctrl.Result{}, nil +} + +func (r *LMEvalJobReconciler) handleCancel(ctx context.Context, log logr.Logger, job *lmesv1alpha1.LMEvalJob) (ctrl.Result, error) { + // delete the pod and update the state to complete + if _, err := r.getPod(ctx, job); err != nil { + // pod is gone. update status + job.Status.State = lmesv1alpha1.CompleteJobState + job.Status.Reason = lmesv1alpha1.FailedReason + job.Status.Message = err.Error() + } else { + job.Status.State = lmesv1alpha1.CompleteJobState + job.Status.Reason = lmesv1alpha1.CancelledReason + if err := r.deleteJobPod(ctx, job); err != nil { + // leave the state as is and retry again + log.Error(err, "failed to delete pod. scheduled a retry", "interval", options.PodCheckingInterval.String()) + return r.pullingJobs.addOrUpdate(string(job.GetUID()), options.PodCheckingInterval), err + } + } + + err := r.Status().Update(ctx, job) + if err != nil { + log.Error(err, "failed to update status for cancellation") + } + r.Recorder.Event(job, "Normal", "Cancelled", + fmt.Sprintf("The LMEvalJob %s in namespace %s has cancelled and changed its state to Complete", + job.Name, + job.Namespace)) + r.pullingJobs.remove(string(job.GetUID())) + return ctrl.Result{}, err +} + +func (r *LMEvalJobReconciler) handleSuspend(ctx context.Context, log logr.Logger, job *lmesv1alpha1.LMEvalJob) (ctrl.Result, error) { + defer r.pullingJobs.remove(string(job.GetUID())) + if job.Status.State != lmesv1alpha1.NewJobState { + log.Info("Suspend job") + if err := r.deleteJobPod(ctx, job); err != nil && client.IgnoreNotFound(err) != nil { + log.Error(err, "failed to delete pod for suspended job") + return r.pullingJobs.addOrUpdate(string(job.GetUID()), options.PodCheckingInterval), nil + } + } else { + log.Info("Create job in suspend state.") + } + job.Status.State = lmesv1alpha1.SuspendedJobState + err := r.Status().Update(ctx, job) + if err != nil { + log.Error(err, "failed to update job status to suspended") + } + + return ctrl.Result{}, err +} + +func (r *LMEvalJobReconciler) handleResume(ctx context.Context, log logr.Logger, job *lmesv1alpha1.LMEvalJob) (ctrl.Result, error) { + log.Info("Resume job") + pod := createPod(options, job, log) + if err := r.Create(ctx, pod); err != nil { + log.Error(err, "failed to create pod to resume job") + return r.pullingJobs.addOrUpdate(string(job.GetUID()), options.PodCheckingInterval), nil + } + job.Status.State = lmesv1alpha1.ScheduledJobState + err := r.Status().Update(ctx, job) + if err != nil { + log.Error(err, "failed to update job status to scheduled") + } + return ctrl.Result{}, err +} + +func (r *LMEvalJobReconciler) validateCustomCard(job *lmesv1alpha1.LMEvalJob, log logr.Logger) error { + if job.Spec.TaskList.TaskRecipes == nil { + return nil + } + + for _, taskRecipe := range job.Spec.TaskList.TaskRecipes { + if taskRecipe.Card.Custom != "" { + var card map[string]interface{} + if err := json.Unmarshal([]byte(taskRecipe.Card.Custom), &card); err != nil { + log.Error(err, "failed to parse the custom card") + return fmt.Errorf("custom card is not a valid JSON string, %s", err.Error()) + } + // at least the custom card shall define its loader + if _, ok := card["loader"]; !ok { + missKeyError := fmt.Errorf("no loader definition in the custom card") + log.Error(missKeyError, "failed to parse the custom card") + return missKeyError + } + } + } + + return nil +} + +func createPod(svcOpts *serviceOptions, job *lmesv1alpha1.LMEvalJob, log logr.Logger) *corev1.Pod { + var allowPrivilegeEscalation = false + var runAsNonRootUser = true + var ownerRefController = true + + var envVars = job.Spec.Pod.GetContainer().GetEnv() + + var volumeMounts = []corev1.VolumeMount{ + { + Name: "shared", + MountPath: "/opt/app-root/src/bin", + }, + } + + var volumes = []corev1.Volume{ + { + Name: "shared", VolumeSource: corev1.VolumeSource{ + EmptyDir: &corev1.EmptyDirVolumeSource{}, + }, + }, + } + + volumes = append(volumes, job.Spec.Pod.GetVolumes()...) + volumeMounts = append(volumeMounts, job.Spec.Pod.GetContainer().GetVolumMounts()...) + labels := getPodLabels(job.Labels, log) + annotations := getAnnotations(job.Annotations, log) + resources := getResources(job.Spec.Pod.GetContainer().GetResources()) + + // Then compose the Pod CR + pod := corev1.Pod{ + TypeMeta: v1.TypeMeta{ + Kind: "Pod", + APIVersion: "v1", + }, + ObjectMeta: v1.ObjectMeta{ + Name: job.Name, + Namespace: job.Namespace, + OwnerReferences: []v1.OwnerReference{ + { + APIVersion: job.APIVersion, + Kind: job.Kind, + Name: job.Name, + Controller: &ownerRefController, + UID: job.UID, + }, + }, + Labels: labels, + Annotations: annotations, + }, + Spec: corev1.PodSpec{ + InitContainers: []corev1.Container{ + { + Name: "driver", + Image: svcOpts.DriverImage, + ImagePullPolicy: svcOpts.ImagePullPolicy, + Command: []string{DriverPath, "--copy", DestDriverPath}, + SecurityContext: &corev1.SecurityContext{ + AllowPrivilegeEscalation: &allowPrivilegeEscalation, + Capabilities: &corev1.Capabilities{ + Drop: []corev1.Capability{ + "ALL", + }, + }, + }, + VolumeMounts: []corev1.VolumeMount{ + { + Name: "shared", + MountPath: "/opt/app-root/src/bin", + }, + }, + }, + }, + Containers: []corev1.Container{ + { + Name: "main", + Image: svcOpts.PodImage, + ImagePullPolicy: svcOpts.ImagePullPolicy, + Env: envVars, + Command: generateCmd(svcOpts, job), + Args: generateArgs(svcOpts, job, log), + SecurityContext: &corev1.SecurityContext{ + AllowPrivilegeEscalation: &allowPrivilegeEscalation, + Capabilities: &corev1.Capabilities{ + Drop: []corev1.Capability{ + "ALL", + }, + }, + }, + VolumeMounts: volumeMounts, + Resources: *resources, + }, + }, + SecurityContext: &corev1.PodSecurityContext{ + RunAsNonRoot: &runAsNonRootUser, + SeccompProfile: &corev1.SeccompProfile{ + Type: corev1.SeccompProfileTypeRuntimeDefault, + }, + }, + Volumes: volumes, + RestartPolicy: corev1.RestartPolicyNever, + }, + } + return &pod +} + +func getPodLabels(src map[string]string, log logr.Logger) map[string]string { + labels := map[string]string{ + "app.kubernetes.io/name": "ta-lmes", + } + mergeMapWithFilters(labels, src, labelFilterPrefixes, log) + return labels +} + +func getAnnotations(annotations map[string]string, log logr.Logger) map[string]string { + if len(annotations) == 0 { + return nil + } + dest := map[string]string{} + mergeMapWithFilters(dest, annotations, annotationFilterPrefixes, log) + return dest +} + +func getResources(resources *corev1.ResourceRequirements) *corev1.ResourceRequirements { + if resources == nil { + return &corev1.ResourceRequirements{} + } + return resources +} + +// Merge the map based on the filters. If the names in the `src` map contains any prefixes +// in the prefixFilters list, those KV will be discarded, otherwise, KV will be merge into +// `dest` map. +func mergeMapWithFilters(dest, src map[string]string, prefixFilters []string, log logr.Logger) { + if len(prefixFilters) == 0 { + // Fast path if the labelFilterPrefix is empty. + maps.Copy(dest, src) + } else { + for k, v := range src { + if slices.ContainsFunc(prefixFilters, func(prefix string) bool { + return strings.HasPrefix(k, prefix) + }) { + log.Info("the label is not propagated to the pod", k, v) + } else { + dest[k] = v + } + } + } +} + +func generateArgs(svcOpts *serviceOptions, job *lmesv1alpha1.LMEvalJob, log logr.Logger) []string { + if job == nil { + return nil + } + + cmds := make([]string, 0, 10) + cmds = append(cmds, "python", "-m", "lm_eval", "--output_path", "/opt/app-root/src/output") + // --model + cmds = append(cmds, "--model", job.Spec.Model) + // --model_args + if job.Spec.ModelArgs != nil { + cmds = append(cmds, "--model_args", argsToString(job.Spec.ModelArgs)) + } + // --tasks + cmds = append(cmds, "--tasks", strings.Join(concatTasks(job.Spec.TaskList), ",")) + // --include + cmds = append(cmds, "--include_path", driver.DefaultTaskRecipesPath) + // --num_fewshot + if job.Spec.NumFewShot != nil { + cmds = append(cmds, "--num_fewshot", fmt.Sprintf("%d", *job.Spec.NumFewShot)) + } + // --limit + if job.Spec.Limit != "" { + cmds = append(cmds, "--limit", job.Spec.Limit) + } + // --gen_kwargs + if job.Spec.GenArgs != nil { + cmds = append(cmds, "--gen_kwargs", argsToString(job.Spec.GenArgs)) + } + // --log_samples + if job.Spec.LogSamples != nil && *job.Spec.LogSamples { + cmds = append(cmds, "--log_samples") + } + // --batch_size + var batchSize = svcOpts.DefaultBatchSize + if job.Spec.BatchSize != nil && *job.Spec.BatchSize > 0 { + batchSize = *job.Spec.BatchSize + } + // This could be done in the webhook if it's enabled. + if batchSize > svcOpts.MaxBatchSize { + batchSize = svcOpts.MaxBatchSize + log.Info("batchSize is greater than max-batch-size of the controller's configuration, use the max-batch-size instead") + } + cmds = append(cmds, "--batch_size", fmt.Sprintf("%d", batchSize)) + + return []string{"sh", "-ec", strings.Join(cmds, " ")} +} + +func concatTasks(tasks lmesv1alpha1.TaskList) []string { + if len(tasks.TaskRecipes) == 0 { + return tasks.TaskNames + } + recipesName := make([]string, len(tasks.TaskRecipes)) + for i := range tasks.TaskRecipes { + // assign internal used task name + recipesName[i] = fmt.Sprintf("%s_%d", driver.TaskRecipePrefix, i) + } + return append(tasks.TaskNames, recipesName...) +} + +func generateCmd(svcOpts *serviceOptions, job *lmesv1alpha1.LMEvalJob) []string { + if job == nil { + return nil + } + cmds := []string{ + DestDriverPath, + "--output-path", "/opt/app-root/src/output", + } + + if svcOpts.DetectDevice { + cmds = append(cmds, "--detect-device") + } + + cr_idx := 0 + for _, recipe := range job.Spec.TaskList.TaskRecipes { + if recipe.Card.Name != "" { + // built-in card, regular recipe + cmds = append(cmds, "--task-recipe", recipe.String()) + } else if recipe.Card.Custom != "" { + // custom card, need to inject --custom-card arg as well + dupRecipe := recipe.DeepCopy() + // the format of a custom card's name: custom_ + dupRecipe.Card.Name = fmt.Sprintf("cards.%s_%d", driver.CustomCardPrefix, cr_idx) + cmds = append(cmds, "--task-recipe", dupRecipe.String()) + cmds = append(cmds, "--custom-card", dupRecipe.Card.Custom) + cr_idx++ + } + } + + cmds = append(cmds, "--") + return cmds +} + +func argsToString(args []lmesv1alpha1.Arg) string { + if args == nil { + return "" + } + var equalForms []string + for _, arg := range args { + equalForms = append(equalForms, fmt.Sprintf("%s=%s", arg.Name, arg.Value)) + } + return strings.Join(equalForms, ",") +} + +func isContainerFailed(status *corev1.ContainerStatus) (bool, string) { + if status.State.Waiting != nil && + status.State.Waiting.Reason != "PodInitializing" { + return true, status.State.Waiting.Reason + } + if status.State.Terminated != nil && + status.State.Terminated.Reason != "Complete" { + return true, status.State.Terminated.Reason + } + return false, "" +} + +// return the index of the container which is in running state and with the specified name +// otherwise return -1 +func getRunningContainerByName(status *corev1.PodStatus, name string) int { + if idx := getContainerByName(status, name); idx != -1 && status.ContainerStatuses[idx].State.Running != nil { + return idx + } + return -1 +} + +func getContainerByName(status *corev1.PodStatus, name string) int { + if status.ContainerStatuses == nil { + return -1 + } + return slices.IndexFunc(status.ContainerStatuses, func(s corev1.ContainerStatus) bool { + return s.Name == name + }) +} diff --git a/controllers/lmes/lmevaljob_controller_test.go b/controllers/lmes/lmevaljob_controller_test.go new file mode 100644 index 00000000..bdb13334 --- /dev/null +++ b/controllers/lmes/lmevaljob_controller_test.go @@ -0,0 +1,957 @@ +/* +Copyright 2024. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +*/ + +package lmes + +import ( + "context" + "strconv" + "testing" + + "github.com/stretchr/testify/assert" + lmesv1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/lmes/v1alpha1" + "github.com/trustyai-explainability/trustyai-service-operator/controllers/lmes/driver" + corev1 "k8s.io/api/core/v1" + "k8s.io/apimachinery/pkg/api/resource" + metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" + "sigs.k8s.io/controller-runtime/pkg/log" +) + +var ( + isController = true + allowPrivilegeEscalation = false + runAsNonRootUser = true +) + +func Test_SimplePod(t *testing.T) { + log := log.FromContext(context.Background()) + svcOpts := &serviceOptions{ + PodImage: "podimage:latest", + DriverImage: "driver:latest", + ImagePullPolicy: corev1.PullAlways, + } + + var job = &lmesv1alpha1.LMEvalJob{ + ObjectMeta: metav1.ObjectMeta{ + Name: "test", + Namespace: "default", + UID: "for-testing", + }, + TypeMeta: metav1.TypeMeta{ + Kind: lmesv1alpha1.KindName, + APIVersion: lmesv1alpha1.Version, + }, + Spec: lmesv1alpha1.LMEvalJobSpec{ + Model: "test", + ModelArgs: []lmesv1alpha1.Arg{ + {Name: "arg1", Value: "value1"}, + }, + TaskList: lmesv1alpha1.TaskList{ + TaskNames: []string{"task1", "task2"}, + }, + }, + } + + expect := &corev1.Pod{ + ObjectMeta: metav1.ObjectMeta{ + Name: "test", + Namespace: "default", + Labels: map[string]string{ + "app.kubernetes.io/name": "ta-lmes", + }, + OwnerReferences: []metav1.OwnerReference{ + { + APIVersion: lmesv1alpha1.Version, + Kind: lmesv1alpha1.KindName, + Name: "test", + Controller: &isController, + UID: "for-testing", + }, + }, + }, + TypeMeta: metav1.TypeMeta{ + Kind: "Pod", + APIVersion: "v1", + }, + Spec: corev1.PodSpec{ + InitContainers: []corev1.Container{ + { + Name: "driver", + Image: svcOpts.DriverImage, + ImagePullPolicy: svcOpts.ImagePullPolicy, + Command: []string{DriverPath, "--copy", DestDriverPath}, + SecurityContext: &corev1.SecurityContext{ + AllowPrivilegeEscalation: &allowPrivilegeEscalation, + Capabilities: &corev1.Capabilities{ + Drop: []corev1.Capability{ + "ALL", + }, + }, + }, + VolumeMounts: []corev1.VolumeMount{ + { + Name: "shared", + MountPath: "/opt/app-root/src/bin", + }, + }, + }, + }, + Containers: []corev1.Container{ + { + Name: "main", + Image: svcOpts.PodImage, + ImagePullPolicy: svcOpts.ImagePullPolicy, + Command: generateCmd(svcOpts, job), + Args: generateArgs(svcOpts, job, log), + SecurityContext: &corev1.SecurityContext{ + AllowPrivilegeEscalation: &allowPrivilegeEscalation, + Capabilities: &corev1.Capabilities{ + Drop: []corev1.Capability{ + "ALL", + }, + }, + }, + VolumeMounts: []corev1.VolumeMount{ + { + Name: "shared", + MountPath: "/opt/app-root/src/bin", + }, + }, + }, + }, + SecurityContext: &corev1.PodSecurityContext{ + RunAsNonRoot: &runAsNonRootUser, + SeccompProfile: &corev1.SeccompProfile{ + Type: corev1.SeccompProfileTypeRuntimeDefault, + }, + }, + Volumes: []corev1.Volume{ + { + Name: "shared", VolumeSource: corev1.VolumeSource{ + EmptyDir: &corev1.EmptyDirVolumeSource{}, + }, + }, + }, + RestartPolicy: corev1.RestartPolicyNever, + }, + } + + newPod := createPod(svcOpts, job, log) + + assert.Equal(t, expect, newPod) +} + +func Test_WithLabelsAnnotationsResourcesVolumes(t *testing.T) { + log := log.FromContext(context.Background()) + svcOpts := &serviceOptions{ + PodImage: "podimage:latest", + DriverImage: "driver:latest", + ImagePullPolicy: corev1.PullAlways, + } + var job = &lmesv1alpha1.LMEvalJob{ + ObjectMeta: metav1.ObjectMeta{ + Name: "test", + Namespace: "default", + UID: "for-testing", + Labels: map[string]string{ + "custom/label1": "value1", + "custom/label2": "value2", + }, + Annotations: map[string]string{ + "custom/annotation1": "annotation1", + "custom/annotation2": "annotation2", + }, + }, + TypeMeta: metav1.TypeMeta{ + Kind: lmesv1alpha1.KindName, + APIVersion: lmesv1alpha1.Version, + }, + Spec: lmesv1alpha1.LMEvalJobSpec{ + Model: "test", + ModelArgs: []lmesv1alpha1.Arg{ + {Name: "arg1", Value: "value1"}, + }, + TaskList: lmesv1alpha1.TaskList{ + TaskNames: []string{"task1", "task2"}, + }, + Pod: &lmesv1alpha1.LMEvalPodSpec{ + Container: &lmesv1alpha1.LMEvalContainer{ + Resources: &corev1.ResourceRequirements{ + Limits: corev1.ResourceList{ + corev1.ResourceCPU: resource.MustParse("1"), + }, + }, + VolumeMounts: []corev1.VolumeMount{ + { + Name: "additionalVolume", + MountPath: "/test", + }, + }, + }, + Volumes: []corev1.Volume{ + { + Name: "additionalVolume", + VolumeSource: corev1.VolumeSource{ + PersistentVolumeClaim: &corev1.PersistentVolumeClaimVolumeSource{ + ClaimName: "mypvc", + ReadOnly: true, + }, + }, + }, + }, + }, + }, + } + + expect := &corev1.Pod{ + ObjectMeta: metav1.ObjectMeta{ + Name: "test", + Namespace: "default", + Labels: map[string]string{ + "app.kubernetes.io/name": "ta-lmes", + "custom/label1": "value1", + "custom/label2": "value2", + }, + Annotations: map[string]string{ + "custom/annotation1": "annotation1", + "custom/annotation2": "annotation2", + }, + OwnerReferences: []metav1.OwnerReference{ + { + APIVersion: lmesv1alpha1.Version, + Kind: lmesv1alpha1.KindName, + Name: "test", + Controller: &isController, + UID: "for-testing", + }, + }, + }, + TypeMeta: metav1.TypeMeta{ + Kind: "Pod", + APIVersion: "v1", + }, + Spec: corev1.PodSpec{ + InitContainers: []corev1.Container{ + { + Name: "driver", + Image: svcOpts.DriverImage, + ImagePullPolicy: svcOpts.ImagePullPolicy, + Command: []string{DriverPath, "--copy", DestDriverPath}, + SecurityContext: &corev1.SecurityContext{ + AllowPrivilegeEscalation: &allowPrivilegeEscalation, + Capabilities: &corev1.Capabilities{ + Drop: []corev1.Capability{ + "ALL", + }, + }, + }, + VolumeMounts: []corev1.VolumeMount{ + { + Name: "shared", + MountPath: "/opt/app-root/src/bin", + }, + }, + }, + }, + Containers: []corev1.Container{ + { + Name: "main", + Image: svcOpts.PodImage, + ImagePullPolicy: svcOpts.ImagePullPolicy, + Command: generateCmd(svcOpts, job), + Args: generateArgs(svcOpts, job, log), + SecurityContext: &corev1.SecurityContext{ + AllowPrivilegeEscalation: &allowPrivilegeEscalation, + Capabilities: &corev1.Capabilities{ + Drop: []corev1.Capability{ + "ALL", + }, + }, + }, + VolumeMounts: []corev1.VolumeMount{ + { + Name: "shared", + MountPath: "/opt/app-root/src/bin", + }, + { + Name: "additionalVolume", + MountPath: "/test", + }, + }, + Resources: corev1.ResourceRequirements{ + Limits: corev1.ResourceList{ + corev1.ResourceCPU: resource.MustParse("1"), + }, + }, + }, + }, + SecurityContext: &corev1.PodSecurityContext{ + RunAsNonRoot: &runAsNonRootUser, + SeccompProfile: &corev1.SeccompProfile{ + Type: corev1.SeccompProfileTypeRuntimeDefault, + }, + }, + Volumes: []corev1.Volume{ + { + Name: "shared", VolumeSource: corev1.VolumeSource{ + EmptyDir: &corev1.EmptyDirVolumeSource{}, + }, + }, + { + Name: "additionalVolume", + VolumeSource: corev1.VolumeSource{ + PersistentVolumeClaim: &corev1.PersistentVolumeClaimVolumeSource{ + ClaimName: "mypvc", + ReadOnly: true, + }, + }, + }, + }, + RestartPolicy: corev1.RestartPolicyNever, + }, + } + + newPod := createPod(svcOpts, job, log) + + assert.Equal(t, expect, newPod) + + // with filter + labelFilterPrefixes = append(labelFilterPrefixes, "custom/label1") + annotationFilterPrefixes = append(annotationFilterPrefixes, "custom/annotation2") + expect.Labels = map[string]string{ + "app.kubernetes.io/name": "ta-lmes", + "custom/label2": "value2", + } + expect.Annotations = map[string]string{ + "custom/annotation1": "annotation1", + } + + newPod = createPod(svcOpts, job, log) + assert.Equal(t, expect, newPod) +} + +func Test_EnvSecretsPod(t *testing.T) { + log := log.FromContext(context.Background()) + svcOpts := &serviceOptions{ + PodImage: "podimage:latest", + DriverImage: "driver:latest", + ImagePullPolicy: corev1.PullAlways, + } + var job = &lmesv1alpha1.LMEvalJob{ + ObjectMeta: metav1.ObjectMeta{ + Name: "test", + Namespace: "default", + UID: "for-testing", + }, + TypeMeta: metav1.TypeMeta{ + Kind: lmesv1alpha1.KindName, + APIVersion: lmesv1alpha1.Version, + }, + Spec: lmesv1alpha1.LMEvalJobSpec{ + Model: "test", + ModelArgs: []lmesv1alpha1.Arg{ + {Name: "arg1", Value: "value1"}, + }, + TaskList: lmesv1alpha1.TaskList{ + TaskNames: []string{"task1", "task2"}, + }, + Pod: &lmesv1alpha1.LMEvalPodSpec{ + Container: &lmesv1alpha1.LMEvalContainer{ + Env: []corev1.EnvVar{ + { + Name: "my_env", + ValueFrom: &corev1.EnvVarSource{ + SecretKeyRef: &corev1.SecretKeySelector{ + Key: "my-key", + LocalObjectReference: corev1.LocalObjectReference{ + Name: "my-secret", + }, + }, + }, + }, + }, + }, + }, + }, + } + + expect := &corev1.Pod{ + ObjectMeta: metav1.ObjectMeta{ + Name: "test", + Namespace: "default", + Labels: map[string]string{ + "app.kubernetes.io/name": "ta-lmes", + }, + OwnerReferences: []metav1.OwnerReference{ + { + APIVersion: lmesv1alpha1.Version, + Kind: lmesv1alpha1.KindName, + Name: "test", + Controller: &isController, + UID: "for-testing", + }, + }, + }, + TypeMeta: metav1.TypeMeta{ + Kind: "Pod", + APIVersion: "v1", + }, + Spec: corev1.PodSpec{ + InitContainers: []corev1.Container{ + { + Name: "driver", + Image: svcOpts.DriverImage, + ImagePullPolicy: svcOpts.ImagePullPolicy, + Command: []string{DriverPath, "--copy", DestDriverPath}, + SecurityContext: &corev1.SecurityContext{ + AllowPrivilegeEscalation: &allowPrivilegeEscalation, + Capabilities: &corev1.Capabilities{ + Drop: []corev1.Capability{ + "ALL", + }, + }, + }, + VolumeMounts: []corev1.VolumeMount{ + { + Name: "shared", + MountPath: "/opt/app-root/src/bin", + }, + }, + }, + }, + Containers: []corev1.Container{ + { + Name: "main", + Image: svcOpts.PodImage, + ImagePullPolicy: svcOpts.ImagePullPolicy, + Env: []corev1.EnvVar{ + { + Name: "my_env", + ValueFrom: &corev1.EnvVarSource{ + SecretKeyRef: &corev1.SecretKeySelector{ + Key: "my-key", + LocalObjectReference: corev1.LocalObjectReference{ + Name: "my-secret", + }, + }, + }, + }, + }, + Command: generateCmd(svcOpts, job), + Args: generateArgs(svcOpts, job, log), + SecurityContext: &corev1.SecurityContext{ + AllowPrivilegeEscalation: &allowPrivilegeEscalation, + Capabilities: &corev1.Capabilities{ + Drop: []corev1.Capability{ + "ALL", + }, + }, + }, + VolumeMounts: []corev1.VolumeMount{ + { + Name: "shared", + MountPath: "/opt/app-root/src/bin", + }, + }, + }, + }, + SecurityContext: &corev1.PodSecurityContext{ + RunAsNonRoot: &runAsNonRootUser, + SeccompProfile: &corev1.SeccompProfile{ + Type: corev1.SeccompProfileTypeRuntimeDefault, + }, + }, + Volumes: []corev1.Volume{ + { + Name: "shared", VolumeSource: corev1.VolumeSource{ + EmptyDir: &corev1.EmptyDirVolumeSource{}, + }, + }, + }, + RestartPolicy: corev1.RestartPolicyNever, + }, + } + + newPod := createPod(svcOpts, job, log) + // maybe only verify the envs: Containers[0].Env + assert.Equal(t, expect, newPod) +} + +func Test_FileSecretsPod(t *testing.T) { + log := log.FromContext(context.Background()) + svcOpts := &serviceOptions{ + PodImage: "podimage:latest", + DriverImage: "driver:latest", + ImagePullPolicy: corev1.PullAlways, + } + var job = &lmesv1alpha1.LMEvalJob{ + ObjectMeta: metav1.ObjectMeta{ + Name: "test", + Namespace: "default", + UID: "for-testing", + }, + TypeMeta: metav1.TypeMeta{ + Kind: lmesv1alpha1.KindName, + APIVersion: lmesv1alpha1.Version, + }, + Spec: lmesv1alpha1.LMEvalJobSpec{ + Model: "test", + ModelArgs: []lmesv1alpha1.Arg{ + {Name: "arg1", Value: "value1"}, + }, + TaskList: lmesv1alpha1.TaskList{ + TaskNames: []string{"task1", "task2"}, + }, + Pod: &lmesv1alpha1.LMEvalPodSpec{ + Container: &lmesv1alpha1.LMEvalContainer{ + VolumeMounts: []corev1.VolumeMount{ + { + Name: "secVol1", + MountPath: "the_path", + ReadOnly: true, + }, + }, + }, + Volumes: []corev1.Volume{ + { + Name: "secVol1", + VolumeSource: corev1.VolumeSource{ + Secret: &corev1.SecretVolumeSource{ + SecretName: "my-secret", + Items: []corev1.KeyToPath{ + { + Key: "key1", + Path: "path1", + }, + }, + }, + }, + }, + }, + }, + }, + } + + expect := &corev1.Pod{ + ObjectMeta: metav1.ObjectMeta{ + Name: "test", + Namespace: "default", + Labels: map[string]string{ + "app.kubernetes.io/name": "ta-lmes", + }, + OwnerReferences: []metav1.OwnerReference{ + { + APIVersion: lmesv1alpha1.Version, + Kind: lmesv1alpha1.KindName, + Name: "test", + Controller: &isController, + UID: "for-testing", + }, + }, + }, + TypeMeta: metav1.TypeMeta{ + Kind: "Pod", + APIVersion: "v1", + }, + Spec: corev1.PodSpec{ + InitContainers: []corev1.Container{ + { + Name: "driver", + Image: svcOpts.DriverImage, + ImagePullPolicy: svcOpts.ImagePullPolicy, + Command: []string{DriverPath, "--copy", DestDriverPath}, + SecurityContext: &corev1.SecurityContext{ + AllowPrivilegeEscalation: &allowPrivilegeEscalation, + Capabilities: &corev1.Capabilities{ + Drop: []corev1.Capability{ + "ALL", + }, + }, + }, + VolumeMounts: []corev1.VolumeMount{ + { + Name: "shared", + MountPath: "/opt/app-root/src/bin", + }, + }, + }, + }, + Containers: []corev1.Container{ + { + Name: "main", + Image: svcOpts.PodImage, + ImagePullPolicy: svcOpts.ImagePullPolicy, + Command: generateCmd(svcOpts, job), + Args: generateArgs(svcOpts, job, log), + SecurityContext: &corev1.SecurityContext{ + AllowPrivilegeEscalation: &allowPrivilegeEscalation, + Capabilities: &corev1.Capabilities{ + Drop: []corev1.Capability{ + "ALL", + }, + }, + }, + VolumeMounts: []corev1.VolumeMount{ + { + Name: "shared", + MountPath: "/opt/app-root/src/bin", + }, + { + Name: "secVol1", + MountPath: "the_path", + ReadOnly: true, + }, + }, + }, + }, + SecurityContext: &corev1.PodSecurityContext{ + RunAsNonRoot: &runAsNonRootUser, + SeccompProfile: &corev1.SeccompProfile{ + Type: corev1.SeccompProfileTypeRuntimeDefault, + }, + }, + Volumes: []corev1.Volume{ + { + Name: "shared", VolumeSource: corev1.VolumeSource{ + EmptyDir: &corev1.EmptyDirVolumeSource{}, + }, + }, + { + Name: "secVol1", + VolumeSource: corev1.VolumeSource{ + Secret: &corev1.SecretVolumeSource{ + SecretName: "my-secret", + Items: []corev1.KeyToPath{ + { + Key: "key1", + Path: "path1", + }, + }, + }, + }, + }, + }, + RestartPolicy: corev1.RestartPolicyNever, + }, + } + + newPod := createPod(svcOpts, job, log) + // maybe only verify the envs: Containers[0].Env + assert.Equal(t, expect, newPod) +} + +func Test_GenerateArgBatchSize(t *testing.T) { + log := log.FromContext(context.Background()) + svcOpts := &serviceOptions{ + PodImage: "podimage:latest", + DriverImage: "driver:latest", + ImagePullPolicy: corev1.PullAlways, + MaxBatchSize: 20, + DefaultBatchSize: 4, + } + var job = &lmesv1alpha1.LMEvalJob{ + ObjectMeta: metav1.ObjectMeta{ + Name: "test", + Namespace: "default", + UID: "for-testing", + }, + TypeMeta: metav1.TypeMeta{ + Kind: lmesv1alpha1.KindName, + APIVersion: lmesv1alpha1.Version, + }, + Spec: lmesv1alpha1.LMEvalJobSpec{ + Model: "test", + ModelArgs: []lmesv1alpha1.Arg{ + {Name: "arg1", Value: "value1"}, + }, + TaskList: lmesv1alpha1.TaskList{ + TaskNames: []string{"task1", "task2"}, + }, + }, + } + + // no batchSize in the job, use default batchSize + assert.Equal(t, []string{ + "sh", "-ec", + "python -m lm_eval --output_path /opt/app-root/src/output --model test --model_args arg1=value1 --tasks task1,task2 --include_path /opt/app-root/src/my_tasks --batch_size " + strconv.Itoa(svcOpts.DefaultBatchSize), + }, generateArgs(svcOpts, job, log)) + + // exceed the max-batch-size, use max-batch-size + var biggerBatchSize = 30 + job.Spec.BatchSize = &biggerBatchSize + assert.Equal(t, []string{ + "sh", "-ec", + "python -m lm_eval --output_path /opt/app-root/src/output --model test --model_args arg1=value1 --tasks task1,task2 --include_path /opt/app-root/src/my_tasks --batch_size " + strconv.Itoa(svcOpts.MaxBatchSize), + }, generateArgs(svcOpts, job, log)) + + // normal batchSize + var normalBatchSize = 16 + job.Spec.BatchSize = &normalBatchSize + assert.Equal(t, []string{ + "sh", "-ec", + "python -m lm_eval --output_path /opt/app-root/src/output --model test --model_args arg1=value1 --tasks task1,task2 --include_path /opt/app-root/src/my_tasks --batch_size 16", + }, generateArgs(svcOpts, job, log)) +} + +func Test_GenerateArgCmdTaskRecipes(t *testing.T) { + log := log.FromContext(context.Background()) + svcOpts := &serviceOptions{ + PodImage: "podimage:latest", + DriverImage: "driver:latest", + ImagePullPolicy: corev1.PullAlways, + MaxBatchSize: options.MaxBatchSize, + DefaultBatchSize: options.DefaultBatchSize, + } + var format = "unitxt.format" + var numDemos = 5 + var demosPoolSize = 10 + var job = &lmesv1alpha1.LMEvalJob{ + ObjectMeta: metav1.ObjectMeta{ + Name: "test", + Namespace: "default", + UID: "for-testing", + }, + TypeMeta: metav1.TypeMeta{ + Kind: lmesv1alpha1.KindName, + APIVersion: lmesv1alpha1.Version, + }, + Spec: lmesv1alpha1.LMEvalJobSpec{ + Model: "test", + ModelArgs: []lmesv1alpha1.Arg{ + {Name: "arg1", Value: "value1"}, + }, + TaskList: lmesv1alpha1.TaskList{ + TaskNames: []string{"task1", "task2"}, + TaskRecipes: []lmesv1alpha1.TaskRecipe{ + { + Card: lmesv1alpha1.Card{Name: "unitxt.card1"}, + Template: "unitxt.template", + Format: &format, + Metrics: []string{"unitxt.metric1", "unitxt.metric2"}, + NumDemos: &numDemos, + DemosPoolSize: &demosPoolSize, + }, + }, + }, + }, + } + + // one TaskRecipe + assert.Equal(t, []string{ + "sh", "-ec", + "python -m lm_eval --output_path /opt/app-root/src/output --model test --model_args arg1=value1 --tasks task1,task2,tr_0 --include_path /opt/app-root/src/my_tasks --batch_size 8", + }, generateArgs(svcOpts, job, log)) + + assert.Equal(t, []string{ + "/opt/app-root/src/bin/driver", + "--output-path", "/opt/app-root/src/output", + "--task-recipe", "card=unitxt.card1,template=unitxt.template,metrics=[unitxt.metric1,unitxt.metric2],format=unitxt.format,num_demos=5,demos_pool_size=10", + "--", + }, generateCmd(svcOpts, job)) + + job.Spec.TaskList.TaskRecipes = append(job.Spec.TaskList.TaskRecipes, + lmesv1alpha1.TaskRecipe{ + Card: lmesv1alpha1.Card{Name: "unitxt.card2"}, + Template: "unitxt.template2", + Format: &format, + Metrics: []string{"unitxt.metric3", "unitxt.metric4"}, + NumDemos: &numDemos, + DemosPoolSize: &demosPoolSize, + }, + ) + + // two task recipes + // one TaskRecipe + assert.Equal(t, []string{ + "sh", "-ec", + "python -m lm_eval --output_path /opt/app-root/src/output --model test --model_args arg1=value1 --tasks task1,task2,tr_0,tr_1 --include_path /opt/app-root/src/my_tasks --batch_size 8", + }, generateArgs(svcOpts, job, log)) + + assert.Equal(t, []string{ + "/opt/app-root/src/bin/driver", + "--output-path", "/opt/app-root/src/output", + "--task-recipe", "card=unitxt.card1,template=unitxt.template,metrics=[unitxt.metric1,unitxt.metric2],format=unitxt.format,num_demos=5,demos_pool_size=10", + "--task-recipe", "card=unitxt.card2,template=unitxt.template2,metrics=[unitxt.metric3,unitxt.metric4],format=unitxt.format,num_demos=5,demos_pool_size=10", + "--", + }, generateCmd(svcOpts, job)) +} + +func Test_GenerateArgCmdCustomCard(t *testing.T) { + log := log.FromContext(context.Background()) + svcOpts := &serviceOptions{ + PodImage: "podimage:latest", + DriverImage: "driver:latest", + ImagePullPolicy: corev1.PullAlways, + MaxBatchSize: options.MaxBatchSize, + DefaultBatchSize: options.DefaultBatchSize, + } + var format = "unitxt.format" + var numDemos = 5 + var demosPoolSize = 10 + var job = &lmesv1alpha1.LMEvalJob{ + ObjectMeta: metav1.ObjectMeta{ + Name: "test", + Namespace: "default", + UID: "for-testing", + }, + TypeMeta: metav1.TypeMeta{ + Kind: lmesv1alpha1.KindName, + APIVersion: lmesv1alpha1.Version, + }, + Spec: lmesv1alpha1.LMEvalJobSpec{ + Model: "test", + ModelArgs: []lmesv1alpha1.Arg{ + {Name: "arg1", Value: "value1"}, + }, + TaskList: lmesv1alpha1.TaskList{ + TaskNames: []string{"task1", "task2"}, + TaskRecipes: []lmesv1alpha1.TaskRecipe{ + { + Card: lmesv1alpha1.Card{ + Custom: `{ "__type__": "task_card", "loader": { "__type__": "load_hf", "path": "wmt16", "name": "de-en" }, "preprocess_steps": [ { "__type__": "copy", "field": "translation/en", "to_field": "text" }, { "__type__": "copy", "field": "translation/de", "to_field": "translation" }, { "__type__": "set", "fields": { "source_language": "english", "target_language": "dutch" } } ], "task": "tasks.translation.directed", "templates": "templates.translation.directed.all" }`, + }, + Template: "unitxt.template", + Format: &format, + Metrics: []string{"unitxt.metric1", "unitxt.metric2"}, + NumDemos: &numDemos, + DemosPoolSize: &demosPoolSize, + }, + }, + }, + }, + } + + assert.Equal(t, []string{ + "sh", "-ec", + "python -m lm_eval --output_path /opt/app-root/src/output --model test --model_args arg1=value1 --tasks task1,task2,tr_0 --include_path /opt/app-root/src/my_tasks --batch_size 8", + }, generateArgs(svcOpts, job, log)) + + assert.Equal(t, []string{ + "/opt/app-root/src/bin/driver", + "--output-path", "/opt/app-root/src/output", + "--task-recipe", "card=cards.custom_0,template=unitxt.template,metrics=[unitxt.metric1,unitxt.metric2],format=unitxt.format,num_demos=5,demos_pool_size=10", + "--custom-card", `{ "__type__": "task_card", "loader": { "__type__": "load_hf", "path": "wmt16", "name": "de-en" }, "preprocess_steps": [ { "__type__": "copy", "field": "translation/en", "to_field": "text" }, { "__type__": "copy", "field": "translation/de", "to_field": "translation" }, { "__type__": "set", "fields": { "source_language": "english", "target_language": "dutch" } } ], "task": "tasks.translation.directed", "templates": "templates.translation.directed.all" }`, + "--", + }, generateCmd(svcOpts, job)) +} + +func Test_CustomCardValidation(t *testing.T) { + log := log.FromContext(context.Background()) + lmevalRec := LMEvalJobReconciler{ + Namespace: "test", + } + var job = &lmesv1alpha1.LMEvalJob{ + ObjectMeta: metav1.ObjectMeta{ + Name: "test", + Namespace: "default", + UID: "for-testing", + }, + TypeMeta: metav1.TypeMeta{ + Kind: lmesv1alpha1.KindName, + APIVersion: lmesv1alpha1.Version, + }, + Spec: lmesv1alpha1.LMEvalJobSpec{ + Model: "test", + ModelArgs: []lmesv1alpha1.Arg{ + {Name: "arg1", Value: "value1"}, + }, + TaskList: lmesv1alpha1.TaskList{ + TaskRecipes: []lmesv1alpha1.TaskRecipe{ + { + Card: lmesv1alpha1.Card{ + Custom: "invalid JSON", + }, + }, + }, + }, + }, + } + + assert.ErrorContains(t, lmevalRec.validateCustomCard(job, log), "custom card is not a valid JSON string") + + // no loader + job.Spec.TaskList.TaskRecipes[0].Card.Custom = ` + { + "__type__": "task_card", + "preprocess_steps": [ + { + "__type__": "copy", + "field": "translation/en", + "to_field": "text" + }, + { + "__type__": "copy", + "field": "translation/de", + "to_field": "translation" + }, + { + "__type__": "set", + "fields": { + "source_language": "english", + "target_language": "dutch" + } + } + ], + "task": "tasks.translation.directed", + "templates": "templates.translation.directed.all" + }` + assert.ErrorContains(t, lmevalRec.validateCustomCard(job, log), "no loader definition in the custom card") + + // ok + job.Spec.TaskList.TaskRecipes[0].Card.Custom = ` + { + "__type__": "task_card", + "loader": { + "__type__": "load_hf", + "path": "wmt16", + "name": "de-en" + }, + "preprocess_steps": [ + { + "__type__": "copy", + "field": "translation/en", + "to_field": "text" + }, + { + "__type__": "copy", + "field": "translation/de", + "to_field": "translation" + }, + { + "__type__": "set", + "fields": { + "source_language": "english", + "target_language": "dutch" + } + } + ], + "task": "tasks.translation.directed", + "templates": "templates.translation.directed.all" + }` + + assert.Nil(t, lmevalRec.validateCustomCard(job, log)) +} + +func Test_ConcatTasks(t *testing.T) { + tasks := concatTasks(lmesv1alpha1.TaskList{ + TaskNames: []string{"task1", "task2"}, + TaskRecipes: []lmesv1alpha1.TaskRecipe{ + {Template: "template3", Card: lmesv1alpha1.Card{Name: "format3"}}, + }, + }) + + assert.Equal(t, []string{"task1", "task2", driver.TaskRecipePrefix + "_0"}, tasks) +} diff --git a/controllers/tas.go b/controllers/tas.go new file mode 100644 index 00000000..d6fd9632 --- /dev/null +++ b/controllers/tas.go @@ -0,0 +1,23 @@ +/* +Copyright 2024. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +*/ + +package controllers + +import "github.com/trustyai-explainability/trustyai-service-operator/controllers/tas" + +func init() { + registerService(tas.ServiceName, tas.ControllerSetUp) +} diff --git a/controllers/certificates.go b/controllers/tas/certificates.go similarity index 97% rename from controllers/certificates.go rename to controllers/tas/certificates.go index 16faa1f9..54e1e1ae 100644 --- a/controllers/certificates.go +++ b/controllers/tas/certificates.go @@ -1,8 +1,9 @@ -package controllers +package tas import ( "context" - trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1" + + trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1" corev1 "k8s.io/api/core/v1" "sigs.k8s.io/controller-runtime/pkg/client" "sigs.k8s.io/controller-runtime/pkg/log" diff --git a/controllers/config_maps.go b/controllers/tas/config_maps.go similarity index 95% rename from controllers/config_maps.go rename to controllers/tas/config_maps.go index a4885ab4..83da00a2 100644 --- a/controllers/config_maps.go +++ b/controllers/tas/config_maps.go @@ -1,8 +1,10 @@ -package controllers +package tas import ( "context" "fmt" + + "github.com/trustyai-explainability/trustyai-service-operator/controllers/constants" corev1 "k8s.io/api/core/v1" "k8s.io/apimachinery/pkg/api/errors" "k8s.io/apimachinery/pkg/types" @@ -15,7 +17,7 @@ func (r *TrustyAIServiceReconciler) getImageFromConfigMap(ctx context.Context, k // Define the key for the ConfigMap configMapKey := types.NamespacedName{ Namespace: r.Namespace, - Name: imageConfigMap, + Name: constants.ConfigMap, } // Create an empty ConfigMap object diff --git a/controllers/config_maps_test.go b/controllers/tas/config_maps_test.go similarity index 94% rename from controllers/config_maps_test.go rename to controllers/tas/config_maps_test.go index 6dfee49d..f2d701e6 100644 --- a/controllers/config_maps_test.go +++ b/controllers/tas/config_maps_test.go @@ -1,4 +1,4 @@ -package controllers +package tas import ( "context" @@ -6,6 +6,7 @@ import ( . "github.com/onsi/ginkgo/v2" . "github.com/onsi/gomega" + "github.com/trustyai-explainability/trustyai-service-operator/controllers/constants" corev1 "k8s.io/api/core/v1" apierrors "k8s.io/apimachinery/pkg/api/errors" metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" @@ -34,7 +35,7 @@ var _ = Describe("ConfigMap tests", func() { configMap := &corev1.ConfigMap{} err := k8sClient.Get(ctx, types.NamespacedName{ Namespace: operatorNamespace, - Name: imageConfigMap, + Name: constants.ConfigMap, }, configMap) // If the ConfigMap exists, delete it @@ -111,7 +112,7 @@ var _ = Describe("ConfigMap tests", func() { WaitFor(func() error { configMap := &corev1.ConfigMap{ ObjectMeta: metav1.ObjectMeta{ - Name: imageConfigMap, + Name: constants.ConfigMap, Namespace: operatorNamespace, }, Data: map[string]string{ @@ -125,7 +126,7 @@ var _ = Describe("ConfigMap tests", func() { var actualOAuthImage string var actualServiceImage string - configMapPath := operatorNamespace + "/" + imageConfigMap + configMapPath := operatorNamespace + "/" + constants.ConfigMap Eventually(func() error { var err error diff --git a/controllers/constants.go b/controllers/tas/constants.go similarity index 98% rename from controllers/constants.go rename to controllers/tas/constants.go index 18fdb4c5..989900f6 100644 --- a/controllers/constants.go +++ b/controllers/tas/constants.go @@ -1,4 +1,4 @@ -package controllers +package tas import "time" @@ -16,6 +16,7 @@ const ( volumeMountName = "volume" defaultRequeueDelay = 30 * time.Second dbCredentialsSuffix = "-db-credentials" + ServiceName = "TAS" ) // Allowed storage formats diff --git a/controllers/database.go b/controllers/tas/database.go similarity index 96% rename from controllers/database.go rename to controllers/tas/database.go index 1c5d26a1..014b55d9 100644 --- a/controllers/database.go +++ b/controllers/tas/database.go @@ -1,10 +1,10 @@ -package controllers +package tas import ( "context" "strings" - trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1" + trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1" appsv1 "k8s.io/api/apps/v1" corev1 "k8s.io/api/core/v1" "k8s.io/apimachinery/pkg/api/errors" diff --git a/controllers/deployment.go b/controllers/tas/deployment.go similarity index 97% rename from controllers/deployment.go rename to controllers/tas/deployment.go index 4696b31e..4d95740e 100644 --- a/controllers/deployment.go +++ b/controllers/tas/deployment.go @@ -1,13 +1,13 @@ -package controllers +package tas import ( "context" "reflect" "strconv" - templateParser "github.com/trustyai-explainability/trustyai-service-operator/controllers/templates" - - trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1" + trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1" + "github.com/trustyai-explainability/trustyai-service-operator/controllers/constants" + templateParser "github.com/trustyai-explainability/trustyai-service-operator/controllers/tas/templates" appsv1 "k8s.io/api/apps/v1" corev1 "k8s.io/api/core/v1" "k8s.io/apimachinery/pkg/api/errors" @@ -69,7 +69,7 @@ func (r *TrustyAIServiceReconciler) createDeploymentObject(ctx context.Context, VolumeMountName: volumeMountName, PVCClaimName: pvcName, CustomCertificatesBundle: caBunble, - Version: Version, + Version: constants.Version, BatchSize: batchSize, } diff --git a/controllers/deployment_test.go b/controllers/tas/deployment_test.go similarity index 99% rename from controllers/deployment_test.go rename to controllers/tas/deployment_test.go index 0aa0f925..ec8d262c 100644 --- a/controllers/deployment_test.go +++ b/controllers/tas/deployment_test.go @@ -1,4 +1,4 @@ -package controllers +package tas import ( "context" @@ -7,7 +7,8 @@ import ( . "github.com/onsi/ginkgo/v2" . "github.com/onsi/gomega" - trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1" + trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1" + "github.com/trustyai-explainability/trustyai-service-operator/controllers/constants" appsv1 "k8s.io/api/apps/v1" corev1 "k8s.io/api/core/v1" apierrors "k8s.io/apimachinery/pkg/api/errors" @@ -47,7 +48,7 @@ func setupAndTestDeploymentDefault(instance *trustyaiopendatahubiov1alpha1.Trust Expect(deployment.Labels["app.kubernetes.io/name"]).Should(Equal(defaultServiceName)) Expect(deployment.Labels["app.kubernetes.io/instance"]).Should(Equal(defaultServiceName)) Expect(deployment.Labels["app.kubernetes.io/part-of"]).Should(Equal(componentName)) - Expect(deployment.Labels["app.kubernetes.io/version"]).Should(Equal(Version)) + Expect(deployment.Labels["app.kubernetes.io/version"]).Should(Equal(constants.Version)) Expect(len(deployment.Spec.Template.Spec.Containers)).Should(Equal(2)) Expect(deployment.Spec.Template.Spec.Containers[0].Image).Should(Equal("quay.io/trustyai/trustyai-service:latest")) @@ -86,7 +87,7 @@ func setupAndTestDeploymentDefault(instance *trustyaiopendatahubiov1alpha1.Trust Expect(oauthService.Labels["app.kubernetes.io/instance"]).Should(Equal(instance.Name)) Expect(oauthService.Labels["app.kubernetes.io/name"]).Should(Equal(instance.Name)) Expect(oauthService.Labels["app.kubernetes.io/part-of"]).Should(Equal(componentName)) - Expect(oauthService.Labels["app.kubernetes.io/version"]).Should(Equal(Version)) + Expect(oauthService.Labels["app.kubernetes.io/version"]).Should(Equal(constants.Version)) Expect(oauthService.Labels["trustyai-service-name"]).Should(Equal(instance.Name)) } @@ -122,7 +123,7 @@ func setupAndTestDeploymentConfigMap(instance *trustyaiopendatahubiov1alpha1.Tru Expect(deployment.Labels["app.kubernetes.io/name"]).Should(Equal(defaultServiceName)) Expect(deployment.Labels["app.kubernetes.io/instance"]).Should(Equal(defaultServiceName)) Expect(deployment.Labels["app.kubernetes.io/part-of"]).Should(Equal(componentName)) - Expect(deployment.Labels["app.kubernetes.io/version"]).Should(Equal(Version)) + Expect(deployment.Labels["app.kubernetes.io/version"]).Should(Equal(constants.Version)) Expect(len(deployment.Spec.Template.Spec.Containers)).Should(Equal(2)) Expect(deployment.Spec.Template.Spec.Containers[0].Image).Should(Equal(serviceImage)) @@ -161,7 +162,7 @@ func setupAndTestDeploymentConfigMap(instance *trustyaiopendatahubiov1alpha1.Tru Expect(oauthService.Labels["app.kubernetes.io/instance"]).Should(Equal(instance.Name)) Expect(oauthService.Labels["app.kubernetes.io/name"]).Should(Equal(instance.Name)) Expect(oauthService.Labels["app.kubernetes.io/part-of"]).Should(Equal(componentName)) - Expect(oauthService.Labels["app.kubernetes.io/version"]).Should(Equal(Version)) + Expect(oauthService.Labels["app.kubernetes.io/version"]).Should(Equal(constants.Version)) Expect(oauthService.Labels["trustyai-service-name"]).Should(Equal(instance.Name)) } @@ -293,7 +294,7 @@ func setupAndTestDeploymentInferenceService(instance *trustyaiopendatahubiov1alp Expect(deployment.Labels["app.kubernetes.io/name"]).Should(Equal(defaultServiceName)) Expect(deployment.Labels["app.kubernetes.io/instance"]).Should(Equal(defaultServiceName)) Expect(deployment.Labels["app.kubernetes.io/part-of"]).Should(Equal(componentName)) - Expect(deployment.Labels["app.kubernetes.io/version"]).Should(Equal(Version)) + Expect(deployment.Labels["app.kubernetes.io/version"]).Should(Equal(constants.Version)) WaitFor(func() error { err := reconciler.reconcileOAuthService(ctx, instance, caBundle) @@ -313,7 +314,7 @@ func setupAndTestDeploymentInferenceService(instance *trustyaiopendatahubiov1alp Expect(oauthService.Labels["app.kubernetes.io/instance"]).Should(Equal(instance.Name)) Expect(oauthService.Labels["app.kubernetes.io/name"]).Should(Equal(instance.Name)) Expect(oauthService.Labels["app.kubernetes.io/part-of"]).Should(Equal(componentName)) - Expect(oauthService.Labels["app.kubernetes.io/version"]).Should(Equal(Version)) + Expect(oauthService.Labels["app.kubernetes.io/version"]).Should(Equal(constants.Version)) Expect(oauthService.Labels["trustyai-service-name"]).Should(Equal(instance.Name)) } @@ -888,7 +889,7 @@ var _ = Describe("TrustyAI operator", func() { Expect(deployment.Labels["app.kubernetes.io/name"]).Should(Equal(defaultServiceName)) Expect(deployment.Labels["app.kubernetes.io/instance"]).Should(Equal(defaultServiceName)) Expect(deployment.Labels["app.kubernetes.io/part-of"]).Should(Equal(componentName)) - Expect(deployment.Labels["app.kubernetes.io/version"]).Should(Equal(Version)) + Expect(deployment.Labels["app.kubernetes.io/version"]).Should(Equal(constants.Version)) Expect(len(deployment.Spec.Template.Spec.Containers)).Should(Equal(2)) Expect(deployment.Spec.Template.Spec.Containers[0].Image).Should(Equal("quay.io/trustyai/trustyai-service:latest")) @@ -912,7 +913,7 @@ var _ = Describe("TrustyAI operator", func() { Expect(oauthService.Labels["app.kubernetes.io/instance"]).Should(Equal(instance.Name)) Expect(oauthService.Labels["app.kubernetes.io/name"]).Should(Equal(instance.Name)) Expect(oauthService.Labels["app.kubernetes.io/part-of"]).Should(Equal(componentName)) - Expect(oauthService.Labels["app.kubernetes.io/version"]).Should(Equal(Version)) + Expect(oauthService.Labels["app.kubernetes.io/version"]).Should(Equal(constants.Version)) Expect(oauthService.Labels["trustyai-service-name"]).Should(Equal(instance.Name)) } diff --git a/controllers/destination_rule.go b/controllers/tas/destination_rule.go similarity index 96% rename from controllers/destination_rule.go rename to controllers/tas/destination_rule.go index ce17552f..caf32336 100644 --- a/controllers/destination_rule.go +++ b/controllers/tas/destination_rule.go @@ -1,12 +1,12 @@ -package controllers +package tas import ( "context" "fmt" "reflect" - trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1" - templateParser "github.com/trustyai-explainability/trustyai-service-operator/controllers/templates" + trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1" + templateParser "github.com/trustyai-explainability/trustyai-service-operator/controllers/tas/templates" apiextensionsv1 "k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1" "k8s.io/apimachinery/pkg/api/errors" "k8s.io/apimachinery/pkg/apis/meta/v1/unstructured" diff --git a/controllers/events.go b/controllers/tas/events.go similarity index 94% rename from controllers/events.go rename to controllers/tas/events.go index c93af02b..4288a957 100644 --- a/controllers/events.go +++ b/controllers/tas/events.go @@ -1,7 +1,7 @@ -package controllers +package tas import ( - trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1" + trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1" corev1 "k8s.io/api/core/v1" ) diff --git a/controllers/finalizers.go b/controllers/tas/finalizers.go similarity index 93% rename from controllers/finalizers.go rename to controllers/tas/finalizers.go index 432abb3e..b3cac73b 100644 --- a/controllers/finalizers.go +++ b/controllers/tas/finalizers.go @@ -1,8 +1,9 @@ -package controllers +package tas import ( "context" - trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1" + + trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1" "sigs.k8s.io/controller-runtime/pkg/log" ) diff --git a/controllers/inference_services.go b/controllers/tas/inference_services.go similarity index 91% rename from controllers/inference_services.go rename to controllers/tas/inference_services.go index 827ed4f4..950f4fde 100644 --- a/controllers/inference_services.go +++ b/controllers/tas/inference_services.go @@ -1,4 +1,4 @@ -package controllers +package tas import ( "context" @@ -6,9 +6,11 @@ import ( "strings" kservev1beta1 "github.com/kserve/kserve/pkg/apis/serving/v1beta1" - trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1" + trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1" + "github.com/trustyai-explainability/trustyai-service-operator/controllers/utils" appsv1 "k8s.io/api/apps/v1" corev1 "k8s.io/api/core/v1" + "k8s.io/apimachinery/pkg/labels" "sigs.k8s.io/controller-runtime/pkg/client" "sigs.k8s.io/controller-runtime/pkg/log" ) @@ -19,6 +21,23 @@ const ( DEPLOYMENT_MODE_SERVERLESS = "Serverless" ) +// GetDeploymentsByLabel returns a list of Deployments that match a label key-value pair +func (r *TrustyAIServiceReconciler) GetDeploymentsByLabel(ctx context.Context, namespace string, labelKey string, labelValue string) ([]appsv1.Deployment, error) { + // Prepare a DeploymentList object + deployments := &appsv1.DeploymentList{} + + // Define the selector based on the provided label key-value pair + selector := labels.Set{labelKey: labelValue}.AsSelector() + + // Fetch the Deployments that match the selector + if err := r.List(ctx, deployments, client.InNamespace(namespace), client.MatchingLabelsSelector{Selector: selector}); err != nil { + log.FromContext(ctx).Error(err, "Could not list Deployments by label.") + return nil, err + } + + return deployments.Items, nil +} + func (r *TrustyAIServiceReconciler) patchEnvVarsForDeployments(ctx context.Context, instance *trustyaiopendatahubiov1alpha1.TrustyAIService, deployments []appsv1.Deployment, envVarName string, url string, remove bool) (bool, error) { // Create volume and volume mount for this intance's TLS secrets certVolumes := TLSCertVolumes{} @@ -148,7 +167,7 @@ func (r *TrustyAIServiceReconciler) patchEnvVarsByLabelForDeployments(ctx contex } // Build the payload processor endpoint - url := generateTLSServiceURL(crName, namespace) + "/consumer/kserve/v2" + url := utils.GenerateTLSServiceURL(crName, namespace) + "/consumer/kserve/v2" // Patch environment variables for the Deployments if shouldContinue, err := r.patchEnvVarsForDeployments(ctx, instance, deployments, envVarName, url, remove); err != nil { @@ -241,7 +260,7 @@ func (r *TrustyAIServiceReconciler) handleInferenceServices(ctx context.Context, // patchKServe adds a TrustyAI service as an InferenceLogger to a KServe InferenceService func (r *TrustyAIServiceReconciler) patchKServe(ctx context.Context, instance *trustyaiopendatahubiov1alpha1.TrustyAIService, infService kservev1beta1.InferenceService, namespace string, crName string, remove bool) error { - url := generateKServeLoggerURL(crName, namespace) + url := utils.GenerateKServeLoggerURL(crName, namespace) if remove { if infService.Spec.Predictor.Logger == nil || *infService.Spec.Predictor.Logger.URL != url { diff --git a/controllers/monitor.go b/controllers/tas/monitor.go similarity index 97% rename from controllers/monitor.go rename to controllers/tas/monitor.go index f2d27227..d793f9ad 100644 --- a/controllers/monitor.go +++ b/controllers/tas/monitor.go @@ -1,13 +1,14 @@ -package controllers +package tas import ( "context" + "reflect" + monitoringv1 "github.com/prometheus-operator/prometheus-operator/pkg/apis/monitoring/v1" - trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1" - templateParser "github.com/trustyai-explainability/trustyai-service-operator/controllers/templates" + trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1" + templateParser "github.com/trustyai-explainability/trustyai-service-operator/controllers/tas/templates" "k8s.io/apimachinery/pkg/api/errors" "k8s.io/apimachinery/pkg/types" - "reflect" ctrl "sigs.k8s.io/controller-runtime" "sigs.k8s.io/controller-runtime/pkg/log" ) diff --git a/controllers/monitor_test.go b/controllers/tas/monitor_test.go similarity index 98% rename from controllers/monitor_test.go rename to controllers/tas/monitor_test.go index ce9bc9c9..e1c15214 100644 --- a/controllers/monitor_test.go +++ b/controllers/tas/monitor_test.go @@ -1,11 +1,12 @@ -package controllers +package tas import ( "context" + . "github.com/onsi/ginkgo/v2" . "github.com/onsi/gomega" monitoringv1 "github.com/prometheus-operator/prometheus-operator/pkg/apis/monitoring/v1" - trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1" + trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1" "k8s.io/apimachinery/pkg/types" "k8s.io/client-go/kubernetes/scheme" "k8s.io/client-go/tools/record" diff --git a/controllers/oauth.go b/controllers/tas/oauth.go similarity index 92% rename from controllers/oauth.go rename to controllers/tas/oauth.go index fc9e5c8a..3c7f9acc 100644 --- a/controllers/oauth.go +++ b/controllers/tas/oauth.go @@ -1,13 +1,15 @@ -package controllers +package tas import ( "context" - trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1" - templateParser "github.com/trustyai-explainability/trustyai-service-operator/controllers/templates" + "reflect" + + trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1" + "github.com/trustyai-explainability/trustyai-service-operator/controllers/constants" + templateParser "github.com/trustyai-explainability/trustyai-service-operator/controllers/tas/templates" corev1 "k8s.io/api/core/v1" "k8s.io/apimachinery/pkg/api/errors" "k8s.io/apimachinery/pkg/types" - "reflect" ctrl "sigs.k8s.io/controller-runtime" "sigs.k8s.io/controller-runtime/pkg/log" ) @@ -32,7 +34,7 @@ func generateTrustyAIOAuthService(ctx context.Context, instance *trustyaiopendat serviceTLSConfig := ServiceTLSConfig{ Instance: instance, CustomCertificatesBundle: caBundle, - Version: Version, + Version: constants.Version, } var serviceTLS *corev1.Service diff --git a/controllers/route.go b/controllers/tas/route.go similarity index 97% rename from controllers/route.go rename to controllers/tas/route.go index 3016a321..ec216d65 100644 --- a/controllers/route.go +++ b/controllers/tas/route.go @@ -1,14 +1,15 @@ -package controllers +package tas import ( "context" + "reflect" + routev1 "github.com/openshift/api/route/v1" - trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1" - templateParser "github.com/trustyai-explainability/trustyai-service-operator/controllers/templates" + trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1" + templateParser "github.com/trustyai-explainability/trustyai-service-operator/controllers/tas/templates" corev1 "k8s.io/api/core/v1" "k8s.io/apimachinery/pkg/api/errors" "k8s.io/apimachinery/pkg/types" - "reflect" ctrl "sigs.k8s.io/controller-runtime" "sigs.k8s.io/controller-runtime/pkg/log" ) diff --git a/controllers/route_test.go b/controllers/tas/route_test.go similarity index 98% rename from controllers/route_test.go rename to controllers/tas/route_test.go index da35a689..946c0e45 100644 --- a/controllers/route_test.go +++ b/controllers/tas/route_test.go @@ -1,11 +1,12 @@ -package controllers +package tas import ( "context" + . "github.com/onsi/ginkgo/v2" . "github.com/onsi/gomega" routev1 "github.com/openshift/api/route/v1" - trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1" + trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1" "k8s.io/apimachinery/pkg/types" "k8s.io/client-go/kubernetes/scheme" "k8s.io/client-go/tools/record" diff --git a/controllers/secrets.go b/controllers/tas/secrets.go similarity index 97% rename from controllers/secrets.go rename to controllers/tas/secrets.go index 499d768b..978f813b 100644 --- a/controllers/secrets.go +++ b/controllers/tas/secrets.go @@ -1,9 +1,10 @@ -package controllers +package tas import ( "context" "fmt" - trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1" + + trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1" corev1 "k8s.io/api/core/v1" "k8s.io/apimachinery/pkg/api/errors" "sigs.k8s.io/controller-runtime/pkg/client" diff --git a/controllers/service_accounts.go b/controllers/tas/service_accounts.go similarity index 95% rename from controllers/service_accounts.go rename to controllers/tas/service_accounts.go index 7c9faf63..0cf64784 100644 --- a/controllers/service_accounts.go +++ b/controllers/tas/service_accounts.go @@ -1,8 +1,10 @@ -package controllers +package tas import ( "context" - trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1" + + trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1" + "github.com/trustyai-explainability/trustyai-service-operator/controllers/constants" "k8s.io/apimachinery/pkg/api/errors" "k8s.io/apimachinery/pkg/types" "k8s.io/apimachinery/pkg/util/json" @@ -64,7 +66,7 @@ func (r *TrustyAIServiceReconciler) createServiceAccount(ctx context.Context, in "app.kubernetes.io/name": serviceAccountName, "app.kubernetes.io/instance": instance.Name, "app.kubernetes.io/part-of": componentName, - "app.kubernetes.io/version": Version, + "app.kubernetes.io/version": constants.Version, }, }, } diff --git a/controllers/service_accounts_test.go b/controllers/tas/service_accounts_test.go similarity index 99% rename from controllers/service_accounts_test.go rename to controllers/tas/service_accounts_test.go index 17b9d660..e7ef134a 100644 --- a/controllers/service_accounts_test.go +++ b/controllers/tas/service_accounts_test.go @@ -1,7 +1,8 @@ -package controllers +package tas import ( "context" + . "github.com/onsi/ginkgo/v2" . "github.com/onsi/gomega" corev1 "k8s.io/api/core/v1" diff --git a/controllers/services.go b/controllers/tas/services.go similarity index 83% rename from controllers/services.go rename to controllers/tas/services.go index d2051fa0..4805bd3d 100644 --- a/controllers/services.go +++ b/controllers/tas/services.go @@ -1,11 +1,13 @@ -package controllers +package tas import ( "context" - trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1" - templateParser "github.com/trustyai-explainability/trustyai-service-operator/controllers/templates" - corev1 "k8s.io/api/core/v1" "reflect" + + trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1" + "github.com/trustyai-explainability/trustyai-service-operator/controllers/constants" + templateParser "github.com/trustyai-explainability/trustyai-service-operator/controllers/tas/templates" + corev1 "k8s.io/api/core/v1" ctrl "sigs.k8s.io/controller-runtime" "sigs.k8s.io/controller-runtime/pkg/log" ) @@ -25,7 +27,7 @@ func (r *TrustyAIServiceReconciler) reconcileService(ctx context.Context, cr *tr serviceConfig := ServiceConfig{ Name: cr.Name, Namespace: cr.Namespace, - Version: Version, + Version: constants.Version, } var service *corev1.Service diff --git a/controllers/statuses.go b/controllers/tas/statuses.go similarity index 99% rename from controllers/statuses.go rename to controllers/tas/statuses.go index b43bbbd1..ca4d4402 100644 --- a/controllers/statuses.go +++ b/controllers/tas/statuses.go @@ -1,9 +1,9 @@ -package controllers +package tas import ( "context" - trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1" + trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1" v1 "k8s.io/api/core/v1" "k8s.io/client-go/util/retry" ctrl "sigs.k8s.io/controller-runtime" diff --git a/controllers/statuses_test.go b/controllers/tas/statuses_test.go similarity index 99% rename from controllers/statuses_test.go rename to controllers/tas/statuses_test.go index f6ad7fe3..b97b156c 100644 --- a/controllers/statuses_test.go +++ b/controllers/tas/statuses_test.go @@ -1,4 +1,4 @@ -package controllers +package tas import ( "context" @@ -6,7 +6,7 @@ import ( . "github.com/onsi/ginkgo/v2" . "github.com/onsi/gomega" - trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1" + trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1" corev1 "k8s.io/api/core/v1" "k8s.io/apimachinery/pkg/types" "k8s.io/client-go/kubernetes/scheme" diff --git a/controllers/storage.go b/controllers/tas/storage.go similarity index 97% rename from controllers/storage.go rename to controllers/tas/storage.go index d9378027..762521f1 100644 --- a/controllers/storage.go +++ b/controllers/tas/storage.go @@ -1,8 +1,9 @@ -package controllers +package tas import ( "context" - trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1" + + trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1" corev1 "k8s.io/api/core/v1" apierrors "k8s.io/apimachinery/pkg/api/errors" "k8s.io/apimachinery/pkg/api/resource" diff --git a/controllers/storage_test.go b/controllers/tas/storage_test.go similarity index 97% rename from controllers/storage_test.go rename to controllers/tas/storage_test.go index c5799035..f91919af 100644 --- a/controllers/storage_test.go +++ b/controllers/tas/storage_test.go @@ -1,10 +1,11 @@ -package controllers +package tas import ( "context" + . "github.com/onsi/ginkgo/v2" . "github.com/onsi/gomega" - trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1" + trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1" corev1 "k8s.io/api/core/v1" "k8s.io/apimachinery/pkg/api/resource" metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" diff --git a/controllers/subreconciler.go b/controllers/tas/subreconciler.go similarity index 95% rename from controllers/subreconciler.go rename to controllers/tas/subreconciler.go index 3748a4d9..43fb689e 100644 --- a/controllers/subreconciler.go +++ b/controllers/tas/subreconciler.go @@ -1,12 +1,13 @@ -package controllers +package tas import ( "context" - trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1" + "time" + + trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1" ctrl "sigs.k8s.io/controller-runtime" "sigs.k8s.io/controller-runtime/pkg/log" "sigs.k8s.io/controller-runtime/pkg/reconcile" - "time" ) func Requeue() (reconcile.Result, error) { return ctrl.Result{Requeue: true}, nil } diff --git a/controllers/suite_test.go b/controllers/tas/suite_test.go similarity index 97% rename from controllers/suite_test.go rename to controllers/tas/suite_test.go index a2f16ba8..9c8f0bda 100644 --- a/controllers/suite_test.go +++ b/controllers/tas/suite_test.go @@ -14,7 +14,7 @@ See the License for the specific language governing permissions and limitations under the License. */ -package controllers +package tas import ( "context" @@ -33,7 +33,8 @@ import ( . "github.com/onsi/gomega" routev1 "github.com/openshift/api/route/v1" monitoringv1 "github.com/prometheus-operator/prometheus-operator/pkg/apis/monitoring/v1" - trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1" + trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1" + "github.com/trustyai-explainability/trustyai-service-operator/controllers/constants" appsv1 "k8s.io/api/apps/v1" corev1 "k8s.io/api/core/v1" "k8s.io/apimachinery/pkg/api/errors" @@ -188,7 +189,7 @@ func createConfigMap(namespace string, oauthImage string, trustyaiServiceImage s // Define the ConfigMap with the necessary data return &corev1.ConfigMap{ ObjectMeta: metav1.ObjectMeta{ - Name: imageConfigMap, + Name: constants.ConfigMap, Namespace: namespace, }, Data: map[string]string{ @@ -511,9 +512,9 @@ var _ = BeforeSuite(func() { By("bootstrapping test environment") testEnv = &envtest.Environment{ - CRDDirectoryPaths: []string{filepath.Join("..", "config", "crd", "bases"), - filepath.Join("..", "config", "prometheus"), - filepath.Join("..", "tests", "crds")}, + CRDDirectoryPaths: []string{filepath.Join("..", "..", "config", "crd", "bases"), + filepath.Join("..", "..", "config", "prometheus"), + filepath.Join("..", "..", "tests", "crds")}, ErrorIfCRDPathMissing: true, } diff --git a/controllers/templates/parser.go b/controllers/tas/templates/parser.go similarity index 100% rename from controllers/templates/parser.go rename to controllers/tas/templates/parser.go diff --git a/controllers/templates/service/deployment.tmpl.yaml b/controllers/tas/templates/service/deployment.tmpl.yaml similarity index 100% rename from controllers/templates/service/deployment.tmpl.yaml rename to controllers/tas/templates/service/deployment.tmpl.yaml diff --git a/controllers/templates/service/destination-rule.tmpl.yaml b/controllers/tas/templates/service/destination-rule.tmpl.yaml similarity index 100% rename from controllers/templates/service/destination-rule.tmpl.yaml rename to controllers/tas/templates/service/destination-rule.tmpl.yaml diff --git a/controllers/templates/service/route.tmpl.yaml b/controllers/tas/templates/service/route.tmpl.yaml similarity index 100% rename from controllers/templates/service/route.tmpl.yaml rename to controllers/tas/templates/service/route.tmpl.yaml diff --git a/controllers/templates/service/service-internal.tmpl.yaml b/controllers/tas/templates/service/service-internal.tmpl.yaml similarity index 100% rename from controllers/templates/service/service-internal.tmpl.yaml rename to controllers/tas/templates/service/service-internal.tmpl.yaml diff --git a/controllers/templates/service/service-monitor-central.tmpl.yaml b/controllers/tas/templates/service/service-monitor-central.tmpl.yaml similarity index 100% rename from controllers/templates/service/service-monitor-central.tmpl.yaml rename to controllers/tas/templates/service/service-monitor-central.tmpl.yaml diff --git a/controllers/templates/service/service-monitor-local.tmpl.yaml b/controllers/tas/templates/service/service-monitor-local.tmpl.yaml similarity index 100% rename from controllers/templates/service/service-monitor-local.tmpl.yaml rename to controllers/tas/templates/service/service-monitor-local.tmpl.yaml diff --git a/controllers/templates/service/service-tls.tmpl.yaml b/controllers/tas/templates/service/service-tls.tmpl.yaml similarity index 100% rename from controllers/templates/service/service-tls.tmpl.yaml rename to controllers/tas/templates/service/service-tls.tmpl.yaml diff --git a/controllers/templates/service/virtual-service.tmpl.yaml b/controllers/tas/templates/service/virtual-service.tmpl.yaml similarity index 100% rename from controllers/templates/service/virtual-service.tmpl.yaml rename to controllers/tas/templates/service/virtual-service.tmpl.yaml diff --git a/controllers/trustyaiservice_controller.go b/controllers/tas/trustyaiservice_controller.go similarity index 94% rename from controllers/trustyaiservice_controller.go rename to controllers/tas/trustyaiservice_controller.go index a6ae3770..619ca6ac 100644 --- a/controllers/trustyaiservice_controller.go +++ b/controllers/tas/trustyaiservice_controller.go @@ -14,7 +14,7 @@ See the License for the specific language governing permissions and limitations under the License. */ -package controllers +package tas import ( "context" @@ -23,7 +23,8 @@ import ( kservev1alpha1 "github.com/kserve/kserve/pkg/apis/serving/v1alpha1" kservev1beta1 "github.com/kserve/kserve/pkg/apis/serving/v1beta1" - trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1" + trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1" + "github.com/trustyai-explainability/trustyai-service-operator/controllers/utils" appsv1 "k8s.io/api/apps/v1" v1 "k8s.io/api/core/v1" "k8s.io/apimachinery/pkg/api/errors" @@ -34,11 +35,21 @@ import ( "sigs.k8s.io/controller-runtime/pkg/client" "sigs.k8s.io/controller-runtime/pkg/handler" "sigs.k8s.io/controller-runtime/pkg/log" + "sigs.k8s.io/controller-runtime/pkg/manager" "sigs.k8s.io/controller-runtime/pkg/source" ) var ErrPVCNotReady = goerrors.New("PVC is not ready") +func ControllerSetUp(mgr manager.Manager, ns, configmap string, recorder record.EventRecorder) error { + return (&TrustyAIServiceReconciler{ + Client: mgr.GetClient(), + Scheme: mgr.GetScheme(), + Namespace: ns, + EventRecorder: recorder, + }).SetupWithManager(mgr) +} + // TrustyAIServiceReconciler reconciles a TrustyAIService object type TrustyAIServiceReconciler struct { client.Client @@ -94,7 +105,7 @@ func (r *TrustyAIServiceReconciler) Reconcile(ctx context.Context, req ctrl.Requ // Check if the CR is being deleted if instance.DeletionTimestamp != nil { // CR is being deleted - if containsString(instance.Finalizers, finalizerName) { + if utils.ContainsString(instance.Finalizers, finalizerName) { // The finalizer is present, so we handle external dependency deletion if err := r.deleteExternalDependency(req.Name, instance, req.Namespace, ctx); err != nil { // Log the error instead of returning it, so we proceed to remove the finalizer without blocking @@ -102,7 +113,7 @@ func (r *TrustyAIServiceReconciler) Reconcile(ctx context.Context, req ctrl.Requ } // Remove the finalizer from the list and update it. - instance.Finalizers = removeString(instance.Finalizers, finalizerName) + instance.Finalizers = utils.RemoveString(instance.Finalizers, finalizerName) if err := r.Update(ctx, instance); err != nil { return RequeueWithErrorMessage(ctx, err, "Failed to remove the finalizer.") } @@ -111,7 +122,7 @@ func (r *TrustyAIServiceReconciler) Reconcile(ctx context.Context, req ctrl.Requ } // Add the finalizer if it does not exist - if !containsString(instance.Finalizers, finalizerName) { + if !utils.ContainsString(instance.Finalizers, finalizerName) { instance.Finalizers = append(instance.Finalizers, finalizerName) if err := r.Update(ctx, instance); err != nil { return RequeueWithErrorMessage(ctx, err, "Failed to add the finalizer.") diff --git a/controllers/virtual_service.go b/controllers/tas/virtual_service.go similarity index 96% rename from controllers/virtual_service.go rename to controllers/tas/virtual_service.go index f5223c86..188c049c 100644 --- a/controllers/virtual_service.go +++ b/controllers/tas/virtual_service.go @@ -1,12 +1,12 @@ -package controllers +package tas import ( "context" "fmt" "reflect" - trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/v1alpha1" - templateParser "github.com/trustyai-explainability/trustyai-service-operator/controllers/templates" + trustyaiopendatahubiov1alpha1 "github.com/trustyai-explainability/trustyai-service-operator/api/tas/v1alpha1" + templateParser "github.com/trustyai-explainability/trustyai-service-operator/controllers/tas/templates" apiextensionsv1 "k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1" "k8s.io/apimachinery/pkg/api/errors" "k8s.io/apimachinery/pkg/apis/meta/v1/unstructured" diff --git a/controllers/utils.go b/controllers/utils.go deleted file mode 100644 index 759b3880..00000000 --- a/controllers/utils.go +++ /dev/null @@ -1,78 +0,0 @@ -package controllers - -import ( - "context" - "os" - - appsv1 "k8s.io/api/apps/v1" - "k8s.io/apimachinery/pkg/labels" - "sigs.k8s.io/controller-runtime/pkg/client" - "sigs.k8s.io/controller-runtime/pkg/log" -) - -func isDeploymentReady(deployment *appsv1.Deployment) bool { - return deployment.Status.Replicas == deployment.Status.UpdatedReplicas && - deployment.Status.Replicas == deployment.Status.AvailableReplicas -} - -// containsString checks if a list contains a string -func containsString(list []string, s string) bool { - for _, v := range list { - if v == s { - return true - } - } - return false -} - -// removeString removes a string from a list -func removeString(list []string, s string) []string { - newList := []string{} - for _, v := range list { - if v != s { - newList = append(newList, v) - } - } - return newList -} - -// GetNamespace returns the namespace of a pod -func GetNamespace() (string, error) { - ns, err := os.ReadFile("/var/run/secrets/kubernetes.io/serviceaccount/namespace") - if err != nil { - return "", err - } - return string(ns), nil -} - -// GetDeploymentsByLabel returns a list of Deployments that match a label key-value pair -func (r *TrustyAIServiceReconciler) GetDeploymentsByLabel(ctx context.Context, namespace string, labelKey string, labelValue string) ([]appsv1.Deployment, error) { - // Prepare a DeploymentList object - deployments := &appsv1.DeploymentList{} - - // Define the selector based on the provided label key-value pair - selector := labels.Set{labelKey: labelValue}.AsSelector() - - // Fetch the Deployments that match the selector - if err := r.List(ctx, deployments, client.InNamespace(namespace), client.MatchingLabelsSelector{Selector: selector}); err != nil { - log.FromContext(ctx).Error(err, "Could not list Deployments by label.") - return nil, err - } - - return deployments.Items, nil -} - -// generateTLSServiceURL generates an internal URL for a TLS-enabled TrustyAI service -func generateTLSServiceURL(crName string, namespace string) string { - return "https://" + crName + "." + namespace + ".svc" -} - -// generateNonTLSServiceURL generates an internal URL for a TrustyAI service -func generateNonTLSServiceURL(crName string, namespace string) string { - return "http://" + crName + "." + namespace + ".svc" -} - -// generateKServeLoggerURL generates an logger url for KServe Inference Loggers -func generateKServeLoggerURL(crName string, namespace string) string { - return "http://" + crName + "." + namespace + ".svc.cluster.local" -} diff --git a/controllers/utils/utils.go b/controllers/utils/utils.go new file mode 100644 index 00000000..419ab3b0 --- /dev/null +++ b/controllers/utils/utils.go @@ -0,0 +1,57 @@ +package utils + +import ( + "os" + + appsv1 "k8s.io/api/apps/v1" +) + +func IsDeploymentReady(deployment *appsv1.Deployment) bool { + return deployment.Status.Replicas == deployment.Status.UpdatedReplicas && + deployment.Status.Replicas == deployment.Status.AvailableReplicas +} + +// containsString checks if a list contains a string +func ContainsString(list []string, s string) bool { + for _, v := range list { + if v == s { + return true + } + } + return false +} + +// removeString removes a string from a list +func RemoveString(list []string, s string) []string { + newList := []string{} + for _, v := range list { + if v != s { + newList = append(newList, v) + } + } + return newList +} + +// GetNamespace returns the namespace of a pod +func GetNamespace() (string, error) { + ns, err := os.ReadFile("/var/run/secrets/kubernetes.io/serviceaccount/namespace") + if err != nil { + return "", err + } + return string(ns), nil +} + +// generateTLSServiceURL generates an internal URL for a TLS-enabled TrustyAI service +func GenerateTLSServiceURL(crName string, namespace string) string { + return "https://" + crName + "." + namespace + ".svc" +} + +// generateNonTLSServiceURL generates an internal URL for a TrustyAI service +func GenerateNonTLSServiceURL(crName string, namespace string) string { + return "http://" + crName + "." + namespace + ".svc" +} + +// generateKServeLoggerURL generates an logger url for KServe Inference Loggers +func GenerateKServeLoggerURL(crName string, namespace string) string { + return "http://" + crName + "." + namespace + ".svc.cluster.local" +} diff --git a/controllers/version.go b/controllers/version.go deleted file mode 100644 index 7633c475..00000000 --- a/controllers/version.go +++ /dev/null @@ -1,5 +0,0 @@ -package controllers - -const ( - Version = "1.17.0" -) diff --git a/go.mod b/go.mod index fcb6b76f..7c98c900 100644 --- a/go.mod +++ b/go.mod @@ -8,6 +8,8 @@ require ( github.com/onsi/gomega v1.27.10 github.com/openshift/api v0.0.0-20200713203337-b2494ecb17dd github.com/prometheus-operator/prometheus-operator/pkg/apis/monitoring v0.64.1 + github.com/spf13/viper v1.19.0 + github.com/stretchr/testify v1.9.0 k8s.io/api v0.26.4 k8s.io/apimachinery v0.26.4 k8s.io/client-go v0.26.4 @@ -15,21 +17,48 @@ require ( ) require ( - cloud.google.com/go v0.110.2 // indirect - cloud.google.com/go/compute v1.20.1 // indirect + github.com/felixge/httpsnoop v1.0.4 // indirect + github.com/go-logr/stdr v1.2.2 // indirect + github.com/hashicorp/hcl v1.0.0 // indirect + github.com/magiconair/properties v1.8.7 // indirect + github.com/mitchellh/mapstructure v1.5.0 // indirect + github.com/moby/spdystream v0.2.0 // indirect + github.com/pelletier/go-toml/v2 v2.2.2 // indirect + github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2 // indirect + github.com/sagikazarmark/locafero v0.4.0 // indirect + github.com/sagikazarmark/slog-shim v0.1.0 // indirect + github.com/sourcegraph/conc v0.3.0 // indirect + github.com/spf13/afero v1.11.0 // indirect + github.com/spf13/cast v1.6.0 // indirect + github.com/subosito/gotenv v1.6.0 // indirect + go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.49.0 // indirect + go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.49.0 // indirect + go.opentelemetry.io/otel v1.24.0 // indirect + go.opentelemetry.io/otel/metric v1.24.0 // indirect + go.opentelemetry.io/otel/trace v1.24.0 // indirect + golang.org/x/exp v0.0.0-20230905200255-921286631fa9 // indirect + golang.org/x/sync v0.6.0 // indirect + google.golang.org/grpc v1.62.1 // indirect + google.golang.org/protobuf v1.33.0 // indirect + gopkg.in/ini.v1 v1.67.0 // indirect +) + +require ( + cloud.google.com/go v0.112.1 // indirect + cloud.google.com/go/compute v1.25.1 // indirect cloud.google.com/go/compute/metadata v0.2.3 // indirect - cloud.google.com/go/iam v1.0.1 // indirect - cloud.google.com/go/storage v1.30.1 // indirect + cloud.google.com/go/iam v1.1.6 // indirect + cloud.google.com/go/storage v1.38.0 // indirect github.com/aws/aws-sdk-go v1.44.264 // indirect github.com/beorn7/perks v1.0.1 // indirect github.com/blendle/zapdriver v1.3.1 // indirect github.com/cespare/xxhash/v2 v2.2.0 // indirect - github.com/davecgh/go-spew v1.1.1 // indirect + github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect github.com/emicklei/go-restful/v3 v3.10.2 // indirect github.com/evanphx/json-patch v5.6.0+incompatible // indirect github.com/evanphx/json-patch/v5 v5.6.0 // indirect - github.com/fsnotify/fsnotify v1.6.0 // indirect - github.com/go-logr/logr v1.2.4 // indirect + github.com/fsnotify/fsnotify v1.7.0 // indirect + github.com/go-logr/logr v1.4.1 github.com/go-logr/zapr v1.2.4 // indirect github.com/go-openapi/jsonpointer v0.19.6 // indirect github.com/go-openapi/jsonreference v0.20.2 // indirect @@ -37,16 +66,16 @@ require ( github.com/go-task/slim-sprig v0.0.0-20230315185526-52ccab3ef572 // indirect github.com/gogo/protobuf v1.3.2 // indirect github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da // indirect - github.com/golang/protobuf v1.5.3 // indirect + github.com/golang/protobuf v1.5.4 // indirect github.com/google/gnostic v0.6.9 // indirect - github.com/google/go-cmp v0.5.9 // indirect + github.com/google/go-cmp v0.6.0 // indirect github.com/google/go-containerregistry v0.15.2 // indirect github.com/google/gofuzz v1.2.0 // indirect github.com/google/pprof v0.0.0-20210720184732-4bb14d4b1be1 // indirect - github.com/google/s2a-go v0.1.4 // indirect - github.com/google/uuid v1.3.0 - github.com/googleapis/enterprise-certificate-proxy v0.2.3 // indirect - github.com/googleapis/gax-go/v2 v2.11.0 // indirect + github.com/google/s2a-go v0.1.7 // indirect + github.com/google/uuid v1.6.0 + github.com/googleapis/enterprise-certificate-proxy v0.3.2 // indirect + github.com/googleapis/gax-go/v2 v2.12.3 // indirect github.com/googleapis/google-cloud-go-testing v0.0.0-20210719221736-1c9a4c676720 // indirect github.com/imdario/mergo v0.3.15 // indirect github.com/jmespath/go-jmespath v0.4.0 // indirect @@ -69,22 +98,19 @@ require ( go.uber.org/multierr v1.11.0 // indirect go.uber.org/zap v1.24.0 // indirect golang.org/x/crypto v0.21.0 // indirect - golang.org/x/net v0.17.0 // indirect - golang.org/x/oauth2 v0.10.0 // indirect + golang.org/x/net v0.23.0 // indirect + golang.org/x/oauth2 v0.18.0 // indirect golang.org/x/sys v0.18.0 // indirect golang.org/x/term v0.18.0 // indirect golang.org/x/text v0.14.0 // indirect - golang.org/x/time v0.3.0 // indirect - golang.org/x/tools v0.9.3 // indirect - golang.org/x/xerrors v0.0.0-20220907171357-04be3eba64a2 // indirect + golang.org/x/time v0.5.0 // indirect + golang.org/x/tools v0.13.0 // indirect gomodules.xyz/jsonpatch/v2 v2.2.0 // indirect - google.golang.org/api v0.126.0 // indirect - google.golang.org/appengine v1.6.7 // indirect - google.golang.org/genproto v0.0.0-20230530153820-e85fd2cbaebc // indirect - google.golang.org/genproto/googleapis/api v0.0.0-20230530153820-e85fd2cbaebc // indirect - google.golang.org/genproto/googleapis/rpc v0.0.0-20230530153820-e85fd2cbaebc // indirect - google.golang.org/grpc v1.56.3 // indirect - google.golang.org/protobuf v1.31.0 // indirect + google.golang.org/api v0.171.0 // indirect + google.golang.org/appengine v1.6.8 // indirect + google.golang.org/genproto v0.0.0-20240213162025-012b6fc9bca9 // indirect + google.golang.org/genproto/googleapis/api v0.0.0-20240318140521-94a12d6c2237 // indirect + google.golang.org/genproto/googleapis/rpc v0.0.0-20240318140521-94a12d6c2237 // indirect gopkg.in/inf.v0 v0.9.1 // indirect gopkg.in/yaml.v2 v2.4.0 // indirect gopkg.in/yaml.v3 v3.0.1 // indirect diff --git a/go.sum b/go.sum index 90e2b229..c2add89e 100644 --- a/go.sum +++ b/go.sum @@ -4,18 +4,18 @@ cloud.google.com/go v0.38.0/go.mod h1:990N+gfupTy94rShfmMCWGDn0LpTmnzTp2qbd1dvSR cloud.google.com/go v0.44.1/go.mod h1:iSa0KzasP4Uvy3f1mN/7PiObzGgflwredwwASm/v6AU= cloud.google.com/go v0.44.2/go.mod h1:60680Gw3Yr4ikxnPRS/oxxkBccT6SA1yMk63TGekxKY= cloud.google.com/go v0.44.3/go.mod h1:60680Gw3Yr4ikxnPRS/oxxkBccT6SA1yMk63TGekxKY= -cloud.google.com/go v0.110.2 h1:sdFPBr6xG9/wkBbfhmUz/JmZC7X6LavQgcrVINrKiVA= -cloud.google.com/go v0.110.2/go.mod h1:k04UEeEtb6ZBRTv3dZz4CeJC3jKGxyhl0sAiVVquxiw= +cloud.google.com/go v0.112.1 h1:uJSeirPke5UNZHIb4SxfZklVSiWWVqW4oXlETwZziwM= +cloud.google.com/go v0.112.1/go.mod h1:+Vbu+Y1UU+I1rjmzeMOb/8RfkKJK2Gyxi1X6jJCZLo4= cloud.google.com/go/bigquery v1.0.1/go.mod h1:i/xbL2UlR5RvWAURpBYZTtm/cXjCha9lbfbpx4poX+o= -cloud.google.com/go/compute v1.20.1 h1:6aKEtlUiwEpJzM001l0yFkpXmUVXaN8W+fbkb2AZNbg= -cloud.google.com/go/compute v1.20.1/go.mod h1:4tCnrn48xsqlwSAiLf1HXMQk8CONslYbdiEZc9FEIbM= +cloud.google.com/go/compute v1.25.1 h1:ZRpHJedLtTpKgr3RV1Fx23NuaAEN1Zfx9hw1u4aJdjU= +cloud.google.com/go/compute v1.25.1/go.mod h1:oopOIR53ly6viBYxaDhBfJwzUAxf1zE//uf3IB011ls= cloud.google.com/go/compute/metadata v0.2.3 h1:mg4jlk7mCAj6xXp9UJ4fjI9VUI5rubuGBW5aJ7UnBMY= cloud.google.com/go/compute/metadata v0.2.3/go.mod h1:VAV5nSsACxMJvgaAuX6Pk2AawlZn8kiOGuCv6gTkwuA= cloud.google.com/go/datastore v1.0.0/go.mod h1:LXYbyblFSglQ5pkeyhO+Qmw7ukd3C+pD7TKLgZqpHYE= -cloud.google.com/go/iam v1.0.1 h1:lyeCAU6jpnVNrE9zGQkTl3WgNgK/X+uWwaw0kynZJMU= -cloud.google.com/go/iam v1.0.1/go.mod h1:yR3tmSL8BcZB4bxByRv2jkSIahVmCtfKZwLYGBalRE8= -cloud.google.com/go/storage v1.30.1 h1:uOdMxAs8HExqBlnLtnQyP0YkvbiDpdGShGKtx6U/oNM= -cloud.google.com/go/storage v1.30.1/go.mod h1:NfxhC0UJE1aXSx7CIIbCf7y9HKT7BiccwkR7+P7gN8E= +cloud.google.com/go/iam v1.1.6 h1:bEa06k05IO4f4uJonbB5iAgKTPpABy1ayxaIZV/GHVc= +cloud.google.com/go/iam v1.1.6/go.mod h1:O0zxdPeGBoFdWW3HWmBxJsk0pfvNM/p/qa82rWOGTwI= +cloud.google.com/go/storage v1.38.0 h1:Az68ZRGlnNTpIBbLjSMIV2BDcwwXYlRlQzis0llkpJg= +cloud.google.com/go/storage v1.38.0/go.mod h1:tlUADB0mAb9BgYls9lq+8MGkfzOXuLrnHXlpHmvFJoY= contrib.go.opencensus.io/exporter/ocagent v0.7.1-0.20200907061046-05415f1de66d h1:LblfooH1lKOpp1hIhukktmSAxFkqMPFk9KR6iZ0MJNI= contrib.go.opencensus.io/exporter/ocagent v0.7.1-0.20200907061046-05415f1de66d/go.mod h1:IshRmMJBhDfFj5Y67nVhMYTTIze91RUeT73ipWKs/GY= contrib.go.opencensus.io/exporter/prometheus v0.4.2 h1:sqfsYl5GIY/L570iT+l93ehxaWJs2/OwXtiWwew3oAg= @@ -29,6 +29,7 @@ github.com/PuerkitoBio/purell v1.1.1/go.mod h1:c11w/QuzBsJSee3cPx9rAFu61PvFxuPbt github.com/PuerkitoBio/urlesc v0.0.0-20160726150825-5bd2802263f2/go.mod h1:uGdkoq3SwY9Y+13GIhn11/XLaGBb4BfwItxLd5jeuXE= github.com/PuerkitoBio/urlesc v0.0.0-20170810143723-de5bf2ad4578/go.mod h1:uGdkoq3SwY9Y+13GIhn11/XLaGBb4BfwItxLd5jeuXE= github.com/antihax/optional v1.0.0/go.mod h1:uupD/76wgC+ih3iEmQUL+0Ugr19nfwCT1kdvxnR2qWY= +github.com/armon/go-socks5 v0.0.0-20160902184237-e75332964ef5 h1:0CwZNZbxp69SHPdPJAN/hZIm0C4OItdklCFmMRWYpio= github.com/armon/go-socks5 v0.0.0-20160902184237-e75332964ef5/go.mod h1:wHh0iHkYZB8zMSxRWpUBQtwG5a7fFgvEO+odwuTv2gs= github.com/asaskevich/govalidator v0.0.0-20190424111038-f61b66f89f4a/go.mod h1:lB+ZfQJz7igIIfQNfa7Ml4HSf2uFQQRzpGGRXenZAgY= github.com/aws/aws-sdk-go v1.44.264 h1:5klL62ebn6uv3oJ0ixF7K12hKItj8lV3QqWeQPlkFSs= @@ -44,7 +45,6 @@ github.com/census-instrumentation/opencensus-proto v0.2.1/go.mod h1:f6KPmirojxKA github.com/census-instrumentation/opencensus-proto v0.4.1 h1:iKLQ0xPNFxR/2hzXZMrBo8f1j86j5WHzznCCQxV/b8g= github.com/census-instrumentation/opencensus-proto v0.4.1/go.mod h1:4T9NM4+4Vw91VeyqjLS6ao50K5bOcLKN6Q42XnYaRYw= github.com/cespare/xxhash v1.1.0/go.mod h1:XrSqR1VqqWfGrhpAt58auRo0WTKS1nRRg3ghfAqPWnc= -github.com/cespare/xxhash/v2 v2.1.1/go.mod h1:VGX0DQ3Q6kWi7AoAeZDth3/j3BFtOZR5XLFGgcrjCOs= github.com/cespare/xxhash/v2 v2.2.0 h1:DC2CZ1Ep5Y4k3ZQ899DldepgrayRUGE6BBZ/cd9Cj44= github.com/cespare/xxhash/v2 v2.2.0/go.mod h1:VGX0DQ3Q6kWi7AoAeZDth3/j3BFtOZR5XLFGgcrjCOs= github.com/chzyer/logex v1.1.10/go.mod h1:+Ywpsq7O8HXn0nuIou7OrIPyXbp3wmkHB+jjWRnGsAI= @@ -53,15 +53,12 @@ github.com/chzyer/test v0.0.0-20180213035817-a1ea475d72b1/go.mod h1:Q3SI9o4m/ZMn github.com/client9/misspell v0.3.4/go.mod h1:qj6jICC3Q7zFZvVWo7KLAzC3yx5G7kyvSDkc90ppPyw= github.com/cncf/udpa/go v0.0.0-20191209042840-269d4d468f6f/go.mod h1:M8M6+tZqaGXZJjfX53e64911xZQV5JYwmTeXPW+k8Sc= github.com/cncf/udpa/go v0.0.0-20201120205902-5459f2c99403/go.mod h1:WmhPx2Nbnhtbo57+VJT5O0JRkEi1Wbu0z5j0R8u5Hbk= -github.com/cncf/udpa/go v0.0.0-20210930031921-04548b0d99d4/go.mod h1:6pvJx4me5XPnfI9Z40ddWsdw2W/uZgQLFXToKeRcDiI= github.com/cncf/xds/go v0.0.0-20210312221358-fbca930ec8ed/go.mod h1:eXthEFrGJvWHgFFCl3hGmgk+/aYT6PnTQLykKQRLhEs= -github.com/cncf/xds/go v0.0.0-20210805033703-aa0b78936158/go.mod h1:eXthEFrGJvWHgFFCl3hGmgk+/aYT6PnTQLykKQRLhEs= -github.com/cncf/xds/go v0.0.0-20210922020428-25de7278fc84/go.mod h1:eXthEFrGJvWHgFFCl3hGmgk+/aYT6PnTQLykKQRLhEs= -github.com/cncf/xds/go v0.0.0-20211011173535-cb28da3451f1/go.mod h1:eXthEFrGJvWHgFFCl3hGmgk+/aYT6PnTQLykKQRLhEs= github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ33E= github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= -github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c= github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= +github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc h1:U9qPSI2PIWSS1VwoXQT9A3Wy9MM3WgvqSxFWenqJduM= +github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= github.com/docopt/docopt-go v0.0.0-20180111231733-ee0de3bc6815/go.mod h1:WwZ+bS3ebgob9U8Nd0kOddGdZWjyMGR8Wziv+TBNwSE= github.com/emicklei/go-restful v0.0.0-20170410110728-ff4f55a20633/go.mod h1:otzb+WCGbkyDHkqmQmT5YD2WR4BBwUdeQoFo8l/7tVs= github.com/emicklei/go-restful v2.9.5+incompatible/go.mod h1:otzb+WCGbkyDHkqmQmT5YD2WR4BBwUdeQoFo8l/7tVs= @@ -73,7 +70,6 @@ github.com/envoyproxy/go-control-plane v0.9.1-0.20191026205805-5f8ba28d4473/go.m github.com/envoyproxy/go-control-plane v0.9.4/go.mod h1:6rpuAdCZL397s3pYoYcLgu1mIlRU8Am5FuJP05cCM98= github.com/envoyproxy/go-control-plane v0.9.9-0.20201210154907-fd9021fe5dad/go.mod h1:cXg6YxExXjJnVBQHBLXeUAgxn2UodCpnH306RInaBQk= github.com/envoyproxy/go-control-plane v0.9.9-0.20210512163311-63b5d3c536b0/go.mod h1:hliV/p42l8fGbc6Y9bQ70uLwIvmJyVE5k4iMKlh8wCQ= -github.com/envoyproxy/go-control-plane v0.9.10-0.20210907150352-cf90f659a021/go.mod h1:AFq3mo9L8Lqqiid3OhADV3RfLJnjiw63cSpi+fDTRC0= github.com/envoyproxy/protoc-gen-validate v0.1.0/go.mod h1:iSmxcyjqTsJpI2R4NaDN7+kN2VEUnK/pcBlmesArF7c= github.com/evanphx/json-patch v0.5.2/go.mod h1:ZWS5hhDbVDyob71nXKNL0+PWn6ToqBHMikGIFbs31qQ= github.com/evanphx/json-patch v4.12.0+incompatible/go.mod h1:50XU6AFN0ol/bzJsmQLiYLvXMP4fmwYFNcr97nuDLSk= @@ -81,11 +77,15 @@ github.com/evanphx/json-patch v5.6.0+incompatible h1:jBYDEEiFBPxA0v50tFdvOzQQTCv github.com/evanphx/json-patch v5.6.0+incompatible/go.mod h1:50XU6AFN0ol/bzJsmQLiYLvXMP4fmwYFNcr97nuDLSk= github.com/evanphx/json-patch/v5 v5.6.0 h1:b91NhWfaz02IuVxO9faSllyAtNXHMPkC5J8sJCLunww= github.com/evanphx/json-patch/v5 v5.6.0/go.mod h1:G79N1coSVB93tBe7j6PhzjmR3/2VvlbKOFpnXhI9Bw4= +github.com/felixge/httpsnoop v1.0.4 h1:NFTV2Zj1bL4mc9sqWACXbQFVBBg2W3GPvqp8/ESS2Wg= +github.com/felixge/httpsnoop v1.0.4/go.mod h1:m8KPJKqk1gH5J9DgRY2ASl2lWCfGKXixSwevea8zH2U= github.com/flowstack/go-jsonschema v0.1.1/go.mod h1:yL7fNggx1o8rm9RlgXv7hTBWxdBM0rVwpMwimd3F3N0= +github.com/frankban/quicktest v1.14.6 h1:7Xjx+VpznH+oBnejlPUj8oUpdxnVs4f8XU8WnHkI4W8= +github.com/frankban/quicktest v1.14.6/go.mod h1:4ptaffx2x8+WTWXmUCuVU6aPUX1/Mz7zb5vbUoiM6w0= github.com/fsnotify/fsnotify v1.4.7/go.mod h1:jwhsz4b93w/PPRr/qN1Yymfu8t87LnFCMoQvtojpjFo= github.com/fsnotify/fsnotify v1.4.9/go.mod h1:znqG4EE+3YCdAaPaxE2ZRY/06pZUdp0tY4IgpuI1SZQ= -github.com/fsnotify/fsnotify v1.6.0 h1:n+5WquG0fcWoWp6xPWfHdbskMCQaFnG6PfBrh1Ky4HY= -github.com/fsnotify/fsnotify v1.6.0/go.mod h1:sl3t1tCWJFWoRz9R8WJCbQihKKwmorjAbSClcnxKAGw= +github.com/fsnotify/fsnotify v1.7.0 h1:8JEhPFa5W2WU7YfeZzPNqzMP6Lwt7L2715Ggo0nosvA= +github.com/fsnotify/fsnotify v1.7.0/go.mod h1:40Bi/Hjc2AVfZrqy+aj+yEI+/bRxZnMJyTJwOpGvigM= github.com/ghodss/yaml v0.0.0-20150909031657-73d445a93680/go.mod h1:4dBDuWmgqj2HViK6kFavaiC9ZROes6MMH2rRYeMEF04= github.com/ghodss/yaml v1.0.0/go.mod h1:4dBDuWmgqj2HViK6kFavaiC9ZROes6MMH2rRYeMEF04= github.com/go-kit/log v0.2.1 h1:MRVx0/zhvdseW+Gza6N9rVzU/IVzaeE1SFI4raAhmBU= @@ -95,9 +95,13 @@ github.com/go-logfmt/logfmt v0.6.0/go.mod h1:WYhtIu8zTZfxdn5+rREduYbwxfcBr/Vr6KE github.com/go-logr/logr v0.1.0/go.mod h1:ixOQHD9gLJUVQQ2ZOR7zLEifBX6tGkNJF4QyIY7sIas= github.com/go-logr/logr v0.2.0/go.mod h1:z6/tIYblkpsD+a4lm/fGIIU9mZ+XfAiaFtq7xTgseGU= github.com/go-logr/logr v1.2.0/go.mod h1:jdQByPbusPIv2/zmleS9BjJVeZ6kBagPoEUsqbVz/1A= +github.com/go-logr/logr v1.2.2/go.mod h1:jdQByPbusPIv2/zmleS9BjJVeZ6kBagPoEUsqbVz/1A= github.com/go-logr/logr v1.2.3/go.mod h1:jdQByPbusPIv2/zmleS9BjJVeZ6kBagPoEUsqbVz/1A= -github.com/go-logr/logr v1.2.4 h1:g01GSCwiDw2xSZfjJ2/T9M+S6pFdcNtFYsp+Y43HYDQ= github.com/go-logr/logr v1.2.4/go.mod h1:jdQByPbusPIv2/zmleS9BjJVeZ6kBagPoEUsqbVz/1A= +github.com/go-logr/logr v1.4.1 h1:pKouT5E8xu9zeFC39JXRDukb6JFQPXM5p5I91188VAQ= +github.com/go-logr/logr v1.4.1/go.mod h1:9T104GzyrTigFIr8wt5mBrctHMim0Nb2HLGrmQ40KvY= +github.com/go-logr/stdr v1.2.2 h1:hSWxHoqTgW2S2qGc0LTAI563KZ5YKYRhT3MFKZMbjag= +github.com/go-logr/stdr v1.2.2/go.mod h1:mMo/vtBO5dYbehREoey6XUKy/eSumjCCveDpRre4VKE= github.com/go-logr/zapr v1.2.4 h1:QHVo+6stLbfJmYGkQ7uGHUCu5hnAFAj6mDe6Ea0SeOo= github.com/go-logr/zapr v1.2.4/go.mod h1:FyHWQIzQORZ0QVE1BtVHv3cKtNLuXsbNLtpuhNapBOA= github.com/go-openapi/jsonpointer v0.0.0-20160704185906-46af16f9f7b1/go.mod h1:+35s3my2LFTysnkMfxsJBAMHj/DoqoB9knIWoYG/Vk0= @@ -146,8 +150,9 @@ github.com/golang/protobuf v1.4.2/go.mod h1:oDoupMAO8OvCJWAcko0GGGIgR6R6ocIYbsSw github.com/golang/protobuf v1.4.3/go.mod h1:oDoupMAO8OvCJWAcko0GGGIgR6R6ocIYbsSw735rRwI= github.com/golang/protobuf v1.5.0/go.mod h1:FsONVRAS9T7sI+LIUmWTfcYkHO4aIWwzhcaSAoJOfIk= github.com/golang/protobuf v1.5.2/go.mod h1:XVQd3VNwM+JqD3oG2Ue2ip4fOMUkwXdXDdiuN0vRsmY= -github.com/golang/protobuf v1.5.3 h1:KhyjKVUg7Usr/dYsdSqoFveMYd5ko72D+zANwlG1mmg= github.com/golang/protobuf v1.5.3/go.mod h1:XVQd3VNwM+JqD3oG2Ue2ip4fOMUkwXdXDdiuN0vRsmY= +github.com/golang/protobuf v1.5.4 h1:i7eJL8qZTpSEXOPTxNKhASYpMn+8e5Q6AdndVa1dWek= +github.com/golang/protobuf v1.5.4/go.mod h1:lnTiLA8Wa4RWRcIUkrtSVa5nRhsEGBg48fD6rSs7xps= github.com/google/btree v0.0.0-20180813153112-4030bb1f1f0c/go.mod h1:lNA+9X1NB3Zf8V7Ke586lFgjr2dZNuvo3lPJSGZ5JPQ= github.com/google/btree v1.0.0/go.mod h1:lNA+9X1NB3Zf8V7Ke586lFgjr2dZNuvo3lPJSGZ5JPQ= github.com/google/gnostic v0.5.7-v3refs/go.mod h1:73MKFl6jIHelAJNaBGFzt3SPtZULs9dYrGFt8OiIsHQ= @@ -161,8 +166,9 @@ github.com/google/go-cmp v0.5.0/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/ github.com/google/go-cmp v0.5.3/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE= github.com/google/go-cmp v0.5.5/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE= github.com/google/go-cmp v0.5.8/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY= -github.com/google/go-cmp v0.5.9 h1:O2Tfq5qg4qc4AmwVlvv0oLiVAGB7enBSJ2x2DqQFi38= github.com/google/go-cmp v0.5.9/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY= +github.com/google/go-cmp v0.6.0 h1:ofyhxvXcZhMsU5ulbFiLKl/XBFqE1GSq7atu8tAmTRI= +github.com/google/go-cmp v0.6.0/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY= github.com/google/go-containerregistry v0.15.2 h1:MMkSh+tjSdnmJZO7ljvEqV1DjfekB6VUEAZgy3a+TQE= github.com/google/go-containerregistry v0.15.2/go.mod h1:wWK+LnOv4jXMM23IT/F1wdYftGWGr47Is8CG+pmHK1Q= github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg= @@ -178,17 +184,18 @@ github.com/google/pprof v0.0.0-20190515194954-54271f7e092f/go.mod h1:zfwlbNMJ+OI github.com/google/pprof v0.0.0-20210407192527-94a9f03dee38/go.mod h1:kpwsk12EmLew5upagYY7GY0pfYCcupk39gWOCRROcvE= github.com/google/pprof v0.0.0-20210720184732-4bb14d4b1be1 h1:K6RDEckDVWvDI9JAJYCmNdQXq6neHJOYx3V6jnqNEec= github.com/google/pprof v0.0.0-20210720184732-4bb14d4b1be1/go.mod h1:kpwsk12EmLew5upagYY7GY0pfYCcupk39gWOCRROcvE= -github.com/google/s2a-go v0.1.4 h1:1kZ/sQM3srePvKs3tXAvQzo66XfcReoqFpIpIccE7Oc= -github.com/google/s2a-go v0.1.4/go.mod h1:Ej+mSEMGRnqRzjc7VtF+jdBwYG5fuJfiZ8ELkjEwM0A= +github.com/google/s2a-go v0.1.7 h1:60BLSyTrOV4/haCDW4zb1guZItoSq8foHCXrAnjBo/o= +github.com/google/s2a-go v0.1.7/go.mod h1:50CgR4k1jNlWBu4UfS4AcfhVe1r6pdZPygJ3R8F0Qdw= github.com/google/uuid v1.1.2/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo= -github.com/google/uuid v1.3.0 h1:t6JiXgmwXMjEs8VusXIJk2BXHsn+wx8BZdTaoZ5fu7I= github.com/google/uuid v1.3.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo= -github.com/googleapis/enterprise-certificate-proxy v0.2.3 h1:yk9/cqRKtT9wXZSsRH9aurXEpJX+U6FLtpYTdC3R06k= -github.com/googleapis/enterprise-certificate-proxy v0.2.3/go.mod h1:AwSRAtLfXpU5Nm3pW+v7rGDHp09LsPtGY9MduiEsR9k= +github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0= +github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo= +github.com/googleapis/enterprise-certificate-proxy v0.3.2 h1:Vie5ybvEvT75RniqhfFxPRy3Bf7vr3h0cechB90XaQs= +github.com/googleapis/enterprise-certificate-proxy v0.3.2/go.mod h1:VLSiSSBs/ksPL8kq3OBOQ6WRI2QnaFynd1DCjZ62+V0= github.com/googleapis/gax-go/v2 v2.0.4/go.mod h1:0Wqv26UfaUD9n4G6kQubkQ+KchISgw+vpHVxEJEs9eg= github.com/googleapis/gax-go/v2 v2.0.5/go.mod h1:DWXyrwAJ9X0FpwwEdw+IPEYBICEFu5mhpdKc/us6bOk= -github.com/googleapis/gax-go/v2 v2.11.0 h1:9V9PWXEsWnPpQhu/PeQIkS4eGzMlTLGgt80cUUI8Ki4= -github.com/googleapis/gax-go/v2 v2.11.0/go.mod h1:DxmR61SGKkGLa2xigwuZIQpkCI2S5iydzRfb3peWZJI= +github.com/googleapis/gax-go/v2 v2.12.3 h1:5/zPPDvw8Q1SuXjrqrZslrqT7dL/uJT2CQii/cLCKqA= +github.com/googleapis/gax-go/v2 v2.12.3/go.mod h1:AKloxT6GtNbaLm8QTNSidHUVsHYcBHwWRvkNFJUQcS4= github.com/googleapis/gnostic v0.0.0-20170729233727-0c5108395e2d/go.mod h1:sJBsCZ4ayReDTBIg8b9dl28c5xFWyhBTVRp3pOg5EKY= github.com/googleapis/google-cloud-go-testing v0.0.0-20210719221736-1c9a4c676720 h1:zC34cGQu69FG7qzJ3WiKW244WfhDC3xxYMeNOX2gtUQ= github.com/googleapis/google-cloud-go-testing v0.0.0-20210719221736-1c9a4c676720/go.mod h1:dvDLG8qkwmyD9a/MJJN3XJcT3xFxOKAvTZGvuZmac9g= @@ -199,6 +206,8 @@ github.com/grpc-ecosystem/grpc-gateway/v2 v2.15.2 h1:gDLXvp5S9izjldquuoAhDzccbsk github.com/grpc-ecosystem/grpc-gateway/v2 v2.15.2/go.mod h1:7pdNwVWBBHGiCxa9lAszqCJMbfTISJ7oMftp8+UGV08= github.com/hashicorp/golang-lru v0.5.0/go.mod h1:/m3WP610KZHVQ1SGc6re/UDhFvYD7pJ4Ao+sR/qLZy8= github.com/hashicorp/golang-lru v0.5.1/go.mod h1:/m3WP610KZHVQ1SGc6re/UDhFvYD7pJ4Ao+sR/qLZy8= +github.com/hashicorp/hcl v1.0.0 h1:0Anlzjpi4vEasTeNFn2mLJgTSwt0+6sfsiTG8qcWGx4= +github.com/hashicorp/hcl v1.0.0/go.mod h1:E5yfLk+7swimpb2L/Alb/PJmXilQ/rhwaUYs4T20WEQ= github.com/hpcloud/tail v1.0.0/go.mod h1:ab1qPbhIpdTxEkNHXyeSf5vhxWSCs/tWer42PpOxQnU= github.com/ianlancetaylor/demangle v0.0.0-20200824232613-28f6c0f3b639/go.mod h1:aSSvb/t6k1mPoxDqO4vJh6VOCGPwU4O0C2/Eqndh1Sc= github.com/imdario/mergo v0.3.15 h1:M8XP7IuFNsqUx6VPK2P9OSmsYsI/YFaGil0uD21V3dM= @@ -231,6 +240,8 @@ github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY= github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE= github.com/kserve/kserve v0.11.2 h1:ZYwj3/04JrwIiqIrVjPhcX5umHFj3gHAhbtghfseoPo= github.com/kserve/kserve v0.11.2/go.mod h1:x44/b0J4y8kqNUuxcHP386jN3BAk5DRoLvhebnF+VP8= +github.com/magiconair/properties v1.8.7 h1:IeQXZAiQcpL9mgcAe1Nu6cX9LLw6ExEHKjN0VQdvPDY= +github.com/magiconair/properties v1.8.7/go.mod h1:Dhd985XPs7jluiymwWYZ0G4Z61jb3vdS329zhj2hYo0= github.com/mailru/easyjson v0.0.0-20160728113105-d5b7844b561a/go.mod h1:C1wdFJiN94OJF2b5HbByQZoLdCWB1Yqtg26g4irojpc= github.com/mailru/easyjson v0.0.0-20190614124828-94de47d64c63/go.mod h1:C1wdFJiN94OJF2b5HbByQZoLdCWB1Yqtg26g4irojpc= github.com/mailru/easyjson v0.0.0-20190626092158-b2ccc519800e/go.mod h1:C1wdFJiN94OJF2b5HbByQZoLdCWB1Yqtg26g4irojpc= @@ -240,6 +251,9 @@ github.com/mailru/easyjson v0.7.7/go.mod h1:xzfreul335JAWq5oZzymOObrkdz5UnU4kGfJ github.com/matttproud/golang_protobuf_extensions v1.0.4 h1:mmDVorXM7PCGKw94cs5zkfA9PSy5pEvNWRP0ET0TIVo= github.com/matttproud/golang_protobuf_extensions v1.0.4/go.mod h1:BSXmuO+STAnVfrANrmjBb36TMTDstsz7MSK+HVaYKv4= github.com/mitchellh/mapstructure v1.1.2/go.mod h1:FVVH3fgwuzCH5S8UJGiWEs2h04kUh9fWfEaFds41c1Y= +github.com/mitchellh/mapstructure v1.5.0 h1:jeMsZIYE/09sWLaz43PL7Gy6RuMjD2eJVyuac5Z2hdY= +github.com/mitchellh/mapstructure v1.5.0/go.mod h1:bFUtVrKA4DC2yAKiSyO/QUcy7e+RRV2QTWOzhPopBRo= +github.com/moby/spdystream v0.2.0 h1:cjW1zVyyoiM0T7b6UoySUFqzXMoqRckQtXwGPiBhOM8= github.com/moby/spdystream v0.2.0/go.mod h1:f7i0iNDQJ059oMTcWxx8MA/zKFIuD/lY+0GqbN2Wy8c= github.com/modern-go/concurrent v0.0.0-20180228061459-e0a39a4cb421/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q= github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd h1:TRLaZ9cD/w8PVh93nsPXa1VrQ6jlwL5oN8l14QlcNfg= @@ -291,11 +305,14 @@ github.com/opencontainers/go-digest v1.0.0/go.mod h1:0JzlMkj0TRzQZfJkVvzbP0HBR3I github.com/openshift/api v0.0.0-20200713203337-b2494ecb17dd h1:MV2FH/cm1wqoVCIL98GT46CMnXZw9faUoIzdZ4nfZw0= github.com/openshift/api v0.0.0-20200713203337-b2494ecb17dd/go.mod h1:vWmWTm4y7XR3wkLR+bDDjRbvkBfx2yP7yve6kfb7+Ts= github.com/openshift/build-machinery-go v0.0.0-20200713135615-1f43d26dccc7/go.mod h1:b1BuldmJlbA/xYtdZvKi+7j5YGB44qJUJDZ9zwiNCfE= +github.com/pelletier/go-toml/v2 v2.2.2 h1:aYUidT7k73Pcl9nb2gScu7NSrKCSHIDE89b3+6Wq+LM= +github.com/pelletier/go-toml/v2 v2.2.2/go.mod h1:1t835xjRzz80PqgE6HHgN2JOsmgYu/h4qDAS4n929Rs= github.com/pkg/errors v0.8.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0= github.com/pkg/errors v0.9.1 h1:FEBLx1zS214owpjy7qsBeixbURkuhQAwrK5UwLGTwt4= github.com/pkg/errors v0.9.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0= -github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM= github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4= +github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2 h1:Jamvg5psRIccs7FGNTlIRMkT8wgtp5eCXdBlqhYGL6U= +github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4= github.com/prometheus-operator/prometheus-operator/pkg/apis/monitoring v0.64.1 h1:bvntWler8vOjDJtxBwGDakGNC6srSZmgawGM9Jf7HC8= github.com/prometheus-operator/prometheus-operator/pkg/apis/monitoring v0.64.1/go.mod h1:cfNgxpCPGyIydmt3HcwDqKDt0nYdlGRhzftl+DZH7WA= github.com/prometheus/client_golang v1.15.1 h1:8tXpTmJbyH5lydzFPoxSIJ0J46jdh3tylbvM1xCv0LI= @@ -313,16 +330,29 @@ github.com/rogpeppe/fastuuid v1.2.0/go.mod h1:jVj6XXZzXRy/MSR5jhDC/2q6DgLz+nrA6L github.com/rogpeppe/go-internal v1.6.1/go.mod h1:xXDCJY+GAPziupqXw64V24skbSoqbTEfhy4qGm1nDQc= github.com/rogpeppe/go-internal v1.10.0 h1:TMyTOH3F/DB16zRVcYyreMH6GnZZrwQVAoYjRBZyWFQ= github.com/rogpeppe/go-internal v1.10.0/go.mod h1:UQnix2H7Ngw/k4C5ijL5+65zddjncjaFoBhdsK/akog= +github.com/sagikazarmark/locafero v0.4.0 h1:HApY1R9zGo4DBgr7dqsTH/JJxLTTsOt7u6keLGt6kNQ= +github.com/sagikazarmark/locafero v0.4.0/go.mod h1:Pe1W6UlPYUk/+wc/6KFhbORCfqzgYEpgQ3O5fPuL3H4= +github.com/sagikazarmark/slog-shim v0.1.0 h1:diDBnUNK9N/354PgrxMywXnAwEr1QZcOr6gto+ugjYE= +github.com/sagikazarmark/slog-shim v0.1.0/go.mod h1:SrcSrq8aKtyuqEI1uvTDTK1arOWRIczQRv+GVI1AkeQ= +github.com/sourcegraph/conc v0.3.0 h1:OQTbbt6P72L20UqAkXXuLOj79LfEanQ+YQFNpLA9ySo= +github.com/sourcegraph/conc v0.3.0/go.mod h1:Sdozi7LEKbFPqYX2/J+iBAM6HpqSLTASQIKqDmF7Mt0= github.com/spaolacci/murmur3 v0.0.0-20180118202830-f09979ecbc72/go.mod h1:JwIasOWyU6f++ZhiEuf87xNszmSA2myDM2Kzu9HwQUA= github.com/spf13/afero v1.2.2/go.mod h1:9ZxEEn6pIJ8Rxe320qSDBk6AsU0r9pR7Q4OcevTdifk= +github.com/spf13/afero v1.11.0 h1:WJQKhtpdm3v2IzqG8VMqrr6Rf3UYpEF239Jy9wNepM8= +github.com/spf13/afero v1.11.0/go.mod h1:GH9Y3pIexgf1MTIWtNGyogA5MwRIDXGUr+hbWNoBjkY= +github.com/spf13/cast v1.6.0 h1:GEiTHELF+vaR5dhz3VqZfFSzZjYbgeKDpBxQVS4GYJ0= +github.com/spf13/cast v1.6.0/go.mod h1:ancEpBxwJDODSW/UG4rDrAqiKolqNNh2DX3mk86cAdo= github.com/spf13/pflag v0.0.0-20170130214245-9ff6c6923cff/go.mod h1:DYY7MBk1bdzusC3SYhjObp+wFpr4gzcvqqNjLnInEg4= github.com/spf13/pflag v1.0.5 h1:iy+VFUOCP1a+8yFto/drg2CJ5u0yRoB7fZw3DKv/JXA= github.com/spf13/pflag v1.0.5/go.mod h1:McXfInJRrz4CZXVZOBLb0bTZqETkiAhM9Iw0y3An2Bg= +github.com/spf13/viper v1.19.0 h1:RWq5SEjt8o25SROyN3z2OrDB9l7RPd3lwTWU8EcEdcI= +github.com/spf13/viper v1.19.0/go.mod h1:GQUN9bilAbhU/jgc1bKs99f/suXKeUMct8Adx5+Ntkg= github.com/stoewer/go-strcase v1.2.0/go.mod h1:IBiWB2sKIp3wVVQ3Y035++gc+knqhUQag1KpM8ahLw8= github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME= github.com/stretchr/objx v0.2.0/go.mod h1:qt09Ya8vawLte6SNmTgCsAVtYtaKzEcn8ATUoHMkEqE= github.com/stretchr/objx v0.4.0/go.mod h1:YvHI0jy2hoMjB+UWwv71VJQ9isScKT/TqJzVSSt89Yw= github.com/stretchr/objx v0.5.0/go.mod h1:Yh+to48EsGEfYuaHDzXPcE3xhTkx73EhmCGUpEOglKo= +github.com/stretchr/objx v0.5.2/go.mod h1:FRsXN1f5AsAjCGJKqEizvkpNtU+EGNCLh3NxZ/8L+MA= github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI= github.com/stretchr/testify v1.4.0/go.mod h1:j7eGeouHqKxXV5pUuKE4zz7dFj8WfuZ+81PSLYec5m4= github.com/stretchr/testify v1.5.1/go.mod h1:5W2xD1RspED5o8YsWQXVCued0rvSQ+mT+I5cxcmMvtA= @@ -330,8 +360,12 @@ github.com/stretchr/testify v1.6.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/ github.com/stretchr/testify v1.7.0/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg= github.com/stretchr/testify v1.7.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg= github.com/stretchr/testify v1.8.0/go.mod h1:yNjHg4UonilssWZ8iaSj1OCr/vHnekPRkoO+kdMU+MU= -github.com/stretchr/testify v1.8.1 h1:w7B6lhMri9wdJUVmEZPGGhZzrYTPvgJArz7wNPgYKsk= github.com/stretchr/testify v1.8.1/go.mod h1:w2LPCIKwWwSfY2zedu0+kehJoqGctiVI29o6fzry7u4= +github.com/stretchr/testify v1.8.4/go.mod h1:sz/lmYIOXD/1dqDmKjjqLyZ2RngseejIcXlSw2iwfAo= +github.com/stretchr/testify v1.9.0 h1:HtqpIVDClZ4nwg75+f6Lvsy/wHu+3BoSGCbBAcpTsTg= +github.com/stretchr/testify v1.9.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY= +github.com/subosito/gotenv v1.6.0 h1:9NlTDc1FTs4qu0DDq7AEtTPNw6SVm7uBMsUCUjABIf8= +github.com/subosito/gotenv v1.6.0/go.mod h1:Dk4QP5c2W3ibzajGcXpNraDfq2IrhjMIvMSWPKKo0FU= github.com/xeipuuv/gojsonpointer v0.0.0-20180127040702-4e3ac2762d5f/go.mod h1:N2zxlSyiKSe5eX1tZViRH5QA0qijqEDrYZiPEAiq3wU= github.com/xeipuuv/gojsonreference v0.0.0-20180127040603-bd5ef7bd5415/go.mod h1:GwrjFmJcFw6At/Gs6z4yjiIwzuJ1/+UwLxMQDVQXShQ= github.com/xeipuuv/gojsonschema v1.2.0/go.mod h1:anYRn/JVcOK2ZgGU+IjEV4nwlhoK5sQluxsYJ78Id3Y= @@ -344,6 +378,18 @@ go.opencensus.io v0.21.0/go.mod h1:mSImk1erAIZhrmZN+AvHh14ztQfjbGwt4TtuofqLduU= go.opencensus.io v0.22.0/go.mod h1:+kGneAE2xo2IficOXnaByMWTGM9T73dGwxeWcUqIpI8= go.opencensus.io v0.24.0 h1:y73uSU6J157QMP2kn2r30vwW1A2W2WFwSCGnAVxeaD0= go.opencensus.io v0.24.0/go.mod h1:vNK8G9p7aAivkbmorf4v+7Hgx+Zs0yY+0fOtgBfjQKo= +go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.49.0 h1:4Pp6oUg3+e/6M4C0A/3kJ2VYa++dsWVTtGgLVj5xtHg= +go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.49.0/go.mod h1:Mjt1i1INqiaoZOMGR1RIUJN+i3ChKoFRqzrRQhlkbs0= +go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.44.0 h1:KfYpVmrjI7JuToy5k8XV3nkapjWx48k4E4JOtVstzQI= +go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.44.0/go.mod h1:SeQhzAEccGVZVEy7aH87Nh0km+utSpo1pTv6eMMop48= +go.opentelemetry.io/otel v1.24.0 h1:0LAOdjNmQeSTzGBzduGe/rU4tZhMwL5rWgtp9Ku5Jfo= +go.opentelemetry.io/otel v1.24.0/go.mod h1:W7b9Ozg4nkF5tWI5zsXkaKKDjdVjpD4oAt9Qi/MArHo= +go.opentelemetry.io/otel/metric v1.24.0 h1:6EhoGWWK28x1fbpA4tYTOWBkPefTDQnb8WSGXlc88kI= +go.opentelemetry.io/otel/metric v1.24.0/go.mod h1:VYhLe1rFfxuTXLgj4CBiyz+9WYBA8pNGJgDcSFRKBco= +go.opentelemetry.io/otel/sdk v1.22.0 h1:6coWHw9xw7EfClIC/+O31R8IY3/+EiRFHevmHafB2Gw= +go.opentelemetry.io/otel/sdk v1.22.0/go.mod h1:iu7luyVGYovrRpe2fmj3CVKouQNdTOkxtLzPvPz1DOc= +go.opentelemetry.io/otel/trace v1.24.0 h1:CsKnnL4dUAr/0llH9FKuc698G04IrpWV0MQA/Y1YELI= +go.opentelemetry.io/otel/trace v1.24.0/go.mod h1:HPc3Xr/cOApsBI154IU0OI0HJexz+aw5uPdbs3UCjNU= go.opentelemetry.io/proto/otlp v0.7.0/go.mod h1:PqfVotwruBrMGOCsRd/89rSnXhoiJIqeYNgFYFoEGnI= go.uber.org/atomic v1.4.0/go.mod h1:gD2HeocX3+yG+ygLZcrzQJaqmWj9AIm7n08wl/qW/PE= go.uber.org/atomic v1.7.0/go.mod h1:fEN4uk6kAWBTFdckzkM89CLk9XfWZrxpCo0nPH17wJc= @@ -363,6 +409,8 @@ golang.org/x/crypto v0.17.0 h1:r8bRNjWL3GshPW3gkd+RpvzWrZAwPS49OmTGZ/uhM4k= golang.org/x/crypto v0.17.0/go.mod h1:gCAAfMLgwOJRpTjQ2zCCt2OcSfYMTeZVSRtQlPC7Nq4= golang.org/x/exp v0.0.0-20190121172915-509febef88a4/go.mod h1:CJ0aWSM057203Lf6IL+f9T1iT9GByDxfZKAQTCR3kQA= golang.org/x/exp v0.0.0-20190510132918-efd6b22b2522/go.mod h1:ZjyILWgesfNpC6sMxTJOJm9Kp84zZh5NQWvqDGG3Qr8= +golang.org/x/exp v0.0.0-20230905200255-921286631fa9 h1:GoHiUyI/Tp2nVkLI2mCxVkOjsbSXD66ic0XW0js0R9g= +golang.org/x/exp v0.0.0-20230905200255-921286631fa9/go.mod h1:S2oDrQGGwySpoQPVqRShND87VCbxmc6bL1Yd2oYrm6k= golang.org/x/image v0.0.0-20190227222117-0694c2d4d067/go.mod h1:kZ7UVZpmo3dzQBMxlp+ypCbDeSB+sBbTgSJuh5dn5js= golang.org/x/lint v0.0.0-20181026193005-c67002cb31c3/go.mod h1:UVdnD1Gm6xHRNCYTkRU2/jEulfH38KcIWyp/GAMgvoE= golang.org/x/lint v0.0.0-20190227174305-5b3e6a55c961/go.mod h1:wehouNa3lNwaWXcvxsM5YxQ5yQlVC4a0KAMCusXpPoU= @@ -381,16 +429,16 @@ golang.org/x/mod v0.6.0/go.mod h1:4mET923SAdbXp2ki8ey+zGs1SLqsuM2Y0uvdZR/fUNI= golang.org/x/mod v0.7.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs= golang.org/x/mod v0.8.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs= golang.org/x/mod v0.9.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs= -golang.org/x/mod v0.10.0 h1:lFO9qtOdlre5W1jxS3r/4szv2/6iXxScdzjoBMXNhYk= -golang.org/x/mod v0.10.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs= +golang.org/x/mod v0.12.0 h1:rmsUpXtvNzj340zd98LZ4KntptpfRHwpFOHG188oHXc= +golang.org/x/mod v0.12.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs= golang.org/x/net v0.23.0 h1:7EYJ93RZ9vYSZAIb2x3lnuvqO5zneoD6IvWjuhfxjTs= golang.org/x/net v0.23.0/go.mod h1:JKghWKKOSdJwpW2GEx0Ja7fmaKnMsbu+MWVZTokSYmg= golang.org/x/oauth2 v0.0.0-20180821212333-d2e6202438be/go.mod h1:N/0e6XlmueqKjAGxoOufVs8QHGRruUQn6yWY3a++T0U= golang.org/x/oauth2 v0.0.0-20190226205417-e64efc72b421/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw= golang.org/x/oauth2 v0.0.0-20190604053449-0f29369cfe45/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw= golang.org/x/oauth2 v0.0.0-20200107190931-bf48bf16ab8d/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw= -golang.org/x/oauth2 v0.10.0 h1:zHCpF2Khkwy4mMB4bv0U37YtJdTGW8jI0glAApi0Kh8= -golang.org/x/oauth2 v0.10.0/go.mod h1:kTpgurOux7LqtuxjuyZa4Gj2gdezIt/jQtGnNFfypQI= +golang.org/x/oauth2 v0.18.0 h1:09qnuIAgzdx1XplqJvW6CQqMCtGZykZWcXzPMPUusvI= +golang.org/x/oauth2 v0.18.0/go.mod h1:Wf7knwG0MPoWIMMBgFlEaSUDaKskp0dCfrlJRJXbBi8= golang.org/x/sync v0.0.0-20180314180146-1d60e4601c6f/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= golang.org/x/sync v0.0.0-20181108010431-42b317875d0f/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= golang.org/x/sync v0.0.0-20181221193216-37e7f081c4d4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= @@ -401,8 +449,8 @@ golang.org/x/sync v0.0.0-20201020160332-67f06af15bc9/go.mod h1:RxMgew5VJxzue5/jJ golang.org/x/sync v0.0.0-20210220032951-036812b2e83c/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= golang.org/x/sync v0.0.0-20220722155255-886fb9371eb4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= golang.org/x/sync v0.1.0/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= -golang.org/x/sync v0.2.0 h1:PUR+T4wwASmuSTYdKjYHI5TD22Wy5ogLU5qZCOLxBrI= -golang.org/x/sync v0.2.0/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= +golang.org/x/sync v0.6.0 h1:5BMeUDZ7vkXGfEr1x9B4bRcTH4lpkTkpdh0T/J+qjbQ= +golang.org/x/sync v0.6.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk= golang.org/x/sys v0.0.0-20170830134202-bb24a47a89ea/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= golang.org/x/sys v0.0.0-20180830151530-49385e6e1522/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= golang.org/x/sys v0.0.0-20180909124046-d0be0721c37e/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= @@ -426,7 +474,6 @@ golang.org/x/sys v0.0.0-20211216021012-1d35b9e2eb4e/go.mod h1:oPkhp1MJrh7nUepCBc golang.org/x/sys v0.0.0-20220319134239-a9b59b0215f8/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= golang.org/x/sys v0.0.0-20220422013727-9388b58f7150/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= golang.org/x/sys v0.0.0-20220722155257-8c9f86f7a55f/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= -golang.org/x/sys v0.0.0-20220908164124-27713097b956/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= golang.org/x/sys v0.1.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= golang.org/x/sys v0.2.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= golang.org/x/sys v0.3.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= @@ -456,8 +503,8 @@ golang.org/x/text v0.14.0/go.mod h1:18ZOQIKpY8NJVqYksKHtTdi31H5itFRjB5/qKTNYzSU= golang.org/x/time v0.0.0-20181108054448-85acf8d2951c/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ= golang.org/x/time v0.0.0-20190308202827-9d24e82272b4/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ= golang.org/x/time v0.0.0-20220210224613-90d013bbcef8/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ= -golang.org/x/time v0.3.0 h1:rg5rLMjNzMS1RkNLzCG38eapWhnYLFYXDXj2gOlr8j4= -golang.org/x/time v0.3.0/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ= +golang.org/x/time v0.5.0 h1:o7cqy6amK/52YcAKIPlM3a+Fpj35zvRj2TP+e1xFSfk= +golang.org/x/time v0.5.0/go.mod h1:3BpzKBy/shNhVucY/MWOyx10tF3SFh9QdLuxbVysPQM= golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ= golang.org/x/tools v0.0.0-20181011042414-1f849cf54d09/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ= golang.org/x/tools v0.0.0-20181030221726-6c7e314b6563/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ= @@ -486,28 +533,29 @@ golang.org/x/tools v0.2.0/go.mod h1:y4OqIKeOV/fWJetJ8bXPU1sEVniLMIyDAZWeHdV+NTA= golang.org/x/tools v0.4.0/go.mod h1:UE5sM2OK9E/d67R0ANs2xJizIymRP5gJU295PvKXxjQ= golang.org/x/tools v0.6.0/go.mod h1:Xwgl3UAJ/d3gWutnCtw505GrjyAbvKui8lOU390QaIU= golang.org/x/tools v0.7.0/go.mod h1:4pg6aUX35JBAogB10C9AtvVL+qowtN4pT3CGSQex14s= -golang.org/x/tools v0.9.3 h1:Gn1I8+64MsuTb/HpH+LmQtNas23LhUVr3rYZ0eKuaMM= -golang.org/x/tools v0.9.3/go.mod h1:owI94Op576fPu3cIGQeHs3joujW/2Oc6MtlxbF5dfNc= +golang.org/x/tools v0.13.0 h1:Iey4qkscZuv0VvIt8E0neZjtPVQFSc870HQ448QgEmQ= +golang.org/x/tools v0.13.0/go.mod h1:HvlwmtVNQAhOuCjW7xxvovg8wbNq7LwfXh/k7wXUl58= golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= golang.org/x/xerrors v0.0.0-20191011141410-1b5146add898/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= golang.org/x/xerrors v0.0.0-20200804184101-5ec99f83aff1/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= -golang.org/x/xerrors v0.0.0-20220907171357-04be3eba64a2 h1:H2TDz8ibqkAF6YGhCdN3jS9O0/s90v0rJh3X/OLHEUk= golang.org/x/xerrors v0.0.0-20220907171357-04be3eba64a2/go.mod h1:K8+ghG5WaK9qNqU5K3HdILfMLy1f3aNYFI/wnl100a8= +golang.org/x/xerrors v0.0.0-20231012003039-104605ab7028 h1:+cNy6SZtPcJQH3LJVLOSmiC7MMxXNOb3PU/VUEz+EhU= +golang.org/x/xerrors v0.0.0-20231012003039-104605ab7028/go.mod h1:NDW/Ps6MPRej6fsCIbMTohpP40sJ/P/vI1MoTEGwX90= gomodules.xyz/jsonpatch/v2 v2.2.0 h1:4pT439QV83L+G9FkcCriY6EkpcK6r6bK+A5FBUMI7qY= gomodules.xyz/jsonpatch/v2 v2.2.0/go.mod h1:WXp+iVDkoLQqPudfQ9GBlwB2eZ5DKOnjQZCYdOS8GPY= google.golang.org/api v0.4.0/go.mod h1:8k5glujaEP+g9n7WNsDg8QP6cUVNI86fCNMcbazEtwE= google.golang.org/api v0.7.0/go.mod h1:WtwebWUNSVBH/HAw79HIFXZNqEvBhG+Ra+ax0hx3E3M= google.golang.org/api v0.8.0/go.mod h1:o4eAsZoiT+ibD93RtjEohWalFOjRDx6CVaqeizhEnKg= google.golang.org/api v0.9.0/go.mod h1:o4eAsZoiT+ibD93RtjEohWalFOjRDx6CVaqeizhEnKg= -google.golang.org/api v0.126.0 h1:q4GJq+cAdMAC7XP7njvQ4tvohGLiSlytuL4BQxbIZ+o= -google.golang.org/api v0.126.0/go.mod h1:mBwVAtz+87bEN6CbA1GtZPDOqY2R5ONPqJeIlvyo4Aw= +google.golang.org/api v0.171.0 h1:w174hnBPqut76FzW5Qaupt7zY8Kql6fiVjgys4f58sU= +google.golang.org/api v0.171.0/go.mod h1:Hnq5AHm4OTMt2BUVjael2CWZFD6vksJdWCWiUAmjC9o= google.golang.org/appengine v1.1.0/go.mod h1:EbEs0AVv82hx2wNQdGPgUI5lhzA/G0D9YwlJXL52JkM= google.golang.org/appengine v1.4.0/go.mod h1:xpcJRLb0r/rnEns0DIKYYv+WjYCduHsrkT7/EB5XEv4= google.golang.org/appengine v1.5.0/go.mod h1:xpcJRLb0r/rnEns0DIKYYv+WjYCduHsrkT7/EB5XEv4= google.golang.org/appengine v1.6.1/go.mod h1:i06prIuMbXzDqacNJfV5OdTW448YApPu5ww/cMBSeb0= -google.golang.org/appengine v1.6.7 h1:FZR1q0exgwxzPzp/aF+VccGrSfxfPpkBqjIIEq3ru6c= -google.golang.org/appengine v1.6.7/go.mod h1:8WjMMxjGQR8xUklV/ARdw2HLXBOI7O7uCIDZVag1xfc= +google.golang.org/appengine v1.6.8 h1:IhEN5q69dyKagZPYMSdIjS2HqprW324FRQZJcGqPAsM= +google.golang.org/appengine v1.6.8/go.mod h1:1jJ3jBArFh5pcgW8gCtRJnepW8FzD1V44FJffLiz/Ds= google.golang.org/genproto v0.0.0-20180817151627-c66870c02cf8/go.mod h1:JiN7NxoALGmiZfu7CAH4rXhgtRTLTxftemlI0sWmxmc= google.golang.org/genproto v0.0.0-20190307195333-5fe7a883aa19/go.mod h1:VzzqZJRnGkLBvHegQrXjBqPurQTc5/KpmUdxsrq26oE= google.golang.org/genproto v0.0.0-20190418145605-e7d98fc518a7/go.mod h1:VzzqZJRnGkLBvHegQrXjBqPurQTc5/KpmUdxsrq26oE= @@ -519,12 +567,12 @@ google.golang.org/genproto v0.0.0-20200513103714-09dca8ec2884/go.mod h1:55QSHmfG google.golang.org/genproto v0.0.0-20200526211855-cb27e3aa2013/go.mod h1:NbSheEEYHJ7i3ixzK3sjbqSGDJWnxyFXZblF3eUsNvo= google.golang.org/genproto v0.0.0-20201019141844-1ed22bb0c154/go.mod h1:FWY/as6DDZQgahTzZj3fqbO1CbirC29ZNUFHwi0/+no= google.golang.org/genproto v0.0.0-20220107163113-42d7afdf6368/go.mod h1:5CzLGKJ67TSI2B9POpiiyGha0AjJvZIUgRMt1dSmuhc= -google.golang.org/genproto v0.0.0-20230530153820-e85fd2cbaebc h1:8DyZCyvI8mE1IdLy/60bS+52xfymkE72wv1asokgtao= -google.golang.org/genproto v0.0.0-20230530153820-e85fd2cbaebc/go.mod h1:xZnkP7mREFX5MORlOPEzLMr+90PPZQ2QWzrVTWfAq64= -google.golang.org/genproto/googleapis/api v0.0.0-20230530153820-e85fd2cbaebc h1:kVKPf/IiYSBWEWtkIn6wZXwWGCnLKcC8oWfZvXjsGnM= -google.golang.org/genproto/googleapis/api v0.0.0-20230530153820-e85fd2cbaebc/go.mod h1:vHYtlOoi6TsQ3Uk2yxR7NI5z8uoV+3pZtR4jmHIkRig= -google.golang.org/genproto/googleapis/rpc v0.0.0-20230530153820-e85fd2cbaebc h1:XSJ8Vk1SWuNr8S18z1NZSziL0CPIXLCCMDOEFtHBOFc= -google.golang.org/genproto/googleapis/rpc v0.0.0-20230530153820-e85fd2cbaebc/go.mod h1:66JfowdXAEgad5O9NnYcsNPLCPZJD++2L9X0PCMODrA= +google.golang.org/genproto v0.0.0-20240213162025-012b6fc9bca9 h1:9+tzLLstTlPTRyJTh+ah5wIMsBW5c4tQwGTN3thOW9Y= +google.golang.org/genproto v0.0.0-20240213162025-012b6fc9bca9/go.mod h1:mqHbVIp48Muh7Ywss/AD6I5kNVKZMmAa/QEW58Gxp2s= +google.golang.org/genproto/googleapis/api v0.0.0-20240318140521-94a12d6c2237 h1:RFiFrvy37/mpSpdySBDrUdipW/dHwsRwh3J3+A9VgT4= +google.golang.org/genproto/googleapis/api v0.0.0-20240318140521-94a12d6c2237/go.mod h1:Z5Iiy3jtmioajWHDGFk7CeugTyHtPvMHA4UTmUkyalE= +google.golang.org/genproto/googleapis/rpc v0.0.0-20240318140521-94a12d6c2237 h1:NnYq6UN9ReLM9/Y01KWNOWyI5xQ9kbIms5GGJVwS/Yc= +google.golang.org/genproto/googleapis/rpc v0.0.0-20240318140521-94a12d6c2237/go.mod h1:WtryC6hu0hhx87FDGxWCDptyssuo68sk10vYjF+T9fY= google.golang.org/grpc v1.19.0/go.mod h1:mqu4LbDTu4XGKhr4mRzUsmM4RtVoemTSY81AxZiDr8c= google.golang.org/grpc v1.20.1/go.mod h1:10oTOabMzJvdu6/UiuZezV6QK5dSlG84ov/aaiqXj38= google.golang.org/grpc v1.21.1/go.mod h1:oYelfM1adQP15Ek0mdvEgi9Df8B9CZIaU1084ijfRaM= @@ -535,9 +583,8 @@ google.golang.org/grpc v1.33.1/go.mod h1:fr5YgcSWrqhRRxogOsw7RzIpsmvOZ6IcH4kBYTp google.golang.org/grpc v1.33.2/go.mod h1:JMHMWHQWaTccqQQlmk3MJZS+GWXOdAesneDmEnv2fbc= google.golang.org/grpc v1.36.0/go.mod h1:qjiiYl8FncCW8feJPdyg3v6XW24KsRHe+dy9BAGRRjU= google.golang.org/grpc v1.40.0/go.mod h1:ogyxbiOoUXAkP+4+xa6PZSE9DZgIHtSpzjDTB9KAK34= -google.golang.org/grpc v1.45.0/go.mod h1:lN7owxKUQEqMfSyQikvvk5tf/6zMPsrK+ONuO11+0rQ= -google.golang.org/grpc v1.56.3 h1:8I4C0Yq1EjstUzUJzpcRVbuYA2mODtEmpWiQoN/b2nc= -google.golang.org/grpc v1.56.3/go.mod h1:I9bI3vqKfayGqPUAwGdOSu7kt6oIJLixfffKrpXqQ9s= +google.golang.org/grpc v1.62.1 h1:B4n+nfKzOICUXMgyrNd19h/I9oH0L1pizfk1d4zSgTk= +google.golang.org/grpc v1.62.1/go.mod h1:IWTG0VlJLCh1SkC58F7np9ka9mx/WNkjl4PGJaiq+QE= google.golang.org/protobuf v0.0.0-20200109180630-ec00e32a8dfd/go.mod h1:DFci5gLYBciE7Vtevhsrf46CRTquxDuWsQurQQe4oz8= google.golang.org/protobuf v0.0.0-20200221191635-4d8936d0db64/go.mod h1:kwYJMbMJ01Woi6D6+Kah6886xMZcty6N08ah7+eCXa0= google.golang.org/protobuf v0.0.0-20200228230310-ab0ca4ff8a60/go.mod h1:cfTl7dwQJ+fmap5saPgwCLgHXTUD7jkjRqWcaiX5VyM= @@ -553,8 +600,8 @@ google.golang.org/protobuf v1.26.0/go.mod h1:9q0QmTI4eRPtz6boOQmLYwt+qCgq0jsYwAQ google.golang.org/protobuf v1.27.1/go.mod h1:9q0QmTI4eRPtz6boOQmLYwt+qCgq0jsYwAQnmE0givc= google.golang.org/protobuf v1.28.0/go.mod h1:HV8QOd/L58Z+nl8r43ehVNZIU/HEI6OcFqwMG9pJV4I= google.golang.org/protobuf v1.28.1/go.mod h1:HV8QOd/L58Z+nl8r43ehVNZIU/HEI6OcFqwMG9pJV4I= -google.golang.org/protobuf v1.31.0 h1:g0LDEJHgrBl9N9r17Ru3sqWhkIx2NB67okBHPwC7hs8= -google.golang.org/protobuf v1.31.0/go.mod h1:HV8QOd/L58Z+nl8r43ehVNZIU/HEI6OcFqwMG9pJV4I= +google.golang.org/protobuf v1.33.0 h1:uNO2rsAINq/JlFpSdYEKIZ0uKD/R9cpdv0T+yoGwGmI= +google.golang.org/protobuf v1.33.0/go.mod h1:c6P6GXX6sHbq/GpV6MGZEdwhWPcYBgnhAHhKbcUYpos= gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= @@ -564,6 +611,8 @@ gopkg.in/errgo.v2 v2.1.0/go.mod h1:hNsd1EY+bozCKY1Ytp96fpM3vjJbqLJn88ws8XvfDNI= gopkg.in/fsnotify.v1 v1.4.7/go.mod h1:Tz8NjZHkW78fSQdbUxIjBTcgA1z1m8ZHf0WmKUhAMys= gopkg.in/inf.v0 v0.9.1 h1:73M5CoZyi3ZLMOyDlQh031Cx6N9NDJ2Vvfl76EDAgDc= gopkg.in/inf.v0 v0.9.1/go.mod h1:cWUDdTG/fYaXco+Dcufb5Vnc6Gp2YChqWtbxRZE0mXw= +gopkg.in/ini.v1 v1.67.0 h1:Dgnx+6+nfE+IfzjUEISNeydPJh9AXNNsWbGP9KzCsOA= +gopkg.in/ini.v1 v1.67.0/go.mod h1:pNLf8WUiyNEtQjuu5G5vTm06TEv9tsIgeAvK8hOrP4k= gopkg.in/tomb.v1 v1.0.0-20141024135613-dd632973f1e7/go.mod h1:dt/ZhP58zS4L8KSrWDmTeBkI65Dw0HsyUHuEVlX15mw= gopkg.in/yaml.v2 v2.2.1/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= gopkg.in/yaml.v2 v2.2.2/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= diff --git a/patch/lmes/models.patch b/patch/lmes/models.patch new file mode 100644 index 00000000..bce91337 --- /dev/null +++ b/patch/lmes/models.patch @@ -0,0 +1,12 @@ +diff --git a/lm_eval/models/__init__.py b/lm_eval/models/__init__.py +index 698c912f..f523d46d 100644 +--- a/lm_eval/models/__init__.py ++++ b/lm_eval/models/__init__.py +@@ -11,6 +11,7 @@ from . import ( + optimum_lm, + textsynth, + vllm_causallms, ++ bam, + ) + +