From 101ed59880795f28f61225b91be8606a905bb726 Mon Sep 17 00:00:00 2001 From: Jover Date: Fri, 19 May 2023 16:41:12 -0700 Subject: [PATCH] Add reusable workflow "run-build" MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add a reusable workflow to run Nextstrain builds so that we can centralize and standardize how our pathogen repos set up these builds. Motivated by the latest hiccup with the Snakemake upgrade that required fixes across multiple pathogen repos.¹ There are some "hacky" steps to get around the limitations of reusable workflows: 1. The 'ref' input is required to ensure that we checkout the matching ref within the reusable workflow since it is currently not possible to detect the called workflow's ref.² 2. Since reusable workflows cannot access environment variables set in the calling workflow, this uses the yaml-to-envvars workaround with an `env` input already used in the patheogen-repo-ci workflow.³ 3. We use a json-to-envvars workaround to allow for setting secrets as environment variables for the build runtime dynamically. ¹ https://bedfordlab.slack.com/archives/C01LCTT7JNN/p1681328249848399 ² https://github.com/actions/toolkit/issues/1264 ³ https://github.com/nextstrain/.github/blob/cc6f4385a45bd6ed114ab4840416fd90cc46cd1b/.github/workflows/pathogen-repo-ci.yaml#L25-L45 --- .github/workflows/run-build.yaml | 230 +++++++++++++++++++ README.md | 13 ++ text-templates/attach-aws-batch.md | 6 + workflow-templates/run-build.properties.json | 4 + workflow-templates/run-build.yaml | 10 + 5 files changed, 263 insertions(+) create mode 100644 .github/workflows/run-build.yaml create mode 100644 text-templates/attach-aws-batch.md create mode 100644 workflow-templates/run-build.properties.json create mode 100644 workflow-templates/run-build.yaml diff --git a/.github/workflows/run-build.yaml b/.github/workflows/run-build.yaml new file mode 100644 index 0000000..06c24bc --- /dev/null +++ b/.github/workflows/run-build.yaml @@ -0,0 +1,230 @@ +# This workflow is intended to be called by workflows in our various pathogen +# build repos. See workflow-templates/run-build.yaml (a "starter" +# workflow) in this repo for an example of what the caller workflow looks like. +name: Run build + +on: + workflow_call: + inputs: + ref: + description: >- + The reference to use for this reusable workflow. Please make sure + that this input ref matches the `@{ref}` that you use when calling the + reusable workflow to ensure that all other local files from the + nextstrain/.github repo are from the same ref. + + This input is required as long as we are unable to detect the workflow + ref from within the reusable workflow. + See https://github.com/actions/toolkit/issues/1264 + type: string + required: true + + runtime: + description: >- + Nextstrain runtime under which to run the build. + Currently only supports docker, conda, and aws-batch. + Defaults to "docker". + type: string + default: docker + required: false + + env: + description: >- + Environment variables to set for this reusable workflow since + environment variables in the caller workflow are not propagated to + reusable workflows. This is expected to be a string containing YAML. + + This is easily produced, for example, by pretending + you're writing normal nested YAML within a literal multi-line block + scalar (introduced by "|"): + + with: + env: | + FOO: bar + I_CANT_BELIEVE: "it's not YAML" + would_you_believe: | + it's + not + yaml + + Do not use for secrets! Instead, pass them via GitHub Action's + dedicated secrets mechanism. + type: string + default: "" + required: false + + runtime-envvar-names: + description: >- + A list of environment variables that should be passed to the build's + runtime. The environment variables that the Nextstrain CLI automatically + forwards to the runtime do not need to be listed here. + See the list of environment variables that are auto-forwarded in + the Nextstrain CLI repo (https://github.com/nextstrain/cli/blob/master/nextstrain/cli/hostenv.py) + + The environment variables must be passed via the `env` input or + via GitHub Action's secrets that are inherited by using `secrets: inherit`. + The environment variable names should be separated by a space, + e.g. "SLACK_TOKEN SLACK_CHANNELS". + type: string + required: false + + cpus: + description: >- + Number of CPUs/cores/threads/jobs to utilize at once. + Will be passed to the `--cpus` option for `nextstrain build` and the + `--cores` option for Snakemake. See `nextstrain build` docs for more details: + https://docs.nextstrain.org/projects/cli/page/commands/build/ + type: number + default: 4 + required: false + + memory: + description: >- + Amount of memory to make available to the build. Passed to the + `--memory` option for `nextstrain build`. + See `nextstrain build` docs for more details: + https://docs.nextstrain.org/projects/cli/page/commands/build/ + type: string + required: false + + build-args: + description: >- + Additional command-line arguments to pass to `nextstrain build` after + the build directory (i.e. additional args for Snakemake). + type: string + default: "" + required: false + + build-output-paths: + description: >- + List of build output paths to include in the "build-output" uploaded + at the end of the workflow, as a string following the format of the + `paths` input of the `actions/upload-artifact` action. + For example: + + with: + build-output-paths: | + results/ + auspice/ + logs/ + + The output will always include a "build.log" file which contains log + messages from the `nextstrain build` command. The envdir created to + pass environment variables to the build runtime will always be + excluded from the output to prevent secrets from being leaked. + + This is not supported for builds on AWS Batch because the workflow + detaches from the build. Please use the `nextstrain build` command + locally to reattach to AWS Batch builds to download outputs. + type: string + default: "" + required: false + +env: + BUILD_DIR: build-dir + ENVDIR: env.d + +jobs: + run-build: + runs-on: ubuntu-latest + steps: + - name: Checkout nextstrain/.github (ref ${{ inputs.ref }}) + uses: actions/checkout@v3 + with: + repository: nextstrain/.github + ref: ${{ inputs.ref }} + + - name: Setup runtime ${{ inputs.runtime }} + uses: ./actions/setup-nextstrain-cli + with: + runtime: ${{ inputs.runtime }} + + # envdir is not available in our Conda runtime + # (See https://github.com/nextstrain/conda-base/blob/135ec65cc286b6fc22f717a1157667cb9a7e0a55/src/recipe.yaml#L52) + # so it needs to be installed separately in the runtime environment. + # + # Installing directly in the nextstrain shell since pip installed packages + # do persist in the Conda runtime (https://github.com/nextstrain/cli/issues/282) + # -Jover, 19 May 2023 + - if: ${{ inputs.runtime == 'conda' }} + name: Install envdir for Conda runtime + shell: nextstrain shell . {0} + run: pip install envdir + + - name: Checkout build repository + uses: actions/checkout@v3 + with: + repository: ${{ github.repository }} + path: ${{ env.BUILD_DIR }} + + - if: inputs.env + name: Set environment variables + env: + env: ${{ inputs.env }} + run: > + # shellcheck disable=SC2154 + + echo "$env" + | ./bin/yaml-to-envvars + | tee -a "$GITHUB_ENV" + + - name: Set secret runtime environment variables + env: + secrets: ${{ toJson(secrets) }} + varnames: ${{ inputs.runtime-envvars-names }} + run: > + # shellcheck disable=SC2154 + + echo "$secrets" + | ./bin/json-to-envvars --varnames="$varnames" + | tee -a "$GITHUB_ENV" + + - name: Set build environment variables + shell: bash + env: + varnames: ${{ inputs.runtime-envvars-names }} + run: | + # shellcheck disable=SC2154 + + read -a varnames -r -d $'\0' <<<"$varnames" + ./bin/write-envdir "${{ env.BUILD_DIR }}/${{ env.ENVDIR }}" "${varnames[@]}" + + - name: Run build via ${{ inputs.runtime }} + env: + memory: ${{ inputs.memory }} + NEXTSTRAIN_DOCKER_IMAGE: ${{ env.NEXTSTRAIN_DOCKER_IMAGE }} + AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }} + AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }} + run: | + nextstrain build \ + --cpus ${{ inputs.cpus }} \ + --memory ${{ env.memory }} \ + --detach \ + --exec env \ + ${{ env.BUILD_DIR }} \ + envdir ${{ env.ENVDIR }} snakemake \ + --cores ${{ inputs.cpus }} \ + ${{ inputs.build-args }} \ + | tee build.log + + - if: ${{ inputs.runtime == 'aws-batch' }} + name: Get AWS Batch job id + id: aws-batch + run: | + echo "AWS_BATCH_JOB_ID=$(tail -n1 build.log | sed -E 's/.+attach ([-a-f0-9]+).+/\1/')" >> "$GITHUB_ENV" + + - if: env.AWS_BATCH_JOB_ID + name: Generate AWS Batch summary + run: | + ./bin/interpolate-text-template \ + .text-templates/attach-aws-batch.md \ + > "$GITHUB_STEP_SUMMARY" + + - if: always() + uses: actions/upload-artifact@v3 + with: + name: build-output + path: | + build.log + ${{ inputs.build-output-paths }} + !${{ env.ENVDIR }} diff --git a/README.md b/README.md index 40bd820..f785ca5 100644 --- a/README.md +++ b/README.md @@ -52,6 +52,9 @@ Invoked by other repos. - Sync RTD redirects ([workflow](.github/workflows/sync-rtd-redirects.yaml)) +- Run Nextstrain build + ([workflow](.github/workflows/run-build.yaml)) + See also GitHub's [documentation on reusing workflows](https://docs.github.com/en/actions/using-workflows/reusing-workflows). @@ -82,6 +85,10 @@ Used to setup other repos. ([template](workflow-templates/debugging-runner.yaml), [properties](workflow-templates/debugging-runner.properties.json)) +- Run Nextstrain build + ([template](workflow-templates/run-build.yaml), + [properties](workflow-templates/run-build.properties.json)) + See also GitHub's [documentation on starter workflows](https://docs.github.com/en/actions/using-workflows/creating-starter-workflows-for-your-organization). @@ -99,3 +106,9 @@ Executable scripts that are used in our workflows. - [write-envdir](bin/write-envdir) - [json-to-envvars](bin/json-to-envvars) - [yaml-to-envvars](bin/yaml-to-envvars) + +## Workflow text templates + +Text templates for messages and summaries in our workflows. + +- [attach-aws-batch](text-templates/attach-aws-batch.md) diff --git a/text-templates/attach-aws-batch.md b/text-templates/attach-aws-batch.md new file mode 100644 index 0000000..abc9443 --- /dev/null +++ b/text-templates/attach-aws-batch.md @@ -0,0 +1,6 @@ +You can run the following command in your local build directory to re-attach to +this AWS Batch job later to see output and download results: + +``` +nextstrain build --aws-batch --attach ${AWS_BATCH_JOB_ID} . +``` diff --git a/workflow-templates/run-build.properties.json b/workflow-templates/run-build.properties.json new file mode 100644 index 0000000..b32b500 --- /dev/null +++ b/workflow-templates/run-build.properties.json @@ -0,0 +1,4 @@ +{ + "name": "Run build", + "description": "Starter workflow for setting up Nextstrain builds." +} diff --git a/workflow-templates/run-build.yaml b/workflow-templates/run-build.yaml new file mode 100644 index 0000000..40f5acb --- /dev/null +++ b/workflow-templates/run-build.yaml @@ -0,0 +1,10 @@ +name: Run build + +on: workflow_dispatch + +jobs: + run-build: + uses: nextstrain/.github/.github/workflows/run-build.yaml@master + with: + ref: master + secrets: inherit