-
Notifications
You must be signed in to change notification settings - Fork 17
Automation
There are six CI/CD pipelines
-
pr-checks.yaml
- runs pre-commit checks and unit tests on the custom KFP components, and checks that the ML pipelines (training and prediction) can compile. -
trigger-tests.yaml
- runs unit tests for the Cloud Function located in terraform/modules/cloudfunction. If you don't need to change this code, you can ignore this CI/CD pipeline. -
e2e-test.yaml
- runs end-to-end tests of the training and prediction pipeline. -
release.yaml
- compiles training and prediction pipelines, then copies the compiled pipelines to the chosen GCS destination (versioned by git tag). -
terraform-plan.yaml
- Checks the Terraform configuration underterraform/envs/<env>
(e.g.terraform/envs/test
), and produces a summary of any proposed changes that will be applied on merge to the main branch. -
terraform-apply.yaml
- Applies the Terraform configuration underterraform/envs/<env>
(e.g.terraform/envs/test
).
We recommend to use a separate admin
project, since the CI/CD pipelines operate across all the different environments (dev/test/prod).
See the Google Cloud Documentation for details on how to link your repository to Cloud Build, and set up triggers.
Your Cloud Build pipelines will need a service account to use. Create a new service account in the admin project named cloud-build
. Then, give it these permissions in the different Google Cloud projects:
- dev/test/prod projects -
roles/owner
- admin project -
roles/logging.logWriter
Set up a trigger for the pr-checks.yaml
pipeline.
We recommend to add make pre-commit
(which is already part of the Makefile
), to keep your ML use case code clean.
By default pull requests don't execute pre-commit hooks to improve the ease of use for new users of the template.
Set up a trigger for the e2e-test.yaml
pipeline, and provide substitution values for the following variables:
Variable | Description | Suggested value |
---|---|---|
_TEST_VERTEX_CMEK_IDENTIFIER |
Optional. ID of the CMEK (Customer Managed Encryption Key) that you want to use for the ML pipeline runs in the E2E tests as part of the CI/CD pipeline with the format projects/my-project/locations/my-region/keyRings/my-kr/cryptoKeys/my-key
|
Leave blank |
_TEST_VERTEX_LOCATION |
The Google Cloud region where you want to run the ML pipelines in the E2E tests as part of the CI/CD pipeline. | Your chosen Google Cloud region |
_TEST_VERTEX_NETWORK |
Optional. The full name of the Compute Engine network to which the ML pipelines should be peered during the E2E tests as part of the CI/CD pipeline with the format projects/<project number>/global/networks/my-vpc
|
|
_TEST_VERTEX_PIPELINE_ROOT |
The GCS folder (i.e. path prefix) that you want to use for the pipeline artifacts and for passing data between stages in the pipeline. Used during the pipeline runs in the E2E tests as part of the CI/CD pipeline. | gs://<Project ID for dev environment>-pl-root |
_TEST_VERTEX_PROJECT_ID |
Google Cloud project ID in which you want to run the ML pipelines in the E2E tests as part of the CI/CD pipeline. | Project ID for the DEV environment |
_TEST_VERTEX_SA_EMAIL |
Email address of the service account you want to use to run the ML pipelines in the E2E tests as part of the CI/CD pipeline. | vertex-pipelines@<Project ID for dev environment>.iam.gserviceaccount.com |
_TEST_ENABLE_PIPELINE_CACHING |
Override the default caching behaviour of the ML pipelines. Leave blank to use the default caching behaviour. | False |
_TEST_BQ_LOCATION |
The location of BigQuery datasets used in training and prediction pipelines. |
US or EU if using multi-region datasets |
We recommend to enable comment control for this trigger (select Required
under Comment Control
). This will mean that the end-to-end tests will only run once a repository collaborator or owner comments /gcbrun
on the pull request.
This will help to avoid unnecessary runs of the ML pipelines while a Pull Request is still being worked on, as they can take a long time (and can be expensive to run on every Pull Request!)
Set up three triggers for terraform-plan.yaml
- one for each of the dev/test/prod environments. Set the Cloud Build substitution variables as follows:
Environment | Cloud Build substitution variables |
---|---|
dev |
_PROJECT_ID=<Google Cloud Project ID for the dev environment> _REGION=<Google Cloud region to use for the dev environment> _ENV_DIRECTORY=terraform/envs/dev |
test |
_PROJECT_ID=<Google Cloud Project ID for the test environment> _REGION=<Google Cloud region to use for the test environment> _ENV_DIRECTORY=terraform/envs/test |
prod |
_PROJECT_ID=<Google Cloud Project ID for the prod environment> _REGION=<Google Cloud region to use for the prod environment> _ENV_DIRECTORY=terraform/envs/prod |
Set up a trigger for the release.yaml
pipeline, and provide substitution values for the following variables:
Variable | Description | Suggested value |
---|---|---|
_PIPELINE_PUBLISH_AR_PATHS |
The (space separated) Artifact Registry repositories (plural!) where the compiled pipelines will be copied to - one for each environment (dev/test/prod). | https://europe-west2-kfp.pkg.dev/<Project ID for dev environment>/vertex-pipelines https://europe-west2-kfp.pkg.dev/<Project ID for test environment>/vertex-pipelines https://europe-west2-kfp.pkg.dev/<Project ID for prod environment>/vertex-pipelines |
Set up three triggers for terraform-apply.yaml
- one for each of the dev/test/prod environments. Set the Cloud Build substitution variables as follows:
Environment | Cloud Build substitution variables |
---|---|
dev |
_PROJECT_ID=<Google Cloud Project ID for the dev environment> _REGION=<Google Cloud region to use for the dev environment> _ENV_DIRECTORY=terraform/envs/dev |
test |
_PROJECT_ID=<Google Cloud Project ID for the test environment> _REGION=<Google Cloud region to use for the test environment> _ENV_DIRECTORY=terraform/envs/test |
prod |
_PROJECT_ID=<Google Cloud Project ID for the prod environment> _REGION=<Google Cloud region to use for the prod environment> _ENV_DIRECTORY=terraform/envs/prod |