Skip to content

A serverless Terraform setup to asynchronously serve batch-predictions using Google Cloud Run, Pub/Sub and Cloud Storage

License

Notifications You must be signed in to change notification settings

stelsemeyer/gcp-cloud-runner-tf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Serverless & containerized tasks using Google Cloud Run, Pub/Sub, Cloud Storage and Terraform

Overview

(A more detailed version is available on my blog.)

We will build a serverless service which will listen to new files in a Cloud Storage bucket via Pub/Sub, run some small containzerized process via Cloud Run once files are available and publish results of the process to another bucket.

As a process here we will run a simple time series forecasting model (facebook's prophet). Alternatively you could also run batch-predictions using a trained model artifact or an external model service, for example to to classify images or videos, or run a simple script to generate some plots.

Prerequisites

Google authentication to enable planning and applying with terraform and updating the image:

gcloud auth application-default login

You can check if you are authenticated with the right user using gcloud auth list.

We run Terraform v0.14.0.

Setup

Run terraform plan & apply using the setup script setup.sh which contains the following steps:

# build container once to enable caching
(cd app && 
	docker build -t cloud-runner .)

(cd terraform && 
	terraform init && 
	terraform apply)

Test

Since our prototype contains a very simple implementation of facebook's prophet package, let's run a time series forecast on a small dataset, the hyndsight dataset. It contains the daily pageviews of Rob J. Hyndman's blog from 2014-04-30 to 2015-04-29. Rob J. Hyndman is the author of many popular R forecast packages (including forecast), author of numerous books and research papers and a forecasting expert. The time series in the dataset shows a prominent weekly pattern and upward trend, which we will see below.

We can upload the dataset to GCS with gsutil:

gsutil cp app/data/hyndsight.csv gs://my-cloud-runner-input-bucket/hyndsight.csv

If our infrastructure works properly we should see the results in the output bucket within a few seconds. We can check the Cloud Run logs in the Google console in the meantime. We should see that the container received some data and stores some output file.

INFO:root:Data received: ...
...
INFO:root:Output file: gs://my-cloud-runner-output-bucket/hyndsight.csv

We can then fetch the output file:

gsutil cp gs://my-cloud-runner-output-bucket/hyndsight.csv app/data/hyndsight_forecast.csv 

Read both datasets and plot them:

import pandas as pd

actual = pd.read_csv("app/data/hyndsight.csv")
forecast = pd.read_csv("app/data/hyndsight_forecast.csv")

data = pd.concat([actual, forecast])

data.plot(x="date", figsize=(12,5))

Deploy new image

Run the simple deploy script deploy.sh which contains the following steps:

# get project id, image output and service name from terraform output
PROJECT_ID=$(cd terraform && terraform output -json | jq -r .project_id.value)
IMAGE_URI=$(cd terraform && terraform output -json | jq -r .image_uri.value)
SERVICE_NAME=$(cd terraform && terraform output -json | jq -r .service_name.value)

# build and push image
(cd app && 
	./build.sh && 
	IMAGE_URI=$IMAGE_URI ./push.sh)

# update image
gcloud --project $PROJECT_ID \
	run services update $SERVICE_NAME \
	--image $IMAGE_URI \
	--platform managed \
	--region europe-west3

# send traffic to latest
gcloud --project $PROJECT_ID \
	run services update-traffic $SERVICE_NAME \
	--platform managed \
	--region europe-west3 \
	--to-latest

Destroy

Run the destroy script _destroy.sh to delete(!) the bucket contents and the project or execute the following steps:

# # delete bucket content
# gsutil rm "gs://my-cloud-runner-input-bucket/**"
# gsutil rm "gs://my-cloud-runner-output-bucket/**"
# 
# # destroy infra
# (cd terraform && 
# 	terraform state rm "google_project_iam_member.project_owner" &&
# 	terraform destroy)

Remarks

  • Cloud Run run has a maximum timeout of 15 minutes, Google Pub/Sub has a maximum acknowledge time of 10 minutes, making it useless for more time-consuming tasks. You can use bigger resources though to speed up the processing time.
  • Instead of uploading the data to the buckets directly we could upload some kind of manifest containing the path to the data and some ids to maker integration into other system easier.
  • The IAM setup of this project is very naive and should be revisited.
  • Make sure to always use two different buckets for input and output, otherwise there will be a continious loop when output files are stored in the input bucket.
  • The most simple levers to tune and scale the system are container resources and concurrency.

About

A serverless Terraform setup to asynchronously serve batch-predictions using Google Cloud Run, Pub/Sub and Cloud Storage

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published