Skip to content

Latest commit

 

History

History
393 lines (258 loc) · 19.8 KB

File metadata and controls

393 lines (258 loc) · 19.8 KB

Getting started

This document explains how to get started “Fleet Events" reference solution. It provides step-by-step instructions and technical details to help you understand the components and deploy a working reference solution to your Google Cloud project.

This solution can be beneficial for your near-real-time use cases. It uses near-real-time events produced by Fleet Engine as a basis for generating custom events for your specific requirements. The reference solution in this repository includes code for creating fleet events and automated deployment using Terraform.

Use cases the example covers :

  • Task Outcome
  • ETA Updates

Overview

To adjust the configuration parameters, you need to have at least a high-level understanding of the building blocks before starting the deployment.

FleetEvents overview

The end state :

  • There will be two projects, one for Fleet Engine, and one for the Fleet Events reference solution. Projects are separated so that each can be tied to respective billing accounts. (Example: Fleet Events: your GCP billing account. Fleet Engine: your Google Maps Platform Mobility billing account)
  • Logs from Fleet Engine project will be setup to flow into a Cloud Pub/Sub Topic in “Fleet Events” project
  • A Cloud Function that processes messages from the Topic will be deployed. This function will have the implementation unique to the use case
  • A Pub/Sub topic where generated events will be published is deployed

Deploy the reference solution

Prerequisites

Information about your environment

Note down the following information before getting started.

  • Fleet Engine project (project id): Your active project that is enabled for “On Demand Rides and Deliveries Solution”(ODRD) or “Last Mile Fleet Solution”(LMFS) is the event source. When handling events, you will often need to access Fleet Engine's APIs to retrieve information. Therefore, your code needs to know which project to access.
    • example) my-fleetengine-project
  • Fleet Events project (project id): The project in which this reference solution will be deployed. This project can be pre-created or be generated as part of the automated provisioning. A clean developer sandbox project is highly recommended for safe experiments.
    • example) my-fleetevents-project
  • Billing Account for Google Cloud: This reference solution will make use of several Google Cloud Platform services. Therefore, the Billing Account for GCP usage is required. You may have different billing arrangements between Google Maps Platform (GMP) and Google Cloud Platform (GCP), in which case the Fleet Events project above should be associated with the latter. If you are not sure, please reach out to your Google account manager or Google Maps Platform reseller.
    • example) XXXXXX-XXXXXX-XXXXXX (alphanumeric)
  • Project Folder: If your organization is adopting a folder structure to manage multiple Google Cloud projects, identify the folder and its id (number) under which the Fleet Events projects should reside.
    • example) XXXXXXXXXXXX (all digits)
  • Developer google account : for simplicity, the deployment automation will run under a developer’s account and its privilege. The developer will be given permission to resources created.

Tools and setup

Check permission to configure projects

Terraform will run under your Google account’s permissions. (you can also run terraform under service accounts. Resources to learn about such configuration will be linked later in this doc)

For testing, give your developer user account IAM role roles/editor in both your Fleet Engine project and Fleet Event project. This gives you most of the permissions required to quickly go through this solution. Additionally, there are a few additionally required IAM roles.

Fleet Engine project

  • roles/logging.configWriter: Contains permissions required to create logging sinks that defines where the log messages will be delivered.

Fleet Events project

  • roles/iam.securityAdmin: Includes permission to set IAM policies for various resources such as Pub/Sub Topics or Subscriptions, BigQuery datasets, etc. Required to apply least privilege principles.
  • roles/datastore.owner: Includes permission required to create Firestore database instance.

Here is an example to give an user account the required IAM roles with Cloud CLI. The same can be achieved through manual steps in the Cloud Console UI.

# set your email to variable
[email protected]


# set your Fleet Engine project 
FLEETENGINE_PROJECT_ID=XXXXXXXX

# give your account, a broad set of permissons by applying roles/editor role
gcloud projects add-iam-policy-binding $FLEETENGINE_PROJECT_ID \ 
    --member=user:$USER_EMAIL --roles=roles/editor

# give your account, the permissions required to configure Logging
gcloud projects add-iam-policy-binding $FLEETENGINE_PROJECT_ID \
    --member=user:$USER_EMAIL -r-oles=roles/logging.configWriter


# set your Fleet Events project
FLEETEVENTS_PROJECT_ID=XXXXXXXX

# give your account, a broad set of permissons by applying roles/editor role
gcloud projects add-iam-policy-binding $FLEETEVENTS_PROJECT_ID \
    --member=user:$USER_EMAIL --roles=roles/editor

# give your account, the permissions required to configure IAM for various resources
gcloud projects add-iam-policy-binding $FLEETEVENTS_PROJECT_ID \
    --member=user:$USER_EMAIL --roles=roles/iam.securityAdmin

# give your account, the permissions required to configure firestore database instances
gcloud projects add-iam-policy-binding $FLEETEVENTS_PROJECT_ID \
    --member=user:$USER_EMAIL --roles=roles/datastore.owner

For production deployment, consider tightened access control, including adoption of Service Accounts to detach dependency on certain individuals. Also use custom roles to give the least set of permissions required for setup.

Access the code

Clone this repo.

git clone https://github.com/googlemaps/fleet-events-reference-solutions

# change working directory to the cloned repo
cd fleet-events-reference-solutions

Deploy the references solution

Follow the steps below to deploy the reference solution.

STEP 1 : identify the example to start with

Change the Terraform working directory to one of the examples.

# choosing "with_existing_project" example here,
# which assumes you already have a project for Fleet Events.

cd terraform/modules/fleet-events/examples/with_existing_project

STEP 2 : create a terraform.tfvars file from example

This example is also a terraform module with input variables defined. These variables can be set in a configuration file terraform.tfvars. Make a copy of file terraform.tfvars.sample and edit it to match your environment. Your text editor of choice can be used. Example below is using `vi`` as the editor.

# copy sample file to create your own config file.
cp terraform.tfvars.sample terraform.tfvars

# edit
vi terraform.tfvars

terraform.tfvars can look like this. Adjust the values to match your own environment

PROJECT_FLEETENGINE      = "<YOUR FLEETENGINE PROJECT>"
PROJECT_FLEETENGINE_LOGS = "<YOUR FLEETEVENTS PROJECT>"
PROJECT_FLEETEVENTS      = "<YOUR FLEETEVENTS PROJECT>"
GCP_REGION               = "us-central1"
GCP_REGION_FIRESTORE     = "nam5"
ME                       = "<Your Google Account>"

The full set of variables that can be set can be referenced in the module’s README.md.

Naming rules

Google Cloud Platform applies different naming restrictions depending on the service and resource. The “Fleet Events” deployment script will generate identifiers or labels by concatenating given resource names so that they are easily recognizable. However, concatenation can hit naming rule limits. The general guidance is to not make identifiers too long. Or, if such errors are observed, reconsider the name and make it shorter. Here are documented naming rules for a few of the major resources types :

STEP 3 : Initialize terraform

Initialize terraform before first run. Dependencies (modules, etc.) will be fetched to the working directory.

terraform init

STEP 4 : terraform plan

Let terraform compare current and ideal state and come up with a plan.

terraform plan

STEP 5 : Apply changes

If the plan is good to go, apply changes so that terraform can create and configure resources in your project accordingly.

terraform apply

Learn more about each terraform commands here.

STEP 6 : Verify your deployment

Use the publisher tool to test the TaskOutcome handler.

## install prerequisite python libraries
pip3 install -r ./tools/python/requirements.txt

## run the tool
python3 task_outcome_test.py \
    --project_id "<gcp_project_id>" \
    --input_topic_id "<input_topic_id>" \
    --output_topic_id "<output_topic_id>"

Note: _input_topic_id_ is the topic that your Cloud Function reads from. _output_topic_id_ should point to the PubSub topic where your deployed cloud function writes its notifications. Ensure the user running the script can read and write from all topics specified.

This publisher tool can also be used to publish test events to Cloud Functions. Sample itineraries that can be published are in folder ./tools/python/data/. You can follow the steps below:

## install prerequisite python libraries
pip3 install -r ./tools/python/requirements.txt

## run the tool
python3 main.py --project_id "<gcp_project_id>" \
    --plan "<path_to_json>" \
    --topic_id "<input_topic_id>"

Starting over

When you need to change configuration, update the terraform.tfvar file and rerun terraform from the same folder. In case you want to start over or clean up the environment, de-provisioning can be done by simply running the following command.

terraform destroy

Recovering from errors

There are situations where deployment may fail mid-way. This can be due to asynchronous tasks such as service enablement or function deployment timing out and terraform returning after initiation but without waiting long enough for actual availability.

Service not available

If the dependent service was not fully available, you may see error messages like this:

Error: Error creating Dataset: googleapi: Error 400: The project
   <PROJECT_FLEETEVENTS> has not enabled BigQuery.

Run the following after a few minutes to confirm the state of the service.

gcloud --project <PROJECT_FLEETEVENTS> services list --enabled

Once confirmed, rerun terraform

# re-run terraform
terreform apply --auto-approve
If you don’t see any logs

Check to see if the _default log stream is disabled. Because Cloud Functions writes logs to resource cloud_run_revision in Cloud Logs, it’s possible that the logs aren’t being routed at all.

Handling inconsistency

Inconsistency between the actual project state and the state terraform locally caches can be caused by failures. In such a case, a re-run can cause terraform to try to create a resource that already exists and fail again with a message like this:

│ Error: Error creating Database: googleapi: Error 409: Database already exists. Please use another database_id
│ Details:
│ [
│  {
│ "@type": "type.googleapis.com/google.rpc.DebugInfo",
│ "detail": "Database '' already exists for project <PROJECT_FLEETEVENTS>. Please use another database_id"
│  }
│ ]
│
│ with module.fleet-events-with-existing-project.module.fleet-events.goo gle_firestore_database.database,
│ on ../../main.tf line 59, in resource "google_firestore_database" "database":
│ 59: resource "google_firestore_database" "database" {

This requires the current project state to be imported into terraform by running “terraform import” command. The error above is about Firestore Database resource documented here:

As in the bottom of this page, terraform import can be run in one of the supported forms.

  • terraform import google_firestore_database.default projects/{{project}}/databases/{{name}}
  • terraform import google_firestore_database.default {{project}}/{{name}}
  • terraform import google_firestore_database.default {{name}}

In the case of the error above:

  • Database resources identifier within Terraform (as in the error message): module.fleet-events-with-existing-project.module.fleet-events.google_firestore_database.database
  • Database resource path: <PROJECT_FLEETEVENTS>/(default). Note that Firestore database instance name defaults to (default)

Therefore, the import command can be run as follows:

# terraform import "<TERRAFORM RESOURCE ID from error message>" "<PROJECT_FLEETEVENTS>/(default)"

terraform import \
module.fleet-events-with-existing-project.module.fleet-events.google_firestore_database.database \
"<PROJECT_FLEETEVENTS>/(default)"

If both identifiers are recognized, you will see a message like this.

...
module.fleet-events-with-existing-project.module.fleet-events.google_firestore_database.database: Import prepared!
Prepared google_firestore_database for import
...

After successful import, rerun the deployment.

# re-run terraform
terreform apply --auto-approve

For resources other than the Firestore database example above, commands to import different GCP resources’s state is well documented in each resource type’s documentation.

Also reference this guide from Google Cloud.

Advanced use of terraform

More advanced use of terraform is not the focus of this document, but here are some pointers that will help you in case it is required for your environment.

Run terraform under dedicated Service Account

This is recommended, especially when setting up CI/CD processes.

Externalizing state to Cloud Storage

In case there are multiple users responsible for deployment, you need to share terraform state by externalizing it to a Cloud Storage backend.

For developers considering wider adoption of terraform, below is a recommended read that will help you plan ahead.

Limitations

PubSub Triggers deliver fleet logs to Cloud Functions. These triggers do not guarantee ordering. Out of order eventing can originate from:

  • Logs delivered out of order by Cloud Logging (if used)
  • Logs delivered out of order by PubSub (see Cloud PubSub: Ordering Messages)
  • Logs processed out of order by Cloud Functions

The volume of out of order logs increases when a function does not have enough compute infrastructure. For example, if memory consumption is reaching capacity (see Monitoring Cloud Functions) for the deployed Cloud Function, we recommend re-deploying the function with more memory. to detect out of order events, enable Monitoring out of order events.

By default, Fleet Events does not enable retries. This may cause events to be dropped if logs fail to be delivered or processed by Cloud Functions.

Monitoring out of order events

Fleet Events can detect out of order events by setting a watermark during event processing. This functionality is disabled by default to reduce Firestore costs, but can be enabled by setting the runtime environment variable MEASURE_OUT_OF_ORDER = true in your Cloud Function. This functionality can be used to determine if larger instance sizes need to be configured to reduce out of order events.

Upon re-deployment, your function will track a watermark based on the event time of each fleet log (per entity id), and output a warning whenever an out of order event is detected. It is persisted in Firestore, which does incur Firestore read/write costs. See UpdateWatermarkTransaction for more information.

You can set up logs-based monitoring to alert on these out of order events. See logs based metrics to configure an end-to-end alert flow.

Conclusion

You now have a working environment of “Fleet Events” reference solution that takes near-real time events from Fleet Engine and turn them into actionable custom events along with handlers that act on them.

Contributors

Google maintains this article. The following contributors originally wrote it.

Principal authors:

  • Ethel Bao | Software Engineer, Google Maps Platform
  • Mohanad Almiski | Software Engineer, Google Maps Platform
  • Naoya Moritani | Solutions Engineer, Google Maps Platform