Skip to content

Latest commit

 

History

History
177 lines (120 loc) · 5.46 KB

README.md

File metadata and controls

177 lines (120 loc) · 5.46 KB

Dev CDP Infrastructure

Full infrastructure setup for whole system / integration level testing.


Initial Comments

This is primarily used for developer stack creation and management. We have an example stack, infrastructure, pipeline, and web app available for demonstration and example data usage in our example repo.

If you are just trying to process example CDP data (or for front-end: visualize example CDP data) and not upload data, it is recommended to simply point your requests at the example stack.

For in-depth details on infrastructure terminology and uses refer to the documentation found in our cookiecutter-cdp-deployment repository.


Dependencies

Deploying the CDP infrastructure requires having cdp-backend installed.

pip install -e ../[dev]

For more detailed information please see the project installation details.

Account Setup

  1. Create (or sign in to) a Google Cloud Platform (GCP) account. (Google Cloud Console Home)
  2. Create (or re-use) a billing account and attach it to your GCP account.
  3. Create (or sign in to) a Pulumi account.

Environment Setup

The only environment setup needed to run this deployment is to make sure pulumi itself and the gcloud SDK are both installed.

If this was the first time installing either of those packages, it is recommended to restart your terminal after installation.

After pulumi and gcloud have both been installed and terminal restarted, run the following commands to setup your local machine with credentials to both services.

Note: Pulumi only supports Python 3.7, when creating dev infrastructures you need to run these scripts in a py37 environment.

cd cdp-backend/dev-infrastructure
make login
make init project={project-name}
make gen-key project={project-name}
make build

After generating the key, name your key file in cdp-backend/.keys to cdp-dev.json. In case you have many keys, note that by default, the random and minimal event pipelines use the key named cdp-dev.json.

Infrastructure Management Commands

All of these commands should be run from within the cdp-backend/dev-infrastructure directory.

  • To log in to GCloud and Pulumi:

    make login
  • To create a new service account JSON key:

    make gen-key project={project-name}
  • To create a new dev infrastructure:

    make init project={project-name}

    And follow the link logged to link a billing account to the created project.

    Note: This will create a new GCP project.

  • To set up infrastructure:

    make build

    Note: You should run make gen-key prior to build and ensure you have GOOGLE_CREDENTIALS set in your environment variables using:

    export GOOGLE_CREDENTIALS=$(cat ../.keys/{project-name}-sa-dev.json)

    and replacing {project-name} with your project name.

    or if you have already renamed your key:

    export GOOGLE_CREDENTIALS=$(cat ../.keys/cdp-dev.json)
  • To clean and remove all database documents and file store objects:

    make clean key={path-to-key}

    Note: Cleaning infrastructure is best practice when comparing pipeline outputs and database models aren't changing (specifically database indices).

  • To reset infrastructure but reuse the same Google project:

    make reset

    Note: Reseting infrastructure is likely required when iterating on database models (specifically database indices). Cleaning infrastructure should always be attempted first before reset or destroy as make clean will not use any extra Google Cloud (or Firebase) projects and applications.

  • To delete all Pulumi and GCloud resources entirely:

    make destroy project={project-name}

    Note: This will delete the GCP project.

Try to use the same project and infrastructure as much as possible, there are limits for how many projects and Firestore applications a single user can have.

All Commands

  • See Makefile commands with make help. Or simply open the Makefile. All the commands are decently easy to follow.
  • See Pulumi CLI documentation for all Pulumi commands.

Non-Default Parameters

If you want to edit the default behavior of the __main__.py file and change the parameters, please see the documentation for the CDPStack object

Running Pipelines Against Dev Infra

Once you have a dev infrastructure set up and a key downloaded (make gen-key) you can run pipelines and store data in the infrastructure by moving up to the base directory of this repository and running the following from cdp-backend/:

  • To run a semi-random (large permutation) event pipeline:

    make run-rand-event-pipeline
  • To run a minimal (by definition) event pipeline:

    make run-min-event-pipeline