terragrunt-polkadot

This is a reference architecture for deploying API nodes for Polkadot. Users can deploy infrastructure on one of several supported clouds and can customize the network topology per their needs. This work was done per the Load Balanced Endpoints grant proposal and is intended to be a long term development project where new features and optimizations will be built in over time.

Currently the API nodes themselves run on VMs with the supporting infrastructure running on kubernetes. In the future, options will be exposed to run the on either VMs, k8s, or some unique combination of both depending on what your infrastructure needs are.

Deploying the Stack

The process involves three steps.

Setting up accounts, projects, and API keys on each provider.
Run the CLI to configure the necessary files and ssh keys.
Run the deployment

Cloud Providers

Before running on any cloud, signup and provide payment details to create an active account and project in GCP. You will need API keys to any provider that you intend on running on on. For a walkthrough on each provider, please check the following links for setting up your cloud accounts.

AWS
GCP
Azure
DigitalOcean
Cloudflare - Only for if using geo routing

All cloud providers are on feature parity except for DigitalOcean which does not have native autoscaling capabilities . For now we have a kubernetes deployment that will run a helm chart with a cluster autoscaling capability.

Deployment Setup and CLI

To get started with an interactive CLI to configure node deployments:

git clone https://github.com/insight-w3f/terragrunt-polkadot
cd terragrunt-polkadot
pip3 install nukikata  # A tool we designed to do interactive code templating
nukikata .

By walking through the steps in the CLI, users should be able to fully customize the deployment of the cluster in any cloud provider. There are three key steps, installing prerequisites, configuring ssh keys, and setting up the stack . Each step can be done in the CLI.

Prerequisites

To run all the different tools, you will need the following tools.

Terraform
Terragrunt
Ansible (Not supported on windows without WSL)
Packer
kubectl
helm
aws-iam-authenticator
awscli - AWS only

SSH Keys

To setup ssh keys, in order to maintain a simpler governance around these sensitive items, we have a notion of a ssh -key profile in the deployment process where you generate new or link to existing keys and then write them to file . The CLI walks you through the process but all it is doing is entering in the profile in a document called secrets .yml which is ignored in version control. The document will end up looking something like this:

ssh_profiles:
- name: kusama-dev
  private_key_path: ~/.ssh/kusama-dev
  public_key_path: ~/.ssh/kusama-dev.pub
- name: kusama-prod
  private_key_path: ~/.ssh/kusama-prod
  public_key_path: ~/.ssh/kusama-prod.pub

Stack Configuration

Configuration settings are bespoke to each cloud provider but generally involve prompting the user for various options conditional on what type of network topology the user is trying to deploy. There are three general options,

No DNS
Single region / single domain
Cloudflare based geo routed (WIP)

For any kind of DNS routing, the user needs to buy a domain. For muli-environment deployments, it is recommended to get multiple domains. For single region deployments, the domain needs to reside on the cloud provider registrar. For Cloudflare deployments, the user needs to transfer the domain to Cloudflare and enable "Load Balancing" for geo routing. The user will first deploy all their clusters and then apply the Cloudflare configurations. Each time one adds a cluster, the Cloudflare module needs to reapplied. To remove a cluster, there is a health setup that will prevent traffic from being routed to the cluster and thus, the cloudflare module doesn't necessarily need to be applied.

The process is self-documented in the CLI or can be done manually by editing the deployment files per the architecture described below. Note that any values can be changed in the deployment files and reapplied to take effect.

Deployment Process

Set deployment variables - ie namespace, network name, etc.
Set the region per the could provider
Configure stack level parameters.
- Each associated terraform module is cloned
- Relevant parameters are prompted per the nuki.yaml file in the module
- Versions of each module are pulled from a versions.yaml file in each stack
Deployment file and run.yml file are written to the deployments directory and root
A terragrunt apply-all is run which traverses across all the modules
- The logic for this call is routed through a variables.hcl file to set all the parameters
- The terragrunt.hcl file then assembles the remote state path for each deployment

Run File, Deployment ID, and Remote State

We order the deployment file names and remote state path per the following convetion.

Num	Name	Description	Example
1	Namespace	The namespace, ie the chain	polkadot
2	Network Name	The name of the network	kusama
3	Environment	The environment of deployment	prod
4	Provider	The cloud provider	aws
5	Region	Region to deploy into	us-east-1
6	Stack	The type of stack to deploy	validator
7	Deployment ID	Identifier for rolling / canary deployments	1

We then will rely on this hierarchy in the remote state and deployment file.

Run File:

run.yaml An inherited file closest to the stack being deployed.

namespace: "polkadot"
network_name: "kusama"
environment: "dev"
provider: "aws"
region: "us-east-1"
stack: "validator-simple"
deployment_id: 1  # Something to discriminate between deployments - ie blue/green

Deployment File:

terragrunt-polkadot/deployments/polkadot.mainnet.prod.aws.us-east-1.validator.1.yaml

Deployment files are created locally by the nukikata CLI in the deployments directory and are referenced in each deployment run via the run.yaml which references the deployment file.

Remote State:

s3://.../<bucket>/polkadot/mainnet/prod/aws/us-east-1/validator/1/terraform.tfstate

The remote state bucket and path are created and managed for you by terragrunt. This is where the state of all the deployments is kept and can be referenced in subsequent deployments.

How it works

This reference architecture is built with terragrunt, a wrapper to terraform, which under the hood calls Ansible and Packer to configure VMs and Helm to configure kubernetes clusters. All aspects of the deployment are immutable and thus, the main challenge with using all of these tools in combination with one another is exposing the right options to the user that allow the customization of the deployment. For that, we have built our own declarative CLI codenamed nukikata, japanese for cookie cutter, which is a fork of the most popular code templating tool called cookiecutter. With this tool, we prompt the user to fillout the appropriate config files to then run the underlying terragrunt commands to deploy the stack.

A critical element in understanding the deployment methodology is understanding how the parameters are handled within the scope of a deployment to a provider. Normally with terragrunt, modules are structured in a heirarchial folder format per the conventions of various reference implementations recommended by industry experts. When running nodes in many regions across many providers, this convention has a draw back of having many files and folders to keep track of. To simplify this, we take a so called "deployment centric" approach where each deployment consists of a file per namespace, stack, network name, environment , and cloud provider region to hold all the parameters needed to inform a properly running stack. These files are currently stored locally in the deployments folder within each provider and soon, users will have the option of storing the files and running the stack remotely. To run the deployment, we write a new run.yml file that points to a deployments file. Currently deployments are executed sequentially and in the future the user will be able to deploy to multiple regions in parallel.

To manage this complex process, we developed nukikata as we felt that managing a declarative CLI in this context would be more manageable as an organization. We also want to make sure to allow features to be used across multiple different implementations and hence see this approach as being more manageable in the long term as we build in new features and expose unique decision tree like configuration options to allow users to easily navigate complex deployments. We see many applications of nukikata and are excited to have this project be the intial proving ground of this process that we hope to expand on for the years to come.

DNS Architecture

Network Topologies

The current architecture is based on a hybrid VM and kubernetes setup options are exposed to adopt either or methodologies.

VMs are used for the fully archived nodes and are deployed in autoscaling groups behind a network load balancer. To optimize the syncing of nodes, a source of truth node architecture is implemented where a single node is consistently syncing the latest chain data to a CDN that subsequent nodes sync off of directly. This reduces scaling time down to ~5 minutes which is a major improvement on the normal scaling time.

Kubernetes is used for monitoring with prometheus, and logging with elasticsearch, and an nginx reverse proxy layer for routing down to the archival nodes. Further optimizations are being planned on the reverse proxy layer to support caching of near head queries and other types of pre-indexed query optimizations. We will also soon support a kubernetes only deployment architecture.

Extra Components

At this time, only kubernetes is supported for running logging and monitoring systems. Options will be exposed for a VM based monitoring solution in the future.

Build Status

module	DigitalOcean	Packet
network		n/a
api-lb	n/a	n/a
asg	n/a	n/a
node
k8s-cluster		n/a

AWS

GCP

Azure

DigitalOcean

Polkadot

terraform-polkadot-user-data

General

Developing

This repo is actually a meta repo constructed from some 25+ other repos. To work with this stack install meta - npm i -g meta and run meta git clone . from the base of this repo. All the modules will then be in the modules directory.

Name		Name	Last commit message	Last commit date
Latest commit History 131 Commits
.circleci		.circleci
deployments		deployments
polkadot		polkadot
scripts		scripts
static		static
.gitignore		.gitignore
.meta		.meta
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum
nuki.yaml		nuki.yaml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
settings.yml		settings.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

terragrunt-polkadot

Deploying the Stack

Cloud Providers

Deployment Setup and CLI

Prerequisites

SSH Keys

Stack Configuration

Deployment Process

Run File, Deployment ID, and Remote State

How it works

DNS Architecture

Network Topologies

Extra Components

Build Status

AWS

GCP

Azure

DigitalOcean

Polkadot

General

Developing

About

Releases 1

Packages

Contributors 2

Languages

License

insight-w3f/terragrunt-polkadot

Folders and files

Latest commit

History

Repository files navigation

terragrunt-polkadot

Deploying the Stack

Cloud Providers

Deployment Setup and CLI

Prerequisites

SSH Keys

Stack Configuration

Deployment Process

Run File, Deployment ID, and Remote State

How it works

DNS Architecture

Network Topologies

Extra Components

Build Status

AWS

GCP

Azure

DigitalOcean

Polkadot

General

Developing

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages