Skip to content

Commit

Permalink
[u R] Create AnVIL deployments of Azul (#4122, #1528, #2489, #4370, PR
Browse files Browse the repository at this point in the history
  • Loading branch information
achave11-ucsc committed Aug 9, 2022
2 parents 500e5c5 + 2641805 commit 11d2bb0
Show file tree
Hide file tree
Showing 37 changed files with 1,270 additions and 484 deletions.
10 changes: 7 additions & 3 deletions .gitlab-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,10 +32,11 @@ build_image:
- cp -vR /etc/gitlab/azul/* . # Copy files like environment.local into the build directory.
- source /build/.venv/bin/activate
- pip list
- deployment=$(PYTHONPATH=src python scripts/check_branch.py --print || echo sandbox)
- source environment # load global defaults
- deployment=$(PYTHONPATH=src python scripts/check_branch.py --print)
- (cd deployments && ln -snf ${deployment} .active)
- source environment
- status_context="gitlab/${GITLAB_INSTANCE_NAME}/${AZUL_DEPLOYMENT_STAGE}"
- status_context="gitlab/${azul_gitlab_instance_name}/${AZUL_DEPLOYMENT_STAGE}"
- make clean
dependencies:
- build_image
Expand All @@ -53,7 +54,7 @@ test:
stage: test
script:
- make format # Any ill-formatted sources, ...
- test "$deployment" != sandbox || make requirements_update # ... stale transitive dependencies ...
- test "$azul_is_sandbox" = 1 && make requirements_update # ... stale transitive dependencies ...
- make openapi # ... or changes to the canned OpenAPI definition document ...
- make check_clean # would dirty up the working copy and fail the build.
- make pep8
Expand All @@ -70,6 +71,9 @@ deploy:
- make create
except:
- schedules
artifacts:
paths:
- terraform/plan.json

integration_test:
extends: .base
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ deploy: check_env

.PHONY: auto_deploy
auto_deploy: check_env
$(MAKE) -C terraform plan auto_apply
$(MAKE) -C terraform auto_apply
$(MAKE) post_deploy

.PHONY: post_deploy
Expand Down
229 changes: 127 additions & 102 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -427,117 +427,125 @@ module is part of a rarely used feature that can be disabled by unchecking
# 3. Deployment
## 3.1 One-time provisioning of shared cloud resources
Most of the cloud resources used by a particular deployment (personal or shared)
are provisioned automatically by `make deploy`. A handful of resources must be
created manually before invoking these Makefile targets for the first time in a
particular AWS account. This only needs to be done once per AWS account, before
the first Azul deployment is created in that account. Additional deployments do
not require this step.
Create an S3 bucket for shared Terraform and Chalice state. That bucket should
have object versioning enabled and must not be publicly accessible since
Terraform state may include secrets. If your developers assume a role via
Amazon STS, the bucket should reside in the same region as the Azul deployment.
This is because temporary STS AssumeRole credentials are specific to a region
and won't be recognized by an S3 region that's different from the one the
temporary credentials were issued in. To account for the region specificity of
the bucket, you may want to include the region name at then end of the bucket
name. That way you can have consistent bucket names across regions.
Next, create a lifecycle policy for the bucket. This rule governs the deletion
of ephemeral object versions that are generated in large numbers by integration
tests on personal and sandbox deployments. Name the rule `expire-tag`; enter
`expires` for the Object tags Key and `true` for the Object tags Value; select
the *Permanently delete previous versions of objects* checkbox, and enter *30*
for *Number of days after objects become previous versions*. Then click *Create
rule.*
### 3.1.1 CloudTrail
The CloudTrail resources for each of the AWS accounts hosting Azul deployments
are provisioned through Terraform. The corresponding resource definitions reside
in a separate *Terraform component*.
A Terraform component is a set of related resources. It is our own bastardized
form of Terraform's *module* concept, aimed at facilitating encapsulation and
reuse. Each deployment has at least a main component and zero or more child
components. The main component is identified by the empty string for a name;
child components have a non-empty name. The `dev` component has a child
component `dev.shared`. To deploy the main component of the `dev`deployment, one
selects the `dev` deployment and runs `make apply` from
`${project_root}/terraform` (or `make deploy` from the project root). To deploy
the `shared` child component of the `dev` deployment, one selects `dev.shared`
and runs `make apply` from `${project_root}/terraform/shared`. In other words,
there is one generic set of resource definitions for a child component, but
multiple concrete deployment directories.
### 3.1.2 API Gateway logs
To enable CloudWatch logs for API Gateway, an IAM role must be created and
configured in the API Gateway console. This must be done for each AWS account
and region, after at least one Azul deployment has been created in that account
and region. Once these steps have been completed, all API Gateway instances in a
region assume the resulting IAM role, allowing those instances to write to
CloudWatch Logs in that region.
1. Navigate to the *IAM Management Console*
2. Click *Roles* under *Access management*
3. Click *Create role*
4. Click *AWS service* for *Select type of trusted entity*
5. Click *API Gateway* for *Choose a use case*, click *Next: Permissions*
## 3.1 One-time provisioning of shared cloud resources
6. Ensure *AmazonAPIGatewayPushToCloudWatchLogs* is listed exclusively as the
policy and click *Next: Tags*
Most of the cloud resources used by a particular deployment (personal or main
ones alike) are provisioned automatically by `make deploy`. A handful of
resources must be created manually before invoking this Makefile target for
the first time in a particular AWS account. This only needs to be done once
per AWS account, before the first Azul deployment is created in that account.
Additional deployments do not require this step.
7. Add *name: azul-api_gateway* as a tag and click *Next: Review*
### 3.1.1 Versioned bucket for shared state
8. For *Role name* enter `azul-api_gateway`, ensure *Policies* are as prescribed
in step 6 and click *Create role*
Create an S3 bucket for shared Terraform and Chalice state. The bucket must
not be publicly accessible since Terraform state may include secrets. If your
developers assume a role via Amazon STS, the bucket should reside in the same
region as the Azul deployment. This is because temporary STS AssumeRole
credentials are specific to a region and won't be recognized by an S3 region
that's different from the one the temporary credentials were issued in. To
account for the region specificity of the bucket, you may want to include the
region name at then end of the bucket name. That way you can have consistent
bucket names across regions. Modify the ``environment.py`` of a main
deployment to be created in the AWS account owning the bucket(typically `dev`
or `prod`) to set `AZUL_VERSIONED_BUCKET` to the name of the bucket. Or,
inversely, name the bucket using the current value of that variable.
9. Copy the *Role ARN* for the newly created `azul-api_gateway` role
and go to the *API Gateway* console
```
aws s3api create-bucket --bucket $AZUL_VERSIONED_BUCKET
```
10. Select any available API Gateway resource (it doesn't matter which because
this is a region wide configuration) e.g. `azul-{lambda}-{stage}`, click on
*Settings* at the bottom of the left menu, paste the copied *Role ARN* into the
*CloudWatch log role ARN* and click *Save*
### 3.1.2 Route 53 hosted zones
### 3.1.3 Route 53 hosted zones
Azul uses Route 53 to provide user-friendly domain names for its services. The
DNS setup for Azul deployments has historically been varied and rather
protracted. Azul's infrastrcture code will typically manage Route 53 records
but the zones have to be created manually.
Create a Route 53 hosted zone for the Azul service and indexer. Multiple
deployments can share a hosted zone, but they don't have to. The name of the
hosted zone is configured with `AZUL_DOMAIN_NAME`. `make deploy` will
automatically provision record sets in the configured zone, but it will not
create the zone itself or register the domain name it is associated with.
Optionally create another hosted zone for the URL shortener. The URLs produced
Optionally, create another hosted zone for the URL shortener. The URLs produced
by the Azul service's URL shortening endpoint will refer to this zone. The name
of this zone is configured in `AZUL_URL_REDIRECT_BASE_DOMAIN_NAME`. It should be
supported to use the same zone for both `AZUL_URL_REDIRECT_BASE_DOMAIN_NAME` and
`AZUL_DOMAIN_NAME` but this was not tested. The shortener zone can be a
subdomain of the main Azul zone, but it doesn't have to be.
Optionally, create a hosted zone for the DRS domain alias of the Azul service.
The corresponding environment variable is `AZUL_DRS_DOMAIN_NAME`. This feature
has not been used since 2020 when Azul stopped offering DRS for HCA.
The hosted zone(s) should be configured with tags for cost tracking. A list of
tags that should be provisioned is noted in
[src/azul/deployment.py:tags](src/azul/deployment.py).
### 3.1.4 EBS volume for Gitlab
### 3.1.3 Shared resources managed by Terraform
The remaining resources for each of the AWS accounts hosting Azul deployments
are provisioned through Terraform. The corresponding resource definitions reside
in a separate *Terraform component*.
A Terraform component is a set of related resources. It is our own bastardized
form of Terraform's *module* concept, aimed at facilitating encapsulation and
reuse. Each deployment has at least a main component and zero or more child
components. The main component is identified by the empty string for a name;
child components have a non-empty name. The `dev` component has a child
component `dev.shared`. To deploy the main component of the `dev` deployment,
one selects the `dev` deployment and runs `make apply` from
`${project_root}/terraform` (or `make deploy` from the project root). To deploy
the `shared` child component of the `dev` deployment, one selects `dev.shared`
and runs `make apply` from `${project_root}/terraform/shared`. In other words,
there is one generic set of resource definitions for a child component, but
multiple concrete deployment directories.
There are currently two Terraform components: `shared` and `gitlab`.
Interestingly, not every deployment uses these components. Typically, only the
`dev` and `prod` deployments use them. The other deployment share them with
`dev` or `prod`, depending on which of those deployments they are colocated
with. Two deployments are colocated if they use the same AWS account. The
`shared` component contains the resources shared by all deployments in an AWS
account.
If you intend to set up a Gitlab instance for CI/CD of your Azul deployments, an
EBS volume needs to be created as well. See [gitlab.tf.json.template.py] and the
[section on CI/CD](#9-continuous-deployment-and-integration) and for details.
To deploy he remaining shared resources, run:
### 3.1.5 Certificate authority for VPN access to Gitlab
```
_select dev.shared # or prod.shared
cd terraform/shared
make validate
terraform import aws_s3_bucket.versioned $AZUL_VERSIONED_BUCKET
make
```
The invocation of `terraform import` puts the bucket we created
[earlier](#311-versioned-bucket-for-shared-state) under management by Terraform.
### 3.1.4 GitLab
A self-hosted GitLab instance is provided by the `gitlab` TerraForm component.
It provides the necessary CI/CD infrastructure for one or more Azul deployments
and protects access to that infrastructure through a VPN. That same VPN is also
used to access to Azul deployments with private APIs (see AZUL_PRIVATE_API in
[environment.py]). Like the `shared` component, the `gitlab` component belongs
to one main deployment in an AWS account (typically `dev` or `prod`) and is
shared by the other deployments colocated with that deployment. Unlike the
`shared` component, the `gitlab` component is optional.
[environment.py]: /environment.py
The following resources must be created manually before deploying the `gitlab`
component:
- An EBS volume needs to be created. See [gitlab.tf.json.template.py] and the
[section on CI/CD](#95-storage) for details.
- A certificate authority must be set up for VPN access. For details refer to
[section on GitLab CA](#912-setting-up-the-certificate-authority).
If you intend to set up a Gitlab instance for CI/CD of your Azul deployments,
a certificate authority must be set up. See the
[section on GitLab CA](#912-setting-up-the-certificate-authority) for details.
## 3.2 One-time manual configuration of deployments
Expand Down Expand Up @@ -1671,11 +1679,11 @@ currently one such instance for the `sandbox` and `dev` deployments and another
one for `prod`.

The GitLab instances are provisioned through the `gitlab` *Terraform component*.
For more information about *Terraform components*, refer to the example of the
`shared` component in [Cloudtrail event recording](#311-cloudtrail).
Within the `gitlab` component, the `dev.gitlab` child component provides a
single Gitlab EC2 instance that serves our CI/CD needs not only for `dev` but
for `integration` and `staging` as well. The `prod.gitlab` child component
For more information about *Terraform components*, refer the [section on shared
resources managed by Terraform](#313-shared-resources-managed-by-terraform).
Within the `gitlab` component, the `dev.gitlab` child component provides a
single Gitlab EC2 instance that serves our CI/CD needs not only for `dev` but
for `integration` and `staging` as well. The `prod.gitlab` child component
provides the Gitlab EC2 instance for `prod`.

To access the web UI of the Gitlab instance for `dev`, visit
Expand Down Expand Up @@ -1966,24 +1974,41 @@ The runner is the container that performs the builds. The instance is configured
to automatically start that container. The primary configuration for the runner
is in `/mnt/gitlab/runner/config/config.toml`. There is one catch, on a fresh
EBS volume that just been initialized, this file is missing, so the container
starts but doesn't advertise itself to Gitlab. The easiest way to create the
file is to kill the `gitlab-runner` container and the run it manually using
the `docker run` command from the instance user data in
[gitlab.tf.json.template.py], but replacing `--detach` with `-it` and adding
`register` at the end of the command. You will be prompted to supply a URL and
a token as [documented here](https://docs.gitlab.com/runner/register/). Specify
`docker` as the runner type and `docker:18.03.1-ce` as the image. Once the
container exits `config.toml` should have been created. Edit it and adjust the
`volumes` setting to read
starts but doesn't advertise any runners to Gitlab.

The easiest way to create the file is to kill the `gitlab-runner` container and
the run it manually using the `docker run` command from `/etc/rc.local`, but
replacing `--detach` with `-it` and adding `register` at the end of the
command. You will be prompted to supply a URL and a registration token as
[documented here](https://docs.gitlab.com/runner/register/).

Note that since version 15.0.0 of GitLab, there is no way to convert a runner
from shared to project-specific or vice versa. If you want to register a runner
reserved to a specific group, you must get the registration token from
the *CI/CD**Runners* page of the respective group. Runners reserved to a
project must be registered from the project's *Settings**CI/CD**Runners*
page. Shared runners are registered via *Admin**Overview**Runners*.

Specify `docker` as the runner type and `docker:18.03.1-ce` as the image. Once
the container exits `config.toml` should have been created. Edit it and adjust
the `volumes` setting to read

```
volumes = ["/var/run/docker.sock:/var/run/docker.sock", "/cache", "/etc/gitlab-runner/etc:/etc/gitlab"]
```

Comparing `config.toml` between an existing instance and the new one doesn't
hurt either. Finally, reboot the instance or manually start the container using
the command from [gitlab.tf.json.template.py] verbatim. The Gitlab UI should
now show the runner.
If you already have a GitLab instance top copy `config.toml` from, do that and
register the runners as described above. Copy the runner tokens from the newly
added runners at the end of config.toml to the preexisting runners. Then
discard the newly added runners from the file. For another instance's
`config.toml` to work on a new instance, the only piece of information that
needs to be updated is the runner token. That's because the runner token is
derived from the registration token which is different between the two
instances.

Finally, reboot the instance. Alternatively, manually start the container using
the command from `/etc/rc.local` verbatim. Either way, the Gitlab UI should now
show the runners.


## 9.8 The Gitlab runner image for Azul
Expand Down
42 changes: 42 additions & 0 deletions UPGRADING.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,48 @@ branch that does not have the listed changes, the steps would need to be
reverted. This is all fairly informal and loosely defined. Hopefully we won't
have too many entries in this file.


#4122 Create AnVIL deployments of Azul and Data Browser
=======================================================

Everyone
~~~~~~~~

In personal deployments dedicated to AnVIL, set ``AZUL_BILLING`` to ``'anvil'``,
set it to ``'hca'`` in all other personal deployments.

In personal deployments, set ``AZUL_VERSIONED_BUCKET`` and ``AZUL_S3_BUCKET`` to
the same value as in the ``sandbox`` deployment.

In personal deployments, remove ``AZUL_URL_REDIRECT_FULL_DOMAIN_NAME`` if its
value is (``'{AZUL_DEPLOYMENT_STAGE}.{AZUL_URL_REDIRECT_BASE_DOMAIN_NAME}'``.

In ``environment.py`` for personal deployments, initialize the ``is_sandbox``
variable to ``False``, replacing the dynamic initializer, and copy the
definition of the ``AZUL_IS_SANDBOX`` environment variable from sandbox'
``environment.py``. This will make it easier in the future to synchronize your
deployments' ``environment.py`` with that of the sandbox.

Operator
~~~~~~~~

Run ::

_select dev.shared # or prod.shared
cd terraform/shared
make validate
terraform import aws_s3_bucket.versioned $AZUL_VERSIONED_BUCKET
terraform import aws_s3_bucket_versioning.versioned $AZUL_VERSIONED_BUCKET
terraform import aws_s3_bucket_lifecycle_configuration.versioned $AZUL_VERSIONED_BUCKET
terraform import aws_api_gateway_account.shared api-gateway-account
terraform import aws_iam_role.api_gateway azul-api_gateway

Repeat for ``shared.prod``.

Redeploy the ``shared.dev`, ``gitlab.dev``, ``shared.prod`, and ``gitlab.prod``
components to apply the needed changes to any resources.


#4224 Index ENCODE snapshot as PoC
==================================

Expand Down
4 changes: 0 additions & 4 deletions common.mk
Original file line number Diff line number Diff line change
Expand Up @@ -78,10 +78,6 @@ check_aws: check_python
check_branch: check_python
python $(project_root)/scripts/check_branch.py

.PHONY: check_branch_personal
check_branch_personal: check_python
python $(project_root)/scripts/check_branch.py --personal

%.json: %.json.template.py check_python .FORCE
python $< $@
.FORCE:
Expand Down
1 change: 1 addition & 0 deletions deployments/anvilbox/.example.environment.local.py
Loading

0 comments on commit 11d2bb0

Please sign in to comment.