Skip to content

Commit

Permalink
deploy: make deployment example work out of the box
Browse files Browse the repository at this point in the history
  • Loading branch information
mmalenic committed Sep 6, 2024
1 parent 59a4aa3 commit c45b092
Show file tree
Hide file tree
Showing 6 changed files with 102 additions and 41 deletions.
30 changes: 21 additions & 9 deletions deploy/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,15 +16,17 @@ The CDK code in this directory constructs a CDK app from [`HtsgetLambdaStack`][h

These are general settings for the CDK deployment.

| Name | Description | Type |
| ------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------- |
| <span id="config">`config`</span> | The location of the htsget-rs server config. This must be specified. This config file configures the htsget-rs server. See [htsget-config] for a list of available server configuration options. | `string` |
| <span id="domain">`domain`</span> | The domain name for the Route53 Hosted Zone that the htsget-rs server will be under. This must be specified. A hosted zone with this name will either be looked up or created depending on the value of [`lookupHostedZone?`](#lookupHostedZone). | `string` |
| <span id="authorizer">`authorizer`</span> | Deployment options related to the authorizer. Note that this option allows specifying an AWS [JWT authorizer][jwt-authorizer]. The JWT authorizer automatically verifies tokens issued by a Cognito user pool. | [`HtsgetJwtAuthSettings`](#htsgetjwtauthsettings) |
| <span id="subDomain">`subDomain?`</span> | The domain name prefix to use for the htsget-rs server. Together with the [`domain`](#domain), this specifies url that the htsget-rs server will be reachable under. Defaults to `"htsget"`. | `string` |
| <span id="s3BucketResources">`s3BucketResources`</span> | The buckets to serve data from. If this is not specified, this defaults to `[]`. This affects which buckets are allowed to be accessed by the policy actions which are `["s3:List*", "s3:Get*"]`. Note that this option alone does not create buckets, it only gives permission to access them, see the `createS3Buckets` option. This option must be specified to allow `htsget-rs` to access data in the buckets. | `string[]` |
| <span id="lookupHostedZone">`lookupHostedZone?`</span> | Whether to lookup the hosted zone with the domain name. Defaults to `true`. If `true`, attempts to lookup an existing hosted zone using the domain name. Set this to `false` if you want to create a new hosted zone with the domain name. | `boolean` |
| <span id="lookupHostedZone">`createS3Buckets?`</span> | A list of buckets to create. Defaults to no buckets. Buckets are created with [`RemovalPolicy.RETAIN`](https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.RemovalPolicy.html). This also copies the example data under the `data` directory to those buckets. | `string[]` |
| Name | Description | Type |
|--------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------|
| <span id="config">`config`</span> | The location of the htsget-rs server config. This must be specified. This config file configures the htsget-rs server. See [htsget-config] for a list of available server configuration options. | `string` |
| <span id="domain">`domain`</span> | The domain name for the Route53 Hosted Zone that the htsget-rs server will be under. This must be specified. A hosted zone with this name will either be looked up or created depending on the value of [`lookupHostedZone?`](#lookupHostedZone). | `string` |
| <span id="authorizer">`authorizer`</span> | Deployment options related to the authorizer. Note that this option allows specifying an AWS [JWT authorizer][jwt-authorizer]. The JWT authorizer automatically verifies tokens issued by a Cognito user pool. | [`HtsgetJwtAuthSettings`](#htsgetjwtauthsettings) |
| <span id="subDomain">`subDomain?`</span> | The domain name prefix to use for the htsget-rs server. Together with the [`domain`](#domain), this specifies url that the htsget-rs server will be reachable under. Defaults to `"htsget"`. | `string` |
| <span id="s3BucketResources">`s3BucketResources`</span> | The buckets to serve data from. If this is not specified, this defaults to `[]`. This affects which buckets are allowed to be accessed by the policy actions which are `["s3:List*", "s3:Get*"]`. Note that this option does not create buckets, it only gives permission to access them, see the `createS3Buckets` option. This option must be specified to allow `htsget-rs` to access data in buckets that are not created in this stack. | `string[]` |
| <span id="lookupHostedZone">`lookupHostedZone?`</span> | Whether to lookup the hosted zone with the domain name. Defaults to `true`. If `true`, attempts to lookup an existing hosted zone using the domain name. Set this to `false` if you want to create a new hosted zone with the domain name. | `boolean` |
| <span id="createS3Bucket">`createS3Bucket?`</span> | Whether to create a test bucket. Defaults to true. Buckets are created with [`RemovalPolicy.RETAIN`](https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.RemovalPolicy.html). The correct access permissions are automatically added. | `boolean` |
| <span id="bucketName">`bucketName?`</span> | The name of the bucket created using `createS3Bucket`. The name defaults to an automatically generated CDK name, use this option to override that. This option only has an affect is `createS3Buckets` is true. | `string` |
| <span id="copyTestData">`copyTestData?`</span> | Whether to copy test data into the bucket. Defaults to true. This copies the example data under the `data` directory to those buckets. This option only has an affect is `createS3Buckets` is true. | `boolean` |

#### HtsgetJwtAuthSettings

Expand Down Expand Up @@ -70,6 +72,11 @@ npm install

### Deploy to AWS

> [!IMPORTANT]
> The default deployment is designed to work out of the box. A bucket with a CDK-generated name is created with test
> data from the [`data`][data] directory. All deployment settings can be tweaked using the [`settings.ts`][htsget-settings].
> The only option that must be specified in the `domain`, which determines the domain name to serve htsget-rs at.
CDK should be bootstrapped once, if this hasn't been done before:

```sh
Expand All @@ -82,6 +89,10 @@ Then to deploy the stack, run:
npx cdk deploy
```

> [!WARNING]
> By default this deployment will create a public instance of htsget-rs. Anyone will be able to query the server
> without authorizing unless you modify the `HtsgetJwtAuthSettings` settings.
### Testing the endpoint

When the deployment is finished, the htsget endpoint can be tested by querying it. If a JWT authorizer is configured,
Expand Down Expand Up @@ -176,3 +187,4 @@ and a [MinIO][minio] deployment.
[rust]: https://www.rust-lang.org/tools/install
[zig]: https://ziglang.org/
[zig-getting-started]: https://ziglang.org/learn/getting-started/
[data]: ../data
17 changes: 9 additions & 8 deletions deploy/bin/settings.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,18 @@ import { HtsgetSettings } from "../lib/htsget-lambda-stack";
* Settings to use for the htsget deployment.
*/
export const SETTINGS: HtsgetSettings = {
config: "config/dev_umccr.toml",
config: "config/example_deploy.toml",
// Specify the domain to serve htsget-rs under.
domain: "dev.umccr.org",
subDomain: "htsget",
s3BucketResources: [
"arn:aws:s3:::org.umccr.demo.sbeacon-data/*",
"arn:aws:s3:::org.umccr.demo.htsget-rs-data/*",
],
lookupHostedZone: true,
createS3Buckets: [],
s3BucketResources: [],
lookupHostedZone: false,
createS3Bucket: true,
copyTestData: true,
// Override the bucket name.
// bucketName: "bucket",
jwtAuthorizer: {
// Set this to true if you want a public instance.
// Set this to false if you want a private instance.
public: false,
// jwtAudience: ["audience"],
// cogUserPoolId: "user-pool-id",
Expand Down
20 changes: 20 additions & 0 deletions deploy/config/example_deploy.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
ticket_server_cors_allow_headers = "All"
ticket_server_cors_allow_origins = []
ticket_server_cors_allow_methods = "All"
ticket_server_cors_allow_credentials = true
ticket_server_cors_max_age = 300

data_server_enabled = false

name = "umccr-htsget-rs"
version = "0.1"
organization_name = "UMCCR"
organization_url = "https://umccr.org/"
contact_url = "https://umccr.org/"
documentation_url = "https://github.com/umccr/htsget-rs"
environment = "dev"

[[resolvers]]
regex = '^(?P<bucket>.*?)/(?P<key>.*)$'
substitution_string = '$key'
storage = 'S3'
1 change: 0 additions & 1 deletion deploy/examples/local_storage/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ curl http://127.0.0.1:8080/reads/data/bam/htsnexus_test_NA12878
```

Which outputs:

```sh
{
"htsget": {
Expand Down
13 changes: 7 additions & 6 deletions deploy/examples/minio/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,13 @@
[MinIO][minio] can be used with htsget-rs by configuring the [storage type][storage] as `S3` and setting the `endpoint` to the MinIO server.
There are a few specific configuration options that need to be considered to use MinIO with htsget-rs, and those include:

- The standard [AWS environment variables][env-variables] for connecting to AWS services must be set, and configured to match those
used by MinIO.
_ This means that htsget-rs expects an `AWS_DEFAULT_REGION` to be set, which must match the region used by MinIO (by default us-east-1).
_ It also means that the `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` must be set to match the credentials used by MinIO.
- If using virtual-hosted style [addressing][virtual-addressing] instead of path style [addressing][path-addressing], `MINIO_DOMAIN` must be
set on the MinIO server and DNS resolution must allow accessing the MinIO server using `bucket.<MINIO_DOMAIN>`. \* Path style addressing can be used instead by setting `path_style = true` under the htsget-rs resolvers storage type.
* The standard [AWS environment variables][env-variables] for connecting to AWS services must be set, and configured to match those
used by MinIO.
* This means that htsget-rs expects an `AWS_DEFAULT_REGION` to be set, which must match the region used by MinIO (by default us-east-1).
* It also means that the `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` must be set to match the credentials used by MinIO.
* If using virtual-hosted style [addressing][virtual-addressing] instead of path style [addressing][path-addressing], `MINIO_DOMAIN` must be
set on the MinIO server and DNS resolution must allow accessing the MinIO server using `bucket.<MINIO_DOMAIN>`.
* Path style addressing can be used instead by setting `path_style = true` under the htsget-rs resolvers storage type.

The caveats around the addressing style occur because there are two different addressing styles for S3 buckets, path style, e.g.
`http://minio:9000/bucket`, and virtual-hosted style, e.g. `http://bucket.minio:9000`. AWS has declared path style addressing
Expand Down
62 changes: 45 additions & 17 deletions deploy/lib/htsget-lambda-stack.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,14 @@ import { STACK_NAME } from "../bin/htsget-lambda";
import * as TOML from "@iarna/toml";
import { readFileSync } from "fs";

import { Duration, RemovalPolicy, Stack, StackProps, Tags } from "aws-cdk-lib";
import {
CfnOutput,
Duration,
RemovalPolicy,
Stack,
StackProps,
Tags,
} from "aws-cdk-lib";
import { Construct } from "constructs";

import { UserPool } from "aws-cdk-lib/aws-cognito";
Expand Down Expand Up @@ -56,10 +63,11 @@ export type HtsgetSettings = {
subDomain?: string;

/**
* The buckets to serve data from. If this is not specified, this defaults to `[]`. This affects which buckets are
* allowed to be accessed by the policy actions which are `["s3:List*", "s3:Get*"]`. Note that this option alone
* does not create buckets, it only gives permission to access them, see the `createS3Buckets` option.
* This option must be specified to allow `htsget-rs` to access data in the buckets.
* The buckets to serve data from. If this is not specified, this defaults to `[]`.
* This affects which buckets are allowed to be accessed by the policy actions which are `["s3:List*", "s3:Get*"]`.
* Note that this option does not create buckets, it only gives permission to access them, see the `createS3Buckets`
* option. This option must be specified to allow `htsget-rs` to access data in buckets that are not created in
* this stack.
*/
s3BucketResources: string[];

Expand All @@ -76,11 +84,23 @@ export type HtsgetSettings = {
lookupHostedZone?: boolean;

/**
* A list of buckets to create. Defaults to no buckets. Buckets are created with
* Whether to create a test bucket. Defaults to true. Buckets are created with
* [`RemovalPolicy.RETAIN`](https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.RemovalPolicy.html).
* This also copies the example data under the `data` directory to those buckets.
* The correct access permissions are automatically added.
*/
createS3Bucket?: boolean;

/**
* The name of the bucket created using `createS3Bucket`. The name defaults to an automatically generated CDK name,
* use this option to override that. This option only has an affect is `createS3Buckets` is true.
*/
bucketName?: string;

/**
* Whether to copy test data into the bucket. Defaults to true. This copies the example data under the `data`
* directory to those buckets. This option only has an affect is `createS3Buckets` is true.
*/
createS3Buckets?: string[];
copyTestData?: boolean;
};

/**
Expand Down Expand Up @@ -169,22 +189,26 @@ export class HtsgetLambdaStack extends Stack {
resources: settings.s3BucketResources ?? [],
});

if (settings.createS3Buckets) {
for (const name of settings.createS3Buckets ?? []) {
const bucket = new Bucket(this, "Bucket", {
blockPublicAccess: BlockPublicAccess.BLOCK_ALL,
encryption: BucketEncryption.S3_MANAGED,
enforceSSL: true,
removalPolicy: RemovalPolicy.RETAIN,
bucketName: name,
});
if (settings.createS3Bucket) {
const bucket = new Bucket(this, "Bucket", {
blockPublicAccess: BlockPublicAccess.BLOCK_ALL,
encryption: BucketEncryption.S3_MANAGED,
enforceSSL: true,
removalPolicy: RemovalPolicy.RETAIN,
bucketName: settings.bucketName,
});

if (settings.copyTestData) {
const dataDir = path.join(__dirname, "..", "..", "data");
new BucketDeployment(this, "DeployFiles", {
sources: [Source.asset(dataDir)],
destinationBucket: bucket,
});
}

s3BucketPolicy.addResources(`arn:aws:s3:::${bucket.bucketName}/*`);

new CfnOutput(this, "HtsgetBucketName", { value: bucket.bucketName });
}

lambdaRole.addManagedPolicy(
Expand Down Expand Up @@ -240,6 +264,10 @@ export class HtsgetLambdaStack extends Stack {
jwtAudience: settings.jwtAuthorizer.jwtAudience ?? [],
},
);
} else {
console.warn(
"This will create an instance of htsget-rs that is public! Anyone will be able to query the server without authorization.",
);
}

let hostedZone;
Expand Down

0 comments on commit c45b092

Please sign in to comment.