Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harmony 571 daac regression tests #2

Merged
merged 19 commits into from
Mar 8, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,5 @@ output/*
.env
.deployenv
.identity
.netrc

156 changes: 124 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,58 +2,150 @@

# Running the Tests

Each test suite is run in a separate Docker container using a temporary image built at test time.
`conda` is used for dependency management. The two steps for each test suite are building and
running the associated image.

## Install Prerequisites

* [Docker](https://www.docker.com/get-started)

## Build the Image & Run the Container
## Set Up Authentication
Create a .netrc file in the `test` directory of this repository. It must contain
credentials for on logging into [Earthdata Login production](https://urs.earthdata.nasa.gov)
for tests run against Harmony production or
[Earthdata Login UAT](https://uat.urs.earthdata.nasa.gov) for Harmony SIT and
UAT environments.

Example `test/.netrc` that can log into both environments:

machine urs.earthdata.nasa.gov login SOME_USER password SOME_PASSWORD
machine uat.urs.earthdata.nasa.gov login SOME_UAT_USER password SOME_UAT_PASSWORD

This file will be copied to into the docker images and used when running the
notebooks. The `.gitignore` file will prevent this from being committed to the
git project, but we recommend providing a user account with minimal privileges
for test purposes.

## Build the Images

$ cd test
$ make image
$ make run
$ make images
bilts marked this conversation as resolved.
Show resolved Hide resolved

By default this will run the tests against the UAT environment. To run
against a specific environment:
`make -j images` can be used to make the images in parallel (faster), although this may lead to
Docker Desktop instabilities

$ make run environment=prod
## Run the Notebooks

Valid environment values are: sbx, sit, uat, prod.
$ cd test
$ export HARMONY_HOST_URL=<url of Harmony in the target environment>
$ ./run_notebooks.sh

# Notebook Development
Outputs will be in the `output` directory.
`HARMONY_HOST_URL` for SIT would be `https://harmony.sit.earthdata.nasa.gov`

# Running the Tests in an AWS for Same-Region Data Access

Harmony runs in the AWS us-west-2 region and offers additional access methods for
clients running in the same region. We have provided a Terraform deployment to
ease test execution in this region.

**Note** - this section applies to the contents of the `test` directory
## Create Terraform Autovars File
In the `terraform` directory create a file called `key.auto.tfvars` and
add a single line indicating the name of the ssh public key file that
should be used for the EC2 instance that runs the notebooks.

These prerequisites and steps are only needed if you want to do local
development on the project.
This file name is the name of the S3 file created in the Harmony ssh key bucket as described in the Harmony project README.md.

## Prerequisites
Example:
```
key_name = "harmony-sit-my-key-name"
```

* [pyenv](https://github.com/pyenv/pyenv)
* [pyenv-virtualenv](https://github.com/pyenv/pyenv-virtualenv)
* [poetry](https://python-poetry.org/)
## Execute the Tests

## Install Python 3.8 (if needed)
**Important**: The following steps allocate resources in AWS. To ease repeated
tests and troubleshooting, they also don't automatically clean up the instance
they create. See "Clean Up Test Resources" to ensure you are minimizing costs
by cleaning up resources.

$ pyenv install 3.8.5
First create a `.env` file in the top level directory by copying in the `dot_env` file and filling
in the proper values. Then execute the following.

## Install dependencies
$ cd script
$ export HARMONY_ENVIRONMENT=<uat|sit|sandbox|prod>
$ ./test.sh

$ pyenv virtualenv 3.8.5 harmony-rt
$ pyenv local harmony-rt
$ pyenv activate harmony-rt
$ poetry install
$ pyenv rehash
Output will be in the bucket specified with the `REGRESSION_TEST_OUTPUT_BUCKET`
environment variable with a folder for each notebook.

## Run the notebooks
## Clean Up Test Resources

$ ./run_notebooks harmony_host_url=<url of Harmony in the target environment>

e.g.,

$ ./run_notebooks harmony_host_url="https://harmony.sit.earthdata.nasa.gov"
The prior scripts do not clean up allocated resources. To remove the resources
used to run the test, run.

Outputs will be in the `output` directory
$ terraform destroy

## Start JupyterLab
Tests outputs are not automatically deleted.

# Notebook Development

$ jupyter-lab
Notebooks and support files should be placed in a subdirectory of the `test` directory.

For example, in the `harmony` directory we have

```
├── Harmony.ipynb
├── __init__.py
├── environment.yaml
└── util.py
```

Notebook dependencies should be listed in file named `environment.yaml` at the top level of the
subdirectory. The `name` field in the file should be `papermill`. For example:

```yaml
name: papermill
channels:
- conda-forge
- defaults
dependencies:
- python=3.7
- jupyter
- requests
- netcdf4
- matplotlib
- papermill
- pytest
- ipytest
```

## Generating a Dependency Lockfile
To increase runtime efficiency, the build relies on [conda-lock](https://pypi.org/project/conda-lock/). This is used to create a dependency lockfile that can be used
by conda to more efficiently load dependencies. The Docker build expects a lockfile
named `conda-linux-64.lock` to exist at the top level of a notebook directory (next to
the `environment.yaml` file).

To build the lockflie install `conda-lock` by following the directions provided on its website. Then generate the lockfile for your notebook by running the following:
```
conda-lock -f environment.yaml -p linux-64
```

Test notebooks should not rely on other forms of dependency management or expect user input.
They _should_ utilize the `harmony_host_url` global variable to communicate with Harmony
or to determine the Harmony environment. This variable is set by `papermill` - see the
`Harmony.ipynb` for how to make use of this variable. More information can be found
in the [papermill](https://papermill.readthedocs.io/en/latest/usage-parameterize.html)
documentation on setting parameters.

New test suites must be added to the `Makefile`. A new `name-image` target (where name is the name of
the test suite) should be added (see the `harmony-image` example), and the new image target
should be added as a dependency of the `images` target. The docker image should have a name like
`harmony/regression-tests-<base_name>`, where `base_name` is the name of the test suite.

Finally, add the image base name to the `images` array on line 6 of the `run_notebooks.sh` file.
For instance, if the image is named `harmony/regression-tests-foo`, then we would add `foo` to the
array.

The `run_notebooks.sh` file can be used as described above to run the test suite. Notebooks are
bilts marked this conversation as resolved.
Show resolved Hide resolved
expected to exit with a non-zero exit code on failure when run from `papermill`.
6 changes: 6 additions & 0 deletions dot_env
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
REGRESSION_TEST_OUTPUT_BUCKET=<some bucket you have created, (aws s3 mb ____)>
AWS_ACCESS_KEY_ID=<KEY ID>
AWS_SECRET_ACCESS_KEY=<SECRET KEY>
EDL_USER=harmony_dev_user
EDL_PASSWORD=<HARMONY DEV USER PASSWORD>
SECRET_KEY_FILE=<path to unencrypted (no passphrase) private key file>
8 changes: 6 additions & 2 deletions script/deploy-from-docker.sh
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,15 @@ function retry {
return 0
bilts marked this conversation as resolved.
Show resolved Hide resolved
}

# copy the test directory to the EC2 instance
retry 5 scp -v -F sshconfig -i .identity -r test "ec2-user@${INSTANCE_ID}:"
# create a .netrc file on the EC2 instance
netrc_default="machine urs.earthdata.nasa.gov login ${EDL_USER} password ${EDL_PASSWORD}\nmachine uat.urs.earthdata.nasa.gov login ${EDL_USER} password ${EDL_PASSWORD}"
retry 5 ssh -F sshconfig -i .identity "ec2-user@${INSTANCE_ID}" "echo -e \"${netrc_default}\" > ./test/.netrc"
# It can take a couple minutes for docker to be available on the instance
retry 10 ssh -F sshconfig -i .identity "ec2-user@${INSTANCE_ID}" "cd test && make image"
retry 10 ssh -F sshconfig -i .identity "ec2-user@${INSTANCE_ID}" "cd test && make -j images"
set +e
ssh -F sshconfig -i .identity "ec2-user@${INSTANCE_ID}" "cd test && make run HARMONY_HOST_URL=${HARMONY_HOST_URL}"
ssh -v -F sshconfig -i .identity "ec2-user@${INSTANCE_ID}" "cd test && export HARMONY_HOST_URL=${HARMONY_HOST_URL} && ./run_notebooks.sh"
exit_code=$?
set -e
# copy the output to here
Expand Down
59 changes: 33 additions & 26 deletions script/test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,36 @@ function get_elb {
echo $(aws elbv2 describe-load-balancers | jq --arg host "harmony-$HARMONY_ENVIRONMENT-frontend" '.LoadBalancers[] | select(.LoadBalancerName == $host) | .DNSName' | tr -d '"')
}

cd ..
bilts marked this conversation as resolved.
Show resolved Hide resolved

deployenv='.deployenv'
if [ -e $deployenv ]; then
rm $deployenv
fi

if [ -e .env ]; then
echo "Using .env file"
set -o allexport
source .env
set +o allexport
cp .env $deployenv
else
echo "AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}" >> $deployenv
echo "AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}" >> $deployenv
echo "REGRESSION_TEST_OUTPUT_BUCKET=${REGRESSION_TEST_OUTPUT_BUCKET}" >> $deployenv
fi

export AWS_DEFAULT_REGION="${AWS_DEFAULT_REGION:-us-west-2}"
echo "AWS_DEFAULT_REGION=${AWS_DEFAULT_REGION}" >> $deployenv

# create the test environment
cd ./terraform
terraform init
terraform apply -auto-approve -var "environment_name=${HARMONY_ENVIRONMENT}"
instance_id=$(terraform output -json harmony_regression_test_instance_id | jq -r .id)

echo "intance_id = ${instance_id}"

case $HARMONY_ENVIRONMENT in
uat)
harmony_host_url="https://harmony.uat.earthdata.nasa.gov"
Expand All @@ -28,13 +58,7 @@ sit|sandbox)
;;
esac

output_bucket="${REGRESSION_TEST_OUTPUT_BUCKET}"

# create the test environment
cd ../terraform
terraform init
terraform apply -auto-approve -var "environment_name=${HARMONY_ENVIRONMENT}"
instance_id=$(terraform output -json harmony_regression_test_instance_id | jq -r .id)
echo "harmony host url: ${harmony_host_url}"

cd ..

Expand All @@ -48,32 +72,15 @@ else
fi
chmod 0600 $identity

AWS_DEFAULT_REGION="${AWS_DEFAULT_REGION:-us-west-2}"

deployenv='.deployenv'
if [ -e $deployenv ]; then
rm $deployenv
fi

if [ -e .env ]; then
set -o allexport
source .env
set +o allexport
cp .env $deployenv
else
echo "AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}" >> $deployenv
echo "AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}" >> $deployenv
fi

echo "INSTANCE_ID=${instance_id}" >> $deployenv
echo "HARMONY_HOST_URL=${harmony_host_url}" >> $deployenv
echo "AWS_DEFAULT_REGION=${AWS_DEFAULT_REGION}" >> $deployenv
echo "REGRESSION_TEST_OUTPUT_BUCKET=${output_bucket}" >> $deployenv

./script/build-image.sh

docker run --rm \
-v $(pwd):/tmp \
-e EDL_USERNAME \
-e EDL_PASSWORD \
harmony/regression-tests \
'./script/deploy-from-docker.sh'

Expand Down
1 change: 1 addition & 0 deletions sshconfig
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@ Host i-*
StrictHostKeyChecking no
UserKnownHostsFile /dev/null
LogLevel ERROR
ServerAliveInterval 60
bilts marked this conversation as resolved.
Show resolved Hide resolved
6 changes: 3 additions & 3 deletions terraform/inputs.tf
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,12 @@ variable "aws_region" {
}

variable "instance_type" {
description = "EC2 instance type for the harmony application"
default = "t2.medium"
description = "EC2 instance type for the regression test runner"
default = "t2.xlarge"
}

variable "key_name" {
description = "Key pair name to use for the harmony EC2 instance."
description = "Key pair name to use for the harmony regression test instance."
default = "bamboo"
}

Expand Down
4 changes: 4 additions & 0 deletions terraform/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,10 @@ resource "aws_instance" "harmony_regression_test" {

user_data = file("${path.module}/harmony-user-data.tmpl")

root_block_device {
volume_size = 256
}

vpc_security_group_ids = [aws_security_group.harmony_regression_test.id]
tags = {
Name = "harmony-regression-test-${var.environment_name}"
Expand Down
26 changes: 15 additions & 11 deletions test/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,16 +1,20 @@
FROM python:3.8.7-buster
FROM continuumio/miniconda3:latest

WORKDIR /opt/harmony
ARG sub_dir
ARG notebook
ENV env_sub_dir=$sub_dir
ENV env_notebook=$notebook

RUN pip install poetry
RUN mkdir -p ./output
WORKDIR /root

COPY pyproject.toml .
COPY poetry.lock .
RUN poetry install
RUN conda config --add channels conda-forge
RUN pip install conda-lock
RUN conda install conda-lock

COPY notebooks ./notebooks
COPY harmony ./harmony
COPY run_notebooks.sh .
COPY .netrc .netrc
RUN mkdir ./${sub_dir}
COPY ${sub_dir}/conda-linux-64.lock ./${sub_dir}

ENTRYPOINT ./run_notebooks.sh -p harmony_host_url $harmony_host_url
RUN conda create --name papermill --file ./${sub_dir}/conda-linux-64.lock

ENTRYPOINT export PATH=/opt/conda/envs/papermill/bin:$PATH; mkdir /root/output/${env_sub_dir}; conda activate papermilll; papermill --cwd ${env_sub_dir} ${env_sub_dir}/${env_notebook} /root/output/${env_sub_dir}/Results.ipynb -p harmony_host_url $harmony_host_url
12 changes: 7 additions & 5 deletions test/Makefile
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
.PHONY: run
harmony-image: Dockerfile harmony/environment.yaml
docker build -t harmony/regression-tests-harmony:latest -f ./Dockerfile --build-arg notebook=Harmony.ipynb --build-arg sub_dir=harmony .

image: pyproject.toml poetry.lock Dockerfile
docker build -t harmony/regression-tests:latest .
# asf-gdal-image: Dockerfile gdal_subsetter/environment.yaml
# docker build -t harmony/regression-tests-asf-gdal:latest -f ./Dockerfile --build-arg notebook=GDAL_Subsetter_Regression.ipynb --build-arg sub_dir=gdal_subsetter .

run:
docker run -v ${PWD}/output:/opt/harmony/output --env harmony_host_url="${HARMONY_HOST_URL}" harmony/regression-tests:latest
harmony-regression-image: Dockerfile harmony-regression/environment.yaml
docker build -t harmony/regression-tests-harmony-regression:latest -f ./Dockerfile --build-arg notebook=HarmonyRegression.ipynb --build-arg sub_dir=harmony-regression .

images: harmony-image harmony-regression-image
Loading