-
Notifications
You must be signed in to change notification settings - Fork 46
Deployment Instructions Scalable Deployment
These instructions assume that your computer runs Linux, MacOS, the Windows Subsystem for Linux, or the Cygwin posix environment on Windows. If you are have the expertise to run the Beiwe Backend directly on Windows and encounter an issue that may be a compatibility issue please report it as a bug with the appropriate caveat. We will try and assist you.
There are many reasons to run a Beiwe cluster instead of an individual server, but the chief one is this: Onnela Lab runs a Beiwe cluster. All work is tested on a cluster deployment, and all application updates are handled by a script maintained in the beiwe-backend repository. By taking on the extra up-front work to set up a cluster you will receive first class support and, inevitably, be saved from a couple of unintended bugs.
- If you don't have an Amazon Web Services account, go to https://aws.amazon.com and click "Create an AWS account".
- Choose an AWS region that offers the services you need: S3, EC2, RDS, and Elastic Beanstalk. (Comparison table: https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/) As of 2020 this appears to be all regions, so pick one that is relatively close to the participants in your studies.
You need to run your Beiwe deployment with a domain name; you can either buy one (like example.com) or obtain one from your organization (e.g., beiwe.myresearchcenter.myuniversity.edu). You need the domain name because Beiwe requires an SSL certificate or mobile devices will not be able to upload data.
Before you can do anything with the AWS Command-Line Interface (CLI), you need to use the AWS web interface to create an Identity and Access Management (IAM) user with the AdministratorAccess
policy attached. Here's how to do that:
- Log in to the AWS web GUI (https://signin.aws.amazon.com/console), and go to "Services" > "Security, Identity, & Compliance" > "IAM"
- On the left sidebar, click "Users" and then the "Add User" button. Fill in "User name" with a value of your choice, select "Programmatic access", and click "Next". On the next screen, select "Attach existing policies directly", and check the box next to
AdministratorAccess
. Click "Next", then review, and click "Create User". - Download or copy the generated credentials ("User", "Access key ID", and "Secret access key") and save them; you will need them in a later step!
You need a deployment key so that you can SSH from your local machine into the AWS servers you're setting up. Here's how to generate one:
- Go to "Services" > "Compute" > "EC2". Then in the left sidebar, under "Network & Security", click "Key Pairs".
- Click the "Create Key Pair" button, give it a name of your choice, and click "Create". Your browser should automatically download the private key as a .PEM file.
- Store the private key in an appropriate location on your local computer; on a Linux or macOS machine a good location is in the
~/.ssh/
directory. - You will need to set permissions on the key, do that with this command
chmod 600 path/to/your/key.pem
Sentry.io is a platform for monitoring errors and providing developers with runtime information. The Sentry credentials are optional, but we strongly recommend including them. We can offer much more support if you have Sentry configured.
- Create an account on Sentry.io. You can use the free tier; the primary limitation of that tier is that it only lets you create one login account.
- Create a project
- Once you have created or selected a project, click "Project Settings" in the top right corner, then in the left side menu, under "Data", click "Client Keys (DSN)".
- Save the "DSN" and "DSN (Public)" values- you'll need them in a later step.
-
git clone
thebeiwe-backend
repository to your local computer.git checkout
themain
branch, and navigate to thecluster-management/general_configuration
directory. -
Make sure you are using Python 3.8 inside the
beiwe-backend
directory.- Note: Python 3.8 goes out of support in October 2024 (2024-10), we will be upgrading before that and providing guides like we did for previous platform version upgrades. (and please file a bug report if this documentation is out of date!)
- Most OSes have fully retired Python 2, but if you are stuck on an old platform you may need to try
python3
for your python executable and possibly thepip3
command to instal software dependencies. - It is very strongly recommended that you use a modern platform and a python virtual environment to install packages. This guide is not the place to go into detail, python virtual environments can be easily searched for online.
-
Copy the file
aws_credentials.example.json
to create a new file calledaws_credentials.json
. Open the new fileaws_credentials.json
and fill in the appropriate values:-
AWS_ACCESS_KEY_ID
is the "Access key ID" from the IAM user you created in the step above. -
AWS_SECRET_ACCESS_KEY
is the "Secret access key" from the IAM user you created in the step above.
-
-
Copy the file
global_configuration.example.json
to create a new file calledglobal_configuration.json
. Open the newglobal_configuration.json
file and fill in the appropriate values:-
DEPLOYMENT_KEY_NAME
is the name of the deployment key you generated above (don't include the ".pem" filename extension). -
DEPLOYMENT_KEY_FILE_PATH
is the absolute filepath (including the filename and ".pem" extension) for where the deployment key is stored on your local computer. -
VPC_ID
: when you select an AWS region, AWS automatically creates one Virtual Private Cloud (VPC) for you. To find its ID, go to "Services" > "Networking & Content Delivery" > "VPC". Under "Resources", you should see a link for "1 VPC" (or a higher number, if you have manually created additional VPCs); click on that link, and then get your VPC ID from the table; it should be formatted likevpc-6789abcd
. -
AWS_REGION
is Amazon's lowercase, hyphenated name for the region you're using. For example, if your region is "US East (Ohio)", this value should beus-east-2
; if your region is "Asia Pacific (Sydney)", this value should beap-southeast-2
. You can look up the lowercase, hyphenated names here: http://docs.aws.amazon.com/general/latest/gr/rande.html#elasticbeanstalk_region -
SYSTEM_ADMINISTRATOR_EMAIL
is the email address that AWS (not Sentry) will administrative alerts, performance alarm notifications, deployment operation events, etc. This can be whatever you choose, but it should be an email address that the system administrator checks regularly.
-
-
Make sure you have
pip
, the CLI Python package manager (https://pip.pypa.io/en/stable/), available on your computer. Then, in your local copy of thebeiwe-backend
repo,cd
into thecluster_management/
directory and run the command below.- If you are using an os-provided Python 3 instance and are not using a Virtual Environment for package installation you should add the --user flag to this command. Installing packages directly into your operating system requires the use of
sudo
, may simply not work, will causepip
to complain, and has the capability of breaking other software on your computer that has expectations about system software packages.* To reiterate, we recommended that you use a modern platform and a python virtual environment to install packages.
$ pip install -r launch_requirements.txt
- If you are using an os-provided Python 3 instance and are not using a Virtual Environment for package installation you should add the --user flag to this command. Installing packages directly into your operating system requires the use of
-
Still in the
cluster_management/
directory, run:$ python launch_script.py -help-setup-new-environment
Give a name for your environment when prompted to do so; from here on, we'll refer to that name you provided as
[YOUR-ENV-NAME]
. The script will then automatically create two more configuration files incluster_management/environment_configuration
. -
In the directory
cluster_management/environment_configuration
, edit both files:- Beiwe is currently using a very old format for Sentry DSNs. Until we have updated Beiwe any DSNs you provide may not work. However, the deployment script should tell you that your new-style DSN does not match the old DSN format, so for now just run without those credentials. We will be updating the Sentry support to use the new DSN format.
- In the file
[YOUR-ENV-NAME]_beiwe_environment_variables.json
:-
SENTRY_JAVASCRIPT_DSN
: this should be aDSN (Public)
value from your Sentry.io account, you can leave it empty or provide a dummy value if you do not have one. -
SENTRY_ANDROID_DSN
,SENTRY_ELASTIC_BEANSTALK_DSN
,SENTRY_DATA_PROCESSING_DSN
: these should all be non-public DSN values from your Sentry.io account. A DSN identifies which Sentry project errors get submitted to. If you want Android errors, Elastic Beanstalk errors, and Data Processing errors all lumped together in the same project, you can provide the same DSN for each of those. If you want the errors grouped in different projects, you can give each one a DSN for a different Sentry project. -
DOMAIN
: this is your server's domain name, including subdomains. (Don't include the http:// or https:// prefix.)
-
- The file
[YOUR-ENV-NAME]_server_settings.json
contains default server types for your Beiwe deployment. You don't need to edit this in any way, but you are welcome to do so.
-
cd
back intocluster_management/
and enter the commands below. Each command will prompt you toEnter the name of the Elastic Beanstalk Environment you want to run this operation on
; provide it with exactly the same environment name you used in the previous step.$ python launch_script.py -create-environment $ python launch_script.py -create-manager $ python launch_script.py -create-worker # (only necessary on clusters with large numbers of users.)
All three of these commands take several minutes to run because they all involve waiting for AWS to spin up new servers. It is possible for commands to take more than 10 minutes if you have chosen very small server sizes or receive a dud server instances. For more information about what each of these commands does, run:
$ python launch_script.py --help
If any of the above commands fail with error messages about "Instance Profiles" (or any other IAM entities) you may be in a situation, usually cause interrupting deployment operations, you have extra AWS Instance Profiles hanging around. You can delete all Instance Profiles by running
python launch_script.py -purge-instance-profiles
. DO NOT RUN THIS COMMAND if you have a functional Elastic Beanstalk Beiwe cluster running, it will probably break it.
-
Install the EB CLI; follow the instructions on this page: Install the Elastic Beanstalk Command Line Interface (EB CLI)
-
In the
beiwe-backend
repo, configure the file.elasticbeanstalk/config.yml
:branch-defaults: master: # You can change "master" to any branch name environment: [YOUR-ENV-NAME] global: application_name: beiwe-application default_ec2_keyname: [DEPLOYMENT_KEY_NAME] # Same as in global_configuration.json default_platform: 64bit Amazon Linux 2018.03 v2.9.4 running Python 3.6 default_region: [AWS_REGION] # Same as in global_configuration.json profile: eb-cli # this name just needs to match a profile declared in ~/.aws/ sc: git
-
cd
into the root directory of thebeiwe-backend
repo, and run$ eb init
When asked to provide your credentials, use the same IAM user credentials you pasted into
cluster_management/general_configuration/aws_credentials.json
. If it prompts you to add AWS CodeCommit, you can answer "n
". -
From the root directory of the
beiwe-backend
repo, run:WARNING: do not run this command before configuring your HTTPS certificate in the EC2 console for the Load Balancer
If you do so you may run into a scenario where you are locked out of all operations for 15+ minutes repeatedly. This is caused by Elastic Beanstalk erroring when it attempts a default health check using HTTPS. See the Configuring SSL section below.
$ eb deploy
Any time you want to update your deployment to the current version of the code, you can just run:
$ git pull $ eb deploy
For more information on using Elastic Beanstalk with Git, read Using the EB CLI with Git.
Beiwe clusters aren't served from a single server; it runs behind a Load Balancer (a kind of intelligent router) that distributes incoming requests to multiple servers. You need to point your domain name to the address of your deployment's Load Balancer. Go to "Services" > "EC2" and click "Load Balancers" in the left column. From the table of Load Balancers, copy the "DNS name" of your Load Balancer - this is the address you need to point your domain name at.
Note: a Load Balancer's IP address is not stable. You need to use its URL, which is formatted like this: awseb-AWSEBLoa-ABCDEF123456-1234567890.us-east-1.elb.amazonaws.com
. Because of that:
-
If you want to use a root domain like
example.com
rather thanbeiwe.example.com
, you may need to use Amazon Route 53 for your DNS. DNS specifications require that the root domain have only A records (ip addresses) attached to them, and disallow CNAME records. Most DNS providers seem to follow this. Route 53 is unusual in that it does allow you point an A record to a URL, though it uses the term "Alias" for this specific feature. We've heard that some other DNS providers will let you point an A record to a URL, but we don't know which providers those are; we do know that GoDaddy doesn't. -
If you want to use a subdomain like
beiwe.example.com
, you can use a DNS provider of your choice; you can use a CNAME record to point from your subdomain to the Load Balancer's URL directly.
Because Beiwe often deals with sensitive data potentially covered under HIPAA regulations it is important to add an SSL certificate so that web traffic is encrypted with HTTPS. This is so important that Beiwe will not run without one except inside development environments.
You can use your own SSL certificate, or one provided by your organization, or a free SSL certificate from Amazon Certificate Manager (ACM). If using ACM, use the web interface to request an SSL certificate for your deployment (see documentation here). Amazon Certificate Manager will check that you control the domain by sending verification emails to the email addresses in the domain's WHOIS listing, so make sure you can receive emails from at least one of those addresses.
Once you have configured your SSL Certificate you navigate to the EC2 service on the AWS online management console. On the left, find the Elastic Load Balancer section, and select your load balancer.
It looks like this, except your 443 port forwarding will be missing or disabled:
Add a forwarding rule to your load balancer like this one:
Provided your server configuration DOMAIN_NAME
matches the address entered in the browser to navigate to the website it should just work and you will be presented with the login screen.
If it is your first time logging in you will need to enter the default admin username default_admin
and the default password abcABC123!@#
. You should then, of course, immediately change your password.