- General information about Alembic migrations can be found here.
- For database make recipes
- See Environment variables for
usage explanation of
DEPLOYMENT_STAGE
,AWS_PROFILE
andCORPORA_LOCAL_DEV
$REPO_ROOT
- root directory where thesingle-cell-data-portal
project is cloned (e.g.~/PyCharmProjects/single-cell-data-portal
)
cd $REPO_ROOT/backend
- Run
make db/new_migration MESSAGE="purpose_of_migration"
wherepurpose_of_migration
is a short phrase describing why the database migration is occurring. This generates a file in$REPO_ROOT/backend/database/versions/xxxxxxxxxxxx_purpose_of_migration.py
- In the generated file, edit the
upgrade()
anddowngrade()
functions such thatupgrade()
contains the Alembic DDL commands to perform the migration you would like anddowngrade()
contains the commands to undo it. - Rename the generated file by prepending the migration count to the filename (
xxx_purpose_of_migration.py
->03_xxx_purpose_of_migration.py
) - In the generated file, update the
Revision ID
and therevision
(used by Alembic) to include the migration count. For exampleRevision ID: a8cd0dc08805
becomesRevision ID: 18_a8cd0dc08805
andrevision = "a8cd0dc08805"
becomesrevision = "18_a8cd0dc08805"
- Test your migration
- Check that orm.py matches up with your changes.
- Once you've completed the changes, create a PR to get the functions reviewed.
- Rdev compatibility:
- If your migration requires changes to the seed data file for rdev, make those changes in a new seed data file with your migration hash in the name.
where
s3://env-rdev-dataportal/database/seed_data_##_a1b2c3d4e5f6.sql
##
is the migration number anda1b2c3d4e5f6
is the migration hash, both taken from the migration file under./versions/
. Do not delete older seed files from s3; the most recent version(s) will still be in active use by other developers before they incorporate your migration into their feature branches. - In your PR, change the
data_load_path
local variable in the ecs-stack module to reflect the new dump file above that you have written to s3.
- If your migration requires changes to the seed data file for rdev, make those changes in a new seed data file with your migration hash in the name.
- Once the PR is merged, migrations will be run as part of the deployment process to each env.
While migrations should run automatically as part of our deployment process, if a manual migration is required:
- Connect to Remote RDS to single-cell-dev
- In a new terminal, complete the migration in the single-cell-dev test env by running:
cd $REPO_ROOT/backend
DEPLOYMENT_STAGE=test make db/migrate
- Make changes to the ORM class(es) in orm.py
- Connect to Remote RDS. Note, generally, you would be connecting to prod
(
AWS_PROFILE=single-cell-prod DEPLOYMENT_STAGE=prod
) since we want to generate a migration from the database schema currently deployed in prod. However, if there are migrations haven't been deployed to prod yet, you would connect to staging here. - Autogenerate the migration using the steps below.
AWS_PROFILE
andDEPLOYMENT_STAGE
should be the same values used in the previous Connect to Remote RDS step. For details about Alembic's migration autogeneration, see What does Autogenerate Detect (and what does it not detect?)
cd $REPO_ROOT/backend
AWS_PROFILE=single-cell-{dev,prod} DEPLOYMENT_STAGE={dev,staging,prod} CORPORA_LOCAL_DEV=1 make db/new_migration_auto MESSAGE="purpose_of_migration"
- Follow How to perform a database migration starting from step 3
(i.e. editing the
upgrade()
anddowngrade()
functions).
The following steps will test that a migration script works on a local database using data downloaded from a deployed database.
- Open a new terminal and using the same values for
AWS_PROFILE
andDEPLOYMENT_STAGE
, download the remote dev database schema:
cd $REPO_ROOT/backend
AWS_PROFILE=single-cell-{dev,prod} DEPLOYMENT_STAGE={dev,staging,prod} make db/dump OUTFILE=corpora_dev.sqlc
This will download the database to $REPO_ROOT/backend/corpora_dev.sqlc
.
- The tunnel to dev should close automatically (but worth verifying
ps ax | grep ssh
) - Start the local database environment:
cd $REPO_ROOT
make local-start
- Import the remote database schema into your local database:
cd $REPO_ROOT/backend
make db/local/load-schema INFILE=corpora_dev.sqlc
where the INFILE
parameter is the base name of the .sqlc
file downloaded from the make db/dump
step above. For example
make db/local/load-schema INFILE=corpora_dev.sqlc
You may need to run this a few times, until there are no significant errors.
- Note:
pg_restore: error: could not execute query: ERROR: role "rdsadmin" does not exist
is not a significant error
- Run the migration test:
AWS_PROFILE=single-cell-{dev,prod} DEPLOYMENT_STAGE=test make db/test_migration
This test will:
- Dump the current schema (before)
- Apply the migration (upgrade)
- Rollback the migration (downgrade)
- Dump the schema (after)
- Compare the before vs after schemas. These should be identical if the database migration's
upgrade()
anddowngrade()
functions were implemented correctly.
If there are no differences then the test passed. If the test didn't pass, make adjustments to your migration script and restart from step 4. Repeat until there are no errors.
Enable local connection to the private RDS instance:
- Note: Since the default PostgreSQL port is
5432
, the above command will conflict with a local PostgreSQL instance. To stop it runmake local-stop
from the$REPO_ROOT
directory.
cd $REPO_ROOT/backend
AWS_PROFILE=single-cell-{dev,prod} DEPLOYMENT_STAGE={dev,staging,prod} make db/tunnel
This command opens an SSH tunnel from localhost:5432
to the RDS connection endpoint via the bastion server.
The local port 5432
is fixed and encoded in the DB connection string stored in
AWS Secrets Manager
in the secret named corpora/backend/${DEPLOYMENT_STAGE}/database
.