-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Script-based Migrations #101
Labels
tech-debt
Related to future development costs
Comments
enobayram
added a commit
that referenced
this issue
Dec 5, 2022
This commit extends the current schema migration with the transition release phase of the upcoming script-based schema migrations. The full transition plan can be found at: #101
enobayram
added a commit
that referenced
this issue
Dec 7, 2022
* Prepare for the script based migrations This commit extends the current schema migration with the transition release phase of the upcoming script-based schema migrations. The full transition plan can be found at: #101 * Trim trailing whitespace * Use System.Exit.die for migration failure
emmanueldenloye
pushed a commit
that referenced
this issue
Jan 4, 2023
This PR is a part of the incremental script-based migration transition #101. This PR adds a `migrate` subcommand to `chainweb-data` that can be used to just run migrations and exit. This will be useful for the post-script-based-migration version; When that version encounters a database that's too old, it'll ask the user to download the latest `beam-automigrate`-based CW-D version (the upcoming release) and run it in this PR's `migrate` mode so that the database can be used with future versions that know how do migrate it incrementally from the last `beam-automigrate`-based database state.
enobayram
added a commit
that referenced
this issue
Mar 24, 2023
This PR implements the second and last step of the transition to script based migrations (#101). After this PR, new CW-D database migrations can be implemented by creating new scripts in the `haskell-src/db-schema/migrations` folder. The scripts in that folder must be named as `1.2.3.4_somename.sql`. This file names correspond to the version components `[1,2,3,4]` along with the step name `somename.sql`. The version components can contain an arbitrary number of elements, so `1.2.3_othername.sql` is also valid. The migration logic implemented by this PR aims to be fairly conservative in that it expects the existing migrations to be a perfect prefix of the incoming migrations with the correct order. The order of the migrations are defined by the version components. The condition that the existing migrations need to be a prefix all the desired migrations means that once a set of migrations are run, we can only append new migrations and those migrations have to have version components that are bigger than the existing migrations. The reason why we're being conservative like this is to avoid very subtle issues that occasionally arise due to migrations running in different orders in different deployments. It's also worth noting that the `--migrations-folder` introduced by this PR is optional and when that argument is not provided, CW-D uses the set of migrations that get embedded into the binary from the repository during compilation. The purpose is to avoid increasing the operational complexity of running CW-D from a compiled binary. The set of migrations associated with a CW-D release are tightly coupled with the Haskell code that comes with it anyway. Another point worth noting is that this migrations workflow also allows CW-D operators to interleave their own migrations with the official migrations that come with CW-D. If the operator of a particular CW-D node wants to include additional migrations, they can do so by maintaining a `migrations` folder of their own and including the official CW-D migrations side by side with their own migrations. In this setup, they need to name their own migration scripts to have version numbers that are compatible with this migration workflow. Resolves #101 * Implement script-based migrations * Return the migration steps in matchRecursive * Preorder steps and detect duplicates * Fix warnings * Base64 encode the migration checksum * Simplify matchSteps
enobayram
added a commit
that referenced
this issue
Apr 12, 2023
We've recently (#101) moved away from `beam-automigrate` for handling our DB migrations, but we've mostly kept the DB schema compatibility checks that come with it as a safety measure. One of the important benefits of #101 was the leniency it brought towards external DB schema manipulations that might be useful to `chainweb-data` operators. To that end, our DB migration check was already ignoring `TableRemoved` diffs so that new tables can be created in the DB without causing issues. This PR extends that leniency towards other types of objects `beam-automigrate` looks for. Their omission was an oversight.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
We're currently using
beam-automigrate
+ a number of custom migration commands atchainweb-data
start up for managing our Postgres schema. This combination served us well so far, but going forward we're planning to use more and more advanced features of Postgres like partitioned tables, triggers etc. so it seems like we'd be better served by converting to the more conventional migration scheme of expressing the schema migrations as a sequence of SQL scripts to be executed at start up.As part of this transition to the script-based migrations, we also need to consider the case of an existing
chainweb-data
deployment using a very old version of thechainweb-data
binary. Here's how we're planning to perform this transition in a way that works for the existingchainweb-data
deployments too:beam-automigrate
same as how been doing it so far.MigrationInitialization
command of postgresql-simple-migration seems like a good library to support this alternative way of managing schema migrations. Marking the deployment as having gone through the "transition migration". I.e. when this command is run, we know that the DB schema is in the known transition state.pg_dump
to take a snapshot of the Postgres DB schema as it appears at the transition version and manually go through the dump to make sure we don't have anything deployment-specific in there so that it can be run on any empty Postgres database. We call the resulting SQL dumpinit.sql
.chainweb-data
performs the following logic:blocks
table doesn't exist) That means this is an entirely new deployment, so we runinit.sql
followed by the incremental migrations scripts.blocks
table exists) Then we check whetherMigrationInitialization
has already been performed. (I.e. we run theMigrationValidation MigrationInitialization
MigrationCommand
.)MigrationInitialization
was run) This means this is a properly transitioned deployment, so we just apply any missing migration scripts.MigrationInitialization
was not run) That means this is a pre-transition deployment, so we exit with a message instructing the user to run the transition releaseThe text was updated successfully, but these errors were encountered: