Dry run migrations #55404
Labels
Feature:Saved Objects
impact:high
Addressing this issue will have a high level of impact on the quality/strength of our product.
loe:x-large
Extra Large Level of Effort
project:ResilientSavedObjectMigrations
Reduce Kibana upgrade failures by making saved object migrations more resilient
Team:Core
Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc
Introduce the option to perform a dry run migration to allow administrators to locate and fix potential migration failures without taking their existing Kibana node(s) offline.
In #58470 we tried to simulate the complete migration (including creating indices / aliases) to try to catch any potential problems like an oversharded cluster. However, it adds a lot of complexity and it's hard to guarantee not introducing unwanted side-effects.
A much simpler approach would be to just perform the read and transform steps of the migration. This won't make any changes to aliases, indices or documents so it's completely safe to run while an older version is still up and serving traffic. It doesn't guarantee that no problems could be encountered in the full upgrade, but it will eliminate corrupt documents or transform function bugs.
We should also make this transform dry run part of the normal upgrade process. This way, before a new version adds any write blocks to old indices, it will first verify that all documents are valid and can be migrated. If a corrupt document or transform bug is encountered a user can simply spin up their old kibana version to rollback, no further action would be required.
Part of #52202
The text was updated successfully, but these errors were encountered: