diff --git a/_config.yml b/_config.yml index 68b4b1395f..6b6821a907 100644 --- a/_config.yml +++ b/_config.yml @@ -31,9 +31,6 @@ collections: install-and-configure: permalink: /:collection/:path/ output: true - upgrade-to: - permalink: /:collection/:path/ - output: true im-plugin: permalink: /:collection/:path/ output: true @@ -94,6 +91,9 @@ collections: data-prepper: permalink: /:collection/:path/ output: true + migration-assistant: + permalink: /:collection/:path/ + output: true tools: permalink: /:collection/:path/ output: true @@ -137,11 +137,6 @@ opensearch_collection: install-and-configure: name: Install and upgrade nav_fold: true - upgrade-to: - name: Migrate to OpenSearch - # nav_exclude: true - nav_fold: true - # search_exclude: true im-plugin: name: Managing Indexes nav_fold: true @@ -213,6 +208,12 @@ clients_collection: name: Clients nav_fold: true +migration_assistant_collection: + collections: + migration-assistant: + name: Migration Assistant + nav_fold: true + benchmark_collection: collections: benchmark: @@ -252,6 +253,12 @@ defaults: values: section: "benchmark" section-name: "Benchmark" + - + scope: + path: "_migration-assistant" + values: + section: "migration-assistant" + section-name: "Migration Assistant" # Enable or disable the site search # By default, just-the-docs enables its JSON file-based search. We also have an OpenSearch-driven search functionality. diff --git a/_data/top_nav.yml b/_data/top_nav.yml index 51d8138680..6552d90359 100644 --- a/_data/top_nav.yml +++ b/_data/top_nav.yml @@ -63,6 +63,8 @@ items: url: /docs/latest/clients/ - label: Benchmark url: /docs/latest/benchmark/ + - label: Migration Assistant + url: /docs/latest/migration-assistant/ - label: Platform url: /platform/index.html children: diff --git a/_includes/cards.html b/_includes/cards.html index 6d958e61a5..5ab37b8c27 100644 --- a/_includes/cards.html +++ b/_includes/cards.html @@ -30,8 +30,14 @@

Measure performance metrics for your OpenSearch cluster

+ +
+ +

Migration Assistant

+

Migrate to OpenSearch from other platforms

+ +
- diff --git a/_layouts/default.html b/_layouts/default.html index d4d40d8cc4..7f2bf0a2a8 100755 --- a/_layouts/default.html +++ b/_layouts/default.html @@ -87,6 +87,8 @@ {% assign section = site.clients_collection.collections %} {% elsif page.section == "benchmark" %} {% assign section = site.benchmark_collection.collections %} + {% elsif page.section == "migration-assistant" %} + {% assign section = site.migration_assistant_collection.collections %} {% endif %} {% if section %} diff --git a/_migrations/deploying-migration-assistant/configuration-options.md b/_migration-assistant/deploying-migration-assistant/configuration-options.md similarity index 97% rename from _migrations/deploying-migration-assistant/configuration-options.md rename to _migration-assistant/deploying-migration-assistant/configuration-options.md index 2e7f43e1b5..f8bcf39ab6 100644 --- a/_migrations/deploying-migration-assistant/configuration-options.md +++ b/_migration-assistant/deploying-migration-assistant/configuration-options.md @@ -2,7 +2,7 @@ layout: default title: Configuration options nav_order: 15 -parent: Deploying migration assistant +parent: Deploying Migration Assistant --- # Configuration options @@ -61,6 +61,7 @@ The following CDK performs a backfill migrations using RFS: } } ``` +{% include copy.html %} Performing an RFS backfill migration requires an existing snapshot. @@ -104,6 +105,7 @@ The following sample CDK performs a live capture migration with C&R: } } ``` +{% include copy.html %} Performing a live capture migration requires that a Capture Proxy be configured to capture incoming traffic and send it to the target cluster using the Traffic Replayer service. For arguments available in `captureProxyExtraArgs`, refer to the `@Parameter` fields [here](https://github.com/opensearch-project/opensearch-migrations/blob/main/TrafficCapture/trafficCaptureProxyServer/src/main/java/org/opensearch/migrations/trafficcapture/proxyserver/CaptureProxy.java). For `trafficReplayerExtraArgs`, refer to the `@Parameter` fields [here](https://github.com/opensearch-project/opensearch-migrations/blob/main/TrafficCapture/trafficReplayer/src/main/java/org/opensearch/migrations/replay/TrafficReplayer.java). At a minimum, no extra arguments may be needed. @@ -125,17 +127,18 @@ Both the source and target cluster can use no authentication, authentication lim ### No authentication -``` +```json "sourceCluster": { "endpoint": , "version": "ES 7.10", "auth": {"type": "none"} } ``` +{% include copy.html %} ### Basic authentication -``` +```json "sourceCluster": { "endpoint": , "version": "ES 7.10", @@ -146,10 +149,11 @@ Both the source and target cluster can use no authentication, authentication lim } } ``` +{% include copy.html %} ### Signature Version 4 authentication -``` +```json "sourceCluster": { "endpoint": , "version": "ES 7.10", @@ -160,6 +164,7 @@ Both the source and target cluster can use no authentication, authentication lim } } ``` +{% include copy.html %} The `serviceSigningName` can be `es` for an Elasticsearch or OpenSearch domain, or `aoss` for an OpenSearch Serverless collection. @@ -167,4 +172,4 @@ All of these authentication options apply to both source and target clusters. ## Network configuration -The migration tooling expects the source cluster, target cluster, and migration resources to exist in the same VPC. If this is not the case, manual networking setup outside of this documentation is likely required. +The migration tooling expects the source cluster, target cluster, and migration resources to exist in the same VPC. If this is not the case, manual networking setup outside of this documentation is likely required. \ No newline at end of file diff --git a/_migrations/deploying-migration-assistant/iam-and-security-groups-for-existing-clusters.md b/_migration-assistant/deploying-migration-assistant/iam-and-security-groups-for-existing-clusters.md similarity index 91% rename from _migrations/deploying-migration-assistant/iam-and-security-groups-for-existing-clusters.md rename to _migration-assistant/deploying-migration-assistant/iam-and-security-groups-for-existing-clusters.md index 808de79689..331b99e1fa 100644 --- a/_migrations/deploying-migration-assistant/iam-and-security-groups-for-existing-clusters.md +++ b/_migration-assistant/deploying-migration-assistant/iam-and-security-groups-for-existing-clusters.md @@ -2,7 +2,7 @@ layout: default title: IAM and security groups for existing clusters nav_order: 20 -parent: Deploying migration assistant +parent: Deploying Migration Assistant --- # IAM and security groups for existing clusters @@ -33,7 +33,7 @@ For an OpenSearch Serverless Collection, you will need to configure both network The Collection should have a network policy that uses the `VPC` access type. This requires creating a VPC endpoint on the VPC used for the solution. The VPC endpoint should be configured for the private subnets of the VPC and should attach the `osClusterAccessSG` security group. 2. **Data Access Policy Configuration**: - The data access policy should grant permission to perform all [index operations](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-data-access.html#serverless-data-supported-permissions) ↗ (`aoss:*`) for all indexes in the Collection. The IAM task roles of the applicable Migration services (Traffic Replayer, Migration Console, Reindex-from-Snapshot) should be used as the principals for this data access policy. + The data access policy should grant permission to perform all [index operations](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-data-access.html#serverless-data-supported-permissions) (`aoss:*`) for all indexes in the Collection. The IAM task roles of the applicable Migration services (Traffic Replayer, migration console, `Reindex-from-Snapshot`) should be used as the principals for this data access policy. ## Capture Proxy on Coordinator Nodes of Source Cluster @@ -68,7 +68,3 @@ Before [setting up Capture Proxy instances](https://github.com/opensearch-projec ] } ``` - -## Related Links - -- [OpenSearch traffic capture setup] \ No newline at end of file diff --git a/_migrations/deploying-migration-assistant/index.md b/_migration-assistant/deploying-migration-assistant/index.md similarity index 58% rename from _migrations/deploying-migration-assistant/index.md rename to _migration-assistant/deploying-migration-assistant/index.md index 6e245aa5da..1c559a81b1 100644 --- a/_migrations/deploying-migration-assistant/index.md +++ b/_migration-assistant/deploying-migration-assistant/index.md @@ -1,7 +1,11 @@ --- layout: default title: Deploying Migration Assistant -nav_order: 10 +nav_order: 15 +has_children: true +permalink: /deploying-migration-assistant/ +redirect-from: + - /deploying-migration-assistant/index/ --- # Deploying Migration Assistant diff --git a/_migrations/getting-started-data-migration.md b/_migration-assistant/getting-started-data-migration.md similarity index 86% rename from _migrations/getting-started-data-migration.md rename to _migration-assistant/getting-started-data-migration.md index 035ddae323..4110f29edf 100644 --- a/_migrations/getting-started-data-migration.md +++ b/_migration-assistant/getting-started-data-migration.md @@ -2,18 +2,20 @@ layout: default title: Getting started with data migration nav_order: 10 +redirect_from: + - /upgrade-to/upgrade-to/ + - /upgrade-to/snapshot-migrate/ --- # Getting started with data migration This quickstart outlines how to deploy Migration Assistant for OpenSearch and execute an existing data migration using `Reindex-from-Snapshot` (RFS). It uses AWS for illustrative purposes. However, the steps can be modified for use with other cloud providers. - ## Prerequisites and assumptions Before using this quickstart, make sure you fulfill the following prerequisites: -* Verify that your migration path [is supported](https://opensearch.org/docs/latest/migrations/is-migration-assistant-right-for-you/#supported-migration-paths). Note that we test with the exact versions specified, but you should be able to migrate data on alternative minor versions as long as the major version is supported. +* Verify that your migration path [is supported]({{site.url}}{{site.baseurl}}/migration-assistant/is-migration-assistant-right-for-you/#migration-paths). Note that we test with the exact versions specified, but you should be able to migrate data on alternative minor versions as long as the major version is supported. * The source cluster must be deployed Amazon Simple Storage Service (Amazon S3) plugin. * The target cluster must be deployed. @@ -27,7 +29,7 @@ The steps in this guide assume the following: --- -## Step 1: Installing Bootstrap on an Amazon EC2 instance (~10 minutes) +## Step 1: Install Bootstrap on an Amazon EC2 instance (~10 minutes) To begin your migration, use the following steps to install a `bootstrap` box on an Amazon Elastic Compute Cloud (Amazon EC2) instance. The instance uses AWS CloudFormation to create and manage the stack. @@ -41,7 +43,7 @@ To begin your migration, use the following steps to install a `bootstrap` box on --- -## Step 2: Setting up Bootstrap instance access (~5 minutes) +## Step 2: Set up Bootstrap instance access (~5 minutes) Use the following steps to set up Bootstrap instance access: @@ -63,12 +65,13 @@ Use the following steps to set up Bootstrap instance access: ] } ``` + {% include copy.html %} 3. Name the policy, for example, `SSM-OSMigrationBootstrapAccess`, and then create the policy by selecting **Create policy**. --- -## Step 3: Logging in to Bootstrap and building Migration Assistant (~15 minutes) +## Step 3: Log in to Bootstrap and building Migration Assistant (~15 minutes) Next, log in to Bootstrap and build Migration Assistant using the following steps. @@ -87,18 +90,20 @@ To use these steps, make sure you fulfill the following prerequisites: ```bash aws ssm start-session --document-name BootstrapShellDoc-- --target --region [--profile ] ``` + {% include copy.html %} 3. Once logged in, run the following command from the shell of the Bootstrap instance in the `/opensearch-migrations` directory: ```bash ./initBootstrap.sh && cd deployment/cdk/opensearch-service-migration ``` + {% include copy.html %} 4. After a successful build, note the path for infrastructure deployment, which will be used in the next step. --- -## Step 4: Configuring and deploying RFS (~20 minutes) +## Step 4: Configure and deploy RFS (~20 minutes) Use the following steps to configure and deploy RFS: @@ -134,6 +139,7 @@ Use the following steps to configure and deploy RFS: } } ``` + {% include copy.html %} The source and target cluster authorization can be configured to have no authorization, `basic` with a username and password, or `sigv4`. @@ -142,12 +148,14 @@ Use the following steps to configure and deploy RFS: ```bash cdk bootstrap --c contextId=migration-assistant --require-approval never ``` + {% include copy.html %} 4. Deploy the stacks: ```bash cdk deploy "*" --c contextId=migration-assistant --require-approval never --concurrency 5 ``` + {% include copy.html %} 5. Verify that all CloudFormation stacks were installed successfully. @@ -163,7 +171,7 @@ You will also need to give the `migrationconsole` and `reindexFromSnapshot` Task --- -## Step 5: Deploying Migration Assistant +## Step 5: Deploy Migration Assistant To deploy Migration Assistant, use the following steps: @@ -172,11 +180,14 @@ To deploy Migration Assistant, use the following steps: ```bash cdk bootstrap --c contextId=migration-assistant --require-approval never --concurrency 5 ``` + {% include copy.html %} + 2. Deploy the stacks when `cdk.context.json` is fully configured: ```bash cdk deploy "*" --c contextId=migration-assistant --require-approval never --concurrency 3 ``` + {% include copy.html %} These commands deploy the following stacks: @@ -186,13 +197,14 @@ These commands deploy the following stacks: --- -## Step 6: Accessing the migration console +## Step 6: Access the migration console Run the following command to access the migration console: ```bash ./accessContainer.sh migration-console dev ``` +{% include copy.html %} `accessContainer.sh` is located in `/opensearch-migrations/deployment/cdk/opensearch-service-migration/` on the Bootstrap instance. To learn more, see [Accessing the migration console]. @@ -200,17 +212,18 @@ Run the following command to access the migration console: --- -## Step 7: Verifying the connection to the source and target clusters +## Step 7: Verify the connection to the source and target clusters To verify the connection to the clusters, run the following command: ```bash console clusters connection-check ``` +{% include copy.html %} You should receive the following output: -``` +```bash * **Source Cluster:** Successfully connected! * **Target Cluster:** Successfully connected! ``` @@ -219,25 +232,28 @@ To learn more about migration console commands, see [Migration commands]. --- -## Step 8: Snapshot creation +## Step 8: Create a snapshot Run the following command to initiate snapshot creation from the source cluster: ```bash console snapshot create [...] ``` +{% include copy.html %} To check the snapshot creation status, run the following command: ```bash console snapshot status [...] ``` +{% include copy.html %} To learn more information about the snapshot, run the following command: ```bash console snapshot status --deep-check [...] ``` +{% include copy.html %} Wait for snapshot creation to complete before moving to step 9. @@ -245,19 +261,20 @@ To learn more about snapshot creation, see [Snapshot Creation]. --- -## Step 9: Metadata migration +## Step 9: Migrate metadata Run the following command to migrate metadata: ```bash console metadata migrate [...] ``` +{% include copy.html %} -For more information, see [Migrating metadata]({{site.url}}{{site.baseurl}}/migrations/migration-phases/migrating-metadata/). +For more information, see [Migrating metadata]({{site.url}}{{site.baseurl}}/migration-assistant/migration-phases/migrating-metadata/). --- -## Step 10: RFS document migration +## Step 10: Migrate documents with RFS You can now use RFS to migrate documents from your original cluster: @@ -266,26 +283,30 @@ You can now use RFS to migrate documents from your original cluster: ```bash console backfill start ``` + {% include copy.html %} 2. _(Optional)_ To speed up the migration, increase the number of documents processed at a simultaneously by using the following command: ```bash console backfill scale ``` + {% include copy.html %} 3. To check the status of the documentation backfill, use the following command: ```bash console backfill status ``` + {% include copy.html %} 4. If you need to stop the backfill process, use the following command: ```bash console backfill stop ``` + {% include copy.html %} -For more information, see [Backfill]({{site.url}}{{site.baseurl}}/migrations/migration-phases/backfill/). +For more information, see [Backfill]({{site.url}}{{site.baseurl}}/migration-assistant/migration-phases/backfill/). --- @@ -296,6 +317,7 @@ Use the following command for detailed monitoring of the backfill process: ```bash console backfill status --deep-check ``` +{% include copy.html %} You should receive the following output: @@ -325,6 +347,7 @@ fields @message | sort @timestamp desc | limit 10000 ``` +{% include copy.html %} If any failed documents are identified, you can index the failed documents directly as opposed to using RFS. diff --git a/_migrations/index.md b/_migration-assistant/index.md similarity index 95% rename from _migrations/index.md rename to _migration-assistant/index.md index e3b2657e1a..f024fdb69c 100644 --- a/_migrations/index.md +++ b/_migration-assistant/index.md @@ -5,6 +5,11 @@ nav_order: 1 has_children: false nav_exclude: true has_toc: false +permalink: /migration-assistant/ +redirect_from: + - /migration-assistant/index/ + - /upgrade-to/index/ + - /upgrade-to/ --- # Migration Assistant for OpenSearch @@ -53,13 +58,13 @@ The Metadata migration tool integrated into the Migration CLI can be used indepe The destination cluster for migration or comparison in an A/B test. -### Architecture overview +## Architecture overview The Migration Assistant architecture is based on the use of an AWS Cloud infrastructure, but most tools are designed to be cloud independent. A local containerized version of this solution is also available. The design deployed in AWS is as follows: -![Migration architecture overview]({{site.url}}{{site.baseurl}}/images/migrations/migration-architecture-overview.svg) +![Migration architecture overview]({{site.url}}{{site.baseurl}}/images/migrations/migrations-architecture-overview.png) 1. Client traffic is directed to the existing cluster. 2. An Application Load Balancer with capture proxies relays traffic to a source while replicating data to Amazon Managed Streaming for Apache Kafka (Amazon MSK). diff --git a/_migrations/is-migration-assistant-right-for-you.md b/_migration-assistant/is-migration-assistant-right-for-you.md similarity index 98% rename from _migrations/is-migration-assistant-right-for-you.md rename to _migration-assistant/is-migration-assistant-right-for-you.md index 6a09e44206..e9c48e353d 100644 --- a/_migrations/is-migration-assistant-right-for-you.md +++ b/_migration-assistant/is-migration-assistant-right-for-you.md @@ -30,6 +30,7 @@ There are also tools available for migrating cluster configuration, templates, a {: .note} ### Supported source and target platforms + * Self-managed (hosted by cloud provider or on-premises) * AWS OpenSearch @@ -54,4 +55,4 @@ Before starting a migration, consider the scope of the components involved. The | **Index State Management (ISM) policies** | Expected in 2025 | Manually migrate using an API. | | **Elasticsearch Kibana dashboards** | Expected in 2025 | This tool is only needed when used to migrate Elasticsearch Kibana Dashboards to OpenSearch Dashboards. To start, export JSON files from Kibana and import them into OpenSearch Dashboards; before importing, use the [`dashboardsSanitizer`](https://github.com/opensearch-project/opensearch-migrations/tree/main/dashboardsSanitizer) tool on X-Pack visualizations like Canvas and Lens in Kibana Dashboards, as they may require recreation for compatibility with OpenSearch. | | **Security constructs** | No | Configure roles and permissions based on cloud provider recommendations. For example, if using AWS, leverage AWS Identity and Access Management (IAM) for enhanced security management. | -| **Plugins** | No | Check plugin compatibility; some Elasticsearch plugins may not have direct equivalents in OpenSearch. | +| **Plugins** | No | Check plugin compatibility; some Elasticsearch plugins may not have direct equivalents in OpenSearch. | \ No newline at end of file diff --git a/_migrations/migration-console/accessing-the-migration-console.md b/_migration-assistant/migration-console/accessing-the-migration-console.md similarity index 97% rename from _migrations/migration-console/accessing-the-migration-console.md rename to _migration-assistant/migration-console/accessing-the-migration-console.md index d6cf9ec150..ea66f5c04c 100644 --- a/_migrations/migration-console/accessing-the-migration-console.md +++ b/_migration-assistant/migration-console/accessing-the-migration-console.md @@ -16,6 +16,7 @@ export STAGE=dev export AWS_REGION=us-west-2 /opensearch-migrations/deployment/cdk/opensearch-service-migration/accessContainer.sh migration-console ${STAGE} ${AWS_REGION} ``` +{% include copy.html %} When opening the console a message will appear above the command prompt, `Welcome to the Migration Assistant Console`. @@ -29,6 +30,6 @@ export SERVICE_NAME=migration-console export TASK_ARN=$(aws ecs list-tasks --cluster migration-${STAGE}-ecs-cluster --family "migration-${STAGE}-${SERVICE_NAME}" | jq --raw-output '.taskArns[0]') aws ecs execute-command --cluster "migration-${STAGE}-ecs-cluster" --task "${TASK_ARN}" --container "${SERVICE_NAME}" --interactive --command "/bin/bash" ``` - +{% include copy.html %} Typically, `STAGE` is equivalent to a standard `dev` environment, but this may vary based on what the user specified during deployment. \ No newline at end of file diff --git a/_migrations/migration-console/index.md b/_migration-assistant/migration-console/index.md similarity index 85% rename from _migrations/migration-console/index.md rename to _migration-assistant/migration-console/index.md index 7ebac65836..3e08e72c5c 100644 --- a/_migrations/migration-console/index.md +++ b/_migration-assistant/migration-console/index.md @@ -3,8 +3,13 @@ layout: default title: Migration console nav_order: 30 has_children: true +permalink: /migration-console/ +redirect_from: + - /migration-console/index/ --- +# Migration console + The Migrations Assistant deployment includes an Amazon Elastic Container Service (Amazon ECS) task that hosts tools that run different phases of the migration and check the progress or results of the migration. This ECS task is called the **migration console**. The migration console is a command line interface used to interact with the deployed components of the solution. This section provides information about how to access the migration console and what commands are supported. diff --git a/_migrations/migration-console/migration-console-commands-references.md b/_migration-assistant/migration-console/migration-console-commands-references.md similarity index 93% rename from _migrations/migration-console/migration-console-commands-references.md rename to _migration-assistant/migration-console/migration-console-commands-references.md index 55731229e0..21d793b3f3 100644 --- a/_migrations/migration-console/migration-console-commands-references.md +++ b/_migration-assistant/migration-console/migration-console-commands-references.md @@ -1,11 +1,10 @@ --- layout: default title: Command reference -nav_order: 35 +nav_order: 40 parent: Migration console --- - # Migration console command reference Migration console commands follow this syntax: `console [component] [action]`. The components include `clusters`, `backfill`, `snapshot`, `metadata`, and `replay`. The console is configured with a registry of the deployed services and the source and target cluster, generated from the `cdk.context.json` values. @@ -21,6 +20,7 @@ Reports whether both the source and target clusters can be reached and provides ```sh console clusters connection-check ``` +{% include copy.html %} ### Run `cat-indices` @@ -29,6 +29,7 @@ Runs the `cat-indices` API on the cluster. ```sh console clusters cat-indices ``` +{% include copy.html %} ### Create a snapshot @@ -37,6 +38,7 @@ Creates a snapshot of the source cluster and stores it in a preconfigured Amazon ```sh console snapshot create ``` +{% include copy.html %} ## Check snapshot status @@ -45,6 +47,7 @@ Runs a detailed check on the snapshot creation status, including estimated compl ```sh console snapshot status --deep-check ``` +{% include copy.html %} ## Evaluate metadata @@ -53,6 +56,7 @@ Performs a dry run of metadata migration, showing which indexes, templates, and ```sh console metadata evaluate ``` +{% include copy.html %} ## Migrate metadata @@ -61,6 +65,7 @@ Migrates the metadata from the source cluster to the target cluster. ```sh console metadata migrate ``` +{% include copy.html %} ## Start a backfill @@ -72,6 +77,7 @@ There are similar `scale UNITS` and `stop` commands to change the number of acti ```sh console backfill start ``` +{% include copy.html %} ## Check backfill status @@ -86,6 +92,7 @@ The `stop` command stops all active instances. ```sh console replay start ``` +{% include copy.html %} ## Read logs @@ -94,9 +101,9 @@ Reads any logs that exist when running Traffic Replayer. Use tab completion on t ```sh console tuples show --in /shared-logs-output/traffic-replayer-default/[NODE_ID]/tuples/console.log | jq > readable_tuples.json ``` +{% include copy.html %} - -## Help command +## Help option All commands and options can be explored within the tool itself by using the `--help` option, either for the entire `console` application or for individual components (for example, `console backfill --help`). For example: @@ -121,4 +128,4 @@ Commands: replay Commands related to controlling the replayer. snapshot Commands to create and check status of snapshots of the... tuples All commands related to tuples. -``` \ No newline at end of file +``` diff --git a/_migration-assistant/migration-phases/assessing-your-cluster-for-migration.md b/_migration-assistant/migration-phases/assessing-your-cluster-for-migration.md new file mode 100644 index 0000000000..5ded49eb59 --- /dev/null +++ b/_migration-assistant/migration-phases/assessing-your-cluster-for-migration.md @@ -0,0 +1,48 @@ +--- +layout: default +title: Assessing your cluster for migration +nav_order: 60 +parent: Migration phases +--- + +# Assessing your cluster for migration + + +The goal of the Migration Assistant is to streamline the process of migrating from one location or version of Elasticsearch/OpenSearch to another. However, completing a migration sometimes requires resolving client compatibility issues before they can communicate directly with the target cluster. + +## Understanding breaking changes + +Before performing any upgrade or migration, you should review any documentation of breaking changes. Even if the cluster is migrated there might be changes required for clients to connect to the new cluster + +## Upgrade and breaking changes guides + +For migrations paths between Elasticsearch 6.8 and OpenSearch 2.x users should be familiar with documentation in the links below that apply to their specific case: + +* [Upgrading Amazon Service Domains](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/version-migration.html). + +* [Changes from Elasticsearch to OpenSearch fork](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/rename.html). + +* [OpenSearch Breaking Changes](https://opensearch.org/docs/latest/breaking-changes/). + +The next step is to set up a proper test bed to verify that your applications will work as expected on the target version. + +## Impact of data transformations + +Any time you apply a transformation to your data, such as: + +- Changing index names +- Modifying field names or field mappings +- Splitting indices with type mappings + +These changes might need to be reflected in your client configurations. For example, if your clients are reliant on specific index or field names, you must ensure that their queries are updated accordingly. + +We recommend running production-like queries against the target cluster before switching over actual production traffic. This helps verify that the client can: + +- Communicate with the target cluster +- Locate the necessary indices and fields +- Retrieve the expected results + +For complex migrations involving multiple transformations or breaking changes, we highly recommend performing a trial migration with representative, non-production data (e.g., in a staging environment) to fully test client compatibility with the target cluster. + + + diff --git a/_migrations/migration-phases/backfill.md b/_migration-assistant/migration-phases/backfill.md similarity index 93% rename from _migrations/migration-phases/backfill.md rename to _migration-assistant/migration-phases/backfill.md index ccdbadd042..d2ff7cd873 100644 --- a/_migrations/migration-phases/backfill.md +++ b/_migration-assistant/migration-phases/backfill.md @@ -7,7 +7,7 @@ parent: Migration phases # Backfill -After the [metadata]({{site.url}}{{site.baseurl}}/migrations/migration-phases/migrating-metadata/) for your cluster has been migrated, you can use capture proxy data replication and snapshots to backfill your data into the next cluster. +After the [metadata]({{site.url}}{{site.baseurl}}/migration-assistant/migration-phases/migrating-metadata/) for your cluster has been migrated, you can use capture proxy data replication and snapshots to backfill your data into the next cluster. ## Capture proxy data replication @@ -48,15 +48,16 @@ If you're only using backfill as your migration technique, make a client/DNS cha After you have routed the client based on your use case, test adding records against HTTP requests using the following steps: -1. In the migration console, run the following command: +In the migration console, run the following command: - ```shell + ```bash console kafka describe-topic-records ``` + {% include copy.html %} Note the records in the logging topic. -2. After a short period, execute the same command again and compare the increased number of records against the expected HTTP requests. +After a short period, execute the same command again and compare the increased number of records against the expected HTTP requests. ## Creating a snapshot @@ -66,12 +67,14 @@ Create a snapshot for your backfill using the following command: ```bash console snapshot create ``` +{% include copy.html %} To check the progress of your snapshot, use the following command: ```bash console snapshot status --deep-check ``` +{% include copy.html %} Depending on the size of the data in the source cluster and the bandwidth allocated for snapshots, the process can take some time. Adjust the maximum rate at which the source cluster's nodes create the snapshot using the `--max-snapshot-rate-mb-per-node` option. Increasing the snapshot rate will consume more node resources, which may affect the cluster's ability to handle normal traffic. @@ -86,6 +89,7 @@ You can check the indexes and document counts of the source and target clusters ```shell console clusters cat-indices ``` +{% include copy.html %} You should receive the following response: @@ -106,6 +110,7 @@ Use the following command to start the backfill and deploy the workers: ```shell console backfill start ``` +{% include copy.html %} You should receive a response similar to the following: @@ -130,6 +135,7 @@ To speed up the transfer, you can scale the number of workers. It may take a few ```shell console backfill scale 5 ``` +{% include copy.html %} We recommend slowly scaling up the fleet while monitoring the health metrics of the target cluster to avoid over-saturating it. [Amazon OpenSearch Service domains](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/monitoring.html) provide a number of metrics and logs that can provide this insight. @@ -154,6 +160,7 @@ After the backfill is complete and the workers have stopped, examine the content ```shell console clusters cat-indices --refresh ``` +{% include copy.html %} This will display the number of documents in each of the indexes in the target cluster, as shown in the following example response: diff --git a/_migrations/migration-phases/index.md b/_migration-assistant/migration-phases/index.md similarity index 57% rename from _migrations/migration-phases/index.md rename to _migration-assistant/migration-phases/index.md index b637d4a28d..c3c6c14b07 100644 --- a/_migrations/migration-phases/index.md +++ b/_migration-assistant/migration-phases/index.md @@ -3,11 +3,14 @@ layout: default title: Migration phases nav_order: 50 has_children: true +permalink: /migration-phases/ +redirect_from: + - /migration-phases/index/ --- This page details how to conduct a migration with Migration Assistant. It encompasses a variety of scenarios including: -- [**Metadata migration**]({{site.url}}{{site.baseurl}}/migrations/migration-phases/migrating-metadata/): Migrating cluster metadata, such as index settings, aliases, and templates. -- [**Backfill migration**]({{site.url}}{{site.baseurl}}/migrations/migration-phases/backfill/): Migrating existing or historical data from a source to a target cluster. +- [**Metadata migration**]({{site.url}}{{site.baseurl}}/migration-assistant/migration-phases/migrating-metadata/): Migrating cluster metadata, such as index settings, aliases, and templates. +- [**Backfill migration**]({{site.url}}{{site.baseurl}}/migration-assistant/migration-phases/backfill/): Migrating existing or historical data from a source to a target cluster. - **Live traffic migration**: Replicating live ongoing traffic from a source to a target cluster. diff --git a/_migrations/migration-phases/migrating-metadata.md b/_migration-assistant/migration-phases/migrating-metadata.md similarity index 95% rename from _migrations/migration-phases/migrating-metadata.md rename to _migration-assistant/migration-phases/migrating-metadata.md index 2a4079ca3f..249a2ca4d0 100644 --- a/_migrations/migration-phases/migrating-metadata.md +++ b/_migration-assistant/migration-phases/migrating-metadata.md @@ -22,12 +22,14 @@ Create the initial snapshot of the source cluster using the following command: ```shell console snapshot create ``` +{% include copy.html %} To check the progress of the snapshot in real time, use the following command: ```shell console snapshot status --deep-check ``` +{% include copy.html %} You should receive the following response when the snapshot is created: @@ -49,28 +51,28 @@ Throughput: 38.13 MiB/sec Depending on the size of the data in the source cluster and the bandwidth allocated for snapshots, the process can take some time. Adjust the maximum rate at which the source cluster's nodes create the snapshot using the `--max-snapshot-rate-mb-per-node` option. Increasing the snapshot rate will consume more node resources, which may affect the cluster's ability to handle normal traffic. -## Command Arguments +## Command arguments For the following commands, to identify all valid arguments, please run with `--help`. ```shell console metadata evaluate --help ``` +{% include copy.html %} ```shell console metadata migrate --help ``` +{% include copy.html %} Based on the migration console deployment options, a number of commands will be pre-populated. To view them, run console with verbosity: ```shell console -v metadata migrate --help ``` +{% include copy.html %} -
- -Example "console -v metadata migrate --help" command output - +You should receive a response similar to the following: ```shell (.venv) bash-5.2# console -v metadata migrate --help @@ -92,6 +94,7 @@ By scanning the contents of the source cluster, applying filtering, and applying ```shell console metadata evaluate [...] ``` +{% include copy.html %} You should receive a response similar to the following: @@ -131,6 +134,7 @@ Running through the same data as the evaluate command all of the migrated items ```shell console metadata migrate [...] ``` +{% include copy.html %} You should receive a response similar to the following: @@ -162,7 +166,7 @@ Migrated Items: Results: 0 issue(s) detected ``` -
+ ## Metadata verification process @@ -172,19 +176,21 @@ Before moving on to additional migration steps, it is recommended to confirm det Use these instructions to help troubleshoot the following issues. -### Access detailed logs +### Accessing detailed logs Metadata migration creates a detailed log file that includes low level tracing information for troubleshooting. For each execution of the program a log file is created inside a shared volume on the migration console named `shared-logs-output` the following command will list all log files, one for each run of the command. ```shell ls -al /shared-logs-output/migration-console-default/*/metadata/ ``` +{% include copy.html %} To inspect the file within the console `cat`, `tail` and `grep` commands line tools. By looking for warnings, errors and exceptions in this log file can help understand the source of failures, or at the very least be useful for creating issues in this project. ```shell tail /shared-logs-output/migration-console-default/*/metadata/*.log ``` +{% include copy.html %} ### Warnings and errors @@ -207,6 +213,7 @@ As Metadata migration supports migrating from ES 6.8 on to the latest versions o **Example starting state with mapping type foo (ES 6):** + ```json { "mappings": [ @@ -221,8 +228,10 @@ As Metadata migration supports migrating from ES 6.8 on to the latest versions o ] } ``` +{% include copy.html %} **Example ending state with foo removed (ES 7):** + ```json { "mappings": { @@ -233,5 +242,6 @@ As Metadata migration supports migrating from ES 6.8 on to the latest versions o } } ``` +{% include copy.html %} For additional technical details, [view the mapping type removal source code](https://github.com/opensearch-project/opensearch-migrations/blob/main/transformation/src/main/java/org/opensearch/migrations/transformation/rules/IndexMappingTypeRemoval.java). diff --git a/_migrations/migration-phases/removing-migration-infrastructure.md b/_migration-assistant/migration-phases/removing-migration-infrastructure.md similarity index 92% rename from _migrations/migration-phases/removing-migration-infrastructure.md rename to _migration-assistant/migration-phases/removing-migration-infrastructure.md index 75413f25f0..656a8e1998 100644 --- a/_migrations/migration-phases/removing-migration-infrastructure.md +++ b/_migration-assistant/migration-phases/removing-migration-infrastructure.md @@ -7,13 +7,14 @@ parent: Migration phases # Removing migration infrastructure -After a migration is complete all resources should be removed except for the target cluster, and optionally your Cloudwatch Logs, and Replayer logs. +After a migration is complete all resources should be removed except for the target cluster, and optionally your Cloudwatch Logs, and Traffic Replayer logs. To remove all the CDK stack(s) which get created during a deployment you can execute a command similar to below within the CDK directory ```bash cdk destroy "*" --c contextId= ``` +{% include copy.html %} Follow the instructions on the command-line to remove the deployed resources from the AWS account. diff --git a/_migrations/migration-phases/switching-traffic-from-the-source-cluster.md b/_migration-assistant/migration-phases/switching-traffic-from-the-source-cluster.md similarity index 97% rename from _migrations/migration-phases/switching-traffic-from-the-source-cluster.md rename to _migration-assistant/migration-phases/switching-traffic-from-the-source-cluster.md index c0fe834943..c43580eef9 100644 --- a/_migrations/migration-phases/switching-traffic-from-the-source-cluster.md +++ b/_migration-assistant/migration-phases/switching-traffic-from-the-source-cluster.md @@ -13,7 +13,7 @@ After the source and target clusters are synchronized, traffic needs to be switc This page assumes that the following has occurred before making the switch: -- All client traffic is being routed through a switchover listener in the [MigrationAssistant Application Load Balancer]({{site.url}}{{site.baseurl}}/migrations/migration-phases/backfill/). +- All client traffic is being routed through a switchover listener in the [MigrationAssistant Application Load Balancer]({{site.url}}{{site.baseurl}}/migration-assistant/migration-phases/backfill/). - Client traffic has been verified as compatible with the target cluster. - The target cluster is in a good state to accept client traffic. - The target proxy service is deployed. diff --git a/_migrations/migration-phases/using-traffic-replayer.md b/_migration-assistant/migration-phases/using-traffic-replayer.md similarity index 97% rename from _migrations/migration-phases/using-traffic-replayer.md rename to _migration-assistant/migration-phases/using-traffic-replayer.md index 1c812be211..5b7af3c3f7 100644 --- a/_migrations/migration-phases/using-traffic-replayer.md +++ b/_migration-assistant/migration-phases/using-traffic-replayer.md @@ -17,7 +17,7 @@ For example, if a document was deleted after a snapshot was taken, starting Traf ## Configuration options -[Traffic Replayer settings]({{site.url}}{{site.baseurl}}/migrations/deploying-migration-assisstant/configuation-options/) are configured during the deployment of Migration Assistant. Make sure to set the authentication mode for Traffic Replayer so that it can properly communicate with the target cluster. For more information about different types of traffic that are handled by Traffic Replayer, see [limitations](#limitations). +[Traffic Replayer settings]({{site.url}}{{site.baseurl}}/migration-assistant/deploying-migration-assistant/configuration-options/) are configured during the deployment of Migration Assistant. Make sure to set the authentication mode for Traffic Replayer so that it can properly communicate with the target cluster. ## Using Traffic Replayer @@ -152,12 +152,13 @@ Suppose that a source request contains a `tagToExcise` element that needs to be The resulting request sent to the target will appear similar to the following: -```http +```bash PUT /oldStyleIndex/moreStuff HTTP/1.0 host: testhostname {"top":{"properties":{"field1":{"type":"text"},"field2":{"type":"keyword"}}}} ``` +{% include copy.html %} You can pass Base64-encoded transformation scripts using `--transformer-config-base64`. @@ -220,6 +221,7 @@ The following example log entry shows a `/_cat/indices?v` request sent to both t "numErrors": 0 } ``` +{% include copy.html %} ### Decoding log content diff --git a/_migrations/migration-phases/verifying-migration-tools.md b/_migration-assistant/migration-phases/verifying-migration-tools.md similarity index 96% rename from _migrations/migration-phases/verifying-migration-tools.md rename to _migration-assistant/migration-phases/verifying-migration-tools.md index 498ed50feb..77df2b4280 100644 --- a/_migrations/migration-phases/verifying-migration-tools.md +++ b/_migration-assistant/migration-phases/verifying-migration-tools.md @@ -9,7 +9,7 @@ parent: Migration phases Before using the Migration Assistant, take the following steps to verify that your cluster is ready for migration. -## Snapshot creation verification +## Verifying snapshot creation Verify that a snapshot can be created of your source cluster and used for metadata and backfill scenarios. @@ -28,6 +28,7 @@ Create an S3 bucket for the snapshot using the following AWS Command Line Interf ```shell aws s3api create-bucket --bucket --region ``` +{% include copy.html %} Register a new S3 snapshot repository on your source cluster using the following cURL command: @@ -40,6 +41,7 @@ curl -X PUT "http://:9200/_snapshot/test_s3_repository" -H } }' ``` +{% include copy.html %} Next, create a test snapshot that captures only the cluster's metadata: @@ -50,6 +52,7 @@ curl -X PUT "http://:9200/_snapshot/test_s3_repository/test "include_global_state": true }' ``` +{% include copy.html %} Check the AWS Management Console to confirm that your bucket contains the snapshot. @@ -62,12 +65,14 @@ To remove the resources created during verification, you can use the following d ```shell curl -X DELETE "http://:9200/_snapshot/test_s3_repository/test_snapshot_1?pretty" ``` +{% include copy.html %} **Test snapshot repository** ```shell curl -X DELETE "http://:9200/_snapshot/test_s3_repository?pretty" ``` +{% include copy.html %} **S3 bucket** @@ -75,6 +80,7 @@ curl -X DELETE "http://:9200/_snapshot/test_s3_repository?p aws s3 rm s3:// --recursive aws s3api delete-bucket --bucket --region ``` +{% include copy.html %} ### Troubleshooting @@ -162,7 +168,7 @@ Look for failing tasks by navigating to **Traffic Capture Proxy ECS**. Change ** After all verifications are complete, reset all resources before using Migration Assistant for an actual migration. -The following steps outline how to reset resources with Migration Assistant before executing the actual migration. At this point all verifications are expected to have been completed. These steps can be performed after [Accessing the Migration Console]({{site.url}}{{site.baseurl}}/migrations/migration-console/accessing-the-migration-console/). +The following steps outline how to reset resources with Migration Assistant before executing the actual migration. At this point all verifications are expected to have been completed. These steps can be performed after [Accessing the Migration Console]({{site.url}}{{site.baseurl}}/migration-assistant/migration-console/accessing-the-migration-console/). ### Traffic Replayer @@ -171,6 +177,7 @@ To stop running Traffic Replayer, use the following command: ```bash console replay stop ``` +{% include copy.html %} ### Kafka @@ -182,6 +189,7 @@ This command will result in the loss of any traffic data captured by the capture ```bash console kafka delete-topic ``` +{% include copy.html %} ### Target cluster @@ -193,4 +201,5 @@ This command will result in the loss of all data in the target cluster and shoul ```bash console clusters clear-indices --cluster target ``` +{% include copy.html %} diff --git a/_migrations/migration-phases/assessing-your-cluster-for-migration.md b/_migrations/migration-phases/assessing-your-cluster-for-migration.md deleted file mode 100644 index d056754555..0000000000 --- a/_migrations/migration-phases/assessing-your-cluster-for-migration.md +++ /dev/null @@ -1,44 +0,0 @@ ---- -layout: default -title: Assessing your cluster for migration -nav_order: 60 -has_children: true -parent: Migration phases ---- - -# Assessing your cluster for migration - -The goal of Migration Assistant is to streamline the process of migrating from one location or version of Elasticsearch/OpenSearch to another. However, completing a migration sometimes requires resolving client compatibility issues before they can communicate directly with the target cluster. - -## Understanding breaking changes - -Before performing any upgrade or migration, you should review any breaking changes documentation. Even if the cluster is migrated, there may be changes required in order for clients to connect to the new cluster. - -## Upgrade and breaking changes guides - -For migration paths between Elasticsearch 6.8 and OpenSearch 2.x, you should be familiar with the following documentation, depending on your specific use case: - -* [Upgrading Amazon OpenSearch Service domains](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/version-migration.html). - -* [Amazon OpenSearch Service rename - Summary of changes](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/rename.html). - -* [OpenSearch breaking changes](https://opensearch.org/docs/latest/breaking-changes/). - -The next step is to set up a proper test bed to verify that your applications will work as expected on the target version. - -## Impact of data transformations - -Any time you apply a transformation to your data, such as changing index names, modifying field names or field mappings, or splitting indexes with type mappings, these changes may need to be reflected in your client configurations. For example, if your clients are reliant on specific index or field names, you must ensure that their queries are updated accordingly. - - - -We recommend running production-like queries against the target cluster before switching to actual production traffic. This helps verify that the client can: - -- Communicate with the target cluster. -- Locate the necessary indexes and fields. -- Retrieve the expected results. - -For complex migrations involving multiple transformations or breaking changes, we highly recommend performing a trial migration with representative, non-production data (for example, in a staging environment) to fully test client compatibility with the target cluster. - - - diff --git a/_migrations/other-helpful-pages/load-sample-data-into-cluster.md b/_migrations/other-helpful-pages/load-sample-data-into-cluster.md deleted file mode 100644 index 6dbd9fdcb2..0000000000 --- a/_migrations/other-helpful-pages/load-sample-data-into-cluster.md +++ /dev/null @@ -1,144 +0,0 @@ - -This guide demonstrates how to quickly load test data into an Elasticsearch or OpenSearch source cluster using AWS Glue and the AWS Open Dataset library. We'll walk through indexing Bitcoin transaction data on the source cluster. For more details, refer to [the official AWS documentation on setting up Glue connections to OpenSearch](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-connect-opensearch-home.html) ↗. - -## 1. Create Your Source Cluster - -Create your source cluster using the method of your choice, keeping in mind the following requirements: - -* We will use basic authentication (username/password) for access control. In this example, we are using Elasticsearch 7.10, but earlier versions of Elasticsearch and OpenSearch 1.X and 2.X are also supported. -* The source cluster must be in a VPC you control and have access to, enabling AWS Glue to send data to it. - -## 2. Create the Access Secret in Secrets Manager - -Create a secret in AWS Secrets Manager that provides AWS Glue with access to the source cluster’s basic authentication credentials: - -1. Navigate to the AWS Secrets Manager Console. -2. Create a generic secret. Name it as you prefer and configure replication/rotation as needed. - -Key fields: - -* `opensearch.net.http.auth.user`: The username for accessing the source cluster. -* `opensearch.net.http.auth.pass`: The password for accessing the source cluster. - -
- -Example Access Secrets - - -![Screenshot](https://github.com/user-attachments/assets/dde7e343-4a9c-4f0b-af6d-e7048ecd1b14) -
- -## 3. Create an AWS Glue IAM Role - -Create an IAM Role that grants AWS Glue the necessary permissions: - -1. Navigate to the IAM Console and create a new IAM Role. -2. Use the following trust policy to allow the AWS Glue service to assume the role: - -```json -{ - "Version": "2012-10-17", - "Statement": [ - { - "Effect": "Allow", - "Principal": { - "Service": "glue.amazonaws.com" - }, - "Action": "sts:AssumeRole" - } - ] -} -``` - -3. Attach the following permission sets: - * `AWSGlueServiceRole` - * `AmazonS3ReadOnlyAccess` - -4. Grant the role access to the secret created in step 2 by adding an inline policy like this: - -```json -{ - "Version": "2012-10-17", - "Statement": [ - { - "Effect": "Allow", - "Action": "secretsmanager:GetSecretValue", - "Resource": "arn:aws:secretsmanager:us-east-1:XXXXXXXXXXXX:secret:migration-assistant-source-cluster-creds-YDtnmx" - } - ] -} -``` - -## 4. Create an AWS Glue Connection - -Create a Glue Connection to provide AWS Glue with access to the source cluster: - -1. Navigate to the AWS Glue Console. -2. Create a new connection of the **Amazon OpenSearch Service** type. -3. Fill in your source cluster’s details (including VPC, subnet, and security group) and use the secret created earlier. - -
- -Example Glue Connection Configuration - - -![Screenshot](https://github.com/user-attachments/assets/b5978b2e-de58-4d46-ad47-ac960e729b89) -
- -## 5. (Optional) Examine the Source Dataset - -The sample dataset we'll use is the [AWS Public Blockchain dataset](https://registry.opendata.aws/aws-public-blockchain/) ↗, which is available for free. More information can be found [in this blog post](https://aws.amazon.com/blogs/database/access-bitcoin-and-ethereum-open-datasets-for-cross-chain-analytics/) ↗, and you can browse its contents in S3 [here](https://us-east-2.console.aws.amazon.com/s3/buckets/aws-public-blockchain) ↗. - -The Bitcoin transaction data we'll load into our source cluster is located at the S3 URI: `s3://aws-public-blockchain/v1.0/btc/transactions/`. - -## 6. Create the AWS Glue Job - -Now, create a Glue Job in the AWS Glue Console using the connection you created earlier. - -### S3 Source - -1. Set the S3 URI to `s3://aws-public-blockchain/v1.0/btc/transactions`. -2. Enable recursive reading of the bucket's contents. -3. The data format is Parquet. - -
- -Example Glue Connection - - -![Screenshot](https://github.com/user-attachments/assets/6fc4c0da-45b9-4c09-ba73-1619f59c9dd3) -
- -### OpenSearch Target - -1. Select the AWS Glue Connection you created. -2. Specify the index name where the Bitcoin transaction data will be stored. - -
- -Example Data Sink - - -![Screenshot](https://github.com/user-attachments/assets/264d0d17-f7f4-4c07-8567-6cae47c3ccd1) -
- -### Pre-Configure the Index Settings - -This is an optional step. By default, the Glue Job creates a single-shard index. Since the dataset is approximately 1 TB in size, it's recommended to pre-create the index with multiple shards. Follow this example to create an index with 40 shards: - -```bash -curl -u : -X PUT "http://:9200/bitcoin-data" -H 'Content-Type: application/json' -d' -{ - "settings": { - "number_of_shards": 40, - "number_of_replicas": 1 - } -} -' -``` - -You can also adjust any additional index settings at this time. - -## 7. Run the Glue Job - -Once the Glue source and target are configured, run the job in the AWS Console by clicking the **Run** button. You can monitor the job’s progress under the **Runs** tab in the console. \ No newline at end of file diff --git a/_migrations/other-helpful-pages/migration-timelines.md b/_migrations/other-helpful-pages/migration-timelines.md deleted file mode 100644 index 51d4302904..0000000000 --- a/_migrations/other-helpful-pages/migration-timelines.md +++ /dev/null @@ -1,105 +0,0 @@ -There is no *one-size-fits-most* migration strategy, this guide seeks to describe possible sample scenario(s) with the goal of helping customers plan their own migration strategy and estimate costs accordingly. - -## 15 Day Historical and Live Migration - -Key phases: - -1. Setup, Planning, and Verification (Days 1-5) -1. Historical backfill, Catchup, and Validation (Days 6-10) -1. Final Validation, Traffic Switchover, and Teardown (Days 11-15) - -### Timeline - -```mermaid -%%{ - init: { - "gantt": { - "fontSize": 20, - "barHeight": 40, - "sectionFontSize": 24, - "leftPadding": 175 - } - } -}%% -gantt - dateFormat D HH - axisFormat Day %d - todayMarker off - tickInterval 1day - - section Steps - Setup and Verification : prep, 1 00, 5d - Clear Test Environment : milestone, clear, after prep, 0d - Traffic Capture : traffic_capture, after clear, 6d - Snapshot : snapshot, after clear, 1d - Scale Up Target Cluster for Backfill : backfill_scale, 6 22, 2h - Metadata Migration : metadata, after snapshot, 1h - Reindex from Snapshot : rfs, after metadata, 71h - Scale Down Target Cluster for Replay : replay_scale, after rfs, 2h - Traffic Replay: replay, after replay_scale, 46h - Traffic Switchover : milestone, switchover, after replay, 0d - Validation : validation, after snapshot, 7d - Scale Down Target Cluster : 11 00, 2h - Teardown : teardown, 14 00, 2d -``` - -#### Explanation of Scaling Operations - -This section assumes a customer chooses to deliberatly scale their target cluster for backfill and/or replay to enable a faster and/or cheaper overall migration. In the absence of this, backfill and replay steps may take much longer (likely increasing overall cost). - -This plan assumes we can replay 6 days of captured data in under 2 days in order for the source and target clusters to be in sync. Take an example of a source cluster operating at avg. 90% CPU utilization to handle reads/writes from application code, it's improbable that a target cluster with the same scale and configuration will be able to support a request throughput of at least 3x in order to catchup in the given time. The same holds for backfill for write-heavy clusters or clusters where data has accumulated for a long time period, to follow this plan, the target cluster should be scaled such that it can ingest/index all the source data in under 3 days. - - -1. **Scale Up Target Cluster for Backfill**: Occurs after metadata migration and before reindexing. The target cluster is scaled up to handle the resource-intensive reindexing process faster. - - -2. **Scale Down Target Cluster for Replay**: Once the reindexing is complete, the target cluster is scaled down to a more appropriate size for the traffic replay phase. While still provisioned higher than normal production workloads, given replayer has a >1 speedup factor. - -3. **Scale Down Target Cluster**: After the validation phase, the target cluster is scaled down to its final operational size. This step ensures that the cluster is rightsized for normal production workloads, balancing performance needs with cost-efficiency. - -### Component Durations - -This component duration breakdown is useful for identifying the cost of resources deployed during the migration process. It provides a clear overview of how long each component is active or retained, which directly impacts resource utilization and associated costs. - -Note: Duration excludes weekends. If actual timeline extends over weekends, duration (and potentially costs) will increase. - -```mermaid -%%{ - init: { - "gantt": { - "fontSize": 20, - "barHeight": 40, - "sectionFontSize": 24, - "leftPadding": 175 - } - } -}%% -gantt - dateFormat D HH - axisFormat Day %d - todayMarker off - tickInterval 1day - - section Services - Core Services Runtime (15d) : active, 1 00, 15d - Capture Proxy Runtime (6d) : active, capture_active, 6 00, 6d - Capture Data Retention (4d) : after capture_active, 4d - Snapshot Runtime (1d) : active, snapshot_active, 6 00, 1d - Snapshot Retention (9d) : after snapshot_active, 9d - Reindex from Snapshot Runtime (3d) : active, historic_active, 7 01, 71h - Replayer Runtime (2d) : active, replayer_active, after historic_active, 2d - Replayer Data Retention (4d) : after replayer_active, 4d - Target Proxy Runtime (4d) : active, after replayer_active, 4d -``` - -| Component | Duration | -|-----------------------------------|----------| -| Core Services Runtime | 15d | -| Capture Proxy Runtime | 6d | -| Capture Data Retention | 4d | -| Snapshot Runtime | 1d | -| Snapshot Retention | 9d | -| Reindex from Snapshot Runtime | 3d | -| Replayer Runtime | 2d | -| Replayer Data Retention | 4d | -| Target Proxy Runtime | 4d | diff --git a/_migrations/other-helpful-pages/provisioning-source-cluster-for-testing.md b/_migrations/other-helpful-pages/provisioning-source-cluster-for-testing.md deleted file mode 100644 index 69b4e930e6..0000000000 --- a/_migrations/other-helpful-pages/provisioning-source-cluster-for-testing.md +++ /dev/null @@ -1,91 +0,0 @@ - -This guide walks you through the steps to provision an Elasticsearch cluster on EC2 using AWS CDK. The CDK that provisions this cluster can be found on the `migration-es` branch of the `opensearch-cluster-cdk` GitHub [forked repository](https://github.com/lewijacn/opensearch-cluster-cdk/tree/migration-es). - -TODO ^ lewijacn seems like it should be updated? - -## 1. Clone the Repository for Source Cluster CDK - -```bash -git clone https://github.com/lewijacn/opensearch-cluster-cdk.git -cd opensearch-cluster-cdk -git checkout migration-es -``` - -## 2. Install NPM Dependencies - -```bash -npm install -``` - -## 3. Configure AWS Credentials - -Configure the desired [AWS credentials](https://docs.aws.amazon.com/cdk/v2/guide/getting_started.html#getting_started_prerequisites) ↗ for the environment, as these will dictate the region and account used for deployment. - -## 4. Configure Cluster Options - -The configuration below sets up a single-node Elasticsearch 7.10.2 cluster on EC2 and a VPC to host the cluster. Alternatively, you can specify an existing VPC by providing the `vpcId` parameter. The setup includes an internal load balancer, which should be used when interacting with the cluster. - -Copy and paste the following configuration into a `cdk.context.json` file at the root of the repository. Replace the `` placeholders with the desired deployment stage, e.g., `dev`. - -```json -{ - "source-single-node-ec2": { - "suffix": "ec2-source-", - "networkStackSuffix": "ec2-source-", - "distVersion": "7.10.2", - "cidr": "12.0.0.0/16", - "distributionUrl": "https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-oss-7.10.2-linux-x86_64.tar.gz", - "captureProxyEnabled": false, - "securityDisabled": true, - "minDistribution": false, - "cpuArch": "x64", - "isInternal": true, - "singleNodeCluster": true, - "networkAvailabilityZones": 2, - "dataNodeCount": 1, - "managerNodeCount": 0, - "serverAccessType": "ipv4", - "restrictServerAccessTo": "0.0.0.0/0" - } -} -``` - -> **Note:** You can specify other versions of Elasticsearch or OpenSearch by modifying the `distributionUrl` parameter. - -## 5. Bootstrap CDK in the Region (If Needed) - -If this is the first time you're deploying CDK in the region, you'll need to run the following command. **Note:** This only needs to be done once per region. - -```bash -cdk bootstrap --c contextId=source-single-node-ec2 --c contextFile=cdk.context.json -``` - -## 6. Deploy CloudFormation Stacks with CDK - -Deploy the infrastructure using the following command: - -```bash -cdk deploy "*" --c contextId=source-single-node-ec2 --c contextFile=cdk.context.json -``` - -Once the deployment is complete, the CDK will output the internal load balancer endpoint, which can be used within the VPC to interact with the Elasticsearch cluster: - -```bash -# Stack output -opensearch-infra-stack-ec2-source-dev.loadbalancerurl = opense-clust-owiejfo2345-sdfljsd.elb.us-east-1.amazonaws.com - -# Example curl command within the VPC -curl http://opense-clust-owiejfo2345-sdfljsd.elb.us-east-1.amazonaws.com:9200 -``` - -## 7. Clean Up Resources - -When you are done using the provisioned source cluster, you can delete the resources by running the following command: - -```bash -cdk destroy "*" --c contextId=source-single-node-ec2 --c contextFile=cdk.context.json -``` - -For a full list of options, refer to the CDK options in the [repository documentation](https://github.com/lewijacn/opensearch-cluster-cdk/tree/migration-es?tab=readme-ov-file#required-context-parameters). - -^ TODO: Are we advertising a fork? Seems like this should be fixed up \ No newline at end of file diff --git a/_migrations/quick-start-data-migration.md b/_migrations/quick-start-data-migration.md deleted file mode 100644 index 62b13292e7..0000000000 --- a/_migrations/quick-start-data-migration.md +++ /dev/null @@ -1,262 +0,0 @@ ---- -layout: default -title: Quickstart - Data migration -nav_order: 10 ---- - -# Quickstart - Data migration - -This document outlines how to deploy the Migration Assistant and execute an existing data migration using Reindex-from-Snapshot (RFS). Note that this does not include steps for deploying and capturing live traffic, which is necessary for a zero-downtime migration. Please refer to the "Phases of a Migration" section in the wiki navigation bar for a complete end-to-end migration process, including metadata migration, live capture, Reindex-from-Snapshot, and replay. - -## Prerequisites and Assumptions -* Verify your migration path [is supported](https://github.com/opensearch-project/opensearch-migrations/wiki/Is-Migration-Assistant-Right-for-You%3F#supported-migration-paths). Note that we test with the exact versions specified, but you should be able to migrate data on alternative minor versions as long as the major version is supported. -* Source cluster must be deployed with the S3 plugin. -* Target cluster must be deployed. -* A snapshot will be taken and stored in S3 in this guide, and the following assumptions are made about this snapshot: - * The `_source` flag is enabled on all indices to be migrated. - * The snapshot includes the global cluster state (`include_global_state` is `true`). - * Shard sizes up to approximately 80GB are supported. Larger shards will not be able to migrate. If this is a blocker, please consult the migrations team. -* Migration Assistant will be installed in the same region and have access to both the source snapshot and target cluster. - ---- - -## Step 1 - Installing Bootstrap EC2 Instance (~10 mins) -1. Log into the target AWS account where you want to deploy the Migration Assistant. -2. From the browser where you are logged into your target AWS account right-click [here](https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/new?templateURL=https://solutions-reference.s3.amazonaws.com/migration-assistant-for-amazon-opensearch-service/latest/migration-assistant-for-amazon-opensearch-service.template&redirectId=SolutionWeb) ↗ to load the CloudFormation (Cfn) template from a new browser tab. -3. Follow the CloudFormation stack wizard: - * **Stack Name:** `MigrationBootstrap` - * **Stage Name:** `dev` - * Hit **Next** on each step, acknowledge on the fourth screen, and hit **Submit**. -4. Verify that the bootstrap stack exists and is set to `CREATE_COMPLETE`. This process takes around 10 minutes. - ---- - -## Step 2 - Setup Bootstrap Instance Access (~5 mins) -1. After deployment, find the EC2 instance ID for the `bootstrap-dev-instance`. -2. Create an IAM policy using the snippet below, replacing ``, ``, ``, and ``: - -```json -{ - "Version": "2012-10-17", - "Statement": [ - { - "Effect": "Allow", - "Action": "ssm:StartSession", - "Resource": [ - "arn:aws:ec2:::instance/", - "arn:aws:ssm:::document/SSM--BootstrapShell" - ] - } - ] -} -``` - -3. Name the policy, e.g., `SSM-OSMigrationBootstrapAccess`, and create the policy. - ---- - -## Step 3 - Login to Bootstrap and Build (~15 mins) -### Prerequisites: -* AWS CLI and AWS Session Manager Plugin installed. -* AWS credentials configured (`aws configure`). - -1. Load AWS credentials into your terminal. -2. Login to the instance using the command below, replacing `` and ``: -```bash -aws ssm start-session --document-name SSM-dev-BootstrapShell --target --region [--profile ] -``` -3. Once logged in, run the following command from the shell of the bootstrap instance (within the /opensearch-migrations directory): -```bash -./initBootstrap.sh && cd deployment/cdk/opensearch-service-migration -``` -4. After a successful build, remember the path for infrastructure deployment in the next step. - ---- - -## Step 4 - Configuring and Deploying for RFS Use Case (~20 mins) -1. Add the target cluster password to AWS Secrets Manager as an unstructured string. Be sure to copy the secret ARN for use during deployment. -2. From the same shell on the bootstrap instance, modify the cdk.context.json file located in the `/opensearch-migrations/deployment/cdk/opensearch-service-migration` directory: - -```json -{ - "migration-assistant": { - "vpcId": "", - "targetCluster": { - "endpoint": "", - "auth": { - "type": "basic", - "username": "", - "passwordFromSecretArn": "" - } - }, - "sourceCluster": { - "endpoint": "", - "auth": { - "type": "basic", - "username": "", - "passwordFromSecretArn": "" - } - }, - "reindexFromSnapshotExtraArgs": "", - "stage": "dev", - "otelCollectorEnabled": true, - "migrationConsoleServiceEnabled": true, - "reindexFromSnapshotServiceEnabled": true, - "migrationAssistanceEnabled": true - } -} -``` - -The source and target cluster authorization can be configured to have none, `basic` with a username and password, or `sigv4`. There are examples of each available [here](https://github.com/opensearch-project/opensearch-migrations/wiki/Configuration-Options#cluster-authentication-options). - -3. Bootstrap the account with the following command: -```bash -cdk bootstrap --c contextId=migration-assistant --require-approval never -``` -4. Deploy the stacks: -```bash -cdk deploy "*" --c contextId=migration-assistant --require-approval never --concurrency 5 -``` -5. Verify that all CloudFormation stacks were installed successfully. - -#### ReindexFromSnapshot Parameters -* If you're creating a snapshot using migration tooling, these parameters are auto-configured. If you're using an existing snapshot, modify `reindexFromSnapshotExtraArgs` with the following values: -```bash ---s3-repo-uri s3:/// --s3-region --snapshot-name -``` -Note, you will also need to give access to the migrationconsole and reindexFromSnapshot taskRole permissions to the bucket - ---- - -## Step 5 - Deploying the Migration Assistant -1. Bootstrap the account: -```bash -cdk bootstrap --c contextId=migration-assistant --require-approval never --concurrency 5 -``` -2. Deploy the stacks when `cdk.context.json` is fully configured: -```bash -cdk deploy "*" --c contextId=migration-assistant --require-approval never --concurrency 3 -``` - -### Stacks Deployed: -* Migration Assistant Network stack -* Reindex From Snapshot stack -* Migration Console stack - ---- - -## Step 6 - Accessing the Migration Console -Run the following command to access the migration console: -```bash -./accessContainer.sh migration-console dev -``` ->[!NOTE] ->`accessContainer.sh` is located in `/opensearch-migrations/deployment/cdk/opensearch-service-migration/` on the bootstrap instance. - -_Learn more [[Accessing the Migration Console]]_ - ---- - -## Step 7 - Checking Connection to Source & Target Clusters -To verify the connection to the clusters, run: -```bash -console clusters connection-check -``` - -### Expected Output: -* **Source Cluster:** Successfully connected! -* **Target Cluster:** Successfully connected! - -_Learn more [[Console commands reference|Migration-Console-commands-references]]_ - ---- - -## Step 8 - Snapshot Creation -Run the following to initiate creating a snapshot from the source cluster -``` -console snapshot create [...] -``` - -To check on the progress, -``` -console snapshot status [...] -``` -or, for more detail, -``` -console snapshot status --deep-check [...] -``` - -Wait for the snapshot to complete before moving to the next step. - -_Learn more [[Snapshot Creation Verification]] [[Snapshot Creation]]_ - ---- - -## Step 9 - Metadata Migration -Run the following command to migrate metadata: -```bash -console metadata migrate [...] -``` - -_Learn more [[Metadata Migration]]_ - ---- - -## Step 10 - RFS Document Migration -Start the backfill process: -```bash -console backfill start -``` - -Scale up the number of workers: -```bash -console backfill scale -``` - -Check the status: -```bash -console backfill status -``` - -To stop the workers: -```bash -console backfill stop -``` - -_Learn more [[Backfill Execution]]_ - ---- - -## Step 11 - Monitoring -Use the following command for detailed monitoring: -```bash -console backfill status --deep-check -``` - -### Example Output: -```text -BackfillStatus.RUNNING -Running=9 -Pending=1 -Desired=10 -Shards total: 62 -Shards completed: 46 -Shards incomplete: 16 -Shards in progress: 11 -Shards unclaimed: 5 -``` - -Logs and metrics are available in CloudWatch in the OpenSearchMigrations log group. - ---- - -## Step 12 - Verify all documents were migrated -Use the following query in CloudWatch Logs Insights to identify failed documents: -```bash -fields @message -| filter @message like "Bulk request succeeded, but some operations failed." -| sort @timestamp desc -| limit 10000 -``` - -_Learn more [[Backfill Result Validation]]_ \ No newline at end of file diff --git a/_upgrade-to/index.md b/_upgrade-to/index.md index 0eea3d6209..696be88c21 100644 --- a/_upgrade-to/index.md +++ b/_upgrade-to/index.md @@ -1,6 +1,6 @@ --- layout: default -title: About the migration process +title: Upgrading OpenSearch nav_order: 1 nav_exclude: true permalink: /upgrade-to/ @@ -8,15 +8,14 @@ redirect_from: - /upgrade-to/index/ --- -# About the migration process +# Upgrading OpenSearch -The process of migrating from Elasticsearch OSS to OpenSearch varies depending on your current version of Elasticsearch OSS, installation type, tolerance for downtime, and cost-sensitivity. Rather than concrete steps to cover every situation, we have general guidance for the process. +The process of upgrading your OpenSearch version varies depending on your current version of OpenSearch, installation type, tolerance for downtime, and cost-sensitivity. For migrating to OpenSearch, we provide a [Migration Assistant]({{site.url}}{{site.baseurl}}/migration-assistant/). -Three approaches exist: +Two upgrade approaches exists: -- Use a snapshot to [migrate your Elasticsearch OSS data]({{site.url}}{{site.baseurl}}/upgrade-to/snapshot-migrate/) to a new OpenSearch cluster. This method may incur downtime. -- Perform a [restart upgrade or a rolling upgrade]({{site.url}}{{site.baseurl}}/upgrade-to/upgrade-to/) on your existing nodes. A restart upgrade involves upgrading the entire cluster and restarting it, whereas a rolling upgrade requires upgrading and restarting nodes in the cluster one by one. -- Replace existing Elasticsearch OSS nodes with new OpenSearch nodes. Node replacement is most popular when upgrading [Docker clusters]({{site.url}}{{site.baseurl}}/upgrade-to/docker-upgrade-to/). +- Perform a [restart upgrade or a rolling upgrade]({{site.url}}{{site.baseurl}}/upgrade-to/snapshot-migrate/) on your existing nodes. A restart upgrade involves upgrading the entire cluster and restarting it, whereas a rolling upgrade requires upgrading and restarting nodes in the cluster one by one. +- Replace existing OpenSearch nodes with new OpenSearch nodes. Node replacement is most popular when upgrading [Docker clusters]({{site.url}}{{site.baseurl}}/upgrade-to/docker-upgrade-to/). Regardless of your approach, to safeguard against data loss, we recommend that you take a [snapshot]({{site.url}}{{site.baseurl}}/opensearch/snapshots/snapshot-restore) of all indexes prior to any migration. diff --git a/_upgrade-to/upgrade-to.md b/_upgrade-to/upgrade-to.md index 340055b214..00950687a5 100644 --- a/_upgrade-to/upgrade-to.md +++ b/_upgrade-to/upgrade-to.md @@ -6,6 +6,10 @@ nav_order: 15 # Migrating from Elasticsearch OSS to OpenSearch + +OpenSearch provides a [Migration Assistant]({{site.url}}{{site.baseurl}}/migration-assistant/) to assist you in migrating from other search solutions. +{: .warning} + If you want to migrate from an existing Elasticsearch OSS cluster to OpenSearch and find the [snapshot approach]({{site.url}}{{site.baseurl}}/upgrade-to/snapshot-migrate/) unappealing, you can migrate your existing nodes from Elasticsearch OSS to OpenSearch. If your existing cluster runs an older version of Elasticsearch OSS, the first step is to upgrade to version 6.x or 7.x. diff --git a/images/migrations/migrations-architecture-overview.png b/images/migrations/migrations-architecture-overview.png new file mode 100644 index 0000000000..3002da3a87 Binary files /dev/null and b/images/migrations/migrations-architecture-overview.png differ