Added migration of tx manager coordinator in recovery mode #15121

mmaslankaprv · 2023-11-24T09:32:18Z

Added a service handling tx manager topic migration (rehashing of tx
manager stm updates to new partition count). A service is a simple state
machine that works in the following steps:

Create a temporary topic for new partitioning scheme
For each partition of current tx manager topic read all the data,
assign new partition and replicate according to the new scheme
Delete current tx manager topic
Create tx manager topic with new number of partitions
Replicate data from temporary topic to the new tx manager topic.
Delete temporary topic

If the process is interrupted with a failure it may be retried as the
logic in migrator service determines starting condition and adjust
initial step accordingly.

The error handling is very simple and requires retrying operation if it
failed.

If any of the operation failed it will always be started from scratch to
prevent leaving some intermediate state behind.

Backports Required

Release Notes

Features

ability to change number of partitions in tx manager topic

vbotbuildovich · 2023-11-24T11:38:01Z

new failures in https://buildkite.com/redpanda/redpanda/builds/41701#018c00da-046d-49ad-8532-e225a273895b:

"rptest.tests.cluster_recovery_test.ClusterRecoveryTest.test_basic_controller_snapshot_restore"

new failures in https://buildkite.com/redpanda/redpanda/builds/41701#018c00ea-2242-4aaa-bc0c-a59cf82382cd:

"rptest.tests.tx_coordinator_migration_test.TxCoordinatorMigrationTest.test_migrating_tx_manager_coordinator.with_failures=True"

new failures in https://buildkite.com/redpanda/redpanda/builds/41701#018c00ea-223d-476c-a7cc-992d72f9a7e3:

"rptest.tests.transactions_test.TxUpgradeTest.upgrade_does_not_change_tx_coordinator_assignment_test"

new failures in https://buildkite.com/redpanda/redpanda/builds/41701#018c00ea-223f-46b3-bb50-09ec92ea04bb:

"rptest.tests.cluster_recovery_test.ClusterRecoveryTest.test_basic_controller_snapshot_restore"
"rptest.tests.tx_coordinator_migration_test.TxCoordinatorMigrationTest.test_migrating_tx_manager_coordinator.with_failures=False"

new failures in https://buildkite.com/redpanda/redpanda/builds/41749#018c101b-f865-4d19-ac62-8a21acd01abb:

"rptest.tests.tx_coordinator_migration_test.TxCoordinatorMigrationTest.test_migrating_tx_manager_coordinator.with_failures=True"

new failures in https://buildkite.com/redpanda/redpanda/builds/41753#018c10b2-31d5-4759-a037-b48ca6fd969e:

"rptest.tests.tx_coordinator_migration_test.TxCoordinatorMigrationTest.test_migrating_tx_manager_coordinator.with_failures=True"

new failures in https://buildkite.com/redpanda/redpanda/builds/41753#018c10b2-31d1-4c5e-b3d7-1032baa57b2d:

"rptest.tests.tx_coordinator_migration_test.TxCoordinatorMigrationTest.test_migrating_tx_manager_coordinator.with_failures=False"

new failures in https://buildkite.com/redpanda/redpanda/builds/41757#018c114b-48eb-4c8d-99ff-6e5fcd7c0d24:

"rptest.tests.tx_coordinator_migration_test.TxCoordinatorMigrationTest.test_migrating_tx_manager_coordinator.with_failures=True"

new failures in https://buildkite.com/redpanda/redpanda/builds/41872#018c1705-ddbd-48db-98b6-b713fe9c2ad1:

"rptest.tests.tx_coordinator_migration_test.TxCoordinatorMigrationTest.test_migrating_tx_manager_coordinator.with_failures=True"

new failures in https://buildkite.com/redpanda/redpanda/builds/42108#018c248a-7577-4009-bcd1-cf91f0779559:

"rptest.tests.consumer_group_test.ConsumerGroupTest.test_consumer_is_removed_when_timedout.static_members=True"
"rptest.tests.consumer_group_recovery_tool_test.ConsumerOffsetsRecoveryToolTest.test_consumer_offsets_partition_count_change"
"rptest.tests.archival_test.ArchivalTest.test_all_partitions_leadership_transfer.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.archival_test.ArchivalTest.test_timeboxed_uploads.acks=0.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.prefix_truncate_recovery_test.PrefixTruncateRecoveryTest.test_prefix_truncate_recovery.acks=-1.start_empty=False"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_size_based_retention.cloud_storage_type=CloudStorageType.S3"

new failures in https://buildkite.com/redpanda/redpanda/builds/42115#018c25f7-5c17-4b64-8710-8445ee583552:

"rptest.tests.tx_coordinator_migration_test.TxCoordinatorMigrationTest.test_migrating_tx_manager_coordinator.with_failures=True.upgrade=True"

vbotbuildovich · 2023-11-24T11:52:20Z

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/41701#018c00ea-223a-4e80-8017-97f3957dc780

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/42026#018c1f70-ff16-471d-b6b5-927208bffbab

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/42054#018c215e-f0a7-48f5-b1db-4ddf481f959b

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/42115#018c25f7-5c21-480a-a1e8-972ba5f7a986

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/42209#018c3451-c8c8-401c-b80a-10c9a98da1f4

bharathv

looks great to me overall.

src/v/cluster/tm_stm_cache_manager.h

src/v/cluster/topics_frontend.cc

src/v/cluster/migrations/tx_manager_migrator.cc

bharathv · 2023-11-30T07:49:44Z

src/v/cluster/migrations/tx_manager_migrator.cc

+            break;
+        }
+        case migration_step::rehash_tx_manager_topic: {
+            auto results = co_await ssx::parallel_transform(


nit: do we need to wrap it in an abortable timeout?

we already have a timeout in transform batches, maybe that is enough ?

bharathv · 2023-11-30T07:51:45Z

src/v/cluster/migrations/tx_manager_migrator.cc

+              != new_partition_count) {
+                co_return errc::topic_invalid_partitions;
+            }
+            auto results = co_await ssx::parallel_transform(


nit: same comment as above, abortable timeout may be? to avoid hang.

tests/rptest/tests/tx_coordinator_migration_test.py

When migrating tx manager topic to new partition count one of the cluster nodes has to read all the data from current tx manager topic and write them back to new tx manager topic with changed partition count. Added RPC service definition that will allow migrator service to read and write tx manager topic partitions from remote nodes. Signed-off-by: Michal Maslanka <[email protected]>

Previously all tx_cache instances on every node were intialized for all the partitions. This is not necessary as not all partitions are instantiated on every shard. Changed logic in tm_cache manager to create a cache instance on demand. Signed-off-by: Michal Maslanka <[email protected]>

As tm_stm owns the tm_stm_cache it should clear it every time we start a new instance. This way we guarantee that the state will be managed solely by the tm_stm instance that is currently running. For now we do not want to clear the state on shutdown as we may start seeing UNKNOWN_SERVER_ERRORS caused by partition movements. (right now a node can always ask others about the previous tx state). Signed-off-by: Michal Maslanka <[email protected]>

Signed-off-by: Michal Maslanka <[email protected]>

Sometimes it may be useful to execute topic deletion from a node which is not a controller leader. Added `cluster::topics_frontend` API allowing topic deletion from any Redpanda node. Signed-off-by: Michal Maslanka <[email protected]>

Added a service handling tx manager topic migration (rehashing of tx manager stm updates to new partition count). A service is a simple state machine that works in the following steps: 1) Create a temporary topic for new partitioning scheme 2) For each partition of current tx manager topic read all the data, assign new partition and replicate according to the new scheme 3) Delete current tx manager topic 4) Create tx manager topic with new number of partitions 5) Replicate data from temporary topic to the new tx manager topic. 6) Delete temporary topic If the process is interrupted with a failure it may be retried as the logic in migrator service determines starting condition and adjust initial step accordingly. The error handling is very simple and requires retrying operation if it failed. If any of the operation failed it will always be started from scratch to prevent leaving some intermediate state behind. Signed-off-by: Michal Maslanka <[email protected]>

Since tx migration may only happen when Redpanda is in recovery mode we instantiate migration service only when recovery mode is enabled. Signed-off-by: Michal Maslanka <[email protected]>

Added handling of special REST APIs that is only enabled in recovery mode. First API that is exposed in this way is an endpoint triggering tx manager migration. Signed-off-by: Michal Maslanka <[email protected]>

Signed-off-by: Michal Maslanka <[email protected]>

bharathv

think only the test changed between force pushes, difficult to tell from diffs, lgtm.

github-actions bot added the area/redpanda label Nov 24, 2023

mmaslankaprv force-pushed the coorinator-migration branch from bbe5625 to c36bac4 Compare November 24, 2023 09:32

mmaslankaprv marked this pull request as draft November 24, 2023 09:32

mmaslankaprv force-pushed the coorinator-migration branch 5 times, most recently from 0f6bbf6 to 77ff61a Compare November 29, 2023 08:37

mmaslankaprv marked this pull request as ready for review November 29, 2023 15:06

mmaslankaprv requested review from bharathv, rystsov and ztlpn November 29, 2023 15:06

mmaslankaprv force-pushed the coorinator-migration branch from 77ff61a to 1e6dea7 Compare November 30, 2023 07:47

bharathv reviewed Nov 30, 2023

View reviewed changes

mmaslankaprv force-pushed the coorinator-migration branch from 1e6dea7 to f5a670d Compare November 30, 2023 11:36

mmaslankaprv requested a review from bharathv November 30, 2023 12:00

bharathv previously approved these changes Nov 30, 2023

View reviewed changes

mmaslankaprv dismissed bharathv’s stale review via 5290680 November 30, 2023 16:43

mmaslankaprv force-pushed the coorinator-migration branch from f5a670d to 5290680 Compare November 30, 2023 16:43

mmaslankaprv requested a review from bharathv November 30, 2023 17:20

mmaslankaprv added 3 commits December 1, 2023 08:34

mmaslankaprv force-pushed the coorinator-migration branch from 5290680 to 4aab45b Compare December 1, 2023 07:34

bharathv previously approved these changes Dec 1, 2023

View reviewed changes

mmaslankaprv dismissed bharathv’s stale review via 9d0fca4 December 1, 2023 14:12

mmaslankaprv force-pushed the coorinator-migration branch from 4aab45b to 9d0fca4 Compare December 1, 2023 14:12

c/service: added rpc to forward delete topics request

7faa9dc

Signed-off-by: Michal Maslanka <[email protected]>

mmaslankaprv added 6 commits December 4, 2023 09:40

c/utils: renamed method building a vector of topic results

d9cd712

Signed-off-by: Michal Maslanka <[email protected]>

app: instantiate tx_migrator service in recovery mode

4f93042

Since tx migration may only happen when Redpanda is in recovery mode we instantiate migration service only when recovery mode is enabled. Signed-off-by: Michal Maslanka <[email protected]>

admin: added recovery mode only admin api for tx manager migration

b5ede4a

Added handling of special REST APIs that is only enabled in recovery mode. First API that is exposed in this way is an endpoint triggering tx manager migration. Signed-off-by: Michal Maslanka <[email protected]>

tests: added tx manager migration test

7226b9a

Signed-off-by: Michal Maslanka <[email protected]>

mmaslankaprv force-pushed the coorinator-migration branch from 9d0fca4 to 7226b9a Compare December 4, 2023 08:40

mmaslankaprv requested a review from bharathv December 4, 2023 08:51

bharathv approved these changes Dec 4, 2023

View reviewed changes

mmaslankaprv merged commit ea656d6 into redpanda-data:dev Dec 4, 2023
20 checks passed

mmaslankaprv deleted the coorinator-migration branch December 4, 2023 12:58

github-actions bot mentioned this pull request Dec 22, 2023

update redpanda appVersion from v23.2.21 to v23.3.1 redpanda-data/helm-charts#950

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added migration of tx manager coordinator in recovery mode #15121

Added migration of tx manager coordinator in recovery mode #15121

mmaslankaprv commented Nov 24, 2023 •

edited

Loading

vbotbuildovich commented Nov 24, 2023 •

edited

Loading

vbotbuildovich commented Nov 24, 2023 •

edited

Loading

bharathv left a comment

bharathv Nov 30, 2023

mmaslankaprv Nov 30, 2023

bharathv Nov 30, 2023

bharathv left a comment

Added migration of tx manager coordinator in recovery mode #15121

Added migration of tx manager coordinator in recovery mode #15121

Conversation

mmaslankaprv commented Nov 24, 2023 • edited Loading

Backports Required

Release Notes

Features

vbotbuildovich commented Nov 24, 2023 • edited Loading

vbotbuildovich commented Nov 24, 2023 • edited Loading

bharathv left a comment

Choose a reason for hiding this comment

bharathv Nov 30, 2023

Choose a reason for hiding this comment

mmaslankaprv Nov 30, 2023

Choose a reason for hiding this comment

bharathv Nov 30, 2023

Choose a reason for hiding this comment

bharathv left a comment

Choose a reason for hiding this comment

mmaslankaprv commented Nov 24, 2023 •

edited

Loading

vbotbuildovich commented Nov 24, 2023 •

edited

Loading

vbotbuildovich commented Nov 24, 2023 •

edited

Loading