Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add doc of cdc and mo-active-standby #1153

Merged
merged 1 commit into from
Oct 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
186 changes: 186 additions & 0 deletions docs/MatrixOne/Maintain/backup-restore/active-standby.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,186 @@
# MatrixOne 主备容灾功能

MatrixOne 支持基于日志复制的主备集群冷备功能,通过实时同步主数据库的事务日志到备库,保障主备数据的一致性和高可用性。在主库出现故障时,备库可以快速接管业务,确保不中断;故障恢复后,系统可将备库的数据同步回主库,实现无缝回切。此方案显著减少停机时间,提升数据可靠性和服务连续性,适用于金融、电商等对高可用性要求较高的场景。

## 操作步骤

### 主集群配置

#### 修改配置文件

进入到 /your_matrixone_path/etc/launch 目录下,新增 fileservice 的 standby 配置,normal 副本和 non-voting 副本都需要配置。

- 新增 log1.toml,normal 副本所在的 log 节点

```shell
service-type = "LOG"
data-dir = "./mo-data"

[log]
level = "info"

[malloc]
check-fraction = 65536
enable-metrics = true

[[fileservice]]
name = "STANDBY"
backend = "DISK"
data-dir = "mo-data/standby"

#对于正常的 logservice 副本,bootstrap 配置组中增加一个配置项,启动备集群的同步。
[logservice.BootstrapConfig]
standby-enabled = true
```

- 新增 log2.toml,non-voting 副本所在的 log 节点

```shell
#需要再启动至少一个 logservice 实例,作为 non-voting 副本运行的节点,并且给该实例配置locality: (配置中的 127.0.0.1 都需要换成实际的 IP 地址)
service-type = "LOG"
data-dir = "./mo-data"

[log]
level = "info"

[malloc]
check-fraction = 65536
enable-metrics = true

[[fileservice]]
name = "STANDBY"
backend = "DISK"
data-dir = "mo-data/standby"

[logservice]
deployment-id = 1
uuid = "4c4dccb4-4d3c-41f8-b482-5251dc7a41bd" #新节点的UUID
raft-address = "127.0.0.1:32010" # raft 服务的地址
logservice-address = "127.0.0.1:32011" # logservice 服务的地址
logservice-listen-address = "0.0.0.0:32011"
gossip-address = "127.0.0.1:32012" # gossip 服务的地址
gossip-seed-addresses = [ # 正常副本的 gossip seed 地址
"127.0.0.1:32002",
"127.0.0.1:32012"
]
locality = "region:west" # 配置 locality 运行 non-voting 副本

[logservice.BootstrapConfig]
bootstrap-cluster = false # 关闭 bootstrap 操作
standby-enabled = true # 启动 standy 功能

[hakeeper-client]
discovery-address="127.0.0.1:32001" # 32001 为默认的 HAKeeper 的地址
```

- 修改 launch.toml

```shell
logservices = [
"./etc/launch/log1.toml",
"./etc/launch/log2.toml"
]

tnservices = [
"./etc/launch/tn.toml"
]

cnservices = [
"./etc/launch/cn.toml"
]
```

#### 启动 non-voting 副本

MO 集群启动后,执行 SQL 命令,对 non-voting 副本实现动态增减:

- 设置 non-voting locality,该 locality 需要与配置文件中的相同:

```bash
mysql> set logservice settings non_voting_locality="region:west";
Query OK, 0 rows affected (0.01 sec)
```

- 设置 non-voting replica num,如下的命令就是设置 non-voting 副本个数为 1:

```
mysql> set logservice settings non_voting_replica_num=1;
Query OK, 0 rows affected (0.02 sec)
```

执行完命令之后,等待片刻,通过两个 SQL 命令查询副本状态:

- show logservice stores;

```sql
mysql> show logservice stores;
+--------------------------------------+------+-------------+----------------------------+-------------+-----------------+-----------------+-----------------+
| store_id | tick | replica_num | replicas | locality | raft_address | service_address | gossip_address |
+--------------------------------------+------+-------------+----------------------------+-------------+-----------------+-----------------+-----------------+
| 4c4dccb4-4d3c-41f8-b482-5251dc7a41bd | 975 | 3 | 0:262148;1:262149;3:262150 | region:west | 127.0.0.1:32010 | 127.0.0.1:32011 | 127.0.0.1:32012 |
| 7c4dccb4-4d3c-41f8-b482-5251dc7a41bf | 974 | 3 | 0:131072;1:262145;3:262146 | | 0.0.0.0:32000 | 0.0.0.0:32001 | 0.0.0.0:32002 |
+--------------------------------------+------+-------------+----------------------------+-------------+-----------------+-----------------+-----------------+
2 rows in set (0.02 sec)
```

- show logservice replicas;

```sql
mysql> show logservice replicas;
+----------+------------+--------------+-----------------------+------+-------+--------------------------------------+
| shard_id | replica_id | replica_role | replica_is_non_voting | term | epoch | store_info |
+----------+------------+--------------+-----------------------+------+-------+--------------------------------------+
| 0 | 131072 | Leader | false | 2 | 1059 | 7c4dccb4-4d3c-41f8-b482-5251dc7a41bf |
| 0 | 262148 | Follower | true | 2 | 1059 | 4c4dccb4-4d3c-41f8-b482-5251dc7a41bd |
| 1 | 262145 | Leader | false | 2 | 120 | 7c4dccb4-4d3c-41f8-b482-5251dc7a41bf |
| 1 | 262149 | Follower | true | 2 | 120 | 4c4dccb4-4d3c-41f8-b482-5251dc7a41bd |
| 3 | 262146 | Leader | false | 2 | 12 | 7c4dccb4-4d3c-41f8-b482-5251dc7a41bf |
| 3 | 262150 | Follower | true | 2 | 12 | 4c4dccb4-4d3c-41f8-b482-5251dc7a41bd |
+----------+------------+--------------+-----------------------+------+-------+--------------------------------------+
6 rows in set (0.01 sec)
```

### 复制数据文件至备集群

1. 将主集群停掉,包括主集群的 non-voting logservice 服务

2. 将 standby 的文件复制到备机的 mo-data/shared 目录下

``` shell
cp -r /your_matrixone_pathe/mo-data/standby/* <username>@<ip>:/your_matrixone_path/mo-data/shared/
```

3. 将主集群 non-voting 副本中的数据复制到备机的 logservice-data/ 目录下

``` shell
cp -r /your_matrixone_pathe/mo-data/logservice-data/<uuid>/ <username>@<ip>:/your_matrixone_pathe/mo-data/logservice-data/
```

其中<uuid>是 non-voting logservice 实例的配置文件中配置的 uuid。

### 启动备集群

#### 数据同步

在备集群利用 logtail 数据同步工具 `mo_ctl` 将主集群 non-voting 副本中的数据同步到备集群的 logservice 中,命令如下:

```shell
mo_ctl data-sync start --logservice-address=127.0.0.1:32001 --log-data-dir=/your_matrixone_path/mo-data/logservice-data/<uuid>/<主集群主机名>/00000000000000000001/tandb
```

其中<uuid>是 non-voting logservice 实例的配置文件中配置的 uuid。

!!! note
mo_ctl 是 MatrixOne 分布式企业级管理的工具,需联系您的客户经理获取。

#### 停止 logservice 服务

```shell
kill `ps -ef | grep mo-service | grep -v grep | awk '{print $2}'`
```

#### 启动 TN、CN 和 Log 服务

```
nohup ./mo-service -launch etc/launch/launch.toml >mo.log 2>&1 &
```
201 changes: 201 additions & 0 deletions docs/MatrixOne/Maintain/cdc/mo-mysql.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
# MatrixOne 到 MySQL CDC 功能

## 场景描述

一家在线零售企业使用 MatrixOne 作为订单管理系统的生产数据库,用于存储订单数据。为了支持业务的实时分析需求(如订单数量、销售趋势、客户行为等),需要将订单数据从 MatrixOne 实时同步到 MySQL 分析数据库中,供数据分析团队和业务系统使用。通过 `mo_cdc` 工具,可以高效实现订单数据的实时同步,让分析系统随时获取最新的订单信息。

- 源数据库(生产数据库):MatrixOne 中的 `orders` 表,包含订单数据,记录每笔订单的详细信息,包括订单 ID、客户 ID、下单时间、订单金额和状态。
- 目标数据库(分析数据库):MySQL 中的 `orders_backup` 表,用于实时统计和分析订单信息。确保业务团队可以实时掌握销售动态。
- 同步需求:通过 `mo_cdc` 将 MatrixOne 的 `orders` 表中的数据实时同步到 MySQL 的 `orders_backup` 表,确保分析系统数据与生产系统一致。

## 操作流程

### 创建表结构

确保源数据库 MatrixOne 和目标数据库 MySQL 中的表结构相同,便于无缝同步数据。

- MatrixOne 中的 `orders` 表:

```sql
CREATE TABLE source_db.orders (
order_id INT PRIMARY KEY,
customer_id INT,
order_date DATETIME,
amount DECIMAL(10, 2),
status VARCHAR(20)
);
INSERT INTO source_db.orders (order_id, customer_id, order_date, amount, status) VALUES
(1, 101, '2024-01-15 14:30:00', 99.99, 'Shipped'),
(2, 102, '2024-02-10 10:00:00', 149.50, 'Delivered'),
(3, 103, '2024-03-05 16:45:00', 75.00, 'Processing'),
(4, 104, '2024-04-20 09:15:00', 200.00, 'Shipped'),
(5, 105, '2024-05-12 14:00:00', 49.99, 'Delivered');
```

- MySQL 中的 `orders_backup` 表:

```sql
CREATE TABLE analytics_db.orders_backup (
order_id INT PRIMARY KEY,
customer_id INT,
order_date DATETIME,
amount DECIMAL(10, 2),
status VARCHAR(20)
);
```

### 创建 `mo_cdc` 同步任务

通过 `mo_cdc` 工具创建同步任务,将 MatrixOne 的订单数据实时推送至 MySQL。

```bash
>./mo_cdc task create \
--task-name "task1" \
--source-uri "mysql://root:[email protected]:6001" \
--sink-type "mysql" \
--sink-uri "mysql://root:[email protected]:3306" \
--tables "source_db.orders:analytics_db.orders_backup" \
--level "account" \
--account "sys"
```

查看任务状态

```bash
> ./mo_cdc task show \
--task-name "task1" \
--source-uri "mysql://root:[email protected]:6001"
[
{
"task-id": "0192d76f-d89a-70b3-a60d-615c5f2fd33d",
"task-name": "task1",
"source-uri": "mysql://root:******@127.0.0.1:6001",
"sink-uri": "mysql://root:******@127.0.0.1:3306",
"state": "running",
"checkpoint": "{\n \"source_db.orders\": 2024-10-29 16:43:00.318404 +0800 CST,\n}",
"timestamp": "2024-10-29 16:43:01.299298 +0800 CST"
}
]
```

连接下游 mysql 查看全量数据同步情况

```sql
mysql> select * from analytics_db.orders_backup;
+----------+-------------+---------------------+--------+------------+
| order_id | customer_id | order_date | amount | status |
+----------+-------------+---------------------+--------+------------+
| 1 | 101 | 2024-01-15 14:30:00 | 99.99 | Shipped |
| 2 | 102 | 2024-02-10 10:00:00 | 149.50 | Delivered |
| 3 | 103 | 2024-03-05 16:45:00 | 75.00 | Processing |
| 4 | 104 | 2024-04-20 09:15:00 | 200.00 | Shipped |
| 5 | 105 | 2024-05-12 14:00:00 | 49.99 | Delivered |
+----------+-------------+---------------------+--------+------------+
5 rows in set (0.01 sec)
```

### 增量同步任务

任务建立后,在上游 MatrixOne 进行数据变更操作

```sql
INSERT INTO source_db.orders (order_id, customer_id, order_date, amount, status) VALUES
(6, 106, '2024-10-29 12:00:00', 150.00, 'New');
DELETE FROM source_db.orders WHERE order_id = 6;
UPDATE source_db.orders SET status = 'Delivered' WHERE order_id = 4;

mysql> select * from source_db.orders;
+----------+-------------+---------------------+--------+------------+
| order_id | customer_id | order_date | amount | status |
+----------+-------------+---------------------+--------+------------+
| 4 | 104 | 2024-04-20 09:15:00 | 200.00 | Delivered |
| 1 | 101 | 2024-01-15 14:30:00 | 99.99 | Shipped |
| 2 | 102 | 2024-02-10 10:00:00 | 149.50 | Delivered |
| 3 | 103 | 2024-03-05 16:45:00 | 75.00 | Processing |
| 5 | 105 | 2024-05-12 14:00:00 | 49.99 | Delivered |
+----------+-------------+---------------------+--------+------------+
5 rows in set (0.00 sec)
```

连接下游 mysql 查看增量数据同步情况

```sql
mysql> select * from analytics_db.orders_backup;
+----------+-------------+---------------------+--------+------------+
| order_id | customer_id | order_date | amount | status |
+----------+-------------+---------------------+--------+------------+
| 1 | 101 | 2024-01-15 14:30:00 | 99.99 | Shipped |
| 2 | 102 | 2024-02-10 10:00:00 | 149.50 | Delivered |
| 3 | 103 | 2024-03-05 16:45:00 | 75.00 | Processing |
| 4 | 104 | 2024-04-20 09:15:00 | 200.00 | Delivered |
| 5 | 105 | 2024-05-12 14:00:00 | 49.99 | Delivered |
+----------+-------------+---------------------+--------+------------+
5 rows in set (0.00 sec)
```

### 断点续传

现在由于意外造成任务中断。

```bash
> ./mo_cdc task pause \
--task-name "task1" \
--source-uri "mysql://root:[email protected]:6001"
```

任务中断期间,往上游 MatrixOne 继续插入数据。

```sql
INSERT INTO source_db.orders (order_id, customer_id, order_date, amount, status) VALUES
(11, 111, '2024-06-15 08:30:00', 250.75, 'Processing');
INSERT INTO source_db.orders (order_id, customer_id, order_date, amount, status) VALUES
(12, 112, '2024-07-22 15:45:00', 399.99, 'Shipped');
INSERT INTO source_db.orders (order_id, customer_id, order_date, amount, status) VALUES
(13, 113, '2024-08-30 10:20:00', 599.99, 'Delivered');

mysql> select * from source_db.orders;
+----------+-------------+---------------------+--------+------------+
| order_id | customer_id | order_date | amount | status |
+----------+-------------+---------------------+--------+------------+
| 1 | 101 | 2024-01-15 14:30:00 | 99.99 | Shipped |
| 2 | 102 | 2024-02-10 10:00:00 | 149.50 | Delivered |
| 3 | 103 | 2024-03-05 16:45:00 | 75.00 | Processing |
| 4 | 104 | 2024-04-20 09:15:00 | 200.00 | Delivered |
| 5 | 105 | 2024-05-12 14:00:00 | 49.99 | Delivered |
| 11 | 111 | 2024-06-15 08:30:00 | 250.75 | Processing |
| 12 | 112 | 2024-07-22 15:45:00 | 399.99 | Shipped |
| 13 | 113 | 2024-08-30 10:20:00 | 599.99 | Delivered |
+----------+-------------+---------------------+--------+------------+
8 rows in set (0.01 sec)
```

手动恢复任务。

```bash
> ./mo_cdc task resume \
--task-name "task1" \
--source-uri "mysql://root:[email protected]:6001"
```

连接下游 mysql 查看断点续传情况。

```sql
mysql> select * from analytics_db.orders_backup;
+----------+-------------+---------------------+--------+------------+
| order_id | customer_id | order_date | amount | status |
+----------+-------------+---------------------+--------+------------+
| 1 | 101 | 2024-01-15 14:30:00 | 99.99 | Shipped |
| 2 | 102 | 2024-02-10 10:00:00 | 149.50 | Delivered |
| 3 | 103 | 2024-03-05 16:45:00 | 75.00 | Processing |
| 4 | 104 | 2024-04-20 09:15:00 | 200.00 | Delivered |
| 5 | 105 | 2024-05-12 14:00:00 | 49.99 | Delivered |
| 11 | 111 | 2024-06-15 08:30:00 | 250.75 | Processing |
| 12 | 112 | 2024-07-22 15:45:00 | 399.99 | Shipped |
| 13 | 113 | 2024-08-30 10:20:00 | 599.99 | Delivered |
+----------+-------------+---------------------+--------+------------+
8 rows in set (0.00 sec)
```

## 应用效果

通过该方案,零售企业可以实时同步订单数据至分析库,实现订单统计、销售趋势分析、客户行为洞察等应用场景,支持业务决策。同时断点续传保障了在网络延迟或任务中断时的数据一致性,使得数据分析系统始终保持准确、可靠的数据来源。
Loading
Loading