Skip to content

Commit

Permalink
This is an automated cherry-pick of pingcap#5719
Browse files Browse the repository at this point in the history
Signed-off-by: ti-chi-bot <[email protected]>
  • Loading branch information
Joyinqin authored and ti-chi-bot committed May 31, 2021
1 parent 65fd84a commit 9806acb
Show file tree
Hide file tree
Showing 2 changed files with 41 additions and 3 deletions.
7 changes: 7 additions & 0 deletions ticdc/deploy-ticdc.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,8 @@ cdc server --pd=http://10.0.10.25:2379 --log-file=ticdc_2.log --addr=0.0.0.0:830
cdc server --pd=http://10.0.10.25:2379 --log-file=ticdc_3.log --addr=0.0.0.0:8303 --advertise-addr=127.0.0.1:8303
```

## Description of TiCDC `cdc server` command-line parameters

The following are descriptions of options available in the `cdc server` command:

- `gc-ttl`: The TTL (Time To Live) of the service level `GC safepoint` in PD set by TiCDC, in seconds. The default value is `86400`, which means 24 hours.
Expand All @@ -58,4 +60,9 @@ The following are descriptions of options available in the `cdc server` command:
- `ca`: The path of the CA certificate file used by TiCDC, in the PEM format (optional).
- `cert`: The path of the certificate file used by TiCDC, in the PEM format (optional).
- `key`: The path of the certificate key file used by TiCDC, in the PEM format (optional).
<<<<<<< HEAD
- `config`: The address of the configuration file that TiCDC uses (optional). This option is supported since TiCDC v4.0.13. This option can be used in the TiCDC deployment since TiUP v1.4.0.
=======
- `config`: The address of the configuration file that TiCDC uses (optional). This option is supported since TiCDC v5.0.0. This option can be used in the TiCDC deployment since TiUP v1.4.0.
- `sort-dir`: Specifies the temporary file directory of the sorting engine. The default value of this configuration item is `/tmp/cdc_sort`. When Unified Sorter is enabled, if this directory on the server is not writable or the available space is insufficient, you need to manually specify a directory in `sort-dir`. Make sure that TiCDC can read and write data in the `sort-dir` path.
>>>>>>> 73471fea6 (ticdc: add more docs about unified sorter (#5719))
37 changes: 34 additions & 3 deletions ticdc/manage-ticdc.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,13 +80,13 @@ Execute the following commands to create a replication task:
{{< copyable "shell-regular" >}}

```shell
cdc cli changefeed create --pd=http://10.0.10.25:2379 --sink-uri="mysql://root:[email protected]:3306/" --changefeed-id="simple-replication-task"
cdc cli changefeed create --pd=http://10.0.10.25:2379 --sink-uri="mysql://root:[email protected]:3306/" --changefeed-id="simple-replication-task" --sort-engine="unified"
```

```shell
Create changefeed successfully!
ID: simple-replication-task
Info: {"sink-uri":"mysql://root:[email protected]:3306/","opts":{},"create-time":"2020-03-12T22:04:08.103600025+08:00","start-ts":415241823337054209,"target-ts":0,"admin-job-type":0,"sort-engine":"memory","sort-dir":".","config":{"case-sensitive":true,"filter":{"rules":["*.*"],"ignore-txn-start-ts":null,"ddl-allow-list":null},"mounter":{"worker-num":16},"sink":{"dispatchers":null,"protocol":"default"},"cyclic-replication":{"enable":false,"replica-id":0,"filter-replica-ids":null,"id-buckets":0,"sync-ddl":false},"scheduler":{"type":"table-number","polling-time":-1}},"state":"normal","history":null,"error":null}
Info: {"sink-uri":"mysql://root:[email protected]:3306/","opts":{},"create-time":"2020-03-12T22:04:08.103600025+08:00","start-ts":415241823337054209,"target-ts":0,"admin-job-type":0,"sort-engine":"unified","sort-dir":".","config":{"case-sensitive":true,"filter":{"rules":["*.*"],"ignore-txn-start-ts":null,"ddl-allow-list":null},"mounter":{"worker-num":16},"sink":{"dispatchers":null,"protocol":"default"},"cyclic-replication":{"enable":false,"replica-id":0,"filter-replica-ids":null,"id-buckets":0,"sync-ddl":false},"scheduler":{"type":"table-number","polling-time":-1}},"state":"normal","history":null,"error":null}
```

- `--changefeed-id`: The ID of the replication task. The format must match the `^[a-zA-Z0-9]+(\-[a-zA-Z0-9]+)*$` regular expression. If this ID is not specified, TiCDC automatically generates a UUID (the version 4 format) as the ID.
Expand All @@ -102,13 +102,23 @@ Info: {"sink-uri":"mysql://root:[email protected]:3306/","opts":{},"create-time":

- `--start-ts`: Specifies the starting TSO of the `changefeed`. From this TSO, the TiCDC cluster starts pulling data. The default value is the current time.
- `--target-ts`: Specifies the ending TSO of the `changefeed`. To this TSO, the TiCDC cluster stops pulling data. The default value is empty, which means that TiCDC does not automatically stop pulling data.
<<<<<<< HEAD
- `--sort-engine`: Specifies the sorting engine for the `changefeed`. Because TiDB and TiKV adopt distributed architectures, TiCDC must sort the data changes before writing them to the sink. This option supports `memory`/`unified`/`file`.

- `memory`: Sorts data changes in memory.
- `unified`: This feature is introduced since v4.0.9 and becomes a stable feature since v4.0.11. When `unified` is used, TiCDC prefers data sorting in memory. If the memory is insufficient, TiCDC automatically uses the disk to store the temporary data. It is recommended to enable this feature when there is a risk of memory shortage, but **NOT recommended** to enable it in v4.0.9 and v4.0.10.
- `file`: Entirely uses the disk to store the temporary data. This feature is **deprecated**. It is **NOT recommended** to use it in **any** situation.

- `--sort-dir`: Specifies the temporary file directory of the sorting engine. For TiDB v4.0.12 and later versions, it is **NOT recommended** to use this option in the command `cdc cli changefeed create`. You are recommended to use this option in the command `cdc server` to set the temporary file directory. The default value of this option is `/tmp/cdc_sort`. When the unified sorter is enabled, if the default directory `/tmp/cdc_sort` on the sever is not writable or there is not enough space, you need to manually specify a directory in `sort-dir`. If the directory specified in `sort-dir` is not writable, `changefeed` stops automatically.
=======
- `--sort-engine`: Specifies the sorting engine for the `changefeed`. Because TiDB and TiKV adopt distributed architectures, TiCDC must sort the data changes before writing them to the sink. This option supports `unified` (by default)/`memory`/`file`.

- `unified`: When `unified` is used, TiCDC prefers data sorting in memory. If the memory is insufficient, TiCDC automatically uses the disk to store the temporary data. This is the default value of `--sort-engine`.
- `memory`: Sorts data changes in memory. It is **NOT recommended** to use this sorting engine, because OOM is easily triggered when you replicate a large amount of data.
- `file`: Entirely uses the disk to store the temporary data. This feature is **deprecated**. It is **NOT recommended** to use it in **any** situation.

- `--sort-dir`: Specifies the temporary file directory of the sorting engine. It is **NOT recommended** to use this option in the command `cdc cli changefeed create`. You are recommended to use this option [in the command `cdc server` to set the temporary file directory](/ticdc/deploy-ticdc.md#description-of-ticdc-cdc-server-command-line-parameters). The default value of this option is `/tmp/cdc_sort`. When the unified sorter is enabled, if the default directory `/tmp/cdc_sort` on the sever is not writable or there is not enough space, you need to manually specify a directory in `sort-dir`. If the directory specified in `sort-dir` is not writable, `changefeed` stops automatically.
>>>>>>> 73471fea6 (ticdc: add more docs about unified sorter (#5719))

- `--config`: Specifies the configuration file of the `changefeed`.

Expand Down Expand Up @@ -304,7 +314,7 @@ cdc cli changefeed query --pd=http://10.0.10.25:2379 --changefeed-id=simple-repl
"start-ts": 419036036249681921,
"target-ts": 0,
"admin-job-type": 0,
"sort-engine": "memory",
"sort-engine": "unified",
"sort-dir": ".",
"config": {
"case-sensitive": true,
Expand Down Expand Up @@ -790,13 +800,34 @@ force-replicate = true
## Unified Sorter
<<<<<<< HEAD
Unified sorter is the sorting engine in TiCDC. This feature is introduced since v4.0.9. It can mitigate OOM problems caused by the following scenarios:
=======
Unified sorter is the sorting engine in TiCDC. It can mitigate OOM problems caused by the following scenarios:
>>>>>>> 73471fea6 (ticdc: add more docs about unified sorter (#5719))
+ The data replication task in TiCDC is paused for a long time, during which a large amount of incremental data is accumulated and needs to be replicated.
+ The data replication task is started from an early timestamp so it becomes necessary to replicate a large amount of incremental data.
For the changefeeds created using `cdc cli` after v4.0.13, Unified Sorter is enabled by default; for the changefeeds that have existed before v4.0.13, the previous configuration is used.
To check whether or not the Unified Sorter feature is enabled on a changefeed, you can execute the following example command (assuming the IP address of the PD instance is `http://10.0.10.25:2379`):
{{< copyable "shell-regular" >}}
```shell
cdc cli --pd="http://10.0.10.25:2379" changefeed query --changefeed-id=simple-replication-task | grep 'sort-engine'
```
In the output of the above command, if the value of `sort-engine` is "unified", it means that Unified Sorter is enabled on the changefeed.
> **Note:**
>
> + If your servers use mechanical hard drives or other storage devices that have high latency or limited bandwidth, use the unified sorter with caution.
> + The total free capacity of hard drives must be greater than or equal to 128G. If you need to replicate a large amount of historical data, make sure that the free capacity on each node is greater than or equal to the size of the incremental data that needs to be replicated.
<<<<<<< HEAD
> + If your servers do not match the above requirements and you want to disable the unified sorter, you need to manually set `sort-engine` to `memory` for the changefeed.
=======
> + Unified sorter is enabled by default. If your servers do not match the above requirements and you want to disable the unified sorter, you need to manually set `sort-engine` to `memory` for the changefeed.
> + To enable Unified Sorter on an existing changefeed, see the methods provided in [How do I handle the OOM that occurs after TiCDC is restarted after a task interruption?](/ticdc/troubleshoot-ticdc.md#how-do-i-handle-the-oom-that-occurs-after-ticdc-is-restarted-after-a-task-interruption).
>>>>>>> 73471fea6 (ticdc: add more docs about unified sorter (#5719))

0 comments on commit 9806acb

Please sign in to comment.