Skip to content

Commit

Permalink
Another round
Browse files Browse the repository at this point in the history
Signed-off-by: Matt Lord <[email protected]>
  • Loading branch information
mattlord committed Dec 6, 2022
1 parent 4ca6b92 commit 826186c
Show file tree
Hide file tree
Showing 5 changed files with 53 additions and 58 deletions.
13 changes: 7 additions & 6 deletions content/en/docs/16.0/reference/vreplication/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,14 @@ GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, RELOAD, PROCESS, FILE,

{{< expand `Why am I seeing io.EOF errors in my workflow?`>}}
<p>
<code>io.EOF</code> errors can be difficult to track down. These are usually caused by an issue at the mysql server. Here are some possible reasons:
<code>io.EOF</code> errors can be difficult to track down. These are usually caused by an issue at the mysql server layer. You will need to consult
the source and target vttablet logs in order to know for sure in each case. Here are some possible reasons:
</p>

<ul>
<li>GTID is not enabled on the server. VReplication requires <code>GTID=on</code>
(<code>permissible</code> is <b>not</b> supported)</li>
<li>Permissions are not setup correctly for the vreplication mysql user</li>
<li>Permissions are not setup correctly for the vreplication related mysql users (in particular the `vt_filtered` user by defualt).</li>
<li>Row-based replication (RBR) <code>binlog_format=row</code> is not enabled. Statement-based replication (SBR) is <b>not</b> supported by VReplication</li>
<li>The mysql server is down or not reachable</li>
</ul>
Expand All @@ -36,8 +37,8 @@ binlog_row_image=full
</pre>
{{< /expand >}}

<!--
{{< expand `If I can't turn GTID on, can I run a VReplication workflow using FilePos instead?`>}}
To be done
{{< expand `If I can't turn GTIDs on, can I run a VReplication workflow using --db_flavor=FilePos instead?`>}}
Yes, you can run VReplication workflows with the pre MySQL 5.6 file and position method but this should only be used as a last resort when it's not possible
to modify the configuration of the source. This is because the File and Position method is not fault tolerant and if any error or failure/failover is encountered
you will need to throw away the existing workflow and start another one anew.
{{< /expand >}}
-->
46 changes: 22 additions & 24 deletions content/en/docs/16.0/reference/vreplication/migrate.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,6 @@ description: Move tables from an external cluster
weight: 85
---

{{< info >}}
This documentation is for a new (v2) set of vtctld commands that start in Vitess 11.0. See [RFC](https://github.com/vitessio/vitess/issues/7225) for more details.
{{< /info >}}

### Command

```
Expand All @@ -17,8 +13,8 @@ Migrate -- <options> <action> <workflow identifier>

### Description

Migrate is used to start and manage vReplication workflows for copying keyspaces and/or tables from a source Vitess cluster, to a target Vitess cluster.
This command is built off of [MoveTables](../movetables) but has been extended to work with source and target topology services. It should be
Migrate is used to start and manage VReplication workflows for copying keyspaces and/or tables from a source Vitess cluster, to a target Vitess cluster.
This command is built off of [MoveTables](../movetables) but has been extended to work with independent source and target topology services. It should be
utilized when moving Keyspaces or Tables between two separate Vitess environments. Migrate is an advantageous strategy for large sharded environments
for a few reasons:

Expand All @@ -30,20 +26,20 @@ for a few reasons:
* Could be used for configuring lower environments with production data.

Please note the Migrate command works with an externally mounted source cluster. See the related [Mount command](../mount) for more information
on external Vitess clusters.
on working with external Vitess clusters.

#### Differences between Migrate and MoveTables
#### Differences Between Migrate and MoveTables

Migrate has separate semantics and behaviors from MoveTables:
`Migrate` has separate semantics and behaviors from `MoveTables`:

* MoveTables migrates data from one keyspace to another, within the same Vitess cluster; Migrate functions between two separated Vitess clusters.
* MoveTables erases the source data upon completion by default; Migrate keeps the source data intact.
* `MoveTables` migrates data from one keyspace to another, within the same Vitess cluster; `Migrate` functions between two separated Vitess clusters.
* `MoveTables` erases the source data upon completion by default; Migrate keeps the source data intact.
* There are flags available in MoveTables to change the default behavior in regards to the source data.
* MoveTables sets up routing rules and reverse replication, allowing for rollback prior to completion.
* Switching read/write traffic is not meaningful in the case of Migrate, as the Source is in a different cluster.
* Switching traffic requires the Target to have the ability to create vreplication streams (in the _vt database) on the Source;
* `MoveTables` sets up routing rules and reverse replication, allowing for rollback prior to completion.
* Switching read/write traffic is not meaningful in the case of `Migrate`, as the Source is in a different cluster.
* Switching traffic requires the Target to have the ability to create vreplication streams (in the `_vt` database) on the Source;
this may not always be possible on production systems.
* Not all MoveTables options work with Migrate; for example [Progress](../progress) is unavailable with Migrate.
* Not all `MoveTables` options work with `Migrate`; for example [`Progress`](../progress) is unavailable with `Migrate`.


### Parameters
Expand All @@ -66,7 +62,9 @@ If needed, you can rename the keyspace while migrating, simply provide a differe

Each `action` has additional options/parameters that can be used to modify its behavior.

The options for the supported commands are the same as [MoveTables](../movetables), with the exception of `reverse_replication`.
The options for the supported commands are the same as [MoveTables](../movetables), with the exception of `--reverse_replication` as setting
up the reverse vreplication streams requires modifying the source cluster's `_vt` sidecar database which we cannot do as that database is
specific to a single Vitess cluster and these streams belong to a different one (the target cluster).

A common option to give if migrating all of the tables from a source keyspace is the `--all` option.

Expand All @@ -80,7 +78,7 @@ All workflows are identified by `targetKeyspace.workflow` where `targetKeyspace`
### A Migrate Workflow lifecycle

{{< info >}}
NOTE: there is no reverse replication flow with Migrate. After the `Migrate Complete` command is given; no writes will be replicated between the Source and Target Vitess clusters. They are essentially two identical Vitess clusters running in two different environments. Once writing resumes on one of the clusters they will begin to drift apart.
NOTE: there is no reverse vreplication flow with `Migrate`. After the `Migrate Complete` command is given; no writes will be replicated between the Source and Target Vitess clusters. They are essentially two identical Vitess clusters running in two different environments. Once writing resumes on one of the clusters they will begin to drift apart.
{{< /info >}}

1. Mount the source Vitess cluster using [Mount](../mount).<br/>
Expand Down Expand Up @@ -114,29 +112,29 @@ For Migrate to function properly, you will need to ensure communication is possi
If you're migrating a keyspace from a production system, you may want to target a replica to reduce your load on the primary vttablets. This will also assist you in reducing the number of network considerations you need to make.

```
Migrate -- --all --tablet_types "REPLICA" --source <mount name>.<source keyspace> Create <workflow identifier>
Migrate -- --all --tablet_types REPLICA --source <mount name>.<source keyspace> Create <workflow identifier>
```

To verify the Migration you can also perform VDiff with the `--tablet_types` option:

```
VDiff -- --tablet_types "REPLICA" <target keyspace>.<workflow identifier>
VDiff -- --tablet_types REPLICA <target keyspace>.<workflow identifier>
```

### Troubleshooting Errors

Migrate fails right away with error:
`Migrate` fails right away with error:

```sh
E0224 23:51:45.312536 138 main.go:76] remote error: rpc error: code = Unknown desc = table table1 not found in vschema for keyspace sharded
```
<br />Solution:
* The target table has a vSchema which does not match the source vSchema
* Upload the source vSchema to the target vSchema and try the migrate again
* The target table has a VSchema which does not match the source VSchema
* Upload the source VSchema to the target VSchema and try the `Migrate` again

---

Migrate fails right away with error:
`Migrate` fails right away with error:

```sh
E0224 18:55:29.275019 578 main.go:76] remote error: rpc error: code = Unknown desc = node doesn't exist
Expand All @@ -148,7 +146,7 @@ E0224 18:55:29.275019 578 main.go:76] remote error: rpc error: code = Unknow
---
After issuing Migrate command everything is stuck at 0% progress
After issuing `Migrate` command everything is stuck at 0% progress
with errors found in target vttablet logs:
```sh
Expand Down
24 changes: 10 additions & 14 deletions content/en/docs/16.0/reference/vreplication/mount.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,6 @@ description: Link an external cluster to the current one
weight: 90
---

{{< info >}}
This documentation is for a new (v2) set of vtctld commands that start in Vitess 11.0. See [RFC](https://github.com/vitessio/vitess/issues/7225) for more details.
{{< /info >}}

### Command

```
Expand All @@ -17,12 +13,12 @@ Mount -- [--type vitess] [--topo_type=etcd2|consul|zookeeper] [--topo_server=top

### Description

Mount is used to link external Vitess clusters to the current cluster. (In the future we will also support mounting external MySQL servers.)
Mount is used to link external Vitess clusters to the current cluster.

Mounting Vitess clusters requires the topology information of the external cluster to be specified. Used in conjunction with [the Migrate command](../migrate).
Mounting Vitess clusters requires the topology information of the external cluster to be specified. Used in conjunction with [the `Migrate` command](../migrate).

{{< info >}}
No validation is performed when using the Mount command. You must ensure your values are correct, or you may get errors when initializing a migration.
No validation is performed when using the `Mount` command. You must ensure your values are correct, or you may get errors when initializing a migration.
{{< /info >}}


Expand All @@ -36,18 +32,18 @@ The name that will be used in VReplication workflows to refer to the mounted clu

Unmount an already mounted cluster. Requires `cluster_name` to be specified.

#### show
#### --show

Show details of an already mounted cluster. Requires `cluster_name` to be specified.

#### list
#### --list

List all mounted clusters
List all mounted clusters.

### Topo parameters
### Topo Parameters

##### topo_type=[etcd2|consul|zookeeper]
##### topo_server=<topo_url>
##### topo_root=<root_topo_node>
##### --topo_type=[etcd2|consul|zookeeper]
##### --topo_server=<topo_url>
##### --topo_root=<root_topo_node>

Mandatory (and only specified) while mounting a Vitess cluster. These should specify the topology parameters of the cluster being mounted.
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,14 @@ weight: 300
For [VReplication streams](../../../concepts/vstream/), we must choose a tablet to serve the role of source (vstreamer) and target (vapplier) in the replication stream and this is done automatically.

To select the tablets we get a set of viable -- healthy and serving -- candidates for the source and target of the stream:
* **Source**: a random tablet is selected from the viable candidates of the specified types (see [tablet types](./#tablet-types))
* **Source**: a random tablet is selected from the viable candidates of the specified tablet types in the given cells
* **Target**: a viable primary tablet is chosen, as we need to do writes that are then replicated within the target shard

### Cell considerations
### Cells

VReplication will only look for tablet pairings within the same cell. If you want to have cross-cell streams then you will need to [create a CellAlias](https://vitess.io/docs/reference/programs/vtctl/cell-aliases/) that contains the list of potential cells and specify that using the `--cell` flag in your VReplication workflow commands.
VReplication will only look for source and target tablet pairings within the same cell by default — so if the target primary is in the `zone1` cell it will only look for source tablets in `zone1` cell. If you want to have cross-cell streams then you will need to specify the list of cells or any [CellAlias](https://vitess.io/docs/reference/programs/vtctl/cell-aliases/) that contain the list of potential cells using the `--cell` flag in your VReplication workflow commands.

### Tablet types
### Tablet Types

The server side default which determines the candidate types made available for potential selection in a stream is set using the [vttablet's `--vreplication_tablet_type` flag](../flags/#vreplication_tablet_type) (default value is `in_order:REPLICA,PRIMARY`). The target tablet will use this when finding the viable source tablet candidates.

Expand Down
20 changes: 10 additions & 10 deletions content/en/docs/16.0/reference/vreplication/throttling.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,28 +5,28 @@ weight: 300

### Introduction

VReplication moves potentially massive amounts of data from one place to another, whether within the same keyspace and shard or across keyspaces. It copies data of entire tables and follows up to apply ongoing changes on those tables by reading the binary logs (aka the changelog).
VReplication moves potentially massive amounts of data from one place to another, whether within the same keyspace and shard or across keyspaces. It copies tables and follows up to apply ongoing changes on those tables by reading the binary logs (aka the changelog).

This places load on both the source side (where VReplication reads data from) as well as on target side (where VReplication writes data to).

On the source side, VReplication reads the full content of tables. This typically means loading pages from disk contending for disk IO, and "polluting" the MySQL buffer pool. The operation competes with normal production traffic for both IO and memory resources. If the source is a replica, the operation may lead to replication lag. If the source is a primary, this may lead to write contention.
On the source side, VReplication reads the content of tables. This typically means loading pages from disk contending for disk IO, and "polluting" the MySQL buffer pool. The operation competes with normal production traffic for both IO and memory resources. If the source is a replica, the operation may lead to replication lag. If the source is a primary, this may lead to write contention.

On the target side, VReplication writes massive amount of data. If the target server is a primary with replicas, then the replicas may incur replication lag.

To address the above issues, VReplication uses the [tablet throttler](../../features/tablet-throttler/) mechanism to push back both reads and writes.
To help address the above issues, VReplication uses the [tablet throttler](../../features/tablet-throttler/) mechanism to push back both reads and writes.

### Target throttling
### Target Throttling

On the target side, VReplication wishes to consult the overall health of the target shard (there can be multiple shards to a VReplication workflow, and here we discuss the single shard at the end of a single VReplication stream). That shard may serve production traffic unrelated to VReplication. VReplication therefore consults the internal equivalent of `/throttler/check` when writing data to the shard's primary. This checks the replication lag on relevant replicas in the shard. The throttler will push back VReplication writes of both table-copy and changelog events.
On the target side, VReplication wishes to consult the overall health of the target shard (there can be multiple shards to a VReplication workflow, and here we discuss the single shard at the end of a single VReplication stream). That shard may serve production traffic unrelated to VReplication. VReplication therefore consults the internal equivalent of `/throttler/check` when writing data to the shard's primary. This checks the MySQL replication lag on relevant replicas in the shard. The throttler will delay the VReplication writes of both table-copy and changelog events until the shard's replication lag is under the defined threshold (1s by default).

### Source throttling
### Source Throttling

On the source side, VReplication only affects the single MySQL server it reads from, and has no impact on the overall shard. VStreamer, the source endpoint of VReplication, consults the equivalent of `/throttler/check-self`, which looks for replication lag on the source host.

As long as `check-self` fails, VStreamer will not read table data, nor will it pull events from the changelog.
As long as `check-self` fails — meaning that the replication lag is not within the defined threshold (1s by default) — VStreamer will not read table data, nor will it pull events from the changelog.

### Impact of throttling
### Impact of Throttling

VReplication throttling is designed to give way to normal production traffic while operating in the background. Production traffic will see less contention. The downside is that VReplication can take longer to operate. Under high load in production VReplication may altogether stall, to resume when the load subsides.
VReplication throttling is designed to give preference to normal production traffic while operating in the background. Production traffic will see less contention. The downside is that VReplication can take longer to operate. Under high load in production VReplication may altogether stall, to resume later when the load subsides.

Throttling will push back VReplication on replication lag. On systems where replication lag is normally high this can bring VReplication down to a grinding halt. In such systems consider configuring `--throttle_threshold` to a value that agrees with your constraints. The default throttling threshold is at `1` second replication lag.
Throttling will push back VReplication on replication lag. On systems where replication lag is normally high this can prevent VReplication from being able to operate normally. In such systems consider configuring `--throttle_threshold` to a value that agrees with your constraints. The default throttling threshold is at `1` second replication lag.

0 comments on commit 826186c

Please sign in to comment.