MyRocks复制优化

title: MySQL · myrocks · myrocks复制优化 author: 张远

概述

myrocks依然采用mysql原有的基于binlog的复制方式。目前由于myrocks不支持gap lock, 因此在statement格式的binlog下进行复制，主备可能出现不一致。myrocks建议在复制时设置binlog格式为row。 myrocks在rocksdb引擎层为复制做了一些卓有成效的优化，例如skip unique check , read free replication。

skip unique check

skip unique check 忽略唯一性检查，此特性开启时需确保我们写入的数据不会违反唯一性约束。在正常的主备复制环境下，备库是只读的，主库的写入是经过了唯一性检查的，写入binlog后，备库应用这些binlog时理论上是不需要再检查唯一性的。基于以上假设，备库开skip unique check，可以减少唯一性检查的开销，并保证主备数据的一致性。

skip unique check有以下参数可以控制，

rocksdb_skip_unique_check
控制rocksdb是否忽略唯一性检查，对复制sql线程和用户正常连接都有效。一般不建议开启。
rocksdb_skip_unique_check_tables
指定哪些表忽略唯一性检查，只对复制sql线程有效。
unique_check_lag_threshold
备库延迟超过此值时才忽略唯一性检查
unique_check_lag_reset_threshold
备库延迟小于此值时不忽略唯一性检查

在备库环境中，我们一般只设置以下三个参数(rocksdb_skip_unique_check参数设置为true后，下面三个参数不管怎么设置都会忽略唯一性检查）

rocksdb_skip_unique_check_tables
unique_check_lag_threshold
unique_check_lag_reset_threshold

备库开启skip unique check时，还有一个优化是写入数据时不需要加锁，省去了锁的开销(get_blind_write_batch)。

read free replication

read free replication优化思路来源于tokudb。 tokudb是基于Fractal-Trees，数据都是先写入到内节点message buffer，最后再apply到叶子节点。这种延迟写入特性有益于read free replication。

read free replication必须工作在row格式的binlog下，基于row格式的binlog包括row的前镜像和后镜像。read free replication利用前镜像来直接更新数据，从而减少了一次读取行操作。

引入read free replication之前，备库复制线程是这样工作的

delete
根据row image来查找此行是否存在，如果不存在复制就报错退出，存在则继续delete。
update
update binlog有前镜像和后镜像，先根据前镜像来查找此行是否存在，如果不存在复制就报错退出，存在则根据后镜像更新数据。

引入read free replication之后，备库复制线程是这样工作的

delete
直接根据row image来delete，不需要判断行是否存在。
update
update binlog直接根据后镜像更新数据，不需要判断行是否存在。其中update过程中如果有更新唯一性字段，还是需要读取行来检查唯一性。

对于insert，read free replication 实际不起作用。

insert
insert过程还是需要检查唯一性的。

因此，要想真正的做到read free replication即复制sql线程只管写入不需要读取行， read free replication是需要和skip unique check一起配合使用的

问题来了，innodb可以做到read free replication吗？

innodb写入并不像tokudb，rocksdb一样是延迟写入的，同时innodb的更新必须读取老的行数据。因此，innodb不能做到read free replication。

read free replication风险

read free replication使用是在一定前提下的

binlog格式为row
复制所在的备库必须是只读的

这里列了两个违反规则使用read free replication导致出问题的两个例子，转帖如下

二级索引少了些行

create table t (id int primary key, i1 int, i2 int, value int, index (i1), index (i2)) engine=rocksdb;
insert into t values (1,1,1,1),(2,2,2,2),(3,3,3,3);

s:
delete from t where id <= 2;

m:
update t set i2=100, value=100 where id=1;

s:
mysql> select count(*) from t force index(primary);
+----------+
| count(*) |
+----------+
|        2 |
+----------+
1 row in set (0。00 sec)

mysql> select count(*) from t force index(i1);
+----------+
| count(*) |
+----------+
|        1 |
+----------+
1 row in set (0。00 sec)

mysql> select count(*) from t force index(i2);
+----------+
| count(*) |
+----------+
|        2 |
+----------+
1 row in set (0。00 sec)

mysql> select * from t where id=1;
+----+------+------+-------+
| id | i1   | i2   | value |
+----+------+------+-------+
|  1 |    1 |  100 |   100 |
+----+------+------+-------+
1 row in set (0。00 sec)

mysql> select i1 from t where i1=1;
Empty set (0。00 sec)

mysql> select i2 from t where i2=100;
+------+
| i2   |
+------+
|  100 |
+------+
1 row in set (0。00 sec)

二级索引多了些行

M:
create table t (id int primary key, i1 int, i2 int, value int, index (i1), index (i2)) engine=rocksdb;
insert into t values (1,1,1,1),(2,2,2,2),(3,3,3,3);

S:
update t set i1=100 where id=1;

M:
delete from t where id=1;

S:
mysql> select count(*) from t force index(primary);
+----------+
| count(*) |
+----------+
|        2 |
+----------+
1 row in set (0。00 sec)

mysql> select count(*) from t force index(i1);
+----------+
| count(*) |
+----------+
|        3 |
+----------+
1 row in set (0。00 sec)

mysql> select count(*) from t force index(i2);
+----------+
| count(*) |
+----------+
|        2 |
+----------+
1 row in set (0。00 sec)

mysql> select i1 from t where i1=100;
+------+
| i1   |
+------+
|  100 |
+------+
1 row in set (0。00 sec)

read free replication应用

这篇文章介绍了tokudb read free replicatio的应用场景，同样适用于rocksdb read free replication。总之，read free replicatio大大提高了复制的效率，同时结合rockedb的高效压缩和低写入放大特性，使得myrocks非常适用于只读库的扩展，或作为mysql其他引擎实例的备用实例。

总结

myrocks在复制方面作了有益的优化，但这些优化并不是银弹。我们通过这些优化得到高的回报的同时，也要明确知道这些优化的风险，严格遵守优化的前置条件，从而保证安全性和高性能。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly