-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(dup): implement pegasus_mutation_duplicator #399
Conversation
本次pr涉及的复制流程是由pegasus_mutation_duplicator的duplicate函数开始,处理每个mutation_tuple_set,将每个mutation的dup rpc存储在_inflight map中,并依次调用send函数,pegasus server client的async_duplicate函数,pegasus client的duplicate函数,发送给远端的cluster,发送后的回调由on_duplicate_reply处理,请问我的理解是否有问题? |
对 |
if (tc == dsn::apps::RPC_RRDB_RRDB_REMOVE) { | ||
dsn::blob raw_key; | ||
dsn::from_blob_to_thrift(data, raw_key); | ||
return pegasus_key_hash(raw_key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里为什么不同的操作要使用不同的hash函数?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
因为 pegasus_key_hash 处理的参数是 hashkey + sortkey, 而 pegasus_hash_key_hash 处理的参数是 hashkey
What problem does this PR solve?
Implement
dsn::replication::mutation_duplicator
interface to support sending mutations tothe remote Pegasus cluster.
Our target is to simplify the "mutation sending" process, reuse the existing client lib, duplicate each
mutation independently, leave optimization to the future. Therefore the performance may be poor,
but the overall cost is acceptable, as long as we make it easy to understand.
include/dsn/dist/replication/mutation_duplicator.h
Docs here might be helpful to understand this PR: https://pegasus-kv.github.io/2019/06/09/duplication-design.html#%E6%B5%81%E7%A8%8B
What is changed and how it works?
The design inside the box is:
pegasus_mutation_duplicator
receives a batch of mutations to be duplicated to the remote endpoint. After sending the entire batch, it callscb
for the next batch. The batch-by-batch designis to preserve the data consistency of both clusters.
private: std::deque<duplicate_rpc> _inflights;
We can improve this design with slightly increased complexity, by dividing the mutations into
groups, isolated by their key. This way helps increase concurrency with uncompromising consistency.
duplicate_rpc
. This design is extensible for adding and removing a write RPC in the future because the real content is hidden undertask_code
andraw_message
. The duplicator is unaware of what the write is.One corner case is when the write type is
RPC_RRDB_RRDB_DUPLICATE
, the duplicator should ignore the write duplicated from others. Because duplication is an one-way edge between two nodes.If the administrator wants to create a two-way duplication between A and B, it should add_dup on A->B as well as B->A.
Check List
Tests
New perf-counters
replica*app.pegasus*dup_shipped_ops@<gpid>
replica*app.pegasus*dup_failed_shipping_ops@<gpid>