Skip to content
This repository has been archived by the owner on Jun 23, 2022. It is now read-only.

feat(split): add splitting_replicas while on_config_sync #653

Merged
merged 18 commits into from
Nov 25, 2020

Conversation

hycdong
Copy link
Contributor

@hycdong hycdong commented Oct 28, 2020

Simple partition split process

  1. meta receives client partition split request, and change partition count split: meta start partition split #286
  2. replica notices partition count changed during on_config_sync
  3. parent partition create child partition split: parent replica create child replica #291
  4. parent prepare states for child to learn feat(split): parent replica prepare states #299
  5. child partition async learn states from parent feat(split): child replica learn parent prepare list and checkpoint #309 feat(split): child replica apply private logs, in-memory mutations and catch up parent #319
  6. child notify parent catch up feat(split): child notify parent catch up #390
  7. update child group partition count feat(split): add update_child_group_partition_count #645
  8. register child partition feat(split): register child partition #391
  9. update parent group partition count feat(split): parent group update partition count #654

More partition split discussion in issue #69 and partition split design doc
This pr solves the part of second step of partition split, which is bold in process description.

What this pr solve

Replica server will send on_config_sync rpc to meta server periodically to get the newest configuration from meta server, we use it to transfer the split status from meta to replica.

  • for meta server, if any partition on this node is splitting, it will be added into splitting_replicas in response ( function on_config_sync in server_state.cpp )
  • for replica server, it will parse split_status from response ( function on_node_query_reply in replica_stub.cpp ), then do possible split actions ( function on_config_sync in replica_config.cpp and function control_split in replica_split_manager.cpp ). This pr only add primary start split, the more split action will be added in further pull request.

This pr also adds related unit tests.

@hycdong hycdong marked this pull request as ready for review October 29, 2020 01:26
@@ -1331,11 +1334,20 @@ void replica_stub::on_node_query_reply(error_code err,
}

for (auto it = resp.partitions.begin(); it != resp.partitions.end(); ++it) {
auto meta_split_status = split_status::NOT_SPLIT;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe you can calculate meta_split_status in meta server. So we don't need to calculate it in each replica server

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an app is not splitting, splitting_replicas in configuration_query_by_node_response won't be set. The config_sync rpc will happen every 30 seconds for all replica servers, and partition split is not a frequent action, for most cases it will reduce the network cost.

acelyc111
acelyc111 previously approved these changes Nov 18, 2020
{
replica_ptr replica = get_replica(req.config.pid);
if (replica != nullptr) {
replica->on_config_sync(req.info, req.config);
replica->on_config_sync(req.info, req.config, meta_split_status);
Copy link
Contributor

@neverchanje neverchanje Nov 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ugly to particularly add an argument for one requirement. It makes the function responsibility obscure. I'd prefer adding meta_split_status to configuration_update_request, even it's ugly too.

Think about it, dozens of parameters in a function, if we keep doing that. And I admit this code is shit, so we can't make it worse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants