Skip to content
This repository has been archived by the owner on Jun 23, 2022. It is now read-only.

feat: restrict the replication factor while creating app #963

Merged
merged 9 commits into from
Dec 15, 2021

Conversation

empiredan
Copy link
Contributor

@empiredan empiredan commented Nov 10, 2021

While creating app on the side of the meta server, currently the replication factor(i.e. replica count) is simply checked to just keep it positive. However, this is not enough since a wrong replication factor will result in later failures.

For example, as we've supported single-replica in #932, creating an app with 3 replicas by shell in a cluster that just includes a single node will be blocked for long since there're not enough nodes for the extra replicas.

Another example will be creating an app with a single replica in a cluster that includes 3 nodes. Later then we decide to upgrade this cluster by HA(i.e. high-availability, upgraded without stopping service of the cluster) to a higher version. The upgrade process will be blocked in the phase of moving the only replica of that app out to another node, since the node that deployed the only replica is stopped.

Also, If there are 2 alive replica servers (with 1 dead), creating an app with 3 replicas will end up with failure. Creating an app in a 3-node cluster with 5 replicas is nonsense. An excessive replication factor is unreasonable for a storage system, too.

Based on the problems illustrated above, I think the replica factor should be restricted further by the following rules.

Replication factor should not exceed the number of alive replica servers

It's obvious that there should be enough replica servers to put the replicas. Once there're extra replicas that cannot be placed anywhere, creating app will fail.

Replication factor has a minimum

Actually creating an app with a single replica in a 3-node cluster is eccentric. Therefore, A minimum value should be set for the replication factor. For example, the minimum for a 3-node cluster can be set to 3. This minimum replication factor will be a configuration for the cluster which is named as min_allowed_replica_count.

Replication factor has a maximum

Similarly, an excessive replication factor is invalid for a cluster. It'll be restricted by another configuration, named max_allowed_replica_count. Any attempt that sets an app with a replication factor exceeding max_allowed_replica_count will be rejected. Typically, this value is set to 3, since 4 or more replicas are unusual.

@empiredan empiredan changed the title fix: restrict the replica count while creating app fix: restrict the replication factor while creating app Nov 10, 2021
@hycdong
Copy link
Contributor

hycdong commented Nov 11, 2021

Thanks for your pull request for adding replica count check while creating table~
In this pull request, you adds three rules for replica count:

  1. replica count should not exceed alive node count
  2. replica count should not below min_replica_count
  3. replica count should not exceed max_replica_count

I have a different option with your third rule, I think system should not restrict its table max replica count for scalability. Besides, I think it's unnecessary to make min_replica_count as an option, in my view, min_replica_count will always be 0.
Besides, I add some comments for this pr.

src/meta/server_state.cpp Outdated Show resolved Hide resolved
src/meta/server_state.cpp Outdated Show resolved Hide resolved
src/meta/server_state.cpp Outdated Show resolved Hide resolved
src/meta/server_state.cpp Show resolved Hide resolved
std::string hint_message;

zauto_read_lock l(_lock);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{
    zauto_read_lock l(_lock);

    auto alive_node_count = get_alive_node_count_unlocked();
}

return is_valid;
}

int32_t server_state::get_alive_node_count_unlocked()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uint32_t server_state::get_alive_node_count_unlocked() const

@empiredan
Copy link
Contributor Author

Thanks for your pull request for adding replica count check while creating table~ In this pull request, you adds three rules for replica count:

  1. replica count should not exceed alive node count
  2. replica count should not below min_replica_count
  3. replica count should not exceed max_replica_count

I have a different option with your third rule, I think system should not restrict its table max replica count for scalability. Besides, I think it's unnecessary to make min_replica_count as an option, in my view, min_replica_count will always be 0. Besides, I add some comments for this pr.

Yeah, I agree that we should provide scalability and flexibility for the system. The purpose of introducing max_allowed_replica_count is to avoid unreasonable replication factor. A big replication factor may lead to poor behavior of the system. We can increase this maximum, but I think we'd better give an upper limit to prevent users from creating an app with 10 replicas.

Introducing min_allowed_replica_count is motivated from the example illustrated in the description above:

Another example will be creating an app with a single replica in a cluster that includes 3 nodes. 

Later then, we decide to make a high-availability upgrade (without stopping service of the cluster) for this cluster. 

The upgrade process will be blocked in the phase of moving the only replica of that app out to another node, since the node that deployed the only replica is stopped.

Considering this example, I think we can set the minimum replication factor as 3 for a cluster that has 3 or more replica servers.

@hycdong
Copy link
Contributor

hycdong commented Nov 12, 2021

Thanks for your pull request for adding replica count check while creating table~ In this pull request, you adds three rules for replica count:

  1. replica count should not exceed alive node count
  2. replica count should not below min_replica_count
  3. replica count should not exceed max_replica_count

I have a different option with your third rule, I think system should not restrict its table max replica count for scalability. Besides, I think it's unnecessary to make min_replica_count as an option, in my view, min_replica_count will always be 0. Besides, I add some comments for this pr.

Yeah, I agree that we should provide scalability and flexibility for the system. The purpose of introducing max_allowed_replica_count is to avoid unreasonable replication factor. A big replication factor may lead to poor behavior of the system. We can increase this maximum, but I think we'd better give an upper limit to prevent users from creating an app with 10 replicas.

Introducing min_allowed_replica_count is motivated from the example illustrated in the description above:

Another example will be creating an app with a single replica in a cluster that includes 3 nodes. 

Later then, we decide to make a high-availability upgrade (without stopping service of the cluster) for this cluster. 

The upgrade process will be blocked in the phase of moving the only replica of that app out to another node, since the node that deployed the only replica is stopped.

Considering this example, I think we can set the minimum replication factor as 3 for a cluster that has 3 or more replica servers.

Okay, your comment persuades me~ min_replica_count is used to set the cluster table min replica count, max_replica_count is used to avoid user unreasonable replica count. How about make max_replica_count as a const value, not a configurable option. For one hand, this value won't be updated in normal cases, for other hands, it will simply the DSN_DEFINE_validator check for min_replica_count.

@empiredan
Copy link
Contributor Author

Thanks for your pull request for adding replica count check while creating table~ In this pull request, you adds three rules for replica count:

  1. replica count should not exceed alive node count
  2. replica count should not below min_replica_count
  3. replica count should not exceed max_replica_count

I have a different option with your third rule, I think system should not restrict its table max replica count for scalability. Besides, I think it's unnecessary to make min_replica_count as an option, in my view, min_replica_count will always be 0. Besides, I add some comments for this pr.

Yeah, I agree that we should provide scalability and flexibility for the system. The purpose of introducing max_allowed_replica_count is to avoid unreasonable replication factor. A big replication factor may lead to poor behavior of the system. We can increase this maximum, but I think we'd better give an upper limit to prevent users from creating an app with 10 replicas.
Introducing min_allowed_replica_count is motivated from the example illustrated in the description above:

Another example will be creating an app with a single replica in a cluster that includes 3 nodes. 

Later then, we decide to make a high-availability upgrade (without stopping service of the cluster) for this cluster. 

The upgrade process will be blocked in the phase of moving the only replica of that app out to another node, since the node that deployed the only replica is stopped.

Considering this example, I think we can set the minimum replication factor as 3 for a cluster that has 3 or more replica servers.

Okay, your comment persuades me~ min_replica_count is used to set the cluster table min replica count, max_replica_count is used to avoid user unreasonable replica count. How about make max_replica_count as a const value, not a configurable option. For one hand, this value won't be updated in normal cases, for other hands, it will simply the DSN_DEFINE_validator check for min_replica_count.

A good idea ! How about set max_allowed_replica_count as 3 ? Can this value satisfy all of your scenarios ?

@hycdong
Copy link
Contributor

hycdong commented Nov 15, 2021

Thanks for your pull request for adding replica count check while creating table~ In this pull request, you adds three rules for replica count:

  1. replica count should not exceed alive node count
  2. replica count should not below min_replica_count
  3. replica count should not exceed max_replica_count

I have a different option with your third rule, I think system should not restrict its table max replica count for scalability. Besides, I think it's unnecessary to make min_replica_count as an option, in my view, min_replica_count will always be 0. Besides, I add some comments for this pr.

Yeah, I agree that we should provide scalability and flexibility for the system. The purpose of introducing max_allowed_replica_count is to avoid unreasonable replication factor. A big replication factor may lead to poor behavior of the system. We can increase this maximum, but I think we'd better give an upper limit to prevent users from creating an app with 10 replicas.
Introducing min_allowed_replica_count is motivated from the example illustrated in the description above:

Another example will be creating an app with a single replica in a cluster that includes 3 nodes. 

Later then, we decide to make a high-availability upgrade (without stopping service of the cluster) for this cluster. 

The upgrade process will be blocked in the phase of moving the only replica of that app out to another node, since the node that deployed the only replica is stopped.

Considering this example, I think we can set the minimum replication factor as 3 for a cluster that has 3 or more replica servers.

Okay, your comment persuades me~ min_replica_count is used to set the cluster table min replica count, max_replica_count is used to avoid user unreasonable replica count. How about make max_replica_count as a const value, not a configurable option. For one hand, this value won't be updated in normal cases, for other hands, it will simply the DSN_DEFINE_validator check for min_replica_count.

A good idea ! How about set max_allowed_replica_count as 3 ? Can this value satisfy all of your scenarios ?

In most our scenarios, replica_count is 3. As our discussion above, max_allowed_replica_count should be greater than the normal replica_count. If we set max_allowed_replica_count as 3, then if we would like to provide 5 replica, which is also a reasonable replica count, we should update code. As a result, I suggest seting max_allowed_replica_count as 10, or even greater than 10.

@empiredan
Copy link
Contributor Author

Thanks for your pull request for adding replica count check while creating table~ In this pull request, you adds three rules for replica count:

  1. replica count should not exceed alive node count
  2. replica count should not below min_replica_count
  3. replica count should not exceed max_replica_count

I have a different option with your third rule, I think system should not restrict its table max replica count for scalability. Besides, I think it's unnecessary to make min_replica_count as an option, in my view, min_replica_count will always be 0. Besides, I add some comments for this pr.

Yeah, I agree that we should provide scalability and flexibility for the system. The purpose of introducing max_allowed_replica_count is to avoid unreasonable replication factor. A big replication factor may lead to poor behavior of the system. We can increase this maximum, but I think we'd better give an upper limit to prevent users from creating an app with 10 replicas.
Introducing min_allowed_replica_count is motivated from the example illustrated in the description above:

Another example will be creating an app with a single replica in a cluster that includes 3 nodes. 

Later then, we decide to make a high-availability upgrade (without stopping service of the cluster) for this cluster. 

The upgrade process will be blocked in the phase of moving the only replica of that app out to another node, since the node that deployed the only replica is stopped.

Considering this example, I think we can set the minimum replication factor as 3 for a cluster that has 3 or more replica servers.

Okay, your comment persuades me~ min_replica_count is used to set the cluster table min replica count, max_replica_count is used to avoid user unreasonable replica count. How about make max_replica_count as a const value, not a configurable option. For one hand, this value won't be updated in normal cases, for other hands, it will simply the DSN_DEFINE_validator check for min_replica_count.

A good idea ! How about set max_allowed_replica_count as 3 ? Can this value satisfy all of your scenarios ?

In most our scenarios, replica_count is 3. As our discussion above, max_allowed_replica_count should be greater than the normal replica_count. If we set max_allowed_replica_count as 3, then if we would like to provide 5 replica, which is also a reasonable replica count, we should update code. As a result, I suggest seting max_allowed_replica_count as 10, or even greater than 10.

Ok, how about setting max_allowed_replica_count as 15 (an odd number, though there's no particular reason) ?

@hycdong
Copy link
Contributor

hycdong commented Nov 16, 2021

Thanks for your pull request for adding replica count check while creating table~ In this pull request, you adds three rules for replica count:

  1. replica count should not exceed alive node count
  2. replica count should not below min_replica_count
  3. replica count should not exceed max_replica_count

I have a different option with your third rule, I think system should not restrict its table max replica count for scalability. Besides, I think it's unnecessary to make min_replica_count as an option, in my view, min_replica_count will always be 0. Besides, I add some comments for this pr.

Yeah, I agree that we should provide scalability and flexibility for the system. The purpose of introducing max_allowed_replica_count is to avoid unreasonable replication factor. A big replication factor may lead to poor behavior of the system. We can increase this maximum, but I think we'd better give an upper limit to prevent users from creating an app with 10 replicas.
Introducing min_allowed_replica_count is motivated from the example illustrated in the description above:

Another example will be creating an app with a single replica in a cluster that includes 3 nodes. 

Later then, we decide to make a high-availability upgrade (without stopping service of the cluster) for this cluster. 

The upgrade process will be blocked in the phase of moving the only replica of that app out to another node, since the node that deployed the only replica is stopped.

Considering this example, I think we can set the minimum replication factor as 3 for a cluster that has 3 or more replica servers.

Okay, your comment persuades me~ min_replica_count is used to set the cluster table min replica count, max_replica_count is used to avoid user unreasonable replica count. How about make max_replica_count as a const value, not a configurable option. For one hand, this value won't be updated in normal cases, for other hands, it will simply the DSN_DEFINE_validator check for min_replica_count.

A good idea ! How about set max_allowed_replica_count as 3 ? Can this value satisfy all of your scenarios ?

In most our scenarios, replica_count is 3. As our discussion above, max_allowed_replica_count should be greater than the normal replica_count. If we set max_allowed_replica_count as 3, then if we would like to provide 5 replica, which is also a reasonable replica count, we should update code. As a result, I suggest seting max_allowed_replica_count as 10, or even greater than 10.

Ok, how about setting max_allowed_replica_count as 15 (an odd number, though there's no particular reason) ?

I think it's okay to set max_allowed_replica_count as 15~

src/meta/test/meta_app_operation_test.cpp Show resolved Hide resolved
}

FLAGS_min_allowed_replica_count = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FLAGS_min_allowed_replica_count = reserved_min_allowed_replica_count;

levy5307
levy5307 previously approved these changes Nov 23, 2021
@Smityz
Copy link
Contributor

Smityz commented Nov 24, 2021

LGTM

src/meta/server_state.cpp Outdated Show resolved Hide resolved
src/meta/server_state.cpp Outdated Show resolved Hide resolved
src/meta/test/meta_app_operation_test.cpp Show resolved Hide resolved
hycdong
hycdong previously approved these changes Nov 26, 2021
Copy link
Contributor

@hycdong hycdong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM~ By the way, I suggest using 'feat' rather than 'fix' to describe this pull request, because this pull request actually provides feature~

@empiredan
Copy link
Contributor Author

LGTM~ By the way, I suggest using 'feat' rather than 'fix' to describe this pull request, because this pull request actually provides feature~

Sure, I'll change the type

@empiredan empiredan changed the title fix: restrict the replication factor while creating app feat: restrict the replication factor while creating app Nov 26, 2021
FLAGS_min_allowed_replica_count);
return false;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you may don't need to split two if to log the hit_message, consider the following whether more simple:

 if (max_replica_count > max_allowed_replica_count || max_replica_count < FLAGS_min_allowed_replica_count) {
        hint_message = fmt::format("requested replica count({}) must be range within [min={} max={}] ",
                                   max_replica_count,
                                   FLAGS_min_allowed_replica_count
                                   max_allowed_replica_count);
        return false;
    }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll merge the both conditions. A range seems more specific.

@@ -56,6 +56,17 @@ using namespace dsn;
namespace dsn {
namespace replication {

const int32_t max_allowed_replica_count = 15;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the max is not set dynamic variable like min?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially the max_allowed_replica_count was set configurable. Then as discussed with @hycdong above, max_allowed_replica_count has been changed as a const value, not a configurable one, since in normal cases it will not be updated.

Copy link
Contributor

@Smityz Smityz Nov 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, the const int should be named like k... refer to google code style

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I'd considered following google code style such as the camel-case naming of constant which begins with 'k'. On the other hand, however I'd found many constants are named in upper or lower snake case before I turned to follow the existing custom. I think we should gradually unify the naming style.

@acelyc111
Copy link
Member

acelyc111 commented Nov 27, 2021

Thanks for your pull request for adding replica count check while creating table~ In this pull request, you adds three rules for replica count:

  1. replica count should not exceed alive node count
  2. replica count should not below min_replica_count
  3. replica count should not exceed max_replica_count

I have a different option with your third rule, I think system should not restrict its table max replica count for scalability. Besides, I think it's unnecessary to make min_replica_count as an option, in my view, min_replica_count will always be 0. Besides, I add some comments for this pr.

Yeah, I agree that we should provide scalability and flexibility for the system. The purpose of introducing max_allowed_replica_count is to avoid unreasonable replication factor. A big replication factor may lead to poor behavior of the system. We can increase this maximum, but I think we'd better give an upper limit to prevent users from creating an app with 10 replicas.
Introducing min_allowed_replica_count is motivated from the example illustrated in the description above:

Another example will be creating an app with a single replica in a cluster that includes 3 nodes. 

Later then, we decide to make a high-availability upgrade (without stopping service of the cluster) for this cluster. 

The upgrade process will be blocked in the phase of moving the only replica of that app out to another node, since the node that deployed the only replica is stopped.

Considering this example, I think we can set the minimum replication factor as 3 for a cluster that has 3 or more replica servers.

Okay, your comment persuades me~ min_replica_count is used to set the cluster table min replica count, max_replica_count is used to avoid user unreasonable replica count. How about make max_replica_count as a const value, not a configurable option. For one hand, this value won't be updated in normal cases, for other hands, it will simply the DSN_DEFINE_validator check for min_replica_count.

A good idea ! How about set max_allowed_replica_count as 3 ? Can this value satisfy all of your scenarios ?

In most our scenarios, replica_count is 3. As our discussion above, max_allowed_replica_count should be greater than the normal replica_count. If we set max_allowed_replica_count as 3, then if we would like to provide 5 replica, which is also a reasonable replica count, we should update code. As a result, I suggest seting max_allowed_replica_count as 10, or even greater than 10.

Ok, how about setting max_allowed_replica_count as 15 (an odd number, though there's no particular reason) ?

I think it's okay to set max_allowed_replica_count as 15~

@hycdong @empiredan I do consider a const and unoptional value of 15 is too large, any new user could create a table with 15 replicas at most without any limitation, it's harmful for the system. If most cases are 3 replicas, why not make it optional, in the rare case if any body want to create a table with some larger replication factor, they can set a larger RF and reboot the metaserver.

@hycdong
Copy link
Contributor

hycdong commented Nov 29, 2021

Thanks for your pull request for adding replica count check while creating table~ In this pull request, you adds three rules for replica count:

  1. replica count should not exceed alive node count
  2. replica count should not below min_replica_count
  3. replica count should not exceed max_replica_count

I have a different option with your third rule, I think system should not restrict its table max replica count for scalability. Besides, I think it's unnecessary to make min_replica_count as an option, in my view, min_replica_count will always be 0. Besides, I add some comments for this pr.

Yeah, I agree that we should provide scalability and flexibility for the system. The purpose of introducing max_allowed_replica_count is to avoid unreasonable replication factor. A big replication factor may lead to poor behavior of the system. We can increase this maximum, but I think we'd better give an upper limit to prevent users from creating an app with 10 replicas.
Introducing min_allowed_replica_count is motivated from the example illustrated in the description above:

Another example will be creating an app with a single replica in a cluster that includes 3 nodes. 

Later then, we decide to make a high-availability upgrade (without stopping service of the cluster) for this cluster. 

The upgrade process will be blocked in the phase of moving the only replica of that app out to another node, since the node that deployed the only replica is stopped.

Considering this example, I think we can set the minimum replication factor as 3 for a cluster that has 3 or more replica servers.

Okay, your comment persuades me~ min_replica_count is used to set the cluster table min replica count, max_replica_count is used to avoid user unreasonable replica count. How about make max_replica_count as a const value, not a configurable option. For one hand, this value won't be updated in normal cases, for other hands, it will simply the DSN_DEFINE_validator check for min_replica_count.

A good idea ! How about set max_allowed_replica_count as 3 ? Can this value satisfy all of your scenarios ?

In most our scenarios, replica_count is 3. As our discussion above, max_allowed_replica_count should be greater than the normal replica_count. If we set max_allowed_replica_count as 3, then if we would like to provide 5 replica, which is also a reasonable replica count, we should update code. As a result, I suggest seting max_allowed_replica_count as 10, or even greater than 10.

Ok, how about setting max_allowed_replica_count as 15 (an odd number, though there's no particular reason) ?

I think it's okay to set max_allowed_replica_count as 15~

@hycdong @empiredan I do consider a const and unoptional value of 15 is too large, any new user could create a table with 15 replicas at most without any limitation, it's harmful for the system. If most cases are 3 replicas, why not make it optional, in the rare case if any body want to create a table with some larger replication factor, they can set a larger RF and reboot the metaserver.

As I mentioned before, I think system should not restrict its table max replica count for scalability, but as empiredan's option, it can avoid user unreasonable replica count. If we set max_allowed_replica_count as 3, it somehow affects scalability. It is okay for user to create table with 4 replicas and 5 replicas and so on. Besides, 4 and 5 are not user unreasonable replica count. So I prefer seting it as a bigger value.
And there are two reasons I recommend it as a const. Firstly, it is easiler to check validation max_allowed_replica_count and min_allowed_replica_count in validation if one of them is a const value. Secondly, as the max_allowed_replica_count is used to avoid unreasonable replica count, we will hardly update it, it is okay to be a const value, because its value is unreasonable enough.
That is my opinion above, I think we can disucuss the usage of max_allowed_replica_count to determine seting it as an optional value or a const value.

@empiredan
Copy link
Contributor Author

empiredan commented Nov 29, 2021

Thanks for your pull request for adding replica count check while creating table~ In this pull request, you adds three rules for replica count:

  1. replica count should not exceed alive node count
  2. replica count should not below min_replica_count
  3. replica count should not exceed max_replica_count

I have a different option with your third rule, I think system should not restrict its table max replica count for scalability. Besides, I think it's unnecessary to make min_replica_count as an option, in my view, min_replica_count will always be 0. Besides, I add some comments for this pr.

Yeah, I agree that we should provide scalability and flexibility for the system. The purpose of introducing max_allowed_replica_count is to avoid unreasonable replication factor. A big replication factor may lead to poor behavior of the system. We can increase this maximum, but I think we'd better give an upper limit to prevent users from creating an app with 10 replicas.
Introducing min_allowed_replica_count is motivated from the example illustrated in the description above:

Another example will be creating an app with a single replica in a cluster that includes 3 nodes. 

Later then, we decide to make a high-availability upgrade (without stopping service of the cluster) for this cluster. 

The upgrade process will be blocked in the phase of moving the only replica of that app out to another node, since the node that deployed the only replica is stopped.

Considering this example, I think we can set the minimum replication factor as 3 for a cluster that has 3 or more replica servers.

Okay, your comment persuades me~ min_replica_count is used to set the cluster table min replica count, max_replica_count is used to avoid user unreasonable replica count. How about make max_replica_count as a const value, not a configurable option. For one hand, this value won't be updated in normal cases, for other hands, it will simply the DSN_DEFINE_validator check for min_replica_count.

A good idea ! How about set max_allowed_replica_count as 3 ? Can this value satisfy all of your scenarios ?

In most our scenarios, replica_count is 3. As our discussion above, max_allowed_replica_count should be greater than the normal replica_count. If we set max_allowed_replica_count as 3, then if we would like to provide 5 replica, which is also a reasonable replica count, we should update code. As a result, I suggest seting max_allowed_replica_count as 10, or even greater than 10.

Ok, how about setting max_allowed_replica_count as 15 (an odd number, though there's no particular reason) ?

I think it's okay to set max_allowed_replica_count as 15~

@hycdong @empiredan I do consider a const and unoptional value of 15 is too large, any new user could create a table with 15 replicas at most without any limitation, it's harmful for the system. If most cases are 3 replicas, why not make it optional, in the rare case if any body want to create a table with some larger replication factor, they can set a larger RF and reboot the metaserver.

As I mentioned before, I think system should not restrict its table max replica count for scalability, but as empiredan's option, it can avoid user unreasonable replica count. If we set max_allowed_replica_count as 3, it somehow affects scalability. It is okay for user to create table with 4 replicas and 5 replicas and so on. Besides, 4 and 5 are not user unreasonable replica count. So I prefer seting it as a bigger value. And there are two reasons I recommend it as a const. Firstly, it is easiler to check validation max_allowed_replica_count and min_allowed_replica_count in validation if one of them is a const value. Secondly, as the max_allowed_replica_count is used to avoid unreasonable replica count, we will hardly update it, it is okay to be a const value, because its value is unreasonable enough. That is my opinion above, I think we can disucuss the usage of max_allowed_replica_count to determine seting it as an optional value or a const value.

@hycdong @acelyc111 Or, can we turn to refer to the implementation of the system that has the similar problem with us ?

In apache/kudu@b39491d, Todd has explained why kudu uses max_num_replicas to restrict the replication factor. Now max_num_replicas is set to 7, since higher RF hasn't been proved to be effective by some test cases. A well-intentioned user may try to set replication factor to a very high number to get a table replicated onto every server in their cluster.

Therefore, should we make max_allowed_replica_count configurable and its default value a little higher than 3, say, 5 ?

And also, RF is restricted by min_num_repilcas . See apache/kudu@6cb1548 for details.

As for the related code, please see the following quotes:

DEFINE_int32(max_num_replicas, 7,
             "Maximum number of replicas that may be specified for a table.");
// Tag as unsafe since we have done very limited testing of higher than 5 replicas.
TAG_FLAG(max_num_replicas, unsafe);
TAG_FLAG(max_num_replicas, runtime);

DEFINE_int32(min_num_replicas, 1,
             "Minimum number of replicas that may be specified when creating "
             "a table: this is to enforce the minimum replication factor for "
             "tables created in a Kudu cluster. For example, setting this flag "
             "to 3 enforces every new table to have at least 3 replicas for "
             "each of its tablets, so there cannot be a data loss when a "
             "single tablet server fails irrecoverably.");
TAG_FLAG(min_num_replicas, advanced);
TAG_FLAG(min_num_replicas, runtime);

...

template<typename RespClass>
Status CatalogManager::ValidateNumberReplicas(const std::string& normalized_table_name,
                                              RespClass* resp, ValidateType type,
                                              const boost::optional<int>& partitions_count,
                                              int num_replicas) {
  if (num_replicas > FLAGS_max_num_replicas) {
    return SetupError(Status::InvalidArgument(
        Substitute("illegal replication factor $0: maximum allowed replication "
                   "factor is $1 (controlled by --max_num_replicas)",
                   num_replicas, FLAGS_max_num_replicas)),
        resp, MasterErrorPB::REPLICATION_FACTOR_TOO_HIGH);
  }
  if (num_replicas < FLAGS_min_num_replicas) {
    return SetupError(Status::InvalidArgument(
        Substitute("illegal replication factor $0: minimum allowed replication "
                   "factor is $1 (controlled by --min_num_replicas)",
            num_replicas, FLAGS_min_num_replicas)),
        resp, MasterErrorPB::ILLEGAL_REPLICATION_FACTOR);
  }
  // Reject create/alter table with even replication factors, unless master flag
  // allow_unsafe_replication_factor is on.
  if (num_replicas % 2 == 0 && !FLAGS_allow_unsafe_replication_factor) {
    return SetupError(Status::InvalidArgument(
        Substitute("illegal replication factor $0: replication factor must be odd",
                   num_replicas)),
        resp, MasterErrorPB::EVEN_REPLICATION_FACTOR);
  }
...
}

@hycdong
Copy link
Contributor

hycdong commented Dec 2, 2021

Thanks for your pull request for adding replica count check while creating table~ In this pull request, you adds three rules for replica count:

  1. replica count should not exceed alive node count
  2. replica count should not below min_replica_count
  3. replica count should not exceed max_replica_count

I have a different option with your third rule, I think system should not restrict its table max replica count for scalability. Besides, I think it's unnecessary to make min_replica_count as an option, in my view, min_replica_count will always be 0. Besides, I add some comments for this pr.

Yeah, I agree that we should provide scalability and flexibility for the system. The purpose of introducing max_allowed_replica_count is to avoid unreasonable replication factor. A big replication factor may lead to poor behavior of the system. We can increase this maximum, but I think we'd better give an upper limit to prevent users from creating an app with 10 replicas.
Introducing min_allowed_replica_count is motivated from the example illustrated in the description above:

Another example will be creating an app with a single replica in a cluster that includes 3 nodes. 

Later then, we decide to make a high-availability upgrade (without stopping service of the cluster) for this cluster. 

The upgrade process will be blocked in the phase of moving the only replica of that app out to another node, since the node that deployed the only replica is stopped.

Considering this example, I think we can set the minimum replication factor as 3 for a cluster that has 3 or more replica servers.

Okay, your comment persuades me~ min_replica_count is used to set the cluster table min replica count, max_replica_count is used to avoid user unreasonable replica count. How about make max_replica_count as a const value, not a configurable option. For one hand, this value won't be updated in normal cases, for other hands, it will simply the DSN_DEFINE_validator check for min_replica_count.

A good idea ! How about set max_allowed_replica_count as 3 ? Can this value satisfy all of your scenarios ?

In most our scenarios, replica_count is 3. As our discussion above, max_allowed_replica_count should be greater than the normal replica_count. If we set max_allowed_replica_count as 3, then if we would like to provide 5 replica, which is also a reasonable replica count, we should update code. As a result, I suggest seting max_allowed_replica_count as 10, or even greater than 10.

Ok, how about setting max_allowed_replica_count as 15 (an odd number, though there's no particular reason) ?

I think it's okay to set max_allowed_replica_count as 15~

@hycdong @empiredan I do consider a const and unoptional value of 15 is too large, any new user could create a table with 15 replicas at most without any limitation, it's harmful for the system. If most cases are 3 replicas, why not make it optional, in the rare case if any body want to create a table with some larger replication factor, they can set a larger RF and reboot the metaserver.

As I mentioned before, I think system should not restrict its table max replica count for scalability, but as empiredan's option, it can avoid user unreasonable replica count. If we set max_allowed_replica_count as 3, it somehow affects scalability. It is okay for user to create table with 4 replicas and 5 replicas and so on. Besides, 4 and 5 are not user unreasonable replica count. So I prefer seting it as a bigger value. And there are two reasons I recommend it as a const. Firstly, it is easiler to check validation max_allowed_replica_count and min_allowed_replica_count in validation if one of them is a const value. Secondly, as the max_allowed_replica_count is used to avoid unreasonable replica count, we will hardly update it, it is okay to be a const value, because its value is unreasonable enough. That is my opinion above, I think we can disucuss the usage of max_allowed_replica_count to determine seting it as an optional value or a const value.

@hycdong @acelyc111 Or, can we turn to refer to the implementation of the system that has the similar problem with us ?

In apache/kudu@b39491d, Todd has explained why kudu uses max_num_replicas to restrict the replication factor. Now max_num_replicas is set to 7, since higher RF hasn't been proved to be effective by some test cases. A well-intentioned user may try to set replication factor to a very high number to get a table replicated onto every server in their cluster.

Therefore, should we make max_allowed_replica_count configurable and its default value a little higher than 3, say, 5 ?

And also, RF is restricted by min_num_repilcas . See apache/kudu@6cb1548 for details.

As for the related code, please see the following quotes:

DEFINE_int32(max_num_replicas, 7,
             "Maximum number of replicas that may be specified for a table.");
// Tag as unsafe since we have done very limited testing of higher than 5 replicas.
TAG_FLAG(max_num_replicas, unsafe);
TAG_FLAG(max_num_replicas, runtime);

DEFINE_int32(min_num_replicas, 1,
             "Minimum number of replicas that may be specified when creating "
             "a table: this is to enforce the minimum replication factor for "
             "tables created in a Kudu cluster. For example, setting this flag "
             "to 3 enforces every new table to have at least 3 replicas for "
             "each of its tablets, so there cannot be a data loss when a "
             "single tablet server fails irrecoverably.");
TAG_FLAG(min_num_replicas, advanced);
TAG_FLAG(min_num_replicas, runtime);

...

template<typename RespClass>
Status CatalogManager::ValidateNumberReplicas(const std::string& normalized_table_name,
                                              RespClass* resp, ValidateType type,
                                              const boost::optional<int>& partitions_count,
                                              int num_replicas) {
  if (num_replicas > FLAGS_max_num_replicas) {
    return SetupError(Status::InvalidArgument(
        Substitute("illegal replication factor $0: maximum allowed replication "
                   "factor is $1 (controlled by --max_num_replicas)",
                   num_replicas, FLAGS_max_num_replicas)),
        resp, MasterErrorPB::REPLICATION_FACTOR_TOO_HIGH);
  }
  if (num_replicas < FLAGS_min_num_replicas) {
    return SetupError(Status::InvalidArgument(
        Substitute("illegal replication factor $0: minimum allowed replication "
                   "factor is $1 (controlled by --min_num_replicas)",
            num_replicas, FLAGS_min_num_replicas)),
        resp, MasterErrorPB::ILLEGAL_REPLICATION_FACTOR);
  }
  // Reject create/alter table with even replication factors, unless master flag
  // allow_unsafe_replication_factor is on.
  if (num_replicas % 2 == 0 && !FLAGS_allow_unsafe_replication_factor) {
    return SetupError(Status::InvalidArgument(
        Substitute("illegal replication factor $0: replication factor must be odd",
                   num_replicas)),
        resp, MasterErrorPB::EVEN_REPLICATION_FACTOR);
  }
...
}

Okay, I think if we set max_num_replica as 5, we should add an option for it, user can update it through config. In this case, it is the unreasonable replica count, it restirct the max replica count through a cluster-level configuration. If user would like to create table with higher replica_count, it should update config.

@acelyc111
Copy link
Member

Thanks for your pull request for adding replica count check while creating table~ In this pull request, you adds three rules for replica count:

  1. replica count should not exceed alive node count
  2. replica count should not below min_replica_count
  3. replica count should not exceed max_replica_count

I have a different option with your third rule, I think system should not restrict its table max replica count for scalability. Besides, I think it's unnecessary to make min_replica_count as an option, in my view, min_replica_count will always be 0. Besides, I add some comments for this pr.

Yeah, I agree that we should provide scalability and flexibility for the system. The purpose of introducing max_allowed_replica_count is to avoid unreasonable replication factor. A big replication factor may lead to poor behavior of the system. We can increase this maximum, but I think we'd better give an upper limit to prevent users from creating an app with 10 replicas.
Introducing min_allowed_replica_count is motivated from the example illustrated in the description above:

Another example will be creating an app with a single replica in a cluster that includes 3 nodes. 

Later then, we decide to make a high-availability upgrade (without stopping service of the cluster) for this cluster. 

The upgrade process will be blocked in the phase of moving the only replica of that app out to another node, since the node that deployed the only replica is stopped.

Considering this example, I think we can set the minimum replication factor as 3 for a cluster that has 3 or more replica servers.

Okay, your comment persuades me~ min_replica_count is used to set the cluster table min replica count, max_replica_count is used to avoid user unreasonable replica count. How about make max_replica_count as a const value, not a configurable option. For one hand, this value won't be updated in normal cases, for other hands, it will simply the DSN_DEFINE_validator check for min_replica_count.

A good idea ! How about set max_allowed_replica_count as 3 ? Can this value satisfy all of your scenarios ?

In most our scenarios, replica_count is 3. As our discussion above, max_allowed_replica_count should be greater than the normal replica_count. If we set max_allowed_replica_count as 3, then if we would like to provide 5 replica, which is also a reasonable replica count, we should update code. As a result, I suggest seting max_allowed_replica_count as 10, or even greater than 10.

Ok, how about setting max_allowed_replica_count as 15 (an odd number, though there's no particular reason) ?

I think it's okay to set max_allowed_replica_count as 15~

@hycdong @empiredan I do consider a const and unoptional value of 15 is too large, any new user could create a table with 15 replicas at most without any limitation, it's harmful for the system. If most cases are 3 replicas, why not make it optional, in the rare case if any body want to create a table with some larger replication factor, they can set a larger RF and reboot the metaserver.

As I mentioned before, I think system should not restrict its table max replica count for scalability, but as empiredan's option, it can avoid user unreasonable replica count. If we set max_allowed_replica_count as 3, it somehow affects scalability. It is okay for user to create table with 4 replicas and 5 replicas and so on. Besides, 4 and 5 are not user unreasonable replica count. So I prefer seting it as a bigger value. And there are two reasons I recommend it as a const. Firstly, it is easiler to check validation max_allowed_replica_count and min_allowed_replica_count in validation if one of them is a const value. Secondly, as the max_allowed_replica_count is used to avoid unreasonable replica count, we will hardly update it, it is okay to be a const value, because its value is unreasonable enough. That is my opinion above, I think we can disucuss the usage of max_allowed_replica_count to determine seting it as an optional value or a const value.

@hycdong @acelyc111 Or, can we turn to refer to the implementation of the system that has the similar problem with us ?
In apache/kudu@b39491d, Todd has explained why kudu uses max_num_replicas to restrict the replication factor. Now max_num_replicas is set to 7, since higher RF hasn't been proved to be effective by some test cases. A well-intentioned user may try to set replication factor to a very high number to get a table replicated onto every server in their cluster.
Therefore, should we make max_allowed_replica_count configurable and its default value a little higher than 3, say, 5 ?
And also, RF is restricted by min_num_repilcas . See apache/kudu@6cb1548 for details.
As for the related code, please see the following quotes:

DEFINE_int32(max_num_replicas, 7,
             "Maximum number of replicas that may be specified for a table.");
// Tag as unsafe since we have done very limited testing of higher than 5 replicas.
TAG_FLAG(max_num_replicas, unsafe);
TAG_FLAG(max_num_replicas, runtime);

DEFINE_int32(min_num_replicas, 1,
             "Minimum number of replicas that may be specified when creating "
             "a table: this is to enforce the minimum replication factor for "
             "tables created in a Kudu cluster. For example, setting this flag "
             "to 3 enforces every new table to have at least 3 replicas for "
             "each of its tablets, so there cannot be a data loss when a "
             "single tablet server fails irrecoverably.");
TAG_FLAG(min_num_replicas, advanced);
TAG_FLAG(min_num_replicas, runtime);

...

template<typename RespClass>
Status CatalogManager::ValidateNumberReplicas(const std::string& normalized_table_name,
                                              RespClass* resp, ValidateType type,
                                              const boost::optional<int>& partitions_count,
                                              int num_replicas) {
  if (num_replicas > FLAGS_max_num_replicas) {
    return SetupError(Status::InvalidArgument(
        Substitute("illegal replication factor $0: maximum allowed replication "
                   "factor is $1 (controlled by --max_num_replicas)",
                   num_replicas, FLAGS_max_num_replicas)),
        resp, MasterErrorPB::REPLICATION_FACTOR_TOO_HIGH);
  }
  if (num_replicas < FLAGS_min_num_replicas) {
    return SetupError(Status::InvalidArgument(
        Substitute("illegal replication factor $0: minimum allowed replication "
                   "factor is $1 (controlled by --min_num_replicas)",
            num_replicas, FLAGS_min_num_replicas)),
        resp, MasterErrorPB::ILLEGAL_REPLICATION_FACTOR);
  }
  // Reject create/alter table with even replication factors, unless master flag
  // allow_unsafe_replication_factor is on.
  if (num_replicas % 2 == 0 && !FLAGS_allow_unsafe_replication_factor) {
    return SetupError(Status::InvalidArgument(
        Substitute("illegal replication factor $0: replication factor must be odd",
                   num_replicas)),
        resp, MasterErrorPB::EVEN_REPLICATION_FACTOR);
  }
...
}

Okay, I think if we set max_num_replica as 5, we should add an option for it, user can update it through config. In this case, it is the unreasonable replica count, it restirct the max replica count through a cluster-level configuration. If user would like to create table with higher replica_count, it should update config.

+1

@empiredan
Copy link
Contributor Author

Thanks for your pull request for adding replica count check while creating table~ In this pull request, you adds three rules for replica count:

  1. replica count should not exceed alive node count
  2. replica count should not below min_replica_count
  3. replica count should not exceed max_replica_count

I have a different option with your third rule, I think system should not restrict its table max replica count for scalability. Besides, I think it's unnecessary to make min_replica_count as an option, in my view, min_replica_count will always be 0. Besides, I add some comments for this pr.

Yeah, I agree that we should provide scalability and flexibility for the system. The purpose of introducing max_allowed_replica_count is to avoid unreasonable replication factor. A big replication factor may lead to poor behavior of the system. We can increase this maximum, but I think we'd better give an upper limit to prevent users from creating an app with 10 replicas.
Introducing min_allowed_replica_count is motivated from the example illustrated in the description above:

Another example will be creating an app with a single replica in a cluster that includes 3 nodes. 

Later then, we decide to make a high-availability upgrade (without stopping service of the cluster) for this cluster. 

The upgrade process will be blocked in the phase of moving the only replica of that app out to another node, since the node that deployed the only replica is stopped.

Considering this example, I think we can set the minimum replication factor as 3 for a cluster that has 3 or more replica servers.

Okay, your comment persuades me~ min_replica_count is used to set the cluster table min replica count, max_replica_count is used to avoid user unreasonable replica count. How about make max_replica_count as a const value, not a configurable option. For one hand, this value won't be updated in normal cases, for other hands, it will simply the DSN_DEFINE_validator check for min_replica_count.

A good idea ! How about set max_allowed_replica_count as 3 ? Can this value satisfy all of your scenarios ?

In most our scenarios, replica_count is 3. As our discussion above, max_allowed_replica_count should be greater than the normal replica_count. If we set max_allowed_replica_count as 3, then if we would like to provide 5 replica, which is also a reasonable replica count, we should update code. As a result, I suggest seting max_allowed_replica_count as 10, or even greater than 10.

Ok, how about setting max_allowed_replica_count as 15 (an odd number, though there's no particular reason) ?

I think it's okay to set max_allowed_replica_count as 15~

@hycdong @empiredan I do consider a const and unoptional value of 15 is too large, any new user could create a table with 15 replicas at most without any limitation, it's harmful for the system. If most cases are 3 replicas, why not make it optional, in the rare case if any body want to create a table with some larger replication factor, they can set a larger RF and reboot the metaserver.

As I mentioned before, I think system should not restrict its table max replica count for scalability, but as empiredan's option, it can avoid user unreasonable replica count. If we set max_allowed_replica_count as 3, it somehow affects scalability. It is okay for user to create table with 4 replicas and 5 replicas and so on. Besides, 4 and 5 are not user unreasonable replica count. So I prefer seting it as a bigger value. And there are two reasons I recommend it as a const. Firstly, it is easiler to check validation max_allowed_replica_count and min_allowed_replica_count in validation if one of them is a const value. Secondly, as the max_allowed_replica_count is used to avoid unreasonable replica count, we will hardly update it, it is okay to be a const value, because its value is unreasonable enough. That is my opinion above, I think we can disucuss the usage of max_allowed_replica_count to determine seting it as an optional value or a const value.

@hycdong @acelyc111 Or, can we turn to refer to the implementation of the system that has the similar problem with us ?
In apache/kudu@b39491d, Todd has explained why kudu uses max_num_replicas to restrict the replication factor. Now max_num_replicas is set to 7, since higher RF hasn't been proved to be effective by some test cases. A well-intentioned user may try to set replication factor to a very high number to get a table replicated onto every server in their cluster.
Therefore, should we make max_allowed_replica_count configurable and its default value a little higher than 3, say, 5 ?
And also, RF is restricted by min_num_repilcas . See apache/kudu@6cb1548 for details.
As for the related code, please see the following quotes:

DEFINE_int32(max_num_replicas, 7,
             "Maximum number of replicas that may be specified for a table.");
// Tag as unsafe since we have done very limited testing of higher than 5 replicas.
TAG_FLAG(max_num_replicas, unsafe);
TAG_FLAG(max_num_replicas, runtime);

DEFINE_int32(min_num_replicas, 1,
             "Minimum number of replicas that may be specified when creating "
             "a table: this is to enforce the minimum replication factor for "
             "tables created in a Kudu cluster. For example, setting this flag "
             "to 3 enforces every new table to have at least 3 replicas for "
             "each of its tablets, so there cannot be a data loss when a "
             "single tablet server fails irrecoverably.");
TAG_FLAG(min_num_replicas, advanced);
TAG_FLAG(min_num_replicas, runtime);

...

template<typename RespClass>
Status CatalogManager::ValidateNumberReplicas(const std::string& normalized_table_name,
                                              RespClass* resp, ValidateType type,
                                              const boost::optional<int>& partitions_count,
                                              int num_replicas) {
  if (num_replicas > FLAGS_max_num_replicas) {
    return SetupError(Status::InvalidArgument(
        Substitute("illegal replication factor $0: maximum allowed replication "
                   "factor is $1 (controlled by --max_num_replicas)",
                   num_replicas, FLAGS_max_num_replicas)),
        resp, MasterErrorPB::REPLICATION_FACTOR_TOO_HIGH);
  }
  if (num_replicas < FLAGS_min_num_replicas) {
    return SetupError(Status::InvalidArgument(
        Substitute("illegal replication factor $0: minimum allowed replication "
                   "factor is $1 (controlled by --min_num_replicas)",
            num_replicas, FLAGS_min_num_replicas)),
        resp, MasterErrorPB::ILLEGAL_REPLICATION_FACTOR);
  }
  // Reject create/alter table with even replication factors, unless master flag
  // allow_unsafe_replication_factor is on.
  if (num_replicas % 2 == 0 && !FLAGS_allow_unsafe_replication_factor) {
    return SetupError(Status::InvalidArgument(
        Substitute("illegal replication factor $0: replication factor must be odd",
                   num_replicas)),
        resp, MasterErrorPB::EVEN_REPLICATION_FACTOR);
  }
...
}

Okay, I think if we set max_num_replica as 5, we should add an option for it, user can update it through config. In this case, it is the unreasonable replica count, it restirct the max replica count through a cluster-level configuration. If user would like to create table with higher replica_count, it should update config.

+1

Ok, I'll make max_allowed_replica_count configurable, with default value set as 5 ~

@empiredan
Copy link
Contributor Author

As has been mentioned by @hycdong , it's difficult to guarantee that FLAGS_min_allowed_replica_count
should never be greater than FLAGS_max_allowed_replica_count in flag_validator for the purpose of validation for individual flag. Thus I've committed #978 to introduce grouped flag validator to detect the inconsistency between 2 or more flags. Therefore, #978 is the prerequisite for this PR.

src/meta/server_state.cpp Outdated Show resolved Hide resolved
src/meta/server_state.h Show resolved Hide resolved
src/meta/test/meta_app_operation_test.cpp Outdated Show resolved Hide resolved
src/meta/server_state.cpp Outdated Show resolved Hide resolved
Copy link
Member

@acelyc111 acelyc111 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

src/meta/server_state.h Show resolved Hide resolved
@levy5307 levy5307 merged commit f845d8c into XiaoMi:master Dec 15, 2021
Smityz added a commit to Smityz/rdsn that referenced this pull request Dec 16, 2021
feat: restrict the replication factor while creating app (XiaoMi#963)

feat(manual_compaction): meta server support querying compaction status (XiaoMi#987)

manual test
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants