Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] "in_order" and "is_leader" settings corelation #2600

Closed
igor-sh8 opened this issue Jul 6, 2018 · 2 comments
Closed

[Question] "in_order" and "is_leader" settings corelation #2600

igor-sh8 opened this issue Jul 6, 2018 · 2 comments

Comments

@igor-sh8
Copy link

igor-sh8 commented Jul 6, 2018

Hello.
I have a small question about how the config setting "in_order" and replicated table state "is_leader" is correlate.
Example of config of the one shard:

<replica>
<host>ch1</host>
<port>9000</port>
</replica>
<replica>
<host>ch2</host>
<port>9000</port>
</replica>

As I understand the "in_order" config allow us to use the first server (ch1) as a main for "SELECT" queries.
When I run a query on a table on such shard "SELECT * FROM system.replicas WHERE table = 'table1'" I see that "is_leader" set to ch2 server.
So, the question is how "in_order" and "is_leader" correlate ? Is it normal, or maybe "in_order" is counted from the end ?

@filimonov
Copy link
Contributor

in_order - defines a strategy of choosing replica to query. I.e. when you're sending select to Distibuted table it should choose one replica from each shard to ask for a data. in_order says that it should always prefer first listed in config.

So it defines how Distributed table work.

Leader in clickhouse is connected with Replicated*MergeTree tables work. It is one of the replicas which is choosen with help of zookeeper to define which parts should be merged. You can forbid certain replica to became a leader with replicated_can_become_leader setting. And the only thing leader is responsible - is choosing which parts should be merged.

So once again:

  1. replication can work without Distibuted table
  2. if you have multiple replicas of your table one replica should be a leader, but you don't need to care about that, as leader doesn't do something CPU intensive or doesn't have any advantages for enduser. Leader only choose which parts should be merged.
  3. you can select / insert data to any replica. You don't care who is the leader.
  4. if leader replica will go away another replica will become a leader automatically.
  5. Distibuted table can work as loadbalancer between multiple replicas, and can decide which replica to ask. There are different strategies for that, one of them is 'in_order'.

@igor-sh8
Copy link
Author

igor-sh8 commented Jul 6, 2018

Great.
Thanks for the detailed explanation.

@igor-sh8 igor-sh8 closed this as completed Jul 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants