-
Notifications
You must be signed in to change notification settings - Fork 470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replica (re)creation process ordering and completion #1567
Comments
do you mean RBAC SQL users? for replicate RBAC SQL created users over Zookeeper use following approach
|
I thought rbac was just not replicated at all in ClickHouse and you needed ON CLUSTER shenanigans when working with it… huh… |
Operator does not know passwords, for example. That's why it is hard to replicate users w/o zookeeper |
That much I figured yeah, but I expected the operator could pull the hashed password and import the users + DB/table RBAC on the new replicas that way? Not that I was able to find at a glance where in ClickHouse they are I suppose the solution in #1567 (comment) is what makes the most sense... 🤔 Well, if not possible/desirable, what do you think about the part of things about keeping the new replica marked as unhealthy (from k8s' point of view) until it's in sync? |
The thing is that it may take a long time for replica to get in sync, especially on big clusters. But we may add a switch for that, it is not that difficult to do. As a workaround, you may use distributed table (even with a single shard) and 'max_replica_delay_for_distributed_queries' . |
That is exactly what brought me to ask this yes 🙂
Hmmm that does seem a bit cumbersome, but it also seems like the right approach long-term indeed... Well, if the switch is easy to add, it would be neat, and if not I can live without it. I figured since it's something the operator tracks already anyway it would be trivial-ish (ie: |
Hi,
I was following https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-data-migration/add_remove_replica/#using-altinity-operator, with operator v0.23.7, after losing a replica to an upstream bug (details not very relevant to this issue).
Anyway, trialing the process in our dev environment, I was surprised by 2 specific aspects:
This causes a couple of issues if the initialization process is not extremely fast (in my cases, with 400+GB of data it's far from immediate...), as clients end up routed to an incomplete replica against which they can't even authenticate, while live replicas still exist.
Maybe this is expected behaviour, but it seems like it would be an improvement to address these 2 concerns, or at least document them in some way. Especially if there are other non-schema elements than users which I'm not thinking of but matter.
Thanks
The text was updated successfully, but these errors were encountered: