You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jan 8, 2019. It is now read-only.
If one is running a large cluster and does not scale node[:bcpc][:hadoop][:zookeeper][:maxClientCnxns] then they will have the joys of Zookeeper unavailability and spewing log entries like:
2014-12-16 17:11:38,921 [myid:12] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@193] - Too many connections from /1.2.3.4 - max is 500
Ideally, we can create monitoring in Zabbix from the provided JMX metrics: org.apache.ZooKeeperService -> ReplicatedServer_idnodeNumber -> replica.nodeNumber -> Attributes-> Follower ->
Attributes ->
PendingRevalidationCount
AvgRequestLatency
MaxRequestLatency
MinRequestLatency
NumAliveConnections
OutstandingRequests
PacketsReceived
PacketsSent
Connections ->
client IP ->
connection ptr ->
OutstandingRequest
PacketsReceived
PacketsSent
MinRequestLatency
MaxRequestLatency
AvgRequestLatency
LastLatency
...
InMemoryDataTree ->
Attributes ->
NodeCount
WatchCount
The question is how to see rejected connections which I'm not seeing here. Regardless I think a lot of useful cluster monitoring can be done here.
The text was updated successfully, but these errors were encountered:
cbaenziger
changed the title
Zookeeper connections are precious
Zookeeper connections are precious and other ZK administrivia
Dec 17, 2014
autopurge.snapRetainCount
(No Java system property)
New in 3.4.0: When enabled, ZooKeeper auto purge feature retains the autopurge.snapRetainCount most recent snapshots and the corresponding transaction logs in the dataDir and dataLogDir respectively and deletes the rest. Defaults to 3. Minimum value is 3.
autopurge.purgeInterval
(No Java system property)
New in 3.4.0: The time interval in hours for which the purge task has to be triggered. Set to a positive integer (1 and above) to enable the auto purging. Defaults to 0.
Leader only Coordinates
If we have more than three node in the quorum and are running for Kafka we probably want this.
leaderServes
(Java system property: zookeeper.leaderServes)
Leader accepts client connections. Default value is "yes". The leader machine coordinates updates. For higher update throughput at thes slight expense of read throughput the leader can be configured to not accept clients and focus on coordination. The default to this option is yes, which means that a leader will accept client connections.
Note
Turning on leader selection is highly recommended when you have more than three ZooKeeper servers in an ensemble.
If one is running a large cluster and does not scale
node[:bcpc][:hadoop][:zookeeper][:maxClientCnxns]
then they will have the joys of Zookeeper unavailability and spewing log entries like:Ideally, we can create monitoring in Zabbix from the provided JMX metrics:
org.apache.ZooKeeperService
->ReplicatedServer_id
nodeNumber ->replica.
nodeNumber ->Attributes
->Follower
->Attributes
->PendingRevalidationCount
AvgRequestLatency
MaxRequestLatency
MinRequestLatency
NumAliveConnections
OutstandingRequests
PacketsReceived
PacketsSent
Connections
->OutstandingRequest
PacketsReceived
PacketsSent
MinRequestLatency
MaxRequestLatency
AvgRequestLatency
LastLatency
InMemoryDataTree
->Attributes
->NodeCount
WatchCount
The question is how to see rejected connections which I'm not seeing here. Regardless I think a lot of useful cluster monitoring can be done here.
The text was updated successfully, but these errors were encountered: