Skip to content

What is Burrow

Todd Palino edited this page Jun 2, 2015 · 8 revisions

#Overview Burrow is a monitoring tool for keeping track of consumer lag in Apache Kafka. It is designed to monitor every consumer group that is committing offsets to Kafka (as opposed to consumers that are using Zookeeper offsets, which are not currently tracked), and to monitor every topic and partition consumed by those groups. This provides a comprehensive view of consumer status.

Burrow also provides several HTTP request endpoints for getting information about the Kafka cluster and consumers, separate from the lag status. This can be very useful for creating applications that assist with managing your Kafka clusters when it is not convenient (or possible) to run a Java Kafka client.

#Why Not MaxLag? The standard Kafka consumer does have a built-in metric to track MaxLag. While this can be convenient, it has several flaws:

  • MaxLag must be monitored on every consumer The MaxLag metric must be collected from every consumer. These metrics have to be collated and interpreted separately.
  • MaxLag is only valid when the consumer is live The metric is reported by the consumer itself. If ithe consumer is not running, no metric is available.
  • MaxLag is not objective Because the consumer itself reports the metric, MaxLag cannot be an objective measure of consumer lag. The consumer measures it after fetching messages, so if there is any problem with consuming, an incorrect value can be reported.
  • MaxLag is only provided by the Java client The only official Kafka client is the Java client, and this is the only client that has the metric available. It can certainly be worked into other clients, but then you have to worry about subtle differences in measurement and collection of the metric.

#How Does It Work? Burrow runs a Kafka client that consumes the __consumer_offsets topic, which is a special internal topic where all consumer offset commits are stored. This provides a stream of every consumer's status. Additionally, Burrow periodically queries all the brokers in the cluster for the current HEAD offset (the most recent offset). Upon demand, via the HTTP endpoint, or on a set interval determined by the , the status of each consumer group is calculated to determine if it is OK or having problems.

For more information about the specific rules on evaluation, review the Consumer Lag Evaluation Rules page.