-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Metricbeat] Kafka module refactoring #14852
Comments
The source code looks complex in terms of error handling, retries and waiting for results, so let me rephrase this description to fully understand what should be done/achieved:
It looks like a bigger refactoring (incl. tests), hence wondering if isn't it easier to rewrite the code. |
I would say that yes, this is the ideal situation, to make use of the "cluster-wide" client if possible. This will most cover points 1 and 2 from the initial description.
I guess that any kind of optimisation is more than welcome :).
Maybe https://github.com/Shopify/sarama/blob/master/admin.go#L85 ?
I would agree with a rewriting. If I'm not mistaken the key question to answer before this is if we can replace the old client with the "cluster-wide" one. If yes then all the functionalities could be re-written on top of this concept. |
@ChrsMark Thank you for answers! |
+1 to rewrite these metricsets if we can simplify the code by using more features of the sarama client and less custom code. |
Lessons learnt here: Ok, so I tried to solve this first and second issue (use cluster-wide client, "broker is not leader"), but apparently grouping requests made the codebase relatively complex and the original issue ("broker is not leader" due to not fresh metadata) might still occur. After a discussion with Jaime, the way to go might be putting some retries around fetching topic metadata, but this can also make the code not readable. I will leave this issue open, unless there are no easy steps to solve it (comparing to the total gain). |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Backlog grooming: Closing for now. |
Current situation
Currently Kafka module connects to the local broker and retrieve information only from there. Given the fact of the distributed nature of Kafka, this might be problematic in many cases that we need some "cluster-wide" metrics. In addition problems have been reported in cases of topics with many partitions.
Goal of this issue
In #14822 we introduced the use of a "cluster-wide" client which connects to a given broker but can retrieve information for the whole cluster through that.
This issue aims to refactor the module so as to:
consumer_lag
calculation as described on this comment@jsoriano @exekias @mtojek feel free to add/comment anything!
The text was updated successfully, but these errors were encountered: