-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Monitoring] Parity between usage data collection #34940
Comments
Pinging @elastic/stack-monitoring |
I don't know details / complexities of the implementation but conceptually it would be nice if there was a common piece of code responsible for collecting monitoring data, including formatting it correctly. Then the API endpoint code and the elasticsearch bulk shipping code could both call this common collection code. That would ensure that these parity bugs go away. |
Another approach would be to decouple when stats are collected (by the various collectors within Kibana) from when the collected stats are used (either pulled via the To make this work, the Kibana server would keep collected stats in memory. The collectors would run whenever they are configured to and update their section of the in-memory collected stats. The The nice thing about this decoupling is that the collectors can each run at whatever frequency makes sense to them. This might be especially beneficial when it comes to Kibana telemetry collection, which we might want to run rather infrequently. Similarly, the bulk uploader could run at whatever frequency it wants to or be entirely disabled w/o affecting collection in any way. This could be useful when we want users to migrate to using Metricbeat for collection. |
To add more information, here is a bit of a difference between how we poll data from the collectors. GET /api/statsThis is an endpoint used by telemetry and MB monitoring collection. By default, it returns the result of this collector set. If you provide an optional
Monitoring PollingThis is how internal monitoring works within Kibana. At the configured interval (default is 10s), we fetch all collectors (except for the duplicate ops collector from OSS). That list is:
They both utilize methods off the OSS collector set class. It makes sense to put the consolidated logic here as both already have access to and are currently using it. |
@chrisronline As far as achieving parity goes, what you're proposing above will work, as long as all collection happens synchronously with either the However, we will still need to address the issue of separating the telemetry collection interval from the rest-of-kibana-stats collection interval and making this separation work while keeping parity between |
@chrisronline, with the latest split between Telemetry and Monitoring. Do you think this issue is still valid? |
Yes, this is all set. Thanks @afharo! |
Currently, we have two separate pieces of code that handle collecting usage data. This is because these pieces of code do something different with the data: one returns it from an api endpoint and the other ships it off to Elasticsearch through monitoring documents.
However, this isn't scalable as with Metricbeat now collecting and shipping usage data (using the api endpoint mentioned in the first piece of code above) to monitoring documents (like the second piece of code), we need to ensure parity or bugs start to crop up.
It will be hard to maintain this parity if the two pieces of code remain as separate pieces - we should unify them so it's not possible for them to deviate.
cc @tsullivan
The text was updated successfully, but these errors were encountered: