Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Create a plugin for metric reporting #8625

Open
chishui opened this issue Oct 17, 2024 · 5 comments
Open

[RFC] Create a plugin for metric reporting #8625

chishui opened this issue Oct 17, 2024 · 5 comments
Labels
enhancement New feature or request

Comments

@chishui
Copy link
Contributor

chishui commented Oct 17, 2024

Is your feature request related to a problem? Please describe.

  1. We want to create a plugin to record metrics such as number of click event on an UI element, number of times a UI element is visible to user, the success rate of an API call etc.
  2. The metrics are better not be accumulated as we want to keep track of each reported metric.
  3. The metric storing solution can be customized so that we can publish them to AWS CloudWatch, or Kinesis, or simply output a log.
  4. We don't vend a stat API in OSD as metric data could be stored to all kinds of platform and handled/analyzed/visualized from other platforms.
  5. We don't want to store metrics to .kibana index through savedobject by default as it could overextend .kibana index

Describe the solution you'd like

A clear and concise description of what you want to happen.

  1. Create a core plugin: MetricsRecorder
  2. Other plugins can depend on this plugin and use it to record a metric.
  3. The recorded metrics are batched locally and sent to node API to process every 1 minute (can be configured)
  4. The plugin vends a function to register a MetricsRecorderFactory so that users can customize the metric storing strategy.
  5. The node API will call MetricsRecorderFactory to process the metrics.

Describe alternatives you've considered
We've considered using usage_collection plugin, it's a legacy plugin marked as deprecated for quite a long time for some performance issues. Here are some other obvious downsides make it not a good solution:

  1. It relies on saved_object. By default, all metrics are stored to .kibana index which as mentioned above is already overextended. Mixing metric data and other user's configuration such as index pattern, dashboard, visualization etc could bring down the performance. Switching to a new storage solution could cause missing data for existing user.
  2. People can provide a customized implementation of saved_object, but the interface of it is both complex and not quite relevant to metrics reporting.
  3. usage_collection has many logic built to provide the stat() API which is unnecessary for metric collection purpose and add additional burden to customization.

telemetry is a plugin replies on usage_collection and it doesn't store metrics, only fetch.

Additional context

Add any other context or screenshots about the feature request here.

@ashwin-pc
Copy link
Member

  1. It relies on saved_object. By default, all metrics are stored to .kibana index which as mentioned above is already overextended. Mixing metric data and other user's configuration such as index pattern, dashboard, visualization etc could bring down the performance.

If the storage medium is the issue lets change that to something else. The saved object client should now support adapters to different storage mediums. Lets repurpose that logic to write to a non opensearch index if thats the concern. SQLite is a good option. But open to other ideas.

  1. People need to extend saved_object interface to provide a customized storing solution which is both complex and confusing.

Can you elaborate a little more about this? Im not sure i unerstand this limitation. We were able to add many new fields without an issue to the useage collection report and didnt have to make any new mapping changes. e.g. #8345

  1. usage_collection has many logic built to provide the stat() API which is unnecessary for metric collection purpose and add additional burden to customization.

Why is this an additional burden? If anything this simplifies the usage. We get all metric data from a single API. This is much better for collectors since we can get all the data in a single request. Today many plugins unfortunately expose their own stats API's flooding the request logs because they arent using this common API. I see having a single API as a benefit, not a drawback

Also what you are proposeing is exactly what im calling for in the first point. You want to be able to store the metrics in a place thats not the Kibana index. Just for that we dont need to reinvent the wheel. Just make the sotrage medium for the useage collector configurable. .kiaban is only a place to persist the data. We can persist the data to file or SQLite or any other persistent storage medium on the server side and the saved object service should already have the capability now.

@chishui
Copy link
Contributor Author

chishui commented Oct 18, 2024

If the storage medium is the issue lets change that to something else. The saved object client should now support adapters to different storage mediums

I'm afraid we can't simply change the storage to somewhere else as if users are already using this feature and storing their usage data in .kinana, they will find their data are missing with new update.

Can you elaborate a little more about this?

Sorry, my bad. The saved_object allows us to customize an internal repository implementation and usage_collection uses that internal repository to retrieve/store metric data. I meant the interface of repository is complex and not quite relevant to a metric collection. Another issue is the customized internal repository is not only used by usage_collection but other plugins as well for different purposes.

Why is this an additional burden?

I fully understood the merits of the API. I mean if uses want to use another platform to store and analyze their metrics, they do not need to be exposed in OSD through stats(). The contract of usage_collection is that it stores through report and fetch through stats. People either break the contract or spend additional efforts for implement.

@ruanyl
Copy link
Member

ruanyl commented Oct 18, 2024

If the storage medium is the issue lets change that to something else. The saved object client should now support adapters to different storage mediums

I'm afraid we can't simply change the storage to somewhere else as if users are already using this feature and storing their usage data in .kinana, they will find their data are missing with new update.

For usage_collection, currently we're using SavedObjects for persistence. Similarly, we can have a API client, something called CloudWatchPublisher to publish metrics to CloudWatch, and user can choose which persistence layer to use.

Just an idea:

interface MetricPublisher {
  publish(metrics: Metric | Metric[]): Promise<void>  
}

class OpenSearchPublisher implements MetricPublisher {
  consctructor(private repository: ISavedObjectsRepository) {}
  publish(metrics: Metric | Metric[]) {
    // implement something like the current storeReport function
    this.repository.create(...)
    this.repository.bulkCreate(...)
    this.repository.incrementCounter(...)
  }
}

class CloudWatchPublisher implements MetricPublisher {
  consctructor(private repository: ISavedObjectsRepository) {}
  publish(metrics: Metric | Metric[]) {
    // call cloud watch API to store metrics
  }
}

Then we don't need to abandon usage_collection plugin, but refactor it so that it can support different persistence layers. @chishui @ashwin-pc

@ashwin-pc
Copy link
Member

I'm afraid we can't simply change the storage to somewhere else as if users are already using this feature and storing their usage data in .kinana, they will find their data are missing with new update.

We are not using this today. And if we suspect someone might be. we can make it dynamic so that the default storagestill remains to be saved objects, but can be changed to something else

I fully understood the merits of the API. I mean if uses want to use another platform to store and analyze their metrics, they do not need to be exposed in OSD through stats(). The contract of usage_collection is that it stores through report and fetch through stats. People either break the contract or spend additional efforts for implement.

Thats fair. But thats not a reason to introduce a new way to track metrics without a plan for the old one. We cant have 2 different ways to track metrics in the platform. Thats a duplicative effort. If we want a new way to collect metrics when an existing method exists, we need a much better reasonsing. I havent seen a clear reason here on why the problems you called out in the usage collector is not fixable. @ruanyl's suggestion is precicely what i excpected we would be using to get around the limitations of storing using saved objects.

As for vending the data, again, while the stats command can vend the data, it needs url params to expose that information. So there are ways there too to fetch the data in a similar fashion.

@ashwin-pc
Copy link
Member

@chishui can you convert this into an RFC so that we can invite the rest of the community to weigh in on your proposal. I dont think this qualifies as a feature request given the existence of the existing usage collector

@chishui chishui changed the title [feat] Create a plugin for metric reporting [RFC] Create a plugin for metric reporting Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants