-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Logs UI] Include the dataset information in categorization warning message #60392
Comments
Pinging @elastic/logs-metrics-ui (Team:logs-metrics-ui) |
The question of how to determine the categories responsible for the warning is still not resolved, so I would dispute this being As @sophiec20 helpfully suggested in #59005 (comment), this might be best achieved by enhancing the stats collected by the ML functionality while processing the documents. |
A couple things to update this for future prioritization:
We should make this ticket dependent on a real ticket that exists for the ML side, or if one doesn't exist, re-think this ticket in light of what's possible.
Have we done this already / are we interested in any of these improvements while we wait for the ML-side improvements? |
OK I just heard back from ML about this (thanks @droberts195) and there is a new value available in a job module's
as a sibling to |
So here are the decisions we still need to make, I think:
|
For 7.10 you can just check the You won't need to do separate calculations to work out which dataset is responsible, as ML will tell you. If ML's current definition of |
@droberts195, the new categorizer stats look awesome and should help us a lot. 🤯
As to what we've done so far that didn't depend on the ML changes, we already implemented a setup enhancement that enables the user to (de)select specific datasets on job (re)creation.
We have a mechanism in place that informs the user about job definition changes in the UI and prompts for re-creation of the job. To me it sounds like this is what we would have to do in order to take advantage of the new per-partition warnings:
The new stop-on-warn parameter also looks extremely useful and if we include it in our job config we would have to adapt the warning messages accordingly. |
ℹ️ This has been split out of #59005.
UPDATE: ML has made it possible to return per-partition errors for problematic partitions, see: #60392 (comment)
Summary
Here, we'd like to show a more meaningful warning message with a call to action to get around the warning root cause, when a dataset categorization job returns categorization_status = warn.
If the status is warn, we will perform per-partition queries to determine which partitions likely cause the high rare categories count or a high category count in respect to the overall count and then display a warning message at the top, calling out the specific datasets that have the categorization_status = warn. The message will also include a link to job configuration to allow users, which when clicked will show the warning indicator alongside the index which containts the problematic dataset. The warning message UI will be
and the job configuration UI will be
Display a warning that summarizes the results.
ℹ️ Implementation hints
categorization_status
of the job is a summary of the individual categorizer's status. But because these are written independently they might be temporarily inconsistent.Use-case description
TODO
The text was updated successfully, but these errors were encountered: