Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Logs UI] categorisation setup screen #59005

Closed
katrin-freihofner opened this issue Mar 2, 2020 · 9 comments
Closed

[Logs UI] categorisation setup screen #59005

katrin-freihofner opened this issue Mar 2, 2020 · 9 comments
Labels
Feature:Logs UI Logs UI feature Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services

Comments

@katrin-freihofner
Copy link
Contributor

Describe the feature
There are two cases where we need to improve the categorization UX:

  1. if the data is not suitable for categorization
  2. if there is too little training data available to show meaningful results

For both scenarios, we want to display a warning to the user. With the button in this warning callout, the ML job setup can be updated.

Warning message (EuiCallout - Warning)

Screenshot 2020-03-02 at 09 45 24

Too little training data

A single dataset

Title
[dataset.name] does not provide enough training data
Message
Longer periods of time will improve the categorization results for [dataset.name]. Update the configuration to improve your results. Learn more
Button
Update configuration -> Links to setup screen

Multiple datasets

Title
Multiple datasets do not provide enough training data
Message
We have too little training data for following datasets: [dataset.name], [dataset.name]. Longer periods of time will improve the categorization results. Learn more
Button
Update configuration -> Links to setup screen

Data is not suitable for categorization

A single dataset

Title
[dataset.name] does not provide data for meaningful categorization
Message
Because of the structure the log messages in [dataset.name] have, they can not be categorized in a meaningful way. Update your job configuration to improve the results. Learn more.
Button
Update configuration -> Links to setup screen

Multiple datasets

Title
Multiple datasets do not provide data for meaningful categorization
Message
Because of the structure the log messages in [dataset.name] have, they can not be categorized in a meaningful way. Update your job configuration to improve the results. Learn more.
Button
Update configuration -> Links to setup screen

Too little training data and not suitable

Title
Multiple datasets don’t provide data for meaningful categorization or provide too little training data
Message
Because of the structure the log messages in [dataset.name], [dataset.name] and [dataset.name], they can not be categorized in a meaningful way or there is too little training data. Learn more.
Button
Update configuration -> Links to setup screen

-> the learn more links should point to a docs page. @mukeshelastic would you please provide the link?

Setup screen

Screenshot 2020-03-02 at 09 46 14

The changes in the setup screen affect the index selection. With the new version if should be possible to select/deselect an index but also all datasets within individually.

Default selection

The default state will not change. When a user first enters the setup all indices and their datasets are selected.

Warning message (left column)

The warning message should be the same as described above for the categorization view. If there is too little training data for a dataset, the warning message appears.

Additionally, the alert icon shows which of the indices/datasets has problems (see screenshot above). Hovering the icon shows a tooltip explaining the warning.

Index

Too little training data
One or more datasets in this index provide not enough training data.

Data not suitable
One or more datasets in this index can not be categorized in a meaningful way.

Dataset

Too little training data
The dataset provides not enough training data.

Data not suitable
The data in this dataset can not be categorized in a meaningful way.


Design issue
Figma file

@katrin-freihofner katrin-freihofner added the Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services label Mar 2, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/logs-metrics-ui (Team:logs-metrics-ui)

@weltenwort
Copy link
Member

Thank you for providing so many details, this looks great. A few thoughts come to mind:

Data quality criteria: It would probably be good to write down the specific criteria we want to use to determine whether there is "too little training data" or whether a dataset is "not suitable for categorization". In both cases I assume it would be evaluations of the count or cardinality?

  • count_docs(dataset) < min_training_count ➡️ "too little training data"?
  • count_categories(index, dataset) > max_category_count ➡️ "not suitable for categorization`?

@sophiec20, can you provide any guidance on such quality criteria? IIRC you were considering emitting such warnings while running the ML jobs? Can we access these?

Well-known datasets: And what about the well-known filebeat module datasets, which we already know to be unsuitable? Do we want to hard-code a warning list for those?

Combination of warnings: From the UI perspective I wonder if the combined case "Too little training data and not suitable" should be displayed as two separate warnings? Otherwise the user might not be able to tell which is which and the combinatorial complexity in the implementation grows - especially if we possibly add more warnings in the future.

@sgrodzicki sgrodzicki added the Feature:Logs UI Logs UI feature label Mar 2, 2020
@katrin-freihofner
Copy link
Contributor Author

Combination of warnings: From the UI perspective I wonder if the combined case "Too little training data and not suitable" should be displayed as two separate warnings? Otherwise, the user might not be able to tell which is which and the combinatorial complexity in the implementation grows - especially if we possibly add more warnings in the future.

We discussed this and decided to have a single callout box but adjust the text accordingly. There is also a special case where one dataset has both problems. I think if it does not provide useful data for categorization we should not even mention too little training data - it won't be useful no matter how much data we have.

@mukeshelastic will help with the wording (so the text in the issue description is likely to change).

@mukeshelastic
Copy link

@katrin-freihofner @weltenwort when we detect lack of sufficient training dataset, we lack confidence in the displayed anomaly score. I wonder whether the appropriate user feedback is 1. Show N/A or something similar in the anomaly score column for each category of the dataset where we detect this case 2. Show a warning message at the top, exactly as katrin suggested but tweak the message to communicate the lack of confidence in anomaly score and hence being not displayed in the anomaly score column for the detected datasets.

@sophiec20
Copy link
Contributor

sophiec20 commented Mar 13, 2020

Specifies which field will be categorized. Using text data types is recommended. Categorization works best on machine written log messages, typically logging written by a developer for the purpose of system troubleshooting.

In ML, we have the helper text (above) which is aimed at helping users understand what categorization is designed for. It would be good to align on this if possible - the final sentence anyway.

// too little training data

Anomaly detection learns from trained data. The probability of anomalies has already been adjusted according to the amount of training data seen. So I advise against Logs UI picking an arbitrary value which defines if enough training data has been seen. It depends on the data.

It is not the case that we lack confidence in displaying the anomaly score because the model has already built this in.

The proposal above links to the Update Configuration page. I would have thought that the answer to usually wait a bit longer, rather than to update the configuration. Perhaps I am missing something here but seems to me that this check can be avoided.

// not suitable for categorization

In 7.7 we now have the following stats categorization stats. A categorization_status can be warn or ok. If you query this for a running job then this our indicator to say if the data is suitable for categorization. elastic/elasticsearch#51879

Unfortunately, because categorization is not yet done on a per partition basis, then this status is also not yet partition aware. It gives a view of the overall job. This will be set to warn by a single dataset that is not suited to categorisation, however it could be set to warn if there were many datasets that were all categorizing nicely.

In 7.6 we had a basic log category check, which would raise an ML job message if 1000 or more categories existed for a job before 100 buckets of results have been created. Because the Logs UI job is partitioned and has model_plot enabled, then I believe it is possible to query this per partition and that this will be a good indicator of a dataset with a message field that does not categorize well.

// well-known datasets

In the end, we did not add this into ML. We did not feel that the business logic ought to be written into the back-end APIs. However I would still think that it has value in the Logs UI application which already has logic in-built to handle different dataset types. This hard-coded list could be extended over time and based on telemetry. It would help with the web access log data which I suspect might be used with categorization but is actually structured data.

// combination of warnings

I do not believe that the "too little data" message should be a warning.

@sophiec20
Copy link
Contributor

sophiec20 commented Mar 16, 2020

One thing I forgot about, we do have a job validation check in the ML UI which pertains to too little data. If there is less than 25 buckets or 2 hrs (which ever is greater), then we warn prior to job creation that there is too little data for the model to be initialized, and therefore no anomalies will be written until such time as sufficient data can be seen. Meaning, there is no historical data to analyse and you'll have to wait for it to continue in real-time until you start to see anomalies. I assumed the comments above were about the early lifetime of the job which comes after the model initialisation but that may not have been the case.

@weltenwort
Copy link
Member

Thank you for the detailed response, @sophiec20!

With the awesome new model stats it sounds like we could do something like the following:

  1. Check the categorization_status stat.
  2. If the status is warn, perform per-partition queries to determine which partitions likely cause the high rare categories count or a high category count in respect to the overall count.
  3. Display a warning that summarizes the results.
  4. Also annotate the dataset filter on the reconfiguration screen with those results.

Does that make sense?

I think the idea behind warning about "too little data" would be to indicate that some datasets might never have enough documents for training due to their rare occurrence. But maybe that's not useful enough to confuse the user with that detail?

@sophiec20
Copy link
Contributor

sophiec20 commented Mar 18, 2020

@weltenwort the steps 1-4 above sound good. In addition, due to categorization_status: warn not yet being partition aware, I'd suggest a 4b which would be to provide a useful message if the status was set to warn but all partitions looked good according to the basic count check.

These are other reasons that indicate that the message is not suited for categorizing.

that suggests the input data is inappropriate for categorization.
Problems could be that there is only one category, more than 90% of
categories are rare, the number of categories is greater than 50% of
the number of categorized documents, there are no frequently
matched categories, or more than 50% of categories are dead.

These cannot be assessed using elasticsearch queries as they are metrics captured as we model. The ML UI categorization wizard does do some pre-flight data validations using _analyze. These seem to me to be too big a lift to include in the Logs UI onboarding workflow, but wanted to share for visibility. #60502

For 7.7, so our end-users can get the most benefit from categorizing data that is categorize-able, then I think a pragmatic approach would be to

  • Identify datasets where the category count is v high (likely common) and allow end-user to de-select these.
  • Identify well-known datasets that are not suited to categorization (likely common) and allow end-user to de-select these.
  • Educate end-user (via on-screen help) on what type of data is best suited to categorization and allow them to use their judgement to exclude datasets from being analyzed. Categorization works best on machine written log messages, typically logging written by a developer for the purpose of system troubleshooting.
  • Educate the end-user (via on-screen help) on what other reasons may have caused the job to be in a warn status and allow them to use their judgement to exclude datasets.

For beyond 7.7, then there are options for a smoother experience from the ml side, such as making categorization_status partition aware or perhaps having some self correcting logic in the job to exclude partitions that are not suited or perhaps having a data validation endpoint. And also from the Logs UI side, such as allowing multiple categoization jobs, which would potentially give better results when looking at datasets with very different data rates especially if the logs do not belong to related systems, or perhaps incorporating something similar to the current ML cat wizard checks. From the ML side, we will have these discussions soon.

@mukeshelastic
Copy link

We split this issue into many sub-issues.
Done: #60385
To be completed: #60392, #60390
so closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Logs UI Logs UI feature Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services
Projects
None yet
Development

No branches or pull requests

6 participants