Skip to content

Commit

Permalink
[ML] AIOps: Use ml_standard tokenizer for log rate analysis. (#176587)
Browse files Browse the repository at this point in the history
## Summary

Fixes #176387.

The `standard` analyser for log pattern analysis introduced in #172188
might return patterns that mess with the identifying of significant
patterns across time ranges, for example if a pattern matches different
parts of a date or time. This adds an update that allows to set the
analyser for log rate analysis to `ml_standard` but keep `standard` for
log pattern analysis.

### Checklist

- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [x] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
- [x] This was checked for breaking API changes and was [labeled
appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
  • Loading branch information
walterra authored Feb 12, 2024
1 parent 459dff1 commit 4566ef7
Show file tree
Hide file tree
Showing 4 changed files with 8 additions and 60 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -31,15 +31,16 @@ export function createCategoryRequest(
queryIn: QueryDslQueryContainer,
wrap: ReturnType<typeof createRandomSamplerWrapper>['wrap'],
intervalMs?: number,
additionalFilter?: CategorizationAdditionalFilter
additionalFilter?: CategorizationAdditionalFilter,
useStandardTokenizer: boolean = true
) {
const query = createCategorizeQuery(queryIn, timeField, timeRange);
const aggs = {
categories: {
categorize_text: {
field,
size: CATEGORY_LIMIT,
categorization_analyzer: categorizationAnalyzer,
...(useStandardTokenizer ? { categorization_analyzer: categorizationAnalyzer } : {}),
},
aggs: {
examples: {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -85,61 +85,6 @@ describe('getCategoryRequest', () => {
aggs: {
categories: {
categorize_text: {
categorization_analyzer: {
char_filter: ['first_line_with_letters'],
tokenizer: 'standard',
filter: [
{
type: 'stop',
stopwords: [
'Monday',
'Tuesday',
'Wednesday',
'Thursday',
'Friday',
'Saturday',
'Sunday',
'Mon',
'Tue',
'Wed',
'Thu',
'Fri',
'Sat',
'Sun',
'January',
'February',
'March',
'April',
'May',
'June',
'July',
'August',
'September',
'October',
'November',
'December',
'Jan',
'Feb',
'Mar',
'Apr',
'May',
'Jun',
'Jul',
'Aug',
'Sep',
'Oct',
'Nov',
'Dec',
'GMT',
'UTC',
],
},
{
type: 'limit',
max_token_count: '100',
},
],
},
field: 'the-field-name',
size: 1000,
},
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,10 @@ export const getCategoryRequest = (
timeFieldName,
undefined,
query,
wrap
wrap,
undefined,
undefined,
false
);

// In this case we're only interested in the aggregation which
Expand Down
3 changes: 1 addition & 2 deletions x-pack/test/functional/apps/aiops/log_rate_analysis.ts
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,7 @@ export default function ({ getPageObjects, getService }: FtrProviderContext) {
await ml.jobSourceSelection.selectSourceForLogRateAnalysis(testData.sourceIndexOrSavedSearch);
});

// FLAKY: https://github.com/elastic/kibana/issues/176387
it.skip(`${testData.suiteTitle} displays index details`, async () => {
it(`${testData.suiteTitle} displays index details`, async () => {
await ml.testExecution.logTestStep(`${testData.suiteTitle} displays the time range step`);
await aiops.logRateAnalysisPage.assertTimeRangeSelectorSectionExists();

Expand Down

0 comments on commit 4566ef7

Please sign in to comment.