-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Visualizations] Expensive queries are causing unnecessary load and delays on Elasticsearch #99031
Changes from 2 commits
57dcc8b
be17145
480e392
94898df
2e1509c
390b601
a8918ef
2684af9
29f119c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,12 +6,11 @@ | |
* Side Public License, v 1. | ||
*/ | ||
|
||
import { countBy, get, groupBy, mapValues, max, min, values } from 'lodash'; | ||
import { ElasticsearchClient } from 'kibana/server'; | ||
import type { estypes } from '@elastic/elasticsearch'; | ||
|
||
import { countBy, groupBy, mapValues, max, min, values } from 'lodash'; | ||
import { getPastDays } from './get_past_days'; | ||
type ESResponse = estypes.SearchResponse<{ visualization: { visState: string } }>; | ||
|
||
import type { SavedObjectsClientContract, SavedObjectsFindResult } from '../../../../core/server'; | ||
import type { SavedVisState } from '../../../visualizations/common'; | ||
|
||
interface VisSummary { | ||
type: string; | ||
|
@@ -35,61 +34,52 @@ export interface VisualizationUsage { | |
* Parse the response data into telemetry payload | ||
*/ | ||
export async function getStats( | ||
esClient: ElasticsearchClient, | ||
index: string | ||
soClient: SavedObjectsClientContract | ||
): Promise<VisualizationUsage | undefined> { | ||
const searchParams = { | ||
size: 10000, // elasticsearch index.max_result_window default value | ||
index, | ||
ignoreUnavailable: true, | ||
filterPath: [ | ||
'hits.hits._id', | ||
'hits.hits._source.visualization', | ||
'hits.hits._source.updated_at', | ||
], | ||
body: { | ||
query: { | ||
bool: { filter: { term: { type: 'visualization' } } }, | ||
}, | ||
}, | ||
}; | ||
const { body: esResponse } = await esClient.search<ESResponse>(searchParams); | ||
const size = get(esResponse, 'hits.hits.length', 0); | ||
if (size < 1) { | ||
return; | ||
} | ||
|
||
// `map` to get the raw types | ||
const visSummaries: VisSummary[] = esResponse.hits.hits.map((hit) => { | ||
const spacePhrases = hit._id.toString().split(':'); | ||
const lastUpdated: string = get(hit, '_source.updated_at'); | ||
const space = spacePhrases.length === 3 ? spacePhrases[0] : 'default'; // if in a custom space, the format of a saved object ID is space:type:id | ||
const visualization = get(hit, '_source.visualization', { visState: '{}' }); | ||
const visState: { type?: string } = JSON.parse(visualization.visState); | ||
return { | ||
type: visState.type || '_na_', | ||
space, | ||
past_days: getPastDays(lastUpdated), | ||
}; | ||
const finder = await soClient.createPointInTimeFinder({ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. my question applies to all refactored telemetries (#98903, #98914, #99023) where we replaced I've tried to fix that by adding @lukeelmers @afharo could you please help me a little cause we have a recommendation to use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, yeah we specifically decided against adding namespace support when using PIT until we had a concrete need for it. I guess we do now 🙂 I've created an issue to track this enhancement: #99044 @afharo WDYT, do we push forward on this work without There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I vote to wait, because the code with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
IMO, it feels like we keep coming back and forth with this conversation. And, usually, AFAIK, the general recommendation is to use the SO Client whenever possible because using plain ES client requests would break if/when we change the internal implementations of SOs (underlying indices, ways to store them, ...). I'm not fully familiar with the limitations of allowing So ++ to wait :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've been playing around with this PR locally, and I think these changes would solve the issue: alexwizp#6 I tested it locally and it works as expected to me :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nevermind, I don't know how I thought it worked when I tried... It's def not working 🤷 Sorry about that! Then, we need to rely on the outcome of #99044 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So I guess the question remains, do we adjust the scope of this PR as follows?
Basically, since @alexwizp has already started the work, is it worth merging with the same 10k size, only switching to SO client instead? Or do we close this PR and consider it blocked by #99044? WDYT @alexwizp @afharo (Either way it doesn't actually resolve #93770, it would just be a small refactor if we moved ahead) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't see a lot of sense just to switch There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I will add #99044 to our list of items to discuss for our next sprint. I don't think it would be a particularly large amount of work, it's mostly about deciding on a strategy with the security team & ensuring there's adequate test coverage. |
||
type: 'visualization', | ||
perPage: 1000, | ||
}); | ||
|
||
// organize stats per type | ||
const visTypes = groupBy(visSummaries, 'type'); | ||
const visSummaries: VisSummary[] = []; | ||
|
||
// get the final result | ||
return mapValues(visTypes, (curr) => { | ||
const total = curr.length; | ||
const spacesBreakdown = countBy(curr, 'space'); | ||
const spaceCounts: number[] = values(spacesBreakdown); | ||
for await (const response of finder.find()) { | ||
(response.saved_objects || []).forEach((so: SavedObjectsFindResult<any>) => { | ||
if (so.attributes?.visState) { | ||
const visState: SavedVisState = JSON.parse(so.attributes.visState); | ||
const spacePhrases = so.id.toString().split(':'); | ||
// if in a custom space, the format of a saved object ID is space:type:id | ||
const space = spacePhrases.length === 3 ? spacePhrases[0] : 'default'; | ||
|
||
return { | ||
total, | ||
spaces_min: min(spaceCounts), | ||
spaces_max: max(spaceCounts), | ||
spaces_avg: total / spaceCounts.length, | ||
saved_7_days_total: curr.filter((c) => c.past_days <= 7).length, | ||
saved_30_days_total: curr.filter((c) => c.past_days <= 30).length, | ||
saved_90_days_total: curr.filter((c) => c.past_days <= 90).length, | ||
}; | ||
}); | ||
visSummaries.push({ | ||
type: visState.type || '_na_', | ||
space, | ||
past_days: getPastDays(so.updated_at!), | ||
}); | ||
} | ||
}); | ||
} | ||
await finder.close(); | ||
|
||
if (visSummaries.length) { | ||
// organize stats per type | ||
const visTypes = groupBy(visSummaries, 'type'); | ||
|
||
// get the final result | ||
return mapValues(visTypes, (curr) => { | ||
const total = curr.length; | ||
const spacesBreakdown = countBy(curr, 'space'); | ||
const spaceCounts: number[] = values(spacesBreakdown); | ||
|
||
return { | ||
total, | ||
spaces_min: min(spaceCounts), | ||
spaces_max: max(spaceCounts), | ||
spaces_avg: total / spaceCounts.length, | ||
saved_7_days_total: curr.filter((c) => c.past_days <= 7).length, | ||
saved_30_days_total: curr.filter((c) => c.past_days <= 30).length, | ||
saved_90_days_total: curr.filter((c) => c.past_days <= 90).length, | ||
}; | ||
}); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this code was removed because we can get space in more legal way from
so.namespaces