Skip to content

Commit

Permalink
[ML] Improve loading time for populated fields in Transform creation …
Browse files Browse the repository at this point in the history
…wizard (elastic#164790)

## Summary

Follow up of elastic#163371. This PR
reduces the extra call in Transform as the same request to fetch 500
sample docs have already been made via the Field Stats provider. It also
reduces the number of docs fetched from 1000 to 500 for consistency.

If the performance journey improves, we can make the same change to Data
Frame Analytics.
### Checklist

Delete any items that are not applicable to this PR.

- [ ] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [ ] Any UI touched in this PR is usable by keyboard only (learn more
about [keyboard accessibility](https://webaim.org/techniques/keyboard/))
- [ ] Any UI touched in this PR does not create any new axe failures
(run axe in browser:
[FF](https://addons.mozilla.org/en-US/firefox/addon/axe-devtools/),
[Chrome](https://chrome.google.com/webstore/detail/axe-web-accessibility-tes/lhdoppojpmngadmnindnejefpokejbdd?hl=en-US))
- [ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)
- [ ] This renders correctly on smaller devices using a responsive
layout. (You can test this [in your
browser](https://www.browserstack.com/guide/responsive-testing-on-local-server))
- [ ] This was checked for [cross-browser
compatibility](https://www.elastic.co/support/matrix#matrix_browsers)


### Risk Matrix

Delete this section if it is not applicable to this PR.

Before closing this PR, invite QA, stakeholders, and other developers to
identify risks that should be tested prior to the change/feature
release.

When forming the risk matrix, consider some of the following examples
and how they may potentially impact the change:

| Risk | Probability | Severity | Mitigation/Notes |

|---------------------------|-------------|----------|-------------------------|
| Multiple Spaces—unexpected behavior in non-default Kibana Space.
| Low | High | Integration tests will verify that all features are still
supported in non-default Kibana Space and when user switches between
spaces. |
| Multiple nodes—Elasticsearch polling might have race conditions
when multiple Kibana nodes are polling for the same tasks. | High | Low
| Tasks are idempotent, so executing them multiple times will not result
in logical error, but will degrade performance. To test for this case we
add plenty of unit tests around this logic and document manual testing
procedure. |
| Code should gracefully handle cases when feature X or plugin Y are
disabled. | Medium | High | Unit tests will verify that any feature flag
or plugin combination still results in our service operational. |
| [See more potential risk
examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx) |


### For maintainers

- [ ] This was checked for breaking API changes and was [labeled
appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)

---------

Co-authored-by: kibanamachine <[email protected]>
  • Loading branch information
2 people authored and eokoneyo committed Aug 31, 2023
1 parent 0ed80b0 commit 966c91c
Show file tree
Hide file tree
Showing 5 changed files with 56 additions and 32 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ export const FieldStatsFlyoutProvider: FC<{
fields: ['*'],
_source: false,
...queryAndRunTimeMappings,
size: 1000,
size: 500,
},
};
const cacheKey = stringHash(JSON.stringify(esSearchRequestParams)).toString();
Expand Down
2 changes: 2 additions & 0 deletions x-pack/plugins/ml/public/shared.ts
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,5 @@ export * from '../common/util/validators';
export * from './application/formatters/metric_change_description';
export * from './application/components/field_stats_flyout';
export * from './application/data_frame_analytics/common';

export { useFieldStatsFlyoutContext } from './application/components/field_stats_flyout/use_field_stats_flytout_context';
63 changes: 35 additions & 28 deletions x-pack/plugins/transform/public/app/hooks/use_index_data.ts
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,8 @@ export const useIndexData = (
dataView: SearchItems['dataView'],
query: TransformConfigQuery,
combinedRuntimeMappings?: StepDefineExposedState['runtimeMappings'],
timeRangeMs?: TimeRangeMs
timeRangeMs?: TimeRangeMs,
populatedFields?: Set<string> | null
): UseIndexDataReturnType => {
const { analytics } = useAppDependencies();

Expand Down Expand Up @@ -96,47 +97,53 @@ export const useIndexData = (
// (for example, as part of filebeat/metricbeat/ECS based indices)
// to the data grid component which would significantly slow down the page.
const fetchDataGridSampleDocuments = async function () {
setErrorMessage('');
setStatus(INDEX_STATUS.LOADING);

const esSearchRequest = {
index: indexPattern,
body: {
fields: ['*'],
_source: false,
query: {
function_score: {
query: defaultQuery,
random_score: {},
let populatedDataViewFields = populatedFields ? [...populatedFields] : [];
let isMissingFields = populatedDataViewFields.length === 0;

// If populatedFields are not provided, make own request to calculate
if (populatedFields === undefined) {
setErrorMessage('');
setStatus(INDEX_STATUS.LOADING);

const esSearchRequest = {
index: indexPattern,
body: {
fields: ['*'],
_source: false,
query: {
function_score: {
query: defaultQuery,
random_score: {},
},
},
size: 500,
},
size: 500,
},
};
};

const resp = await dataSearch(esSearchRequest, abortController.signal);
const resp = await dataSearch(esSearchRequest, abortController.signal);

if (!isEsSearchResponse(resp)) {
setErrorMessage(getErrorMessage(resp));
setStatus(INDEX_STATUS.ERROR);
return;
}
if (!isEsSearchResponse(resp)) {
setErrorMessage(getErrorMessage(resp));
setStatus(INDEX_STATUS.ERROR);
return;
}
const docs = resp.hits.hits.map((d) => getProcessedFields(d.fields ?? {}));
isMissingFields = resp.hits.hits.every((d) => typeof d.fields === 'undefined');

populatedDataViewFields = [...new Set(docs.map(Object.keys).flat(1))];
}
const isCrossClusterSearch = indexPattern.includes(':');
const isMissingFields = resp.hits.hits.every((d) => typeof d.fields === 'undefined');

const docs = resp.hits.hits.map((d) => getProcessedFields(d.fields ?? {}));

// Get all field names for each returned doc and flatten it
// to a list of unique field names used across all docs.
const allDataViewFields = getFieldsFromKibanaIndexPattern(dataView);
const populatedFields = [...new Set(docs.map(Object.keys).flat(1))]
const filteredDataViewFields = populatedDataViewFields
.filter((d) => allDataViewFields.includes(d))
.sort();

setCcsWarning(isCrossClusterSearch && isMissingFields);
setStatus(INDEX_STATUS.LOADED);
setDataViewFields(populatedFields);
setDataViewFields(filteredDataViewFields);
};

fetchDataGridSampleDocuments();
Expand All @@ -145,7 +152,7 @@ export const useIndexData = (
abortController.abort();
};
// eslint-disable-next-line react-hooks/exhaustive-deps
}, [timeRangeMs]);
}, [timeRangeMs, populatedFields?.size]);

const columns: EuiDataGridColumn[] = useMemo(() => {
if (typeof dataViewFields === 'undefined') {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,10 @@ export const useSearchItems = (defaultSavedObjectId: string | undefined) => {
}

try {
fetchedSavedSearch = await appDeps.savedSearch.get(id);
// If data view already found, no need to get saved search
if (!fetchedDataView) {
fetchedSavedSearch = await appDeps.savedSearch.get(id);
}
} catch (e) {
// Just let fetchedSavedSearch stay undefined in case it doesn't exist.
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ import {
import { useDocumentationLinks } from '../../../../hooks/use_documentation_links';
import { useIndexData } from '../../../../hooks/use_index_data';
import { useTransformConfigData } from '../../../../hooks/use_transform_config_data';
import { useToastNotifications } from '../../../../app_dependencies';
import { useAppDependencies, useToastNotifications } from '../../../../app_dependencies';
import { SearchItems } from '../../../../hooks/use_search_items';
import { getAggConfigFromEsAgg } from '../../../../common/pivot_aggs';

Expand Down Expand Up @@ -120,8 +120,20 @@ export const StepDefineForm: FC<StepDefineFormProps> = React.memo((props) => {
const { transformConfigQuery } = stepDefineForm.searchBar.state;
const { runtimeMappings } = stepDefineForm.runtimeMappingsEditor.state;

const appDependencies = useAppDependencies();
const {
ml: { useFieldStatsFlyoutContext },
} = appDependencies;

const fieldStatsContext = useFieldStatsFlyoutContext();
const indexPreviewProps = {
...useIndexData(dataView, transformConfigQuery, runtimeMappings, timeRangeMs),
...useIndexData(
dataView,
transformConfigQuery,
runtimeMappings,
timeRangeMs,
fieldStatsContext?.populatedFields ?? null
),
dataTestSubj: 'transformIndexPreview',
toastNotifications,
};
Expand Down

0 comments on commit 966c91c

Please sign in to comment.