-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Use random sampler for field statistics table in Discover and Data visualizer #138953
Conversation
…/kibana into ml-dv-random-sampler-table
Pinging @elastic/ml-ui (:ml) |
💛 Build succeeded, but was flaky
Failed CI StepsTest Failures
Metrics [docs]Module Count
Async chunks
Page load bundle
History
To update your PR or re-run it, just comment with: cc @qn895 |
@@ -154,9 +147,9 @@ export const TopValues: FC<Props> = ({ stats, fieldFormat, barColor, compressed, | |||
<EuiText size="xs" textAlign={'center'}> | |||
<FormattedMessage | |||
id="xpack.dataVisualizer.dataGrid.field.topValues.calculatedFromSampleDescription" | |||
defaultMessage="Calculated from sample of {topValuesSamplerShardSize} documents per shard" | |||
defaultMessage="Calculated from sample of {topValuesSampleSize} documents" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -141,6 +172,7 @@ export const EmbeddableWrapper = ({ | |||
showPreviewByDefault={input?.showPreviewByDefault} | |||
onChange={onOutputChange} | |||
loading={progress < 100} | |||
totalCount={overallStats?.documentCountStats?.totalCount ?? 0} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The types suggest overallStats
will always be defined.
] | ||
); | ||
|
||
const { overallStats, progress: overallStatsProgress } = useOverallStats( | ||
fieldStatsRequest, | ||
lastRefresh, | ||
browserSessionSeed, | ||
dataVisualizerListState.probability | ||
input?.samplingMode === 'autoRandomSampler' ? null : dataVisualizerListState.probability |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The types suggest input
will always be defined.
dataVisualizerListState | ||
dataVisualizerListState, | ||
(dataVisualizerListState.probability === null | ||
? overallStats?.documentCountStats?.probability |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The types suggest overallStats
will always be defined.
* The preferred mode for sampling data for the field statistics | ||
* default as 'autoRandomSampler' | ||
*/ | ||
samplingMode?: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we know the supported values for the sampling mode?
autoRandomSampler
and what else?
This could be a union of all allowed types.
At the moment the code suggests it could just be:
samplingMode?: 'autoRandomSampler';
aggs: any, | ||
probability: number | null, | ||
seed: number | ||
): Record<string, estypes.AggregationsAggregationContainer> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like there are a couple of version of this type:
https://github.com/qn895/kibana/blob/75b2944216cda3f30dffc048972401a5e65e0af2/x-pack/plugins/data_visualizer/common/types/field_stats.ts#L238
IMO the type Aggregation = Record<string, estypes.AggregationsAggregationContainer>;
is better and matches how aggregations are described in the es client types.
If we clean these up and chose one type, this function could return Record<Aggregation>
* Wraps the supplied aggregations in a random sampler aggregation. | ||
*/ | ||
export function buildRandomSamplerAggregation( | ||
aggs: any, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this any
be replaced with the correct type?
Looks like the same type Aggregation = Record<string, estypes.AggregationsAggregationContainer>;
as the previous comment
return { | ||
sample: { | ||
aggs, | ||
// @ts-expect-error AggregationsAggregationContainer needs to be updated with random_sampler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
has an issue been raised in the elasticsearch-specification repo to correct these types?
} | ||
|
||
return { | ||
sample: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like sample
could be pulled out here and made into a constant. This could then be used through out the code where things like const aggsPath = ['sample'];
are used.
@@ -221,7 +223,7 @@ export const DataVisualizerTable = <T extends DataVisualizerTableItem>({ | |||
defaultMessage: 'Documents (%)', | |||
}), | |||
render: (value: number | undefined, item: DataVisualizerTableItem) => ( | |||
<DocumentStat config={item} showIcon={dimensions.showIcon} /> | |||
<DocumentStat config={item} showIcon={dimensions.showIcon} totalCount={totalCount} /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the console error Pete was seeing, we have some reference code in explain log rate spikes that shows how to handle this: https://github.com/elastic/kibana/blob/main/x-pack/plugins/aiops/public/hooks/use_document_count_stats.ts#L134 |
Closing as replaced by #144646. |
## Summary This PR removes the beta badge for the Field statistics table. <img width="1791" alt="Screen Shot 2022-09-19 at 12 22 30" src="https://user-images.githubusercontent.com/43350163/191076625-9489eaa0-2488-4a5a-b737-e32724d3bffc.png"> Points of consideration for keeping the beta badge: - Easier for us to keep collecting more user feedback. - Potentially switching to [using the new random sampler for aggregation for the field statistics table](#138953) in the next release. Currently, we are pausing this work to match up with the popover (#139072 and #140667) and to fine-tune the user experience/performance. Points of consideration for removing the beta badge: - The field stats table has been available to users since 8.1, and has been in use within ML since 7.x. We should be defining clear criterias for when it can be moved to GA. ### Checklist Delete any items that are not applicable to this PR. - [ ] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md) - [ ] [Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html) was added for features that require explanation or tutorials - [ ] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios - [ ] Any UI touched in this PR is usable by keyboard only (learn more about [keyboard accessibility](https://webaim.org/techniques/keyboard/)) - [ ] Any UI touched in this PR does not create any new axe failures (run axe in browser: [FF](https://addons.mozilla.org/en-US/firefox/addon/axe-devtools/), [Chrome](https://chrome.google.com/webstore/detail/axe-web-accessibility-tes/lhdoppojpmngadmnindnejefpokejbdd?hl=en-US)) - [ ] If a plugin configuration key changed, check if it needs to be allowlisted in the cloud and added to the [docker list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker) - [ ] This renders correctly on smaller devices using a responsive layout. (You can test this [in your browser](https://www.browserstack.com/guide/responsive-testing-on-local-server)) - [ ] This was checked for [cross-browser compatibility](https://www.elastic.co/support/matrix#matrix_browsers) ### Risk Matrix Delete this section if it is not applicable to this PR. Before closing this PR, invite QA, stakeholders, and other developers to identify risks that should be tested prior to the change/feature release. When forming the risk matrix, consider some of the following examples and how they may potentially impact the change: | Risk | Probability | Severity | Mitigation/Notes | |---------------------------|-------------|----------|-------------------------| | Multiple Spaces—unexpected behavior in non-default Kibana Space. | Low | High | Integration tests will verify that all features are still supported in non-default Kibana Space and when user switches between spaces. | | Multiple nodes—Elasticsearch polling might have race conditions when multiple Kibana nodes are polling for the same tasks. | High | Low | Tasks are idempotent, so executing them multiple times will not result in logical error, but will degrade performance. To test for this case we add plenty of unit tests around this logic and document manual testing procedure. | | Code should gracefully handle cases when feature X or plugin Y are disabled. | Medium | High | Unit tests will verify that any feature flag or plugin combination still results in our service operational. | | [See more potential risk examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx) | ### For maintainers - [ ] This was checked for breaking API changes and was [labeled appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
Summary
Follow-up of #136150. This PR replaces the previously sampling aggregation with the new random sampler in the field statistics table. Changes include:
samplerShardSize
Checklist
Delete any items that are not applicable to this PR.
Risk Matrix
Delete this section if it is not applicable to this PR.
Before closing this PR, invite QA, stakeholders, and other developers to identify risks that should be tested prior to the change/feature release.
When forming the risk matrix, consider some of the following examples and how they may potentially impact the change:
For maintainers