-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Adds sampled % of documents & cardinality for text fields for Data visualizer/Field stats & fix missing bucket in doc count chart #172378
Conversation
Pinging @elastic/ml-ui (:ml) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left two minor suggestions.
type === SUPPORTED_FIELD_TYPES.TEXT ? ( | ||
<FormattedMessage | ||
id="xpack.dataVisualizer.sampledPercentageForTextFieldsMsg" | ||
defaultMessage="The % of documents for text fields is sampled and calculated from {sampledDocumentsFormatted} sample {sampledDocuments, plural, one {record} other {records}}." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it could be a bit simpler.
defaultMessage="The % of documents for text fields is sampled and calculated from {sampledDocumentsFormatted} sample {sampledDocuments, plural, one {record} other {records}}." | |
defaultMessage="The % of documents for text fields is calculated from a sample of {sampledDocumentsFormatted} {sampledDocuments, plural, one {record} other {records}}." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated here e6facd0
type === SUPPORTED_FIELD_TYPES.TEXT ? ( | ||
<FormattedMessage | ||
id="xpack.dataVisualizer.sampledCardinalityForTextFieldsMsg" | ||
defaultMessage="The cardinality for text fields is sampled and calculated from {sampledDocumentsFormatted} sample {sampledDocuments, plural, one {record} other {records}}." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above.
defaultMessage="The cardinality for text fields is sampled and calculated from {sampledDocumentsFormatted} sample {sampledDocuments, plural, one {record} other {records}}." | |
defaultMessage="The cardinality for text fields is calculated from a sample of {sampledDocumentsFormatted} {sampledDocuments, plural, one {record} other {records}}." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated here e6facd0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UI text LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some minor comments, but overall LGTM
import { ES_FIELD_TYPES, KBN_FIELD_TYPES } from '@kbn/field-types'; | ||
import { SUPPORTED_FIELD_TYPES } from '../../../../../../../common/constants'; | ||
import { useDataVisualizerKibana } from '../../../../../kibana_context'; | ||
import { FieldDataRowProps } from '../../types'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
import { FieldDataRowProps } from '../../types'; | |
import type { FieldDataRowProps } from '../../types'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated here 6e88086
}, | ||
} = useDataVisualizerKibana(); | ||
|
||
const cardinality = config?.stats?.cardinality; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const cardinality = config?.stats?.cardinality; | |
const cardinality = stats?.cardinality; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated here 6e88086
<EuiText size={'xs'}> | ||
{fieldFormats | ||
.getDefaultInstance(KBN_FIELD_TYPES.NUMBER, [ES_FIELD_TYPES.INTEGER]) | ||
.convert(valueCount)}{' '} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the space char {' '}
needed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this one, yes, since it's separating two different values
value.forEach((resp, idx) => { | ||
if (idx === 0 && isNonAggregatableSampledDocs(resp)) { | ||
const docs = resp.rawResponse.hits.hits.map((d) => | ||
getProcessedFields(d.fields ?? {}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than calling getProcessedFields
with an empty object, it might be neater to make the check beforehand.
e.g.
d.fields ? getProcessedFields(d.fields) : {}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated here 6e88086
@elasticmachine merge upstream |
💚 Build Succeeded
Metrics [docs]Async chunks
History
To update your PR or re-run it, just comment with: cc @qn895 |
Summary
This PR fetches a sample of 1000 documents in Elasticsearch, and compute the approximate count % and cardinality based on that sample.
It also shows a tooltip message indicating that text fields are using a much smaller sample:
For example, selecting date to be 18:04 will wipe out the first bucket at 18:00.
Before:
After:
Checklist
Delete any items that are not applicable to this PR.
Risk Matrix
Delete this section if it is not applicable to this PR.
Before closing this PR, invite QA, stakeholders, and other developers to identify risks that should be tested prior to the change/feature release.
When forming the risk matrix, consider some of the following examples and how they may potentially impact the change:
For maintainers