-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML-DataFrame] add support for max, value_count, cardinality and sum #38569
[ML-DataFrame] add support for max, value_count, cardinality and sum #38569
Conversation
Pinging @elastic/ml-core |
@@ -44,17 +44,15 @@ protected void createReviewsIndex() throws IOException { | |||
builder.startObject(); | |||
{ | |||
builder.startObject("mappings") | |||
.startObject("_doc") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this change is due to type removal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Left a couple of comments.
@@ -25,7 +25,11 @@ private Aggregations() {} | |||
*/ | |||
enum AggregationType { | |||
AVG("avg", "double"), | |||
MAX("max", null); | |||
CARDINALITY("cardinality", "long"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The aggregation names are redundant here. getName()
could return name().toLowerCase(Locale.ROOT)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, thought the same. But did not change it, because that's not part of this PR.
I think I keep this for a later re-factoring, while this is redundant right now, we likely need more meta information for other purposes. One argument to keep this: if we want to have "median" we need to map to the "percentiles" aggregation.
MAX("max", null); | ||
CARDINALITY("cardinality", "long"), | ||
VALUE_COUNT("value_count", "long"), | ||
MAX("max", null), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't those 3 get mapped as double
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both are counts and therefore not floating point types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant about max
, min
& sum
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
max
, min
and sum
are mapped according to the mapping in the source, not visible in the diff as this hasn't been changed, if you expand the diff you will see:
targetMapping - the field type for the output, if null, the source type should be used
With other words: null
is a magic value, telling map deduction to use the source mapping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! Cool, thanks for clarifying :-)
f2a7fc4
to
c883d2a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
c883d2a
to
d0d0856
Compare
run elasticsearch-ci/2 |
add support for max, value_count, cardinality and sum