Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML-DataFrame] add support for max, value_count, cardinality and sum #38569

Merged

Conversation

hendrikmuhs
Copy link

add support for max, value_count, cardinality and sum

@hendrikmuhs hendrikmuhs added >feature :ml Machine learning labels Feb 7, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

@@ -44,17 +44,15 @@ protected void createReviewsIndex() throws IOException {
builder.startObject();
{
builder.startObject("mappings")
.startObject("_doc")
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change is due to type removal

Copy link
Contributor

@dimitris-athanasiou dimitris-athanasiou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Left a couple of comments.

@@ -25,7 +25,11 @@ private Aggregations() {}
*/
enum AggregationType {
AVG("avg", "double"),
MAX("max", null);
CARDINALITY("cardinality", "long"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The aggregation names are redundant here. getName() could return name().toLowerCase(Locale.ROOT).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, thought the same. But did not change it, because that's not part of this PR.

I think I keep this for a later re-factoring, while this is redundant right now, we likely need more meta information for other purposes. One argument to keep this: if we want to have "median" we need to map to the "percentiles" aggregation.

MAX("max", null);
CARDINALITY("cardinality", "long"),
VALUE_COUNT("value_count", "long"),
MAX("max", null),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't those 3 get mapped as double?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both are counts and therefore not floating point types

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant about max, min & sum.

Copy link
Author

@hendrikmuhs hendrikmuhs Feb 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

max, min and sum are mapped according to the mapping in the source, not visible in the diff as this hasn't been changed, if you expand the diff you will see:

targetMapping - the field type for the output, if null, the source type should be used

With other words: null is a magic value, telling map deduction to use the source mapping.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! Cool, thanks for clarifying :-)

@hendrikmuhs hendrikmuhs force-pushed the feature/fib-add-aggregations branch from f2a7fc4 to c883d2a Compare February 8, 2019 07:59
Copy link
Contributor

@dimitris-athanasiou dimitris-athanasiou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hendrikmuhs hendrikmuhs force-pushed the feature/fib-add-aggregations branch from c883d2a to d0d0856 Compare February 8, 2019 15:53
@hendrikmuhs
Copy link
Author

run elasticsearch-ci/2

@hendrikmuhs hendrikmuhs merged commit cd7292c into elastic:feature/fib Feb 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>feature :ml Machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants