-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Normalize Pipeline Aggregation #56399
Conversation
Pinging @elastic/es-analytics-geo (:Analytics/Aggregations) |
This aggregation will perform normalizations of metrics for a given series of data in the form of bucket values. The aggregations supports the following normalizations - rescale 0-1 - rescale 0-100 - percentage of sum - mean normalization - z-score normalization - softmax normalization To specify which normalization is to be used, it can be specified in the normalize agg's `normalizer` field. For example: ``` { "normalize": { "buckets_path": <>, "normalizer": "percent" } } ``` Closes elastic#51005.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments because I'm excited about this!
static final ParseField NORMALIZER_FIELD = new ParseField("normalizer"); | ||
|
||
@SuppressWarnings("unchecked") | ||
public static final ConstructingObjectParser<NormalizePipelineAggregationBuilder, String> PARSER = new ConstructingObjectParser<>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check out InstantiatingObjectParser
!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, I tried changing that parser to work here, but I think it deserves its own change. The InstantiatingObjectParser does not expose the Context in such a way that more constructor arguments can be passed in. I believe this can change, but I'd rather not do that here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
...in/java/org/elasticsearch/xpack/analytics/normalize/NormalizePipelineAggregationBuilder.java
Outdated
Show resolved
Hide resolved
normalizedBucketValue = normalizer.normalize(thisBucketValue); | ||
} | ||
|
||
List<InternalAggregation> aggs = StreamSupport.stream(bucket.getAggregations().spliterator(), false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bucket.getAggregations().copyResults()
does this without so much boiler plate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfortunately, that method does not work in this context. I think a more dedicated cleanup for this boilerplate can be tackled outside of this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
...s/src/main/java/org/elasticsearch/xpack/analytics/normalize/NormalizePipelineNormalizer.java
Outdated
Show resolved
Hide resolved
...s/src/main/java/org/elasticsearch/xpack/analytics/normalize/NormalizePipelineNormalizer.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/src/test/resources/rest-api-spec/test/analytics/normalize.yml
Outdated
Show resolved
Hide resolved
x-pack/plugin/src/test/resources/rest-api-spec/test/analytics/normalize.yml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Merge whenever you are happy with the docs!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments. I think the only notable one is about handling terms agg as the parent bucket :)
Looks good!
-------------------------------------------------- | ||
// NOTCONSOLE | ||
|
||
[[normalizer_pipeline-params]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make a note somewhere that this pipeline always uses a skip
gap policy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call!
|
||
public class NormalizePipelineAggregationBuilder extends AbstractPipelineAggregationBuilder<NormalizePipelineAggregationBuilder> { | ||
public static final String NAME = "normalize"; | ||
static final ParseField NORMALIZER_FIELD = new ParseField("normalizer"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine with normalizer
, but wanted to also suggest method
as a potential param name. No strong opinion though :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wishy washy on the naming here as well, and decided not to fret, but I too have leaned towards method
earlier, so I am happy to do so here. especially given the overloading of the term across the stack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the naming to be method
if (bucketsPaths.length != 1) { | ||
context.addBucketPathValidationError("must contain a single entry for aggregation [" + name + "]"); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also check context.validateHasParent()
to make sure this isn't at the top level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, yes. I wasn't aware of this. thanks for bringing it up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a check and a test for this!
histo = (InternalMultiBucketAggregation<? extends InternalMultiBucketAggregation, ? extends | ||
InternalMultiBucketAggregation.InternalBucket>) aggregation; | ||
List<? extends InternalMultiBucketAggregation.InternalBucket> buckets = histo.getBuckets(); | ||
HistogramFactory factory = (HistogramFactory) histo; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know if this works with a terms
agg as the parent? It feels like it should (e.g. it doesn't require any specific ordering of the buckets, unlike something like a moving avg which needs an ordering).
If we think it should work with terms
we should tweak this to not use a HistogramFactory directly. BucketScriptPipelineAggregator
has an example of how to generically build buckets from any InternalMultiBucketAggregation
(the internal agg can create buckets too, not just the factory).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks! I was slightly loose in my interpretation of the HistogramFactory's comment
/** Implemented by histogram aggregations and used by pipeline aggregations to insert buckets. */
Will look at how BucketScript does things and add a test for terms agg!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yikes! I'm sorry I didn't notice this one!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, I've updated to include a test for terms and use a more generic way to make new buckets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This aggregation will perform normalizations of metrics for a given series of data in the form of bucket values. The aggregations supports the following normalizations - rescale 0-1 - rescale 0-100 - percentage of sum - mean normalization - z-score normalization - softmax normalization To specify which normalization is to be used, it can be specified in the normalize agg's `normalizer` field. For example: ``` { "normalize": { "buckets_path": <>, "normalizer": "percent" } } ``` Closes elastic#51005.
This aggregation will perform normalizations of metrics for a given series of data in the form of bucket values. The aggregations supports the following normalizations - rescale 0-1 - rescale 0-100 - percentage of sum - mean normalization - z-score normalization - softmax normalization To specify which normalization is to be used, it can be specified in the normalize agg's `normalizer` field. For example: ``` { "normalize": { "buckets_path": <>, "normalizer": "percent" } } ```
Relates: elastic/elasticsearch#56399 This commit adds the normalize aggregation to the high level client.
Relates: elastic/elasticsearch#56399 This commit adds the normalize aggregation to the high level client.
Relates: elastic/elasticsearch#56399 This commit adds the normalize aggregation to the high level client.
Relates: elastic/elasticsearch#56399 This commit adds the normalize aggregation to the high level client.
Relates: elastic/elasticsearch#56399 This commit adds the normalize aggregation to the high level client. Co-authored-by: Russ Cam <[email protected]>
Relates: elastic/elasticsearch#56399 This commit adds the normalize aggregation to the high level client. Co-authored-by: Russ Cam <[email protected]>
This aggregation will perform normalizations of metrics
for a given series of data in the form of bucket values.
The aggregations supports the following normalizations
To specify which normalization is to be used, it can be specified
in the normalize agg's
normalizer
field.For example:
Closes #51005.