Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds a minimum interval to auto_date_histogram. #42814

Merged

Conversation

pcsanwald
Copy link
Contributor

Closes #41757 by adding a new, optional parameter, minimum_interval, to auto_date_histogram.

note: I need to add docs, but the work should be functionally complete. Here's what I did to test:

autohisto

restricting the roundings passed into to the aggregator.
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo

}

private static List<String> allowedIntervals = Arrays.asList("second", "minute", "hour", "day", "month", "year");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought about using an enum here, didn't seem to buy too much.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could possibly re-use Rounding.DateTimeUnit enum, although that also includes quarter. And would have to add string representations to the enum since it doesn't right now. Might not be worth the hassle. Another candidate is DateHistogramAggregationBuilder.DATE_FIELD_UNITS, although that also includes 1w, etc.

Although it does make me wonder, was quarter skipped on purpose and/or some reason to not have it?

EDIT: Might be moot given Colin's comment below

Copy link
Contributor

@colings86 colings86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pcsanwald I left some comments, thanks for opening the PR

RoundingInfo[] roundingInfos,
SearchContext context,
AggregatorFactory<?> parent,
AggregatorFactories.Builder subFactoriesBuilder,
Map<String, Object> metaData) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Indent isn't inline with the rest of the constructor args here

List<PipelineAggregator> pipelineAggregators,
Map<String, Object> metaData) throws IOException {

super(minimumIntervalExpression, factories, aggregationContext, parent, pipelineAggregators, metaData);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The aggregator name seems to have been replaced by the minimumIntervalExpression, the name arg seems to have been removed entirely.

Also, it's not clear to me why the minimumIntervalExpression needs to be passed to the AutoDateHistogramAggregator?

Copy link
Contributor

@polyfractal polyfractal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few minor notes on my end :)

We are planning to backport this to 7.x is, right?

}

private static List<String> allowedIntervals = Arrays.asList("second", "minute", "hour", "day", "month", "year");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could possibly re-use Rounding.DateTimeUnit enum, although that also includes quarter. And would have to add string representations to the enum since it doesn't right now. Might not be worth the hassle. Another candidate is DateHistogramAggregationBuilder.DATE_FIELD_UNITS, although that also includes 1w, etc.

Although it does make me wonder, was quarter skipped on purpose and/or some reason to not have it?

EDIT: Might be moot given Colin's comment below

@@ -101,12 +128,16 @@ public AutoDateHistogramAggregationBuilder(String name) {
public AutoDateHistogramAggregationBuilder(StreamInput in) throws IOException {
super(in, ValuesSourceType.NUMERIC, ValueType.DATE);
numBuckets = in.readVInt();
if (in.getVersion().onOrAfter(Version.V_8_0_0)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a procedural note: if we are backporting this to 7.x, the backport will need to change the version, and then we'll need to push a followup commit to master which change the version appropriately. I like to leave a #TODO change version after backport so I don't forget, but that's just personal preference.

Just an FYI since version checks still confuse me from time to time :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

believe I addressed this comment

Copy link
Contributor

@colings86 colings86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pcsanwald Thanks for making the changes. I left a couple of comments and a suggestion for adding even more security around the RoundingInfo objects

this.rounding = rounding;
this.roughEstimateDurationMillis = roughEstimateDurationMillis;
this.unitAbbreviation = unitAbbreviation;
this.innerIntervals = innerIntervals;
this.dateTimeUnit = dateTimeUnit;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we assert dateTimeUnit != null; here so we definitely catch the case where the allowedIntervals map doesn't contain an entry for the date time unit used in the rounding?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in 95ee1cf

}

public static final Map<Rounding.DateTimeUnit, String> allowedIntervals = Map.ofEntries(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: since this is a static constant we should give it an uppercase name

Copy link
Contributor Author

@pcsanwald pcsanwald Jun 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in 95ee1cf

365 * 24 * 60 * 60 * 1000L, "y", 1, 5, 10, 20, 50, 100);
return roundings;
365 * 24 * 60 * 60 * 1000L, "y",
allowedIntervals.get(Rounding.DateTimeUnit.YEAR_OF_CENTURY), 1, 5, 10, 20, 50, 100);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: instead of having to pass in the same DateTimeUnit twice and relying on them being the same we could change RoundingInfo so it's constructor has the signature:

public RoundingInfo(DateTimeUnit dateTimeUnit,
                            ZoneId timeZone,
                            long roughEstimateDurationMillis,
                            String unitAbbreviation,
                            int... innerIntervals)

The RoundingInfo constructor could then call createRounding() and allowedIntervals.get() and this would ensure we can never have a RoundingInfo where the dateTimeUnit String and the Rounding don't match the same DateTimeUnit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great suggestion! this simplified the calling logic a lot. addressed in 95ee1cf

@pcsanwald pcsanwald requested a review from colings86 June 9, 2019 22:55
@pcsanwald
Copy link
Contributor Author

@elasticmachine update branch

long roughEstimateDurationMillis,
String unitAbbreviation,
int... innerIntervals) {
this.rounding = createRounding(Rounding.DateTimeUnit.SECOND_OF_MINUTE, timeZone);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems wrong, the rounding should be created with the dateTimeUnit no?

this.roughEstimateDurationMillis = roughEstimateDurationMillis;
this.unitAbbreviation = unitAbbreviation;
this.innerIntervals = innerIntervals;
assert dateTimeUnit != null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to check that the dateTimeUnit is not null, but with Objects.RequireNonNull(). The assert I suggested was meant for this.dateTimeUnit to ensure we never ask for an entry from the map that doesn't exist

assertThat(roundings.length, equalTo(6));
}

}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make these tests check the roundings that are created to make sure they contain the correct rounding levels, not just the number of roundings returned?

@colings86
Copy link
Contributor

I just realised that we don't have any documentation for this parameter yet. Coudl we add that?

@pcsanwald
Copy link
Contributor Author

Screen Shot 2019-06-10 at 11 53 09 AM

@pcsanwald
Copy link
Contributor Author

@elasticmachine test this please

1 similar comment
@pcsanwald
Copy link
Contributor Author

@elasticmachine test this please

@@ -258,6 +258,28 @@ Like with the normal <<search-aggregations-bucket-datehistogram-aggregation, `da
scripts and value level scripts are supported. This aggregation does not however, support the `min_doc_count`,
`extended_bounds` and `order` parameters.

==== Minimum Interval parameter

The `minimum_interval` allows the caller to specify the minimum rounding interval that should be used.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a bit more detail, for example, to explain that the aggregation will start from the specified interval and that it will make the collection more efficient (faster) since it won't waste time collected at lower intervals.

Also we should list the available values for this parameter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in 388127c

@pcsanwald pcsanwald merged commit 6357857 into elastic:master Jun 11, 2019
pcsanwald pushed a commit to pcsanwald/elasticsearch that referenced this pull request Jun 17, 2019
Adds a minimum interval to `auto_date_histogram`. We do this by
restricting the roundings passed into to the aggregator.
pcsanwald pushed a commit that referenced this pull request Jun 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add minimum_interval setting to auto_date_histogram aggregation
6 participants