-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds a minimum interval to auto_date_histogram
.
#42814
Adds a minimum interval to auto_date_histogram
.
#42814
Conversation
restricting the roundings passed into to the aggregator.
Pinging @elastic/es-analytics-geo |
} | ||
|
||
private static List<String> allowedIntervals = Arrays.asList("second", "minute", "hour", "day", "month", "year"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought about using an enum here, didn't seem to buy too much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could possibly re-use Rounding.DateTimeUnit
enum, although that also includes quarter
. And would have to add string representations to the enum since it doesn't right now. Might not be worth the hassle. Another candidate is DateHistogramAggregationBuilder.DATE_FIELD_UNITS
, although that also includes 1w
, etc.
Although it does make me wonder, was quarter
skipped on purpose and/or some reason to not have it?
EDIT: Might be moot given Colin's comment below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pcsanwald I left some comments, thanks for opening the PR
RoundingInfo[] roundingInfos, | ||
SearchContext context, | ||
AggregatorFactory<?> parent, | ||
AggregatorFactories.Builder subFactoriesBuilder, | ||
Map<String, Object> metaData) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Indent isn't inline with the rest of the constructor args here
List<PipelineAggregator> pipelineAggregators, | ||
Map<String, Object> metaData) throws IOException { | ||
|
||
super(minimumIntervalExpression, factories, aggregationContext, parent, pipelineAggregators, metaData); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The aggregator name seems to have been replaced by the minimumIntervalExpression
, the name
arg seems to have been removed entirely.
Also, it's not clear to me why the minimumIntervalExpression
needs to be passed to the AutoDateHistogramAggregator
?
.../elasticsearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregationBuilder.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few minor notes on my end :)
We are planning to backport this to 7.x is, right?
} | ||
|
||
private static List<String> allowedIntervals = Arrays.asList("second", "minute", "hour", "day", "month", "year"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could possibly re-use Rounding.DateTimeUnit
enum, although that also includes quarter
. And would have to add string representations to the enum since it doesn't right now. Might not be worth the hassle. Another candidate is DateHistogramAggregationBuilder.DATE_FIELD_UNITS
, although that also includes 1w
, etc.
Although it does make me wonder, was quarter
skipped on purpose and/or some reason to not have it?
EDIT: Might be moot given Colin's comment below
.../elasticsearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregationBuilder.java
Outdated
Show resolved
Hide resolved
.../elasticsearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregationBuilder.java
Show resolved
Hide resolved
.../elasticsearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregationBuilder.java
Outdated
Show resolved
Hide resolved
@@ -101,12 +128,16 @@ public AutoDateHistogramAggregationBuilder(String name) { | |||
public AutoDateHistogramAggregationBuilder(StreamInput in) throws IOException { | |||
super(in, ValuesSourceType.NUMERIC, ValueType.DATE); | |||
numBuckets = in.readVInt(); | |||
if (in.getVersion().onOrAfter(Version.V_8_0_0)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a procedural note: if we are backporting this to 7.x, the backport will need to change the version, and then we'll need to push a followup commit to master which change the version appropriately. I like to leave a #TODO change version after backport
so I don't forget, but that's just personal preference.
Just an FYI since version checks still confuse me from time to time :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
believe I addressed this comment
…ket/histogram/AutoDateHistogramAggregationBuilder.java Co-Authored-By: Zachary Tong <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pcsanwald Thanks for making the changes. I left a couple of comments and a suggestion for adding even more security around the RoundingInfo
objects
this.rounding = rounding; | ||
this.roughEstimateDurationMillis = roughEstimateDurationMillis; | ||
this.unitAbbreviation = unitAbbreviation; | ||
this.innerIntervals = innerIntervals; | ||
this.dateTimeUnit = dateTimeUnit; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we assert dateTimeUnit != null;
here so we definitely catch the case where the allowedIntervals
map doesn't contain an entry for the date time unit used in the rounding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 95ee1cf
} | ||
|
||
public static final Map<Rounding.DateTimeUnit, String> allowedIntervals = Map.ofEntries( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: since this is a static constant we should give it an uppercase name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 95ee1cf
365 * 24 * 60 * 60 * 1000L, "y", 1, 5, 10, 20, 50, 100); | ||
return roundings; | ||
365 * 24 * 60 * 60 * 1000L, "y", | ||
allowedIntervals.get(Rounding.DateTimeUnit.YEAR_OF_CENTURY), 1, 5, 10, 20, 50, 100); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: instead of having to pass in the same DateTimeUnit
twice and relying on them being the same we could change RoundingInfo
so it's constructor has the signature:
public RoundingInfo(DateTimeUnit dateTimeUnit,
ZoneId timeZone,
long roughEstimateDurationMillis,
String unitAbbreviation,
int... innerIntervals)
The RoundingInfo
constructor could then call createRounding()
and allowedIntervals.get()
and this would ensure we can never have a RoundingInfo
where the dateTimeUnit
String and the Rounding don't match the same DateTimeUnit
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great suggestion! this simplified the calling logic a lot. addressed in 95ee1cf
@elasticmachine update branch |
long roughEstimateDurationMillis, | ||
String unitAbbreviation, | ||
int... innerIntervals) { | ||
this.rounding = createRounding(Rounding.DateTimeUnit.SECOND_OF_MINUTE, timeZone); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems wrong, the rounding should be created with the dateTimeUnit
no?
this.roughEstimateDurationMillis = roughEstimateDurationMillis; | ||
this.unitAbbreviation = unitAbbreviation; | ||
this.innerIntervals = innerIntervals; | ||
assert dateTimeUnit != null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to check that the dateTimeUnit
is not null, but with Objects.RequireNonNull()
. The assert I suggested was meant for this.dateTimeUnit
to ensure we never ask for an entry from the map that doesn't exist
assertThat(roundings.length, equalTo(6)); | ||
} | ||
|
||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make these tests check the roundings that are created to make sure they contain the correct rounding levels, not just the number of roundings returned?
…anwald/elasticsearch into auto-interval-datehisto-min-interval
I just realised that we don't have any documentation for this parameter yet. Coudl we add that? |
@elasticmachine test this please |
1 similar comment
@elasticmachine test this please |
@@ -258,6 +258,28 @@ Like with the normal <<search-aggregations-bucket-datehistogram-aggregation, `da | |||
scripts and value level scripts are supported. This aggregation does not however, support the `min_doc_count`, | |||
`extended_bounds` and `order` parameters. | |||
|
|||
==== Minimum Interval parameter | |||
|
|||
The `minimum_interval` allows the caller to specify the minimum rounding interval that should be used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a bit more detail, for example, to explain that the aggregation will start from the specified interval and that it will make the collection more efficient (faster) since it won't waste time collected at lower intervals.
Also we should list the available values for this parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 388127c
Adds a minimum interval to `auto_date_histogram`. We do this by restricting the roundings passed into to the aggregator.
…#43285) Backports minimum interval to date histogram
Closes #41757 by adding a new, optional parameter,
minimum_interval
, toauto_date_histogram
.note: I need to add docs, but the work should be functionally complete. Here's what I did to test: