[ML] Automatically map interval to fixed/calendar interval in datafeed aggs #51606

droberts195 · 2020-01-29T12:27:31Z

Since #33727 went into 7.2 interval is deprecated in date histograms. New date histograms should use fixed_interval or calendar_interval instead. In 8.x it is likely that date histograms containing interval will be considered invalid.

ML datafeed configs are persisted and are likely to contain interval rather than fixed_interval or calendar_interval; they definitely will if they were first created in a version earlier than 7.2. Such datafeed configs will be invalid in 8.0, so prior to 8.0 we must do something about this, but there is no reason we cannot start sooner.

The best course of action would seem to be to silently rewrite datafeed configs that contain a date_histogram aggregation with an interval setting to have whichever of fixed_interval or calendar_interval preserves the current behaviour.

We can do this unconditionally providing the oldest node in the cluster is on version 7.2 or above.

Rollups already does this silent behaviour-preserving rewriting, although rollup configs are stored in cluster state which makes this considerably easier than for the ML configs that are stored in an index.

The code that does the translation for rollups is in:

elasticsearch/x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/rollup/job/DateHistogramGroupConfig.java

Lines 65 to 85 in bd01cce

    
           PARSER = new ConstructingObjectParser<>(NAME, a -> { 
        
               DateHistogramInterval oldInterval = (DateHistogramInterval) a[1]; 
        
               DateHistogramInterval calendarInterval = (DateHistogramInterval) a[2]; 
        
               DateHistogramInterval fixedInterval = (DateHistogramInterval) a[3]; 
        
               if (oldInterval != null) { 
        
                   if  (calendarInterval != null || fixedInterval != null) { 
        
                       throw new IllegalArgumentException("Cannot use [interval] with [fixed_interval] or [calendar_interval] " + 
        
                           "configuration options."); 
        
                   } 
        
                   return fromUnknownTimeUnit((String) a[0], oldInterval, (DateHistogramInterval) a[4], (String) a[5]); 
        
               } else if (calendarInterval != null && fixedInterval == null) { 
        
                   return new CalendarInterval((String) a[0], calendarInterval, (DateHistogramInterval) a[4], (String) a[5]); 
        
               } else if (calendarInterval == null && fixedInterval != null) { 
        
                   return new FixedInterval((String) a[0], fixedInterval, (DateHistogramInterval) a[4], (String) a[5]); 
        
               } else if (calendarInterval != null && fixedInterval != null) { 
        
                   throw new IllegalArgumentException("Cannot set both [fixed_interval] and [calendar_interval] at the same time"); 
        
               } else { 
        
                   throw new IllegalArgumentException("An interval is required.  Use [fixed_interval] or [calendar_interval]."); 
        
               } 
        
           });

and:

elasticsearch/x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/rollup/job/DateHistogramGroupConfig.java

Lines 167 to 174 in bd01cce

    
           private static DateHistogramGroupConfig fromUnknownTimeUnit(String field, DateHistogramInterval interval, 
        
                                                                       DateHistogramInterval delay, String timeZone) { 
        
               if (DateHistogramAggregationBuilder.DATE_FIELD_UNITS.get(interval.toString()) != null) { 
        
                   return new CalendarInterval(field, interval, delay, timeZone); 
        
               } else { 
        
                   return new FixedInterval(field, interval, delay, timeZone); 
        
               } 
        
           }

That should be considered as the spec for how to do a silent behaviour-preserving rewrite.

The extra complexity in ML is deciding where to do the migration for datafeed configs that wouldn't otherwise be rewritten. If we do it in the datafeed config parser then that ensures that newly submitted configs and datafeed configs that are updated will get translated. Then we would just need to find an appropriate place to do it for existing datafeed configs - maybe on master node election providing the oldest node in the cluster is version 7.2 or above?

There are also some currently muted tests that should be reenabled when the work is done:

elasticsearch/x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/ml/datafeed/DatafeedConfigTests.java

Line 545 in be849b2

@AwaitsFix(bugUrl = "Needs ML to look at and fix. Unclear how this should be handled, interval is not an optional param")
elasticsearch/x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/ml/datafeed/DatafeedUpdateTests.java

Line 60 in be849b2

@AwaitsFix(bugUrl = "Tests need to be updated to use calendar/fixed interval explicitly")

The text was updated successfully, but these errors were encountered:

elasticmachine · 2020-01-29T12:27:34Z

Pinging @elastic/ml-core (:ml)

benwtrent · 2020-02-14T19:11:47Z

We can do this unconditionally providing the oldest node in the cluster is on version 7.2 or above.

I think we can do this unconditionally regardless of minimum node version. If we have a mixed cluster version, the date_histogram aggregation must follow the same BWC guarantees that we do. I think we would have a bigger problem if a date_histogram that specifies a fixed/calendar interval is not BWC with older nodes.

droberts195 · 2020-02-17T09:48:03Z

I think we can do this unconditionally regardless of minimum node version. If we have a mixed cluster version, the date_histogram aggregation must follow the same BWC guarantees that we do. I think we would have a bigger problem if a date_histogram that specifies a fixed/calendar interval is not BWC with older nodes.

For wire serialization this is true, but the extra problem ML is adding is to store the aggregations in a document in an index that might be read by an older node in a mixed version cluster. Usually aggregation DSL is a transient thing that only exists for the duration of one search so the wire serialization mappings are enough. We have lenient parsing when loading configs from indices, so parsing the config on load will not cause an error. But the older node will then see a date_histogram aggregation that doesn't contain an interval, and error at the point of trying to run it within the datafeed. This is why I think we should only do the rewrites when the minimum node version is high enough for all nodes in the cluster to understand fixed_interval and calendar_interval.

benwtrent · 2020-02-18T13:12:18Z

@droberts195

For wire serialization this is true, but the extra problem ML is adding is to store the aggregations in a document in an index that might be read by an older node in a mixed version cluster.

Isn't this already a problem then?

Somebody PUT or _update a datafeed and specifically sets the histogram setting to fixed|calendar
the doc is updated
the doc is read by an older node

I thought this type of situation was why our PUT and _update requests are master node actions. This way the request that gets parsed from XContent is the same version of as the master node. Which should be upgraded last anyways.

NOTE: With the automatic re-write, this will probably be exacerbated. To your point.

droberts195 · 2020-02-18T13:24:31Z

the master node. Which should be upgraded last anyways.

This is not a hard requirement. The docs just say:

It is best to upgrade the master-eligible nodes in your cluster after all of the other nodes.

Also, even if that advice is followed, when there are multiple master-eligible nodes the actual master node at some time during the process might be on a version ahead of some other master-eligible nodes.

Somebody PUT or _update a datafeed and specifically sets the histogram setting to fixed|calendar

There's an assumption that users don't use new features of a particular version until the whole cluster is on that version. I agree there could be problems if this assumption isn't followed.

Basically what I am saying is that our internal upgrade code should follow that same assumption - don't start adding new syntax into configs until the whole cluster is on the version that supports that syntax.

If a user got themselves into a mess by trying to use new version functionality in a mixed version cluster we could justifiably push back and say they should have waited until the whole cluster is upgraded, and that they need to complete their upgrade before our functionality will work again. But code we write that runs automatically breaks their mixed version cluster then they have a more justified grievance against us.

benwtrent · 2020-02-18T16:26:17Z

Current plan:

Have _update automatically fix the interval if the min node version of the cluster is > 7.2.0
The datafeed job builder should update the lazily parsed map before validating or running the datafeed. This way we handle the situation where a datafeed has continued to run, but has the old interval settings

As for updating existing datafeeds that are not in use, i am not 100% the best way to go about this.

The options I can think of are:

provide an easy way for folks to update them (via an empty call to _update) if they want to use the datafeed again later. This lines up with the "upgrade assistant" format.
always update the interval when we parse the XContent from the index. v8 is BWC only with 7.last. So, in v8.0.0 we can do this without worry. But, this would mean the stored document could always have the old interval. Might cause issues if somebody decides to delete line of code that does the rewrite...
Have some service that reads and rewrites them (I dislike this option...)

…terval fields (#52538) `interval` was deprecated back in 7.2.0. It was replaced with `calendar_interval` and `fixed_interval`. This change automatically rewrites datafeed configurations that contain the old `interval` field. The rewrite occurs when the configuration is read from the index. It is NOT then written back to the index. This PR also re-enables the long disabled BWC tests for datafeeds. Partially addresses: #51606

This creates an auto update service. The first automatic update is rewriting datafeed aggregations if they exist. This is necessary as `date_histogram` is deprecating and removing the `interval` key. To aid users who have the `interval` key defined, we can automatically upgrade these aggregation definitions to the appropriate `fixed_interval` or `calendar_interval`. closes #51606

droberts195 added >upgrade :ml Machine learning labels Jan 29, 2020

benwtrent self-assigned this Feb 10, 2020

benwtrent mentioned this issue Feb 19, 2020

[ML] rewrite deprecated datafeed configs to use new date_histogram interval fields #52538

Merged

benwtrent mentioned this issue Apr 23, 2020

[ML] automatically update deprecated datafeed aggregations #55678

Merged

benwtrent closed this as completed in #55678 Apr 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] Automatically map interval to fixed/calendar interval in datafeed aggs #51606

[ML] Automatically map interval to fixed/calendar interval in datafeed aggs #51606

droberts195 commented Jan 29, 2020

elasticmachine commented Jan 29, 2020

benwtrent commented Feb 14, 2020

droberts195 commented Feb 17, 2020

benwtrent commented Feb 18, 2020 •

edited

Loading

droberts195 commented Feb 18, 2020

benwtrent commented Feb 18, 2020 •

edited

Loading

[ML] Automatically map interval to fixed/calendar interval in datafeed aggs #51606

[ML] Automatically map interval to fixed/calendar interval in datafeed aggs #51606

Comments

droberts195 commented Jan 29, 2020

elasticmachine commented Jan 29, 2020

benwtrent commented Feb 14, 2020

droberts195 commented Feb 17, 2020

benwtrent commented Feb 18, 2020 • edited Loading

droberts195 commented Feb 18, 2020

benwtrent commented Feb 18, 2020 • edited Loading

benwtrent commented Feb 18, 2020 •

edited

Loading

benwtrent commented Feb 18, 2020 •

edited

Loading