-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid deprecation warning when running the ML datafeed extractor. #31463
Conversation
In elastic#29639 we added a `format` option to doc-value fields and deprecated usage of doc-value fields without a format so that we could migrate doc-value fields to use the format that comes with the mappings by default. However I missed to fix the machine-learning datafeed extractor.
Pinging @elastic/ml-core |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for picking this up Adrien! Left a few comments.
if (value[0] instanceof BaseDateTime) { // script field | ||
value[0] = ((BaseDateTime) value[0]).getMillis(); | ||
} else if (value[0] instanceof String) { // doc_value field with the epoch_millis format | ||
value[0] = Long.parseLong((String) value[0]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the value going to be a String
rather than a long
here since we're explicitly setting the format to epoch_millis
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is correct.
@@ -103,7 +103,13 @@ public static ExtractedField newField(String alias, String name, ExtractionMetho | |||
if (value.length != 1) { | |||
return value; | |||
} | |||
value[0] = ((BaseDateTime) value[0]).getMillis(); | |||
if (value[0] instanceof BaseDateTime) { // script field |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we swap the order of those instanceof
checks? The time value will be coming from a doc_value way more often than from a script field, so I'd rather do the most useful check first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
if (docValueField.equals(context.extractedFields.timeField())) { | ||
searchRequestBuilder.addDocValueField(docValueField, EPOCH_MILLIS_FORMAT); | ||
} else { | ||
searchRequestBuilder.addDocValueField(docValueField, DocValueFieldsContext.USE_DEFAULT_FORMAT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the default format for date
fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On 6.x, the default is to keep returning a date time object for backward compatibility. If you use format=use_field_mapping
then date fields will use the format that is configured in the mappings. The default format is strict_date_optional_time||epoch_millis
. All formats may be used for parsing but only the first one is used for rendering, so this would give something like 2015-01-01T12:10:30Z
.
In 7.x I plan to make use_field_mapping
the default via #30831 which is how I found this bug. So ML needs to explicitly request that dates are formatted with epoch_millis
in order to get millis since Epoch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* master: Add get field mappings to High Level REST API Client (#31423) [DOCS] Updates Watcher examples for code testing (#31152) TEST: Add bwc recovery tests with synced-flush index [DOCS] Move sql to docs (#31474) [DOCS] Move monitoring to docs folder (#31477) Core: Combine doExecute methods in TransportAction (#31517) IndexShard should not return null stats (#31528) fix repository update with the same settings but different type (#31458) Fix Mockito trying to mock IOException that isn't thrown by method (#31433) (#31527) Node selector per client rather than per request (#31471) Core: Combine messageRecieved methods in TransportRequestHandler (#31519) Upgrade to Lucene 7.4.0. (#31529) [ML] Add ML filter update API (#31437) Allow multiple unicast host providers (#31509) Avoid deprecation warning when running the ML datafeed extractor. (#31463) REST high-level client: add simulate pipeline API (#31158) Get Mapping API to honour allow_no_indices and ignore_unavailable (#31507) [PkiRealm] Invalidate cache on role mappings change (#31510) [Security] Check auth scheme case insensitively (#31490) In NumberFieldType equals and hashCode, make sure that NumberType is taken into account. (#31514) [DOCS] Fix REST tests in SQL docs [DOCS] Add code snippet testing in more ML APIs (#31339) Core: Remove ThreadPool from base TransportAction (#31492) [DOCS] Remove fixed file from build.gradle Rename createNewTranslog to fileBasedRecovery (#31508) Test: Skip assertion on windows [DOCS] Creates field and document level security overview (#30937) [DOCS] Significantly improve SQL docs [DOCS] Move migration APIs to docs (#31473) Core: Convert TransportAction.execute uses to client calls (#31487) Return transport addresses from UnicastHostsProvider (#31426) Ensure local addresses aren't null (#31440) Remove unused generic type for client execute method (#31444) Introduce http and tcp server channels (#31446)
* 6.x: Avoid sending duplicate remote failed shard requests (#31313) Add get field mappings to High Level REST API Client Relates to #27205 [DOCS] Updates Watcher examples for code testing (#31152) [DOCS] Move monitoring to docs folder (#31477) [DOCS] Fixes SQL docs in nav [DOCS] Move sql to docs IndexShard should not return null stats - empty stats or AlreadyCloseException if it's closed is better Clarify that IP range data can be specified in CIDR notation. (#31374) Remove some cases in FieldTypeLookupTests that are no longer relevant. (#31381) In NumberFieldType equals and hashCode, make sure that NumberType is taken into account. (#31514) fix repository update with the same settings but different type Revert "AwaitsFix FullClusterRestartIT#testRecovery" Upgrade to Lucene 7.4.0. (#31529) Avoid deprecation warning when running the ML datafeed extractor. (#31463) Retry synced-flush in FullClusterRestartIT#testRecovery Allow multiple unicast host providers (#31509) [ML] Add ML filter update API (#31437) AwaitsFix FullClusterRestartIT#testRecovery Fix missing historyUUID in peer recovery when rolling upgrade 5.x to 6.3 (#31506) Remove QueryCachingPolicy#ALWAYS_CACHE (#31451) Rename createNewTranslog to fileBasedRecovery (#31508) [DOCS] Add code snippet testing in more ML APIs (#31339) [DOCS] Remove fixed file from build.gradle [DOCS] Creates field and document level security overview (#30937) Test: Skip assertion on windows [DOCS] Move migration APIs to docs (#31473) Add a known issue for upgrading from 5.x to 6.3.0 (#31501) Return transport addresses from UnicastHostsProvider (#31426) Add Delete Snapshot High Level REST API Reload secure settings for plugins (#31481) [DOCS] Fix JDBC Maven client group/artifact ID
In #29639 we added a
format
option to doc-value fields and deprecated usageof doc-value fields without a format so that we could migrate doc-value fields
to use the format that comes with the mappings by default. However I missed to
fix the machine-learning datafeed extractor.
I'm setting the
non-issue
label since this bug is not released.@dimitris-athanasiou Could you have a look?