-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Script: Ingest Metadata and CtxMap #88458
Script: Ingest Metadata and CtxMap #88458
Conversation
…return vt validator, doc matcher only checks metadata map
Adds FieldProperties record that configures how to validate fields. Fields have a type, are writeable or read-only, and nullable or not and may have an additional validation useful for Set/Enum validation. Splits IngestMetadata from Metadata in preparation for new Metdata subclasses.
… into ingest_ctx_map_field_prop
… into ingest_sm_to_ctx_map
… into ingest_sm_to_ctx_map
…search into ingest_ctx_map_field_prop
Pinging @elastic/es-core-infra (Team:Core/Infra) |
Pinging @elastic/es-data-management (Team:Data Management) |
@elasticsearchmachine run elasticsearch-ci/part-2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fine. I don't quite see yet how this will get to what we discussed before (subclasses of Metadata passing the operations they support for keys, ie read-only vs read-write), but I know there are more followups. This PR on it's own looks like an improvement.
return null; | ||
} | ||
|
||
static class IngestMetadata extends Metadata { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is static, let's move it out to a tope level class. It can still be package private.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already an IngestMetadata that "Holds the ingest pipelines that are available in the cluster".
I'm going to rename this to IngestDocMetadata
when I make it top-level.
public static Tuple<Map<String, Object>, Map<String, Object>> splitSourceAndMetadata(Map<String, Object> sourceAndMetadata) { | ||
if (sourceAndMetadata instanceof IngestSourceAndMetadata ingestSourceAndMetadata) { | ||
return new Tuple<>(new HashMap<>(ingestSourceAndMetadata.source), new HashMap<>(ingestSourceAndMetadata.metadata.getMap())); | ||
public static Tuple<Map<String, Object>, Map<String, Object>> splitSourceAndMetadata( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this method necessary when the source and metadata are already split internally? Couldn't this be two separate calls to member methods, to get the source map and the Metadata object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
} | ||
return metadata; | ||
} | ||
|
||
/** | ||
* Check that all metadata map contains only valid metadata and no extraneous keys and source map contains no metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think this comment is off? there is no source map
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
@@ -172,7 +119,7 @@ public void setVersion(long version) { | |||
} | |||
|
|||
public ZonedDateTime getTimestamp() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can move completely to IngestMetadata right? We shouldn't need it to exist at all on other Metadata subclasses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is available in 2 of 4 write contexts, ingest and update. It is missing in reindex and update by query
It's here because that allows scripts to do Metadata m = metadata(); m.timestamp
. Otherwise, the script would have to name the subtype.
It is possible to add timestamp to reindex and update by query, both have access to the thread pool and pull a long supplier from it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could duck-type the Metadata subclasses just for timestamp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it’s relevant to all the other contexts, then it makes sense to stay completely in this class. But I don’t think we should have it defined here but implemented in subclasses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll change the implementation here in the Update PR.
…x validate javadoc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. A few minor comments.
ZonedDateTime timestamp, | ||
Map<String, Object> source | ||
) { | ||
super(new HashMap<>(source), new IngestDocMetadata(index, id, version, routing, versionType, timestamp)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we do a deepcopy for source when calling this constructor, do we then again need to wrap source in a new HashMap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bunch of tests call into IngestDocument which calls this constructor, those test all do Map.of or SingletonMap and fail if we don't perform this copy.
@@ -763,7 +751,7 @@ public static Object deepCopy(Object value) { | |||
for (Map.Entry<?, ?> entry : mapValue.entrySet()) { | |||
copy.put(entry.getKey(), deepCopy(entry.getValue())); | |||
} | |||
// TODO(stu): should this check for IngestSourceAndMetadata in addition to Map? | |||
// TODO(stu): should this check for IngestCtxMap in addition to Map? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to leave this as part of the PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shows up because of the rename, it's still relevant question for now.
@@ -172,7 +119,7 @@ public void setVersion(long version) { | |||
} | |||
|
|||
public ZonedDateTime getTimestamp() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could duck-type the Metadata subclasses just for timestamp.
This reverts commit a08afb8.
Create a
Metadata
superclass for ingest and update contexts.Create a
CtxMap
superclass forctx
backwards compatibility in ingest and update contexts.script.CtxMap
was moved fromingest.IngestSourceAndMetadata
CtxMap
takes aMetadata
subclass and validates update via theFieldProperty
s passed in.Metadata
provides typed getters and setters and implements aMap
-like interface, making it easy for a class containingCtxMap
to implement the fullMap
interface.The
FieldProperty
record that configures how to validate fields. Fields have atype
, arewriteable
or read-only, andnullable
or not and may have an additional validation useful for Set/Enum validation.