-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ingest: Start separating Metadata from IngestSourceAndMetadata #88401
Conversation
…return vt validator, doc matcher only checks metadata map
Pinging @elastic/es-core-infra (Team:Core/Infra) |
Pinging @elastic/es-data-management (Team:Data Management) |
@elasticmachine run elasticsearch-ci/part-1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is getting closer to what we discussed! I left some comments.
protected final Map<String, Validator> validators; | ||
|
||
// timestamp is new to ingest metadata, so it doesn't need to be backed by the map for back compat | ||
protected final ZonedDateTime timestamp; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is going to be more troublesome as more contexts are added. Timestamp is specific to ingest. Yet all contexts are going to have this extra member. I suggest (in a followup) to consider either putting this in the map (it's not the worst thing to have it available in ctx), and/or to allow subclasses to provide additional keys/validators (but validators can still be protected). I think this will be necessary anyways to add metadata support in watcher, which has some additional keys that we would not add here in server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest (in a followup) to consider either putting this in the map (it's not the worst thing to have it available in ctx), and/or to allow subclasses to provide additional keys/validators (but validators can still be protected).
I moved this to it's own subclass when implementing Update
. This PR is just moving the code over.
The Update context has this map backed but it is backwards incompatible to add it to a ctx
for ingest unless we say underscore prefixed values in ctx
are reserved.
// timestamp is new to ingest metadata, so it doesn't need to be backed by the map for back compat | ||
protected final ZonedDateTime timestamp; | ||
|
||
public Metadata(String index, String id, long version, String routing, VersionType versionType, ZonedDateTime timestamp) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ctor is basically Ingest specific. Can we replace this ctor with calls to the relevant setters in ingest when constructing the metadata?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the next PR, Metadata
takes the fully constructed Map
. I kept this here to have a minimal code change in this PR.
Can we replace this ctor with calls to the relevant setters in ingest when constructing the metadata?
Then we'd have to manually call the initial validation, I think it works better to validate the full map on construction.
if (value instanceof Number number) { | ||
return number; | ||
} | ||
throw new IllegalArgumentException( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this be an illegal state or assertion error? It got past the validators; this is a coding error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to IllegalStateException
.
* Get the {@link Number} associated with key, or null | ||
* @throws IllegalArgumentException if the value is not a {@link Number} | ||
*/ | ||
public Number getNumber(String key) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think these helper methods should be public, they can be protected, just used by the validators right? The only public members (aside from general map stuff necessary for ctx) should be the strongly keyed/typed methods meant to be exposed in scripts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we implement getOp/setOp
in the CtxMap
for Update
, it will reuse getString
. I'll make these protected
now and we can discuss it then.
* Is this key a Metadata key? A {@link #remove}d key would return false for {@link #containsKey(String)} but true for | ||
* this call. | ||
*/ | ||
public boolean isMetadata(String key) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the naming is confusing, since this is in a class called Metadata. How about isAvailable
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to isAvailable
.
this(map, timestamp, VALIDATORS); | ||
} | ||
|
||
Metadata(Map<String, Object> map, ZonedDateTime timestamp, Map<String, Validator> validators) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think initial values for the map are really necessary (use the setters, see my other comment). But I think what is missing is the user (ie ingest) specifying which keys should be available. What is here isn't all the possible keys right? And other contexts wont' want all of these? What I was trying to suggest in our previous conversations was to have the ctor take in a map from key to allowed access (read only vs read/write). Perhaps it could be two lists. Then the validators are grabbed for each one of those, internally within the ctor, to choose which validators are set on the instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I think what is missing is the user (ie ingest) specifying which keys should be available.
Those changes are coming in the third PR in this series. This PR separates the code, the next PR will create a CtxMap
from IngestSourceAndMetadata
and the final will allow configuration based validation.
I'm happy to roll those two changes into this one, if that's desired.
*/ | ||
String getIndex(); | ||
public List<String> getKeys() { | ||
return new ArrayList<>(map.keySet()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of copying, this could be wrapped with Collections.unmodifiableSet? I would also name this keySet (and return a Set), no reason not to use the standard collection type for keys?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice, done.
/** | ||
* Get the metadata keys inside the {@param other} set | ||
*/ | ||
public Set<String> metadataKeys(Set<String> other) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like the intersection of the keyset and the keys passed in. Instead of another public method, I think the getKeys (or keySet name as I suggested above) can be called with Sets.intersection
in the one place in ingest using this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, done.
* Allow Numbers that can be represented as longs without loss of precision or null | ||
* @throws IllegalArgumentException if the value cannot be represented as a long | ||
*/ | ||
public static void longValidator(MapOperation op, String key, Object value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be protected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to protected.
* Allow maps. | ||
* @throws IllegalArgumentException if {@param value} is not a {@link Map} | ||
*/ | ||
public static void mapValidator(MapOperation op, String key, Object value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be protected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to protected.
…adataTests.testGetNumber
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one additional question following up on @rjernst's review.
for (Map.Entry<String, Object> metadata : metadataMap.entrySet()) { | ||
if (metadata.getValue() != null) { | ||
builder.field(metadata.getKey(), metadata.getValue().toString()); | ||
org.elasticsearch.script.Metadata metadata = ingestDocument.getMetadata(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should Metadata be able to read/write it's own XContent stream? Or maybe Metadata should have something to return a Map of its keys to values? I'm considering the case where in the future Metadata may not necessarily be backed by a Map so it won't have something like keySet and get naturally available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting thoughts but I don't think having Metadata own it's XContent makes sense right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
LGTM too! |
* upstream/master: (38 commits) Simplify map copying (elastic#88432) Make DiffableUtils.diff implementation agnostic (elastic#88403) Ingest: Start separating Metadata from IngestSourceAndMetadata (elastic#88401) Move runtime fields base scripts out of scripting fields api package. (elastic#88488) Enable TRACE Logging for test and increase timeout (elastic#88477) Mute ReactiveStorageIT#testScaleDuringSplitOrClone (elastic#88480) Track the count of failed invocations since last successful policy snapshot (elastic#88398) Avoid noisy exceptions on data nodes when aborting snapshots (elastic#88476) Fix ReactiveStorageDeciderServiceTests testNodeSizeForDataBelowLowWatermark (elastic#88452) INFO logging of snapshot restore and completion (elastic#88257) unmute test (elastic#88454) Updatable API keys - noop check (elastic#88346) Corrected an incomplete sentence. (elastic#86542) Use consistent shard map type in IndexService (elastic#88465) Stop registering TestGeoShapeFieldMapperPlugin in ESIntegTestCase (elastic#88460) TSDB: RollupShardIndexer logging improvements (elastic#88416) Audit API key ID when create or grant API keys (elastic#88456) Bound random negative size test in SearchSourceBuilderTests#testNegativeSizeErrors (elastic#88457) Updatable API keys - logging audit trail event (elastic#88276) Polish reworked LoggedExec task (elastic#88424) ... # Conflicts: # x-pack/plugin/rollup/src/main/java/org/elasticsearch/xpack/rollup/v2/RollupShardIndexer.java
Pull out the implementation of
Metadata
fromIngestSourceAndMetadata
.Metadata
will become a base class extended by the update contexts: ingest, update, update by query and reindex.Metadata
implements a map-like interface, making it easy for a class containingMetadata
to implement the fullMap
interface.