-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add immutable 'operator' metadata classes for cluster state #87763
Conversation
Pinging @elastic/es-core-infra (Team:Core/Infra) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for splitting this out. Overall the use of records is nice here, and I appreciate all the javadocs! I left some questions. The downside of the way the PR is split is I can't tell how any of these classes are intended to be used, so it is difficult to comment on particular methods and whether they could be simplified. I'm not sure how to remedy that...
* @param namespace the namespace to lookup operator metadata for | ||
* @return {@link OperatorMetadata} or null if not found | ||
*/ | ||
public OperatorMetadata operatorMetadata(String namespace) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why have this method and also the one above returning the entire map? If the one above is needed, then it could also be used in place of this method?
@@ -1146,6 +1181,7 @@ public Metadata apply(Metadata part) { | |||
builder.indices(indices.apply(part.indices)); | |||
builder.templates(templates.apply(part.templates)); | |||
builder.customs(customs.apply(part.customs)); | |||
builder.put(operatorMetadata.apply(part.operatorMetadata)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the jdk map diff return an immutable/unmodifiable map? We could either wrap or copy this here to make it immutable, or even better open a separate issue to make the jdk map diff's apply return an immutable map (I've had the need for this in recent hppc PRs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made an issue here #88019. In the interim until that's fixed I'll wrap this in an unmodifiable map.
* @param operatorMetadata a map of namespace to {@link OperatorMetadata} | ||
* @return {@link Builder} | ||
*/ | ||
public Builder put(Map<String, OperatorMetadata> operatorMetadata) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this used anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah, I used to call this method putOperatorState, but it was too long and it didn't match the naming of other methods we have in Metadata, we simply call .put
with different types. Here's the usage here:
*/ | ||
public record OperatorErrorMetadata( | ||
Long version, | ||
org.elasticsearch.cluster.metadata.OperatorErrorMetadata.ErrorKind errorKind, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: static import ErrorKind?
* @throws IOException | ||
*/ | ||
public static OperatorErrorMetadata readFrom(StreamInput in) throws IOException { | ||
Builder builder = new Builder().version(in.readLong()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why create a builder, these can be passed to the canonical ctor?
* @param metadata {@link OperatorErrorMetadata} | ||
* @param builder {@link XContentBuilder} | ||
*/ | ||
public static void toXContent(OperatorErrorMetadata metadata, XContentBuilder builder) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be a member method of the record?
* @return {@link OperatorErrorMetadata} | ||
* @throws IOException | ||
*/ | ||
public static OperatorErrorMetadata fromXContent(XContentParser parser) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be on the record, not the builder?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually for this one is a lot more convenient to be a builder method, we are iterating over the parser elements and we fill in the pieces of data we parse inside the builder. I had made the toXContent part of the builder too since they share the static strings, but I can remove that wrapping.
* @param metadata {@link OperatorHandlerMetadata} | ||
* @param builder {@link XContentBuilder} | ||
*/ | ||
public static void toXContent(OperatorHandlerMetadata metadata, XContentBuilder builder) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is toXContent needed on both the builder (as a static that takes the record?) and on the record itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll move it to the record only, similar to other places, I'm sharing the string constants for to and from xcontent.
|
||
String currentFieldName = skipNamespace(parser); | ||
XContentParser.Token token; | ||
while ((token = parser.nextToken()) != XContentParser.Token.END_OBJECT) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could a ConstructingObjectParser be used instead of this loop (same goes for other fromXContent in this PR)? If using that, would a builder even be necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll look into that, it would be great if we can.
Yeah, I figured it will be super hard without context. I'll start adding comments on various PR lines with context from the original PR in how it's used. I hope that will help. |
@@ -942,6 +947,15 @@ public Map<String, Custom> customs() { | |||
return this.customs; | |||
} | |||
|
|||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's an example in the related PR of how this will be used:
The related PR is out of date, it calls the API operatorState and it has the extra helper method to perform the additional .get.
Instead of:
OperatorMetadata existingMetadata = state.metadata().operatorState(namespace);
we'll be calling:
OperatorMetadata existingMetadata = state.metadata().operatorMetadata.get(namespace);
* This information is held by the {@link OperatorMetadata} class. | ||
* </p> | ||
*/ | ||
public record OperatorErrorMetadata(Long version, ErrorKind errorKind, List<String> errors) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error metadata is set when a file fails to get processed for some reason. In this case we launch a cluster state update task that will update this metadata. Example here:
The builder code is now removed, so the OperatorErrorMetadata will be directly created.
* multiple namespaces. See {@link OperatorMetadata} and {@link Metadata}. | ||
* </p> | ||
*/ | ||
public record OperatorHandlerMetadata(String name, Set<String> keys) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The OperatorHandlerMetadata is stored when we successfully process a new operator cluster state. Here's an example usage from the task that does the final update of the cluster state:
specifically here:
The only difference is that I renamed the API so instead of:
Metadata.Builder metadataBuilder = Metadata.builder(state.metadata()).putOperatorState(operatorMetadataBuilder.build());
we'll do:
Metadata.Builder metadataBuilder = Metadata.builder(state.metadata()).put(operatorMetadataBuilder.build());
* and cannot be modified by the end user. | ||
* </p> | ||
*/ | ||
public record OperatorMetadata( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is always used in the prior examples I posted for the OperatorErrorMetadata and OperatorHandlerMetadata. It's the top level objects that contains both, either or neither can be set.
I think I applied all suggestions and I linked with the relevant code in the WIP PR. I hope it's sufficiently described for a review. |
@elasticmachine update branch |
… into operator/metadata
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just a couple suggestions for avoiding TreeSets
@Override | ||
public XContentBuilder toXContent(XContentBuilder builder, Params params) throws IOException { | ||
builder.startObject(name()); | ||
builder.stringListField(KEYS.getPreferredName(), new TreeSet<>(keys())); // ordered set here to ensure output consistency |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Building a TreeSet would seem more complicated and costly (lots of object creations) than simply sorting the keys?
builder.startObject(namespace()); | ||
builder.field(VERSION.getPreferredName(), version); | ||
builder.startObject(HANDLERS.getPreferredName()); | ||
var sortedKeys = new TreeSet<>(handlers.keySet()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using a TreeSet here to do the sorting, we can use the stream api?
handlers.entrySet().stream().sorted(Map.Entry.comparingByKey()).forEach(e -> {
e.getValue().toXContent(builder, params);
}
* | ||
* <p> | ||
* These settings cannot be updated by the end user and are set outside of the | ||
* REST layer, e.g. through file based settings or by plugin/modules. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to mention this naming issue. Since all of these classes are about more than operator settings, I think we need better terminology that applies to file based settings and programmatic fixed settings. I'm not sure what that naming should be, but as it is the "operator" terminology will be confusing when we start using this in plugins and modules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps a name referencing cluster state. Some ideas:
unmodifiable cluster state
immutable cluster state
....
So then this class might be ImmutableClusterStateMetadata
. It's not the best, but it demonstrates the idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, I've been struggling with terminology on this. I like the immutable prefix, it's really what it is. Operator is just the mode we set them in initially.
@elasticmachine update branch |
This PR relates to #86224 (Operator/File cluster settings) where we need to store various information in the Metadata about the most recent processing of the operator cluster state update.
Since the original PR is too large on its own, I'm separating the individual components of the PR for easier review.