Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add immutable 'operator' metadata classes for cluster state #87763

Merged
merged 14 commits into from
Jun 29, 2022

Conversation

grcevski
Copy link
Contributor

This PR relates to #86224 (Operator/File cluster settings) where we need to store various information in the Metadata about the most recent processing of the operator cluster state update.

Since the original PR is too large on its own, I'm separating the individual components of the PR for easier review.

@grcevski grcevski added >non-issue :Core/Infra/Core Core issues without another label Team:Core/Infra Meta label for core/infra team v8.4.0 labels Jun 16, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for splitting this out. Overall the use of records is nice here, and I appreciate all the javadocs! I left some questions. The downside of the way the PR is split is I can't tell how any of these classes are intended to be used, so it is difficult to comment on particular methods and whether they could be simplified. I'm not sure how to remedy that...

* @param namespace the namespace to lookup operator metadata for
* @return {@link OperatorMetadata} or null if not found
*/
public OperatorMetadata operatorMetadata(String namespace) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have this method and also the one above returning the entire map? If the one above is needed, then it could also be used in place of this method?

@@ -1146,6 +1181,7 @@ public Metadata apply(Metadata part) {
builder.indices(indices.apply(part.indices));
builder.templates(templates.apply(part.templates));
builder.customs(customs.apply(part.customs));
builder.put(operatorMetadata.apply(part.operatorMetadata));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the jdk map diff return an immutable/unmodifiable map? We could either wrap or copy this here to make it immutable, or even better open a separate issue to make the jdk map diff's apply return an immutable map (I've had the need for this in recent hppc PRs)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made an issue here #88019. In the interim until that's fixed I'll wrap this in an unmodifiable map.

* @param operatorMetadata a map of namespace to {@link OperatorMetadata}
* @return {@link Builder}
*/
public Builder put(Map<String, OperatorMetadata> operatorMetadata) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, I used to call this method putOperatorState, but it was too long and it didn't match the naming of other methods we have in Metadata, we simply call .put with different types. Here's the usage here:

https://github.com/elastic/elasticsearch/pull/86224/files#diff-8ca6818d706ec1720e045ea8a1f4f8e4b5dda47c5610aed2bf8e7d55121d8391R129

*/
public record OperatorErrorMetadata(
Long version,
org.elasticsearch.cluster.metadata.OperatorErrorMetadata.ErrorKind errorKind,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: static import ErrorKind?

* @throws IOException
*/
public static OperatorErrorMetadata readFrom(StreamInput in) throws IOException {
Builder builder = new Builder().version(in.readLong())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why create a builder, these can be passed to the canonical ctor?

* @param metadata {@link OperatorErrorMetadata}
* @param builder {@link XContentBuilder}
*/
public static void toXContent(OperatorErrorMetadata metadata, XContentBuilder builder) throws IOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a member method of the record?

* @return {@link OperatorErrorMetadata}
* @throws IOException
*/
public static OperatorErrorMetadata fromXContent(XContentParser parser) throws IOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be on the record, not the builder?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually for this one is a lot more convenient to be a builder method, we are iterating over the parser elements and we fill in the pieces of data we parse inside the builder. I had made the toXContent part of the builder too since they share the static strings, but I can remove that wrapping.

* @param metadata {@link OperatorHandlerMetadata}
* @param builder {@link XContentBuilder}
*/
public static void toXContent(OperatorHandlerMetadata metadata, XContentBuilder builder) throws IOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is toXContent needed on both the builder (as a static that takes the record?) and on the record itself?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll move it to the record only, similar to other places, I'm sharing the string constants for to and from xcontent.


String currentFieldName = skipNamespace(parser);
XContentParser.Token token;
while ((token = parser.nextToken()) != XContentParser.Token.END_OBJECT) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could a ConstructingObjectParser be used instead of this loop (same goes for other fromXContent in this PR)? If using that, would a builder even be necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look into that, it would be great if we can.

@grcevski
Copy link
Contributor Author

Thanks for splitting this out. Overall the use of records is nice here, and I appreciate all the javadocs! I left some questions. The downside of the way the PR is split is I can't tell how any of these classes are intended to be used, so it is difficult to comment on particular methods and whether they could be simplified. I'm not sure how to remedy that...

Yeah, I figured it will be super hard without context. I'll start adding comments on various PR lines with context from the original PR in how it's used. I hope that will help.

@@ -942,6 +947,15 @@ public Map<String, Custom> customs() {
return this.customs;
}

/**
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's an example in the related PR of how this will be used:

https://github.com/elastic/elasticsearch/pull/86224/files#diff-d017ebae07494c580da12a539a666fe1d87537f167e71761cf60451b8913d554R127

The related PR is out of date, it calls the API operatorState and it has the extra helper method to perform the additional .get.

Instead of:

OperatorMetadata existingMetadata = state.metadata().operatorState(namespace);

we'll be calling:

OperatorMetadata existingMetadata = state.metadata().operatorMetadata.get(namespace);

* This information is held by the {@link OperatorMetadata} class.
* </p>
*/
public record OperatorErrorMetadata(Long version, ErrorKind errorKind, List<String> errors)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error metadata is set when a file fails to get processed for some reason. In this case we launch a cluster state update task that will update this metadata. Example here:

https://github.com/elastic/elasticsearch/pull/86224/files#diff-a7da9bd5e6ce66769b5113745baa42c88b7c21b916a8ea290434048fc5af2c5fR29

The builder code is now removed, so the OperatorErrorMetadata will be directly created.

* multiple namespaces. See {@link OperatorMetadata} and {@link Metadata}.
* </p>
*/
public record OperatorHandlerMetadata(String name, Set<String> keys)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OperatorHandlerMetadata is stored when we successfully process a new operator cluster state. Here's an example usage from the task that does the final update of the cluster state:

https://github.com/elastic/elasticsearch/pull/86224/files#diff-8ca6818d706ec1720e045ea8a1f4f8e4b5dda47c5610aed2bf8e7d55121d8391R40

specifically here:

https://github.com/elastic/elasticsearch/pull/86224/files#diff-8ca6818d706ec1720e045ea8a1f4f8e4b5dda47c5610aed2bf8e7d55121d8391R128

The only difference is that I renamed the API so instead of:

Metadata.Builder metadataBuilder = Metadata.builder(state.metadata()).putOperatorState(operatorMetadataBuilder.build());

we'll do:

Metadata.Builder metadataBuilder = Metadata.builder(state.metadata()).put(operatorMetadataBuilder.build());

* and cannot be modified by the end user.
* </p>
*/
public record OperatorMetadata(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is always used in the prior examples I posted for the OperatorErrorMetadata and OperatorHandlerMetadata. It's the top level objects that contains both, either or neither can be set.

@grcevski
Copy link
Contributor Author

I think I applied all suggestions and I linked with the relevant code in the WIP PR. I hope it's sufficiently described for a review.

@grcevski grcevski requested a review from rjernst June 24, 2022 20:42
@grcevski
Copy link
Contributor Author

@elasticmachine update branch

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just a couple suggestions for avoiding TreeSets

@Override
public XContentBuilder toXContent(XContentBuilder builder, Params params) throws IOException {
builder.startObject(name());
builder.stringListField(KEYS.getPreferredName(), new TreeSet<>(keys())); // ordered set here to ensure output consistency
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Building a TreeSet would seem more complicated and costly (lots of object creations) than simply sorting the keys?

builder.startObject(namespace());
builder.field(VERSION.getPreferredName(), version);
builder.startObject(HANDLERS.getPreferredName());
var sortedKeys = new TreeSet<>(handlers.keySet());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using a TreeSet here to do the sorting, we can use the stream api?

handlers.entrySet().stream().sorted(Map.Entry.comparingByKey()).forEach(e -> {
    e.getValue().toXContent(builder, params);
}

*
* <p>
* These settings cannot be updated by the end user and are set outside of the
* REST layer, e.g. through file based settings or by plugin/modules.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to mention this naming issue. Since all of these classes are about more than operator settings, I think we need better terminology that applies to file based settings and programmatic fixed settings. I'm not sure what that naming should be, but as it is the "operator" terminology will be confusing when we start using this in plugins and modules.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a name referencing cluster state. Some ideas:

unmodifiable cluster state
immutable cluster state
....

So then this class might be ImmutableClusterStateMetadata. It's not the best, but it demonstrates the idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I've been struggling with terminology on this. I like the immutable prefix, it's really what it is. Operator is just the mode we set them in initially.

@grcevski
Copy link
Contributor Author

@elasticmachine update branch

@grcevski grcevski changed the title Add operator metadata classes for cluster state Add immutable 'operator' metadata classes for cluster state Jun 29, 2022
@grcevski grcevski merged commit e1d03ef into elastic:master Jun 29, 2022
@grcevski grcevski deleted the operator/metadata branch June 29, 2022 22:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Core Core issues without another label >non-issue Team:Core/Infra Meta label for core/infra team v8.4.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants