Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Settings Diffable #88815

Merged
merged 7 commits into from
Jul 29, 2022

Conversation

original-brownbear
Copy link
Member

@original-brownbear original-brownbear commented Jul 26, 2022

Making settings diffable so that index metadata diffs
are smaller whenever the metadata changes without a setting change
as well as to make index setting updates over a wider number of
indices faster.

This saves about 3% of CPU time on master and about half that on data nodes
that is just burnt for writing setting strings (mainly a result of the extremely heavy default query fields list in the Beats template I think) when bootstrapping many shards benchmarks benchmarks to 50k indices.

relates #77466

Making settings diffable so that index metadata diffs
are smaller whenever the metadata changes without a setting change
as well as to make index setting updates over a wider number of
indices faster.

This saves about 3% of CPU time on master and about half that on data nodes
that is just burnt for writing setting strings
when bootstrapping many shards benchmarks benchmarks to 50k indices.
@elasticsearchmachine elasticsearchmachine added the Team:Core/Infra Meta label for core/infra team label Jul 26, 2022
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@@ -402,6 +402,25 @@ public void writeTo(StreamOutput out) throws IOException {
}
}

private static <K, T> Map.Entry<K, T> mapEntry(K key, T newValue) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a quick fix here to enable null values as we can have them for settings. The default map entries don't allow null values.

@@ -1420,7 +1433,13 @@ public void writeTo(StreamOutput out) throws IOException {
out.writeVLong(aliasesVersion);
}
out.writeByte(state.id);
Settings.writeSettingsToStream(settings, out);
assert settings != null
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not the nicest way to go about BwC but I couldn't see a neater solution.
As we never write this over cross-cluster connections this should be safe as far as I can tell.

@original-brownbear
Copy link
Member Author

Adding WIP label here, as I just realized I forgot dedicated test for the diffing. Will add shortly, please hold off on reviewing.

@original-brownbear original-brownbear marked this pull request as draft July 26, 2022 15:56
@original-brownbear original-brownbear marked this pull request as ready for review July 27, 2022 12:35
@mark-vieira mark-vieira added v8.5.0 and removed v8.4.0 labels Jul 27, 2022
Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, a couple minor comments.

@@ -402,6 +402,25 @@ public void writeTo(StreamOutput out) throws IOException {
}
}

private static <K, T> Map.Entry<K, T> mapEntry(K key, T newValue) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the special map entry? These will be recreated by any map impls right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, we really didn't need Map.Entry for the sites that we use it to begin with, Tuple would've been just fine but I didn't want to mix more refactoring than necessary into this PR so I went with this since weirdly enough I couldn't find a built-in JDK way of creating a Map.Entry that allows me a null value ...

out.writeMap(settings, StreamOutput::writeString, Settings::writeSettingValue);
}

private static void writeSettingValue(StreamOutput streamOutput, Object value) throws IOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this was already here, and I think it was from a previous PR, but why aren't we using writeGenericValue?I do notice that we use the fully generic read above. If it's for performance, can you add a comment so that me or anyone else that wonders this in the future remembers. :)

Copy link
Member Author

@original-brownbear original-brownbear Jul 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's for performance

yes that's what it's for, the difference to the generic method is massive here. Will add a comment

@@ -441,6 +442,17 @@ public void testWriteSettingsToStream() throws IOException {
assertEquals(Arrays.asList("1", "2"), settings.getAsList("test.key4.foo"));
}

public void testDiff() throws IOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have some more tests? eg a noop test, a test with more than one setting, etc. Basically, can we have something to exercise each path in the setting specific diff code you added (maybe the noop is the only additional thing needed).

@original-brownbear
Copy link
Member Author

Jenkins run elasticsearch-ci/packaging-tests-unix-sample

@original-brownbear
Copy link
Member Author

Thanks Ryan! Added a noop test and a list test case to cover all possible paths.
Merging despite the red unix-sample since it's just a random distribution failing on repeats because of Jenkins issues unrelated to this change.

@original-brownbear original-brownbear merged commit a3a5332 into elastic:main Jul 29, 2022
@original-brownbear original-brownbear deleted the diffable-settings branch July 29, 2022 16:30
rjernst pushed a commit to rjernst/elasticsearch that referenced this pull request Jul 29, 2022
Making settings diffable so that index metadata diffs
are smaller whenever the metadata changes without a setting change
as well as to make index setting updates over a wider number of
indices faster.

This saves about 3% of CPU time on master and about half that on data nodes
that is just burnt for writing setting strings
when bootstrapping many shards benchmarks benchmarks to 50k indices.
pull bot pushed a commit to NOUIY/elasticsearch that referenced this pull request Aug 8, 2022
Follow-up to elastic#88815.
No need to have two equivalent implementations here.
@original-brownbear original-brownbear restored the diffable-settings branch April 18, 2023 20:36
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Apr 18, 2023
Follow-up to elastic#88815.
No need to have two equivalent implementations here.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Settings Settings infrastructure and APIs >non-issue Team:Core/Infra Meta label for core/infra team v8.5.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants