Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Profiling] Query in parallel only if beneficial #103061

Merged

Conversation

danielmitterdorfer
Copy link
Member

With this commit we check index allocation before we do key-value lookups. To reduce latency, key-value lookups are done in parallel for multiple slices of data. However, on nodes with spinning disks, parallel accesses are harmful. Therefore, we check whether any index is allocated either to the warm or cold tier (which are usually on spinning disks) and disable parallel key-value lookups. This has improved latency on the warm tier by about 10% in our experiments.

With this commit we check index allocation before we do key-value lookups. To
reduce latency, key-value lookups are done in parallel for multiple slices of
data. However, on nodes with spinning disks, parallel accesses are harmful.
Therefore, we check whether any index is allocated either to the warm or
cold tier (which are usually on spinning disks) and disable parallel
key-value lookups. This has improved latency on the warm tier by about
10% in our experiments.
@danielmitterdorfer danielmitterdorfer added >bug auto-backport-and-merge :UniversalProfiling/Application Elastic Universal Profiling REST APIs and infrastructure v8.12.0 labels Dec 6, 2023
@elasticsearchmachine elasticsearchmachine added the Team:obs-knowledge Meta label for Observability Knowledge team label Dec 6, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/obs-knowledge-team (Team:obs-knowledge)

@elasticsearchmachine
Copy link
Collaborator

Hi @danielmitterdorfer, I've created a changelog YAML for you.

@danielmitterdorfer
Copy link
Member Author

@elasticmachine merge upstream

ClusterState clusterState = clusterService.state();
List<Index> indices = resolver.resolve(clusterState, "profiling-stacktraces", responseBuilder.getStart(), responseBuilder.getEnd());
// Avoid parallelism if there is potential we are on spinning disks (frozen tier uses searchable snapshots)
int sliceCount = IndexAllocation.isAnyOnWarmOrColdTier(clusterState, indices) ? 1 : desiredSlices;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question. Did you test with 2 slices on the warm tier as well? If yes, what was the ~ impact?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just dug through my notes. I believe I did such a test but the superior alternative in all cases was just using a single slice (apparently I did not even bother to write down the results for anything except the default value of 16 and a single slice...).

Copy link
Contributor

@rockdaboot rockdaboot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@danielmitterdorfer danielmitterdorfer merged commit a721ed8 into elastic:main Dec 7, 2023
15 checks passed
@danielmitterdorfer danielmitterdorfer deleted the adaptive-slicing branch December 7, 2023 15:45
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

The backport operation could not be completed due to the following error:

There are no branches to backport to. Aborting.

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 103061

danielmitterdorfer added a commit to danielmitterdorfer/elasticsearch that referenced this pull request Dec 7, 2023
With this commit we check index allocation before we do key-value lookups. To
reduce latency, key-value lookups are done in parallel for multiple slices of
data. However, on nodes with spinning disks, parallel accesses are harmful.
Therefore, we check whether any index is allocated either to the warm or
cold tier (which are usually on spinning disks) and disable parallel
key-value lookups. This has improved latency on the warm tier by about
10% in our experiments.
@danielmitterdorfer
Copy link
Member Author

I'm manually backporting this in #103144.

elasticsearchmachine pushed a commit that referenced this pull request Dec 8, 2023
With this commit we check index allocation before we do key-value lookups. To
reduce latency, key-value lookups are done in parallel for multiple slices of
data. However, on nodes with spinning disks, parallel accesses are harmful.
Therefore, we check whether any index is allocated either to the warm or
cold tier (which are usually on spinning disks) and disable parallel
key-value lookups. This has improved latency on the warm tier by about
10% in our experiments.

Co-authored-by: Elastic Machine <[email protected]>
danielmitterdorfer added a commit to danielmitterdorfer/elasticsearch that referenced this pull request Jan 22, 2024
In order to take advantage of inherent parallelism of modern SSDs, we
slice keys and issue multiple mgets concurrently. In elastic#103061 we have
introduced an additional heuristic to disable that behavior on the warm
and cold tier which usually use spinning disks. We have unintentionally
also disabled the behavior on content nodes, i.e. on any clusters which
do not use data tiers. With this commit we explicitly exclude content
nodes from the heuristic so they can benefit from speedups due to
concurrent mgets.

Relates elastic#103061
danielmitterdorfer added a commit that referenced this pull request Jan 22, 2024
In order to take advantage of inherent parallelism of modern SSDs, we
slice keys and issue multiple mgets concurrently. In #103061 we have
introduced an additional heuristic to disable that behavior on the warm
and cold tier which usually use spinning disks. We have unintentionally
also disabled the behavior on content nodes, i.e. on any clusters which
do not use data tiers. With this commit we explicitly exclude content
nodes from the heuristic so they can benefit from speedups due to
concurrent mgets.

Relates #103061
danielmitterdorfer added a commit to danielmitterdorfer/elasticsearch that referenced this pull request Jan 22, 2024
In order to take advantage of inherent parallelism of modern SSDs, we
slice keys and issue multiple mgets concurrently. In elastic#103061 we have
introduced an additional heuristic to disable that behavior on the warm
and cold tier which usually use spinning disks. We have unintentionally
also disabled the behavior on content nodes, i.e. on any clusters which
do not use data tiers. With this commit we explicitly exclude content
nodes from the heuristic so they can benefit from speedups due to
concurrent mgets.

Relates elastic#103061
elasticsearchmachine pushed a commit that referenced this pull request Jan 22, 2024
In order to take advantage of inherent parallelism of modern SSDs, we
slice keys and issue multiple mgets concurrently. In #103061 we have
introduced an additional heuristic to disable that behavior on the warm
and cold tier which usually use spinning disks. We have unintentionally
also disabled the behavior on content nodes, i.e. on any clusters which
do not use data tiers. With this commit we explicitly exclude content
nodes from the heuristic so they can benefit from speedups due to
concurrent mgets.

Relates #103061
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug Team:obs-knowledge Meta label for Observability Knowledge team :UniversalProfiling/Application Elastic Universal Profiling REST APIs and infrastructure v8.12.0 v8.13.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants