Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ILM execution order on phase rollover #61014

Open
rsdrakh opened this issue Aug 12, 2020 · 5 comments
Open

ILM execution order on phase rollover #61014

rsdrakh opened this issue Aug 12, 2020 · 5 comments
Labels
:Data Management/ILM+SLM Index and Snapshot lifecycle management >enhancement Team:Data Management Meta label for data/management team

Comments

@rsdrakh
Copy link

rsdrakh commented Aug 12, 2020

I experienced non-optimal behaviour on ILM rollover from hot to warm phase.

My general pattern is

  • hot phase: 2 or more shards, 0 replicas
  • warm phase: shrink to 1 shard, forcemerge to 1 segment, create 1 replica

When executing the ILM policy, elasticsearch first creates the replicas for the existing primaries, then allocates one copy of each shard to the node that will do the shrink, then executes the shrink and again, creates a replica of the now shrunk single primary shard.

This adds some overhead, as creating the replica after the primary shards are in the configured condition (1 shrunk to a single shard/segment) is redundant.

In an environment with 2 nodes and 2 primaries that is no problem, as creating the replica has the same result as allocating all primaries to a single node.
But as soon as there are more primaries or more nodes, there is (in my opinion) unnecessary shard movement, at the recommended 25-50GB each.

Would it be possible to implement a logic that steers the allocation on rollover, taking into account such shard movements?

@rsdrakh rsdrakh added >enhancement needs:triage Requires assignment of a team area label labels Aug 12, 2020
@danielmitterdorfer danielmitterdorfer added :Data Management/ILM+SLM Index and Snapshot lifecycle management and removed needs:triage Requires assignment of a team area label labels Aug 12, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (:Core/Features/ILM+SLM)

@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Aug 12, 2020
@andreidan
Copy link
Contributor

@rsdrakh thanks for describing this issue. We talked about it today and we came up with the proposal of allowing the number of replicas to be configurable in the shrink action.
If the shrink action would have the option to specify the number_of_replicas, this would remove the need to use the allocate action to configure the number of replicas and the shrink action would set the index.numer_of_replicas on the shrunk index, essentially allowing you to shrink an index with 0 replicas and configure 1 replica on the shrunk index.
Would this reduce the overhead you're describing with regards to replicas allocations?

@rsdrakh
Copy link
Author

rsdrakh commented Aug 20, 2020

@andreidan Thanks for considering this issue!
As you describe it, it would reduce the overhead in my specific scenario.
Still, in case there is no "shrink" but only a change in "number_of_replicas", it might get tricky.
I am thinking of a hot warm scenario where a "hot tier" index with 1 replica is allocated to a "warm tier" and given 2 replicas. Is the shrink action involved as well in this case?
And when reducing the replicas from 2 to 1, the reduction should take place before allocating from hot to warm, shouldn't it?

So as I understood, on rollover there is an allocation action followed by a shrink action.

I am not familiar with the code, but IMHO a separate "replicas" action would make most sense, executed before allocation/shrink in case of reduced replicas, and post allocation/shrink in case of increased replicas.
Does that make sense?

@andreidan
Copy link
Contributor

@rsdrakh you make a great point about the most optimal execution path depending on how the number of replicas is changing (increased or decreased).
Currently, the number of replicas is changed using the allocation action, which is also responsible for changing the index allocation filters (ie. relocating the index based on the filters). As both these operations are controlled via index settings, an allocate action that modifies both the number of replicas and the allocation rules will see the number of replicas being changed before the index shards are rerouted to match the defined filtering rules.

And when reducing the replicas from 2 to 1, the reduction should take place before allocating from hot to warm, shouldn't it?

A warm phase that specifies an allocate action like below would currently achieve this.

"warm": {
        "actions": {
          "allocate" : {
            "number_of_replicas": 1,
            "include" : {
              "box_type": "warm"
            }
          }
        }
So as I understood, on rollover there is an allocation action followed by a shrink action.

Yes, with our proposal, the allocate action in the warm phase will not specify the number_of_replicas anymore so it will first allocate the index to the warm boxes and the number of replicas will be increased as part of a configuration in the shrink action.

I am thinking of a hot warm scenario where a "hot tier" index with 1 replica is allocated to a "warm tier" and given 2 replicas. Is the shrink action involved as well in this case?

No, the shrink action would not be involved here. This is one use case that our proposal would not tackle.

 but IMHO a separate "replicas" action would make most sense, executed before allocation/shrink in case of reduced replicas, and post allocation/shrink in case of increased replicas.

It does, and I agree that a replicas action would be a nice separation of concerns (as now the allocate action handles more than the allocation of the index), however the ILM action execution order is deterministic and static now, and that is by design. This change would be a high hanging fruit (the order of execution would need to have runtime information with respect to the the managed index, so in order to support the execute before allocation/shrink in case of reduced replicas, and post allocation/shrink in case of increased replicas we'd need to make decisions on the fly when ILM is executing for every index).

@joegallo
Copy link
Contributor

See also #73499, which isn't the same as this issue, but does touch on some additional flexibility around the shrink action and juggling the number of replicas while executing that action.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/ILM+SLM Index and Snapshot lifecycle management >enhancement Team:Data Management Meta label for data/management team
Projects
None yet
Development

No branches or pull requests

6 participants