You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description of the problem including expected versus actual behavior:
When upgrading from 5.6.7 to 6.3.2, we've noticed that relocating a primary shard takes longer.
It seems that in 6.3.2, when relocating a primary shard to a different node, translog operations are being replayed. This happens even if the shard on the source node was successfully flushed, which means the translog does not contain any operation that is not already contained in the files being copied to the target node.
In 5.6.7 the translog is emptied when flush takes place, so translog operations are not replayed during relocation.
The expected behavior is that the recovery of a flushed shard to an empty target node will not entail translog replay, only copying files.
Steps to reproduce:
Create an index with a single primary shard and no replicas.
Index 1M documents.
Flush the index.
Relocate the index to a different node using e.g. "index.routing.allocation.require._name" with a different node name.
If you set org.elasticsearch.indices.recovery logging level to TRACE, you will see that a file based recovery is taking place, that the files are transferred and then the translog is being sent and replayed on the target node:
[2018-08-28T13:10:31,099][TRACE][o.e.i.r.RecoverySourceHandler] [node_td2] [index][0][recover to node_td1] sent batch of [10083][512kb] (total: [1000000]) translog operations
The text was updated successfully, but these errors were encountered:
We replay the translog operations so that the relocated shard has a history of operations in its translog too. This history is important for operations-based recoveries (when a shard temporarily goes offline and only needs to replay some operations to catch up). By default, we are now retaining 512 MB or twelve hours of translog files for these purposes. We are making some improvements here as we work on relying on the translog less for history, see for example #33190. It will be awhile though until the behavior that PR is building on is the default. For now, you can adjust your translog retention policy, but the risk is that a shard will not have enough history in its translog for an operations-based recovery and you will have to fall back to file-based recoveries.
Elasticsearch version (6.3.2)
Description of the problem including expected versus actual behavior:
When upgrading from 5.6.7 to 6.3.2, we've noticed that relocating a primary shard takes longer.
It seems that in 6.3.2, when relocating a primary shard to a different node, translog operations are being replayed. This happens even if the shard on the source node was successfully flushed, which means the translog does not contain any operation that is not already contained in the files being copied to the target node.
In 5.6.7 the translog is emptied when flush takes place, so translog operations are not replayed during relocation.
The expected behavior is that the recovery of a flushed shard to an empty target node will not entail translog replay, only copying files.
Steps to reproduce:
If you set org.elasticsearch.indices.recovery logging level to TRACE, you will see that a file based recovery is taking place, that the files are transferred and then the translog is being sent and replayed on the target node:
[2018-08-28T13:10:31,099][TRACE][o.e.i.r.RecoverySourceHandler] [node_td2] [index][0][recover to node_td1] sent batch of [10083][512kb] (total: [1000000]) translog operations
The text was updated successfully, but these errors were encountered: