-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[STORE] Move to one datapath per shard #9498
Comments
+1 on this change, my suggestion to upgrade is to upgrade on node startup automatically. We could scan the data directory, and rebalance shards data paths during startup, and only then make the node available. If we can use MOVE of files it would be great, and if MOVE fails, we can resort to copying it over so it will be more lightweight. |
moved out to 2.0 for now |
This is a very common scenario when you have very large clusters and multiple disks on each node. Some disk is bound to fail every few days. We had a bunch of disk failures across some days and took some time to figure out the cause was due to shards striped across multiple disks. When it runs it is perfect since the load is balanced and the multiple disk spins are utilized. We had to shutdown ES on the issue node, remove the disk from ES config and bring ES back up after sometime. This forced the Node refresh of data from other Nodes. With the all shard data on one disk, balancing/utilizing all disks may have to be given a good look, ie if this will cause some disks to be over-utilized due to that index living there getting more data. Also about reading performance since now all from one single disk. |
+1 on doing this.
I'm not sure how this can be nicely automated..... I think the optimal process involves slowly moving away from this by moving shards off of a node (a few nodes?), taking it out of the cluster, reformatting its multiple disks to proper raid, adding it back to the cluster, and letting the cluster heal. I suppose in many respects its similar to replacing the disks entirely. We did that about a year ago with great success, btw. |
Overall +1 to this change for increasing durability of shards. However once this change goes though one will need to either resort to RAID of some description of more shards to make up for the loss of spindles. In light of this change having multiple shards allocated to the same node would actually be beneficial in terms of IO throughput but could anyone comment on any drawbacks of this? It seems to be the most elegant solution to restore the lost performance without resorting to external systems. If the increase in shards is indeed feasible it would also be nice as a latter optimisation to attempt to allocate shards of the same index to different disks. |
This commit moves away from using stripe RAID-0 simumlation across multiple data paths towards using a single path per shard. Multiple data paths are still supported but shards and it's data is not striped across multiple paths / disks. This will for instance prevent to loose all shards if a single disk is corrupted. Indices that are using this features already will automatically upgraded to a single datapath based on a simple diskspace based heuristic. In general there must be enough diskspace to move a single shard at any time otherwise the upgrade will fail. Closes elastic#9498
This commit moves away from using stripe RAID-0 simumlation across multiple data paths towards using a single path per shard. Multiple data paths are still supported but shards and it's data is not striped across multiple paths / disks. This will for instance prevent to loose all shards if a single disk is corrupted. Indices that are using this features already will automatically upgraded to a single datapath based on a simple diskspace based heuristic. In general there must be enough diskspace to move a single shard at any time otherwise the upgrade will fail. Closes elastic#9498
This commit moves away from using stripe RAID-0 simumlation across multiple data paths towards using a single path per shard. Multiple data paths are still supported but shards and it's data is not striped across multiple paths / disks. This will for instance prevent to loose all shards if a single disk is corrupted. Indices that are using this features already will automatically upgraded to a single datapath based on a simple diskspace based heuristic. In general there must be enough diskspace to move a single shard at any time otherwise the upgrade will fail. Closes elastic#9498
This commit moves away from using stripe RAID-0 simumlation across multiple data paths towards using a single path per shard. Multiple data paths are still supported but shards and it's data is not striped across multiple paths / disks. This will for instance prevent to loose all shards if a single disk is corrupted. Indices that are using this features already will automatically upgraded to a single datapath based on a simple diskspace based heuristic. In general there must be enough diskspace to move a single shard at any time otherwise the upgrade will fail. Closes elastic#9498
This commit moves away from using stripe RAID-0 simumlation across multiple data paths towards using a single path per shard. Multiple data paths are still supported but shards and it's data is not striped across multiple paths / disks. This will for instance prevent to loose all shards if a single disk is corrupted. Indices that are using this features already will automatically upgraded to a single datapath based on a simple diskspace based heuristic. In general there must be enough diskspace to move a single shard at any time otherwise the upgrade will fail. Closes elastic#9498
this commit removes the obsolete settings for distributors and updates the documentation on multiple data.path. It also adds an explain to the migration guide. Relates to elastic#9498 Closes elastic#10770
Today Elasticsearch has the ability to utilize multiple datapaths
which is useful to spread load across several disks throughput and space wise.
Yet, the way it is utilized is problematic since it's similar to a stripe raid where
each shard gets a dedicated directory on each of the data paths / disks. This method has several
problems:
This issues aims to move away from this shard striping towards a datapath per shard such that all
files for a Elasticsearch Shard (lucene index) are on the same directory and no special code is needed.
This requires some changes how we balance disk utilization, today we pick the directories when a file
needs to be written, now we need to select the directory ahead of time. While this seems to be trivial
since we can just pick the least utilized disk we need to take the expected size of the shard into account
and see if we can allocate this shard on a certain node at all. This can have implication on how we do this
since data-paths space is accumulated for this decision.
Rather less trivial is the migration path for this, we need to move from multiple directories to a single directory
likely in the background such that the upgrade process for users is smooth. There are several options:
However, I think the code changes can we done in two steps:
since the latter is much harder.
The text was updated successfully, but these errors were encountered: