We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hello!
https://discuss.elastic.co/t/curator-shard-has-exceeded-the-maximum-number-of-retries-1/290059
When the curator tries to allocate a replica shard of shrunken index I've got this error:
{ "index" : "example-index-2021-09-29-shrink", "shard" : 0, "primary" : false, "current_state" : "unassigned", "unassigned_info" : { "reason" : "ALLOCATION_FAILED", "at" : "2021-11-23T12:26:19.515Z", "failed_allocation_attempts" : 1, "details" : "failed shard on node [8r_zhRD4RDm2peWnDun_3w]: failed recovery, failure RecoveryFailedException[[example-index-2021-09-29-shrink][0]: Recovery failed from {node15}{nWOPSov3TFKUunoiooVxMQ}{PSAfiXvZQx-NLyKpnXGs1A}{192.168.0.164}{192.168.0.164:9300}{ml.machine_memory=135291469824, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true} into {node13}{8r_zhRD4RDm2peWnDun_3w}{KU0HhEPMQ_ilSV3RCe4XNw}{192.168.0.162}{192.168.0.162:9300}{ml.machine_memory=135291469824, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}]; nested: RemoteTransportException[[node15][172.17.0.3:9300][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[Phase[1] phase1 failed]; nested: RecoverFilesRecoveryException[Failed to transfer [85] files with total size of [24.8gb]]; nested: ReceiveTimeoutTransportException[[node13][192.168.0.162:9300][internal:index/shard/recovery/file_chunk] request_id [1586168734] timed out after [899897ms]]; ", "last_allocation_status" : "no_attempt" }, "can_allocate" : "no", "allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes", "node_allocation_decisions" : [ { "node_id" : "8r_zhRD4RDm2peWnDun_3w", "node_name" : "node13", "transport_address" : "192.168.0.162:9300", "node_attributes" : { "ml.machine_memory" : "135291469824", "xpack.installed" : "true", "ml.max_open_jobs" : "20", "ml.enabled" : "true" }, "node_decision" : "no", "deciders" : [ { "decider" : "max_retry", "decision" : "NO", "explanation" : "shard has exceeded the maximum number of retries [1] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2021-11-23T12:26:19.515Z], failed_attempts[1], delayed=false, details[failed shard on node [8r_zhRD4RDm2peWnDun_3w]: failed recovery, failure RecoveryFailedException[[example-index-2021-09-29-shrink][0]: Recovery failed from {node15}{nWOPSov3TFKUunoiooVxMQ}{PSAfiXvZQx-NLyKpnXGs1A}{192.168.0.164}{192.168.0.164:9300}{ml.machine_memory=135291469824, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true} into {node13}{8r_zhRD4RDm2peWnDun_3w}{KU0HhEPMQ_ilSV3RCe4XNw}{192.168.0.162}{192.168.0.162:9300}{ml.machine_memory=135291469824, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}]; nested: RemoteTransportException[[node15][172.17.0.3:9300][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[Phase[1] phase1 failed]; nested: RecoverFilesRecoveryException[Failed to transfer [85] files with total size of [24.8gb]]; nested: ReceiveTimeoutTransportException[[node13][192.168.0.162:9300][internal:index/shard/recovery/file_chunk] request_id [1586168734] timed out after [899897ms]]; ], allocation_status[no_attempt]]]" } ]
Is there a way to increase the "index.allocation.max_retries" in curator settings?
Action file:
actions: 1: action: shrink description: >- Shrink selected indices on the node with the most available space. Delete source index after successful shrink, then reroute the shrunk index with the provided parameters. options: ignore_empty_list: True shrink_node: DETERMINISTIC node_filters: permit_masters: True number_of_shards: 1 number_of_replicas: ${REPLICA_COUNT:1} shrink_prefix: shrink_suffix: '-shrink' copy_aliases: True delete_after: True wait_for_active_shards: 1 extra_settings: settings: index.codec: best_compression wait_for_completion: True wait_for_rebalance: True wait_interval: 9 max_wait: -1 filters: - filtertype: pattern kind: prefix value: ${INDEX_PREFIX} - filtertype: age source: name direction: older timestring: ${TIMESTAMP:'%Y-%m-%d'} unit: ${PERIOD:days} unit_count: ${PERIOD_COUNT}
Curator version: 5.8.4 OS: Centos 7
I've tried to create a template:
"shrink" : { "order" : 0, "index_patterns" : [ "*-shrink" ], "settings" : { "index" : { "allocation" : { "max_retries" : "5" } }
But it doesn't help. Here are indices settings after successful shrink:
GET /example-index-shrink/_settings { "example-index-shrink" : { "settings" : { "index" : { "allocation" : { "max_retries" : "1" }, "shrink" : { "source" : { "name" : "example-index", "uuid" : "mecKKzDDTzu77ViMv5N3EA" } }, "blocks" : { "write" : null }, "provided_name" : "example-index-shrink", "creation_date" : "1637751350836", "number_of_replicas" : "1", "uuid" : "MI_wbW35R8ubkYZOySfp1g", "version" : { "created" : "6080899", "upgraded" : "6080899" }, "codec" : "best_compression", "routing" : { "allocation" : { "initial_recovery" : { "_id" : "nWOPSov3TFKUunoiooVxMQ" }, "require" : { "_name" : null } } }, "number_of_shards" : "1", "routing_partition_size" : "1", "resize" : { "source" : { "name" : "example-index", "uuid" : "mecKKzDDTzu77ViMv5N3EA" } } } } } }
Thanks in advance
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Hello!
For usage questions and help
https://discuss.elastic.co/t/curator-shard-has-exceeded-the-maximum-number-of-retries-1/290059
When the curator tries to allocate a replica shard of shrunken index I've got this error:
Is there a way to increase the "index.allocation.max_retries" in curator settings?
Action file:
Curator version: 5.8.4
OS: Centos 7
I've tried to create a template:
But it doesn't help.
Here are indices settings after successful shrink:
Thanks in advance
The text was updated successfully, but these errors were encountered: