Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updateMappingOnMaster never times out leaving replicas stuck in INITIALIZING #9066

Closed
folke opened this issue Dec 24, 2014 · 3 comments
Closed
Assignees

Comments

@folke
Copy link

folke commented Dec 24, 2014

Due to issue #9065, tasks of priority NORMAL never got executed.

A problem related to this is that we had a number of replicas that stayed stuck in INITIALIZING in the phase TRANSLOG.

This was due to the fact that the task "recovery_mapping_check" never actually completed (priority NORMAL). In the method updateMappingOnMaster there is a timeout, but that timeout is never reached due to the latch.await(); code above the timeout check.

Version: 1.4.2

@clintongormley clintongormley added >bug :Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source. v1.4.3 labels Dec 29, 2014
@bleskes
Copy link
Contributor

bleskes commented Jan 25, 2015

@folke sorry for the very late response. The latch you refer to waits on the execution of recovery_mapping_check on the local node. The bug you mentioned concerns (if I understand correctly) refreshing the mapping on the master. Can you confirm it kept causing cluster state publishing?
I think a timeout in the recovery_mapping_check makes sense - waiting for the confirmation to see it will actually solve the issue.

@martijnvg martijnvg added v1.4.4 and removed v1.4.3 labels Feb 5, 2015
@spinscale spinscale added v1.4.5 and removed v1.4.4 labels Feb 19, 2015
@bleskes bleskes removed :Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source. >bug v1.4.5 labels Apr 11, 2015
@bleskes
Copy link
Contributor

bleskes commented Apr 11, 2015

this was resolved with #9575, forgot to close at the time.

@bleskes bleskes closed this as completed Apr 11, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants