-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shard history retention leases #37165
Labels
:Data Management/ILM+SLM
Index and Snapshot lifecycle management
:Distributed Indexing/CCR
Issues around the Cross Cluster State Replication features
:Distributed Indexing/Distributed
A catch all label for anything in the Distributed Area. Please avoid if you can.
:Distributed Indexing/Recovery
Anything around constructing a new shard, either from a local or a remote source.
>feature
Meta
Comments
jasontedor
added
>feature
:Distributed Indexing/Recovery
Anything around constructing a new shard, either from a local or a remote source.
:Distributed Indexing/Distributed
A catch all label for anything in the Distributed Area. Please avoid if you can.
:Distributed Indexing/CCR
Issues around the Cross Cluster State Replication features
:Data Management/ILM+SLM
Index and Snapshot lifecycle management
labels
Jan 6, 2019
Pinging @elastic/es-distributed |
Pinging @elastic/es-core-features |
This was referenced Jan 6, 2019
3 tasks
This was referenced Jan 10, 2019
dnhatn
added a commit
that referenced
this issue
Jan 31, 2019
dnhatn
added a commit
that referenced
this issue
Jan 31, 2019
dnhatn
added a commit
that referenced
this issue
Feb 13, 2019
When a primary shard is recovered from its store, we trim the last commit (when it's unsafe). If that primary crashes before the recovery completes, we will lose the committed retention leases because they are baked in the last commit. With this change, we copy the retention leases from the last commit to the safe commit when trimming unsafe commits. Relates #37165
dnhatn
added a commit
that referenced
this issue
Feb 13, 2019
When a primary shard is recovered from its store, we trim the last commit (when it's unsafe). If that primary crashes before the recovery completes, we will lose the committed retention leases because they are baked in the last commit. With this change, we copy the retention leases from the last commit to the safe commit when trimming unsafe commits. Relates #37165
DaveCTurner
added a commit
to DaveCTurner/elasticsearch
that referenced
this issue
Feb 14, 2019
Today if soft deletes are enabled then we read the operations needed for peer recovery from Lucene. However we do not currently make any attempt to retain history in Lucene specifically for peer recoveries so we may discard it and fall back to a more expensive file-based recovery. Yet we still retain sufficient history in the translog to perform an operations-based peer recovery. In the long run we would like to fix this by retaining more history in Lucene, possibly using shard history retention leases (elastic#37165). For now, however, this commit reverts to performing peer recoveries using the history retained in the translog regardless of whether soft deletes are enabled or not.
DaveCTurner
added a commit
that referenced
this issue
Feb 15, 2019
Today if soft deletes are enabled then we read the operations needed for peer recovery from Lucene. However we do not currently make any attempt to retain history in Lucene specifically for peer recoveries so we may discard it and fall back to a more expensive file-based recovery. Yet we still retain sufficient history in the translog to perform an operations-based peer recovery. In the long run we would like to fix this by retaining more history in Lucene, possibly using shard history retention leases (#37165). For now, however, this commit reverts to performing peer recoveries using the history retained in the translog regardless of whether soft deletes are enabled or not.
DaveCTurner
added a commit
that referenced
this issue
Feb 15, 2019
Today if soft deletes are enabled then we read the operations needed for peer recovery from Lucene. However we do not currently make any attempt to retain history in Lucene specifically for peer recoveries so we may discard it and fall back to a more expensive file-based recovery. Yet we still retain sufficient history in the translog to perform an operations-based peer recovery. In the long run we would like to fix this by retaining more history in Lucene, possibly using shard history retention leases (#37165). For now, however, this commit reverts to performing peer recoveries using the history retained in the translog regardless of whether soft deletes are enabled or not.
DaveCTurner
added a commit
that referenced
this issue
Feb 15, 2019
Today if soft deletes are enabled then we read the operations needed for peer recovery from Lucene. However we do not currently make any attempt to retain history in Lucene specifically for peer recoveries so we may discard it and fall back to a more expensive file-based recovery. Yet we still retain sufficient history in the translog to perform an operations-based peer recovery. In the long run we would like to fix this by retaining more history in Lucene, possibly using shard history retention leases (#37165). For now, however, this commit reverts to performing peer recoveries using the history retained in the translog regardless of whether soft deletes are enabled or not.
This was referenced Feb 16, 2019
dnhatn
added a commit
that referenced
this issue
Feb 20, 2019
This commit introduces the retention leases to ESIndexLevelReplicationTestCase, then adds some tests verifying that the retention leases replication works correctly in spite of the presence of the primary failover or out of order delivery of retention leases sync requests. Relates #37165
dnhatn
added a commit
that referenced
this issue
Feb 21, 2019
This commit introduces the retention leases to ESIndexLevelReplicationTestCase, then adds some tests verifying that the retention leases replication works correctly in spite of the presence of the primary failover or out of order delivery of retention leases sync requests. Relates #37165
dnhatn
added a commit
that referenced
this issue
Feb 21, 2019
This commit introduces the retention leases to ESIndexLevelReplicationTestCase, then adds some tests verifying that the retention leases replication works correctly in spite of the presence of the primary failover or out of order delivery of retention leases sync requests. Relates #37165
weizijun
pushed a commit
to weizijun/elasticsearch
that referenced
this issue
Feb 22, 2019
This commit introduces the retention leases to ESIndexLevelReplicationTestCase, then adds some tests verifying that the retention leases replication works correctly in spite of the presence of the primary failover or out of order delivery of retention leases sync requests. Relates elastic#37165
weizijun
pushed a commit
to weizijun/elasticsearch
that referenced
this issue
Feb 22, 2019
This commit introduces the retention leases to ESIndexLevelReplicationTestCase, then adds some tests verifying that the retention leases replication works correctly in spite of the presence of the primary failover or out of order delivery of retention leases sync requests. Relates elastic#37165
This was referenced Feb 27, 2019
DaveCTurner
added a commit
that referenced
this issue
Mar 15, 2019
Today we load the shard history retention leases from disk whenever opening the engine, and treat a missing file as an empty set of leases. However in some cases this is inappropriate: we might be restoring from a snapshot (if the target index already exists then there may be leases on disk) or force-allocating a stale primary, and in neither case does it make sense to restore the retention leases from disk. With this change we write an empty retention leases file during recovery, except for the following cases: - During peer recovery the on-disk leases may be accurate and could be needed if the recovery target is made into a primary. - During recovery from an existing store, as long as we are not force-allocating a stale primary. Relates #37165
DaveCTurner
added a commit
that referenced
this issue
Mar 15, 2019
Today we load the shard history retention leases from disk whenever opening the engine, and treat a missing file as an empty set of leases. However in some cases this is inappropriate: we might be restoring from a snapshot (if the target index already exists then there may be leases on disk) or force-allocating a stale primary, and in neither case does it make sense to restore the retention leases from disk. With this change we write an empty retention leases file during recovery, except for the following cases: - During peer recovery the on-disk leases may be accurate and could be needed if the recovery target is made into a primary. - During recovery from an existing store, as long as we are not force-allocating a stale primary. Relates #37165
DaveCTurner
added a commit
to DaveCTurner/elasticsearch
that referenced
this issue
Mar 15, 2019
Today we load the shard history retention leases from disk whenever opening the engine, and treat a missing file as an empty set of leases. However in some cases this is inappropriate: we might be restoring from a snapshot (if the target index already exists then there may be leases on disk) or force-allocating a stale primary, and in neither case does it make sense to restore the retention leases from disk. With this change we write an empty retention leases file during recovery, except for the following cases: - During peer recovery the on-disk leases may be accurate and could be needed if the recovery target is made into a primary. - During recovery from an existing store, as long as we are not force-allocating a stale primary. Relates elastic#37165
DaveCTurner
added a commit
that referenced
this issue
Mar 15, 2019
Today we load the shard history retention leases from disk whenever opening the engine, and treat a missing file as an empty set of leases. However in some cases this is inappropriate: we might be restoring from a snapshot (if the target index already exists then there may be leases on disk) or force-allocating a stale primary, and in neither case does it make sense to restore the retention leases from disk. With this change we write an empty retention leases file during recovery, except for the following cases: - During peer recovery the on-disk leases may be accurate and could be needed if the recovery target is made into a primary. - During recovery from an existing store, as long as we are not force-allocating a stale primary. Relates #37165
DaveCTurner
added a commit
that referenced
this issue
Mar 15, 2019
Today we load the shard history retention leases from disk whenever opening the engine, and treat a missing file as an empty set of leases. However in some cases this is inappropriate: we might be restoring from a snapshot (if the target index already exists then there may be leases on disk) or force-allocating a stale primary, and in neither case does it make sense to restore the retention leases from disk. With this change we write an empty retention leases file during recovery, except for the following cases: - During peer recovery the on-disk leases may be accurate and could be needed if the recovery target is made into a primary. - During recovery from an existing store, as long as we are not force-allocating a stale primary. Relates #37165
Closing since the work to integrate shard history retention leases with recovery is tracked #41536. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
:Data Management/ILM+SLM
Index and Snapshot lifecycle management
:Distributed Indexing/CCR
Issues around the Cross Cluster State Replication features
:Distributed Indexing/Distributed
A catch all label for anything in the Distributed Area. Please avoid if you can.
:Distributed Indexing/Recovery
Anything around constructing a new shard, either from a local or a remote source.
>feature
Meta
When a shard of a follower index is consuming shard history from its corresponding shard of its leader index, it could be that the history operations is no longer available on any of the leader shard copies. This can happen if some operations were soft deleted and subsequently merged away before the shard of the following index had a chance to replicate these operations. This has catastrophic consequences for the follower index though as now the only option for the follower index to recover is a full file-based recovery. In the context of cross-cluster replication, this can potentially be over a WAN with limited networking resources. During this file-based recovery, the follower index becomes unavailable, defeating the purpose of being an available copy of the leader index in another cluster.
One idea towards solving this problem is for the shard of a follower index to be able to leave a marker on the corresponding shard of its leader index to notate where in shard history the following shard is. This marker would prevent any operations with sequence number at least at that marker from being eligible to be merged away.
And thus was born the idea of shard history retention leases. Shard history retention leases are aimed at preventing shard history consumers from having to fallback to expensive file copy operations if shard history is not available from a certain point. These consumers include following indices in cross-cluster replication, and local shard recoveries. A future consumer will be the changes API.
Further, index lifecycle management requires coordinating with some of these consumers otherwise it could remove the source before all consumers have finished reading all operations. The notion of shard history retention leases that we are introducing here will also be used to address this problem.
Shard history retention leases are a property of the replication group managed under the authority of the primary. A shard history retention lease is a combination of an identifier, a retaining sequence number, a timestamp indicating when the lease was acquired or renewed, and a string indicating the source of the lease. Being leases they have a limited lifespan that will expire if not renewed. The idea of these leases is that all operations above the minimum of all retaining sequence numbers will be retained during merges (which would otherwise clear away operations that are soft deleted). These leases will be periodically persisted to
Lucenea dedicated state file and restored during recovery, and broadcast to replicas under certain circumstances.This issue is a meta-issue for tracking the progress of implementing shard history retention leases. We will proceed with implementing shard history retention leases along the following rough plan:
persist retention leases in Luceneswitched to persisting in a dedicated state file Introduce retention lease state file #39004The text was updated successfully, but these errors were encountered: