-
Notifications
You must be signed in to change notification settings - Fork 24.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Docs for translog, history retention and flushing (#46245)
This commit updates the docs about translog retention and flushing to reflect recent changes in how peer recoveries work. It also adds some docs to describe how history is retained for replay using soft deletes and shard history retention leases. Relates #45473
- Loading branch information
1 parent
5e682a0
commit 0e2a53e
Showing
4 changed files
with
219 additions
and
108 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
[[index-modules-history-retention]] | ||
== History retention | ||
|
||
{es} sometimes needs to replay some of the operations that were performed on a | ||
shard. For instance, if a replica is briefly offline then it may be much more | ||
efficient to replay the few operations it missed while it was offline than to | ||
rebuild it from scratch. Similarly, {ccr} works by performing operations on the | ||
leader cluster and then replaying those operations on the follower cluster. | ||
|
||
At the Lucene level there are really only two write operations that {es} | ||
performs on an index: a new document may be indexed, or an existing document may | ||
be deleted. Updates are implemented by atomically deleting the old document and | ||
then indexing the new document. A document indexed into Lucene already contains | ||
all the information needed to replay that indexing operation, but this is not | ||
true of document deletions. To solve this, {es} uses a feature called _soft | ||
deletes_ to preserve recent deletions in the Lucene index so that they can be | ||
replayed. | ||
|
||
{es} only preserves certain recently-deleted documents in the index because a | ||
soft-deleted document still takes up some space. Eventually {es} will fully | ||
discard these soft-deleted documents to free up that space so that the index | ||
does not grow larger and larger over time. Fortunately {es} does not need to be | ||
able to replay every operation that has ever been performed on a shard, because | ||
it is always possible to make a full copy of a shard on a remote node. However, | ||
copying the whole shard may take much longer than replaying a few missing | ||
operations, so {es} tries to retain all of the operations it expects to need to | ||
replay in future. | ||
|
||
{es} keeps track of the operations it expects to need to replay in future using | ||
a mechanism called _shard history retention leases_. Each shard copy that might | ||
need operations to be replayed must first create a shard history retention lease | ||
for itself. For example, this shard copy might be a replica of a shard or it | ||
might be a shard of a follower index when using {ccr}. Each retention lease | ||
keeps track of the sequence number of the first operation that the corresponding | ||
shard copy has not received. As the shard copy receives new operations, it | ||
increases the sequence number contained in its retention lease to indicate that | ||
it will not need to replay those operations in future. {es} discards | ||
soft-deleted operations once they are not being held by any retention lease. | ||
|
||
If a shard copy fails then it stops updating its shard history retention lease, | ||
which means that {es} will preserve all new operations so they can be replayed | ||
when the failed shard copy recovers. However, retention leases only last for a | ||
limited amount of time. If the shard copy does not recover quickly enough then | ||
its retention lease may expire. This protects {es} from retaining history | ||
forever if a shard copy fails permanently, because once a retention lease has | ||
expired {es} can start to discard history again. If a shard copy recovers after | ||
its retention lease has expired then {es} will fall back to copying the whole | ||
index since it can no longer simply replay the missing history. The expiry time | ||
of a retention lease defaults to `12h` which should be long enough for most | ||
reasonable recovery scenarios. | ||
|
||
Soft deletes are enabled by default on indices created in recent versions, but | ||
they can be explicitly enabled or disabled at index creation time. If soft | ||
deletes are disabled then peer recoveries can still sometimes take place by | ||
copying just the missing operations from the translog | ||
<<index-modules-translog-retention,as long as those operations are retained | ||
there>>. {ccr-cap} will not function if soft deletes are disabled. | ||
|
||
[float] | ||
=== History retention settings | ||
|
||
`index.soft_deletes.enabled`:: | ||
|
||
Whether or not soft deletes are enabled on the index. Soft deletes can only be | ||
configured at index creation and only on indices created on or after 6.5.0. | ||
The default value is `true`. | ||
|
||
`index.soft_deletes.retention_lease.period`:: | ||
|
||
The maximum length of time to retain a shard history retention lease before | ||
it expires and the history that it retains can be discarded. The default | ||
value is `12h`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.