-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
_recovery_source
sometimes remains after merge
#82595
Comments
Pinging @elastic/es-distributed (Team:Distributed) |
I'm seeing the exact same behavior when trying to exclude
Upon inspecting the index and based on the size of |
@elasticmachine is there any progress on this issue? |
The You can get information about the retention leases and sequence numbers with the following command:
Can you share the output of that command here? Edit to add: could you also share the full breakdown of disk usage for your index:
|
David, thank you for your help! Here is the info on the retention leases on the index in question Output{
"indices": {
"proposals.proposals.vector_20221119": {
"shards": {
"0": [
{
"commit": {
"id": "FOt/dGPF1B6NxwFV1EWNlw==",
"generation": 682,
"user_data": {
"local_checkpoint": "14491789",
"es_version": "8.4.1",
"min_retained_seq_no": "14484118",
"max_seq_no": "14491789",
"history_uuid": "8ZXxEfCfRVibbeCI0-hD2Q",
"max_unsafe_auto_id_timestamp": "-1",
"translog_uuid": "FyoeUjQ1T8yBf-fZQ8gqsQ"
},
"num_docs": 3939193
},
"seq_no": {
"max_seq_no": 14491789,
"local_checkpoint": 14491789,
"global_checkpoint": 14491789
},
"retention_leases": {
"primary_term": 1,
"version": 78196,
"leases": [
{
"id": "peer_recovery/vGiOGPHoQnSNmndhy2Np1A",
"retaining_seq_no": 14491790,
"timestamp": 1670374822751,
"source": "peer recovery"
},
{
"id": "peer_recovery/23uocumgThOiaraPXP_JNA",
"retaining_seq_no": 14491790,
"timestamp": 1670374822751,
"source": "peer recovery"
}
]
}
},
{
"commit": {
"id": "0tJKDmvaz43/5YosYvy9nQ==",
"generation": 678,
"user_data": {
"local_checkpoint": "14491789",
"es_version": "8.4.1",
"min_retained_seq_no": "14487455",
"max_seq_no": "14491789",
"history_uuid": "8ZXxEfCfRVibbeCI0-hD2Q",
"max_unsafe_auto_id_timestamp": "-1",
"translog_uuid": "YU3iTTrPSumU5E7tgGwPlw"
},
"num_docs": 3939193
},
"seq_no": {
"max_seq_no": 14491789,
"local_checkpoint": 14491789,
"global_checkpoint": 14491789
},
"retention_leases": {
"primary_term": 1,
"version": 78196,
"leases": [
{
"id": "peer_recovery/vGiOGPHoQnSNmndhy2Np1A",
"retaining_seq_no": 14491790,
"timestamp": 1670374822751,
"source": "peer recovery"
},
{
"id": "peer_recovery/23uocumgThOiaraPXP_JNA",
"retaining_seq_no": 14491790,
"timestamp": 1670374822751,
"source": "peer recovery"
}
]
}
}
],
"1": [
{
"commit": {
"id": "wlSNJOgD2Jm4Ms4eMN8n1w==",
"generation": 682,
"user_data": {
"local_checkpoint": "14526770",
"min_retained_seq_no": "14521481",
"es_version": "8.4.1",
"max_seq_no": "14526770",
"translog_uuid": "7txDNt4ITMarfIm-NbPGZw",
"max_unsafe_auto_id_timestamp": "-1",
"history_uuid": "E-gKvtUtSTS0Ff5ABOv0lQ"
},
"num_docs": 3941107
},
"seq_no": {
"max_seq_no": 14526770,
"local_checkpoint": 14526770,
"global_checkpoint": 14526770
},
"retention_leases": {
"primary_term": 2,
"version": 78123,
"leases": [
{
"id": "peer_recovery/3fQTuJNpQOeCg9k2zQI6Rg",
"retaining_seq_no": 14526771,
"timestamp": 1670374822751,
"source": "peer recovery"
},
{
"id": "peer_recovery/vGiOGPHoQnSNmndhy2Np1A",
"retaining_seq_no": 14526771,
"timestamp": 1670374822751,
"source": "peer recovery"
}
]
}
},
{
"commit": {
"id": "0tJKDmvaz43/5YosYvy9ng==",
"generation": 683,
"user_data": {
"local_checkpoint": "14526770",
"es_version": "8.4.1",
"min_retained_seq_no": "14524098",
"max_seq_no": "14526770",
"history_uuid": "E-gKvtUtSTS0Ff5ABOv0lQ",
"max_unsafe_auto_id_timestamp": "-1",
"translog_uuid": "1rZPK20OQ66Jk7rHtjAoAg"
},
"num_docs": 3941107
},
"seq_no": {
"max_seq_no": 14526770,
"local_checkpoint": 14526770,
"global_checkpoint": 14526770
},
"retention_leases": {
"primary_term": 2,
"version": 78123,
"leases": [
{
"id": "peer_recovery/3fQTuJNpQOeCg9k2zQI6Rg",
"retaining_seq_no": 14526771,
"timestamp": 1670374822751,
"source": "peer recovery"
},
{
"id": "peer_recovery/vGiOGPHoQnSNmndhy2Np1A",
"retaining_seq_no": 14526771,
"timestamp": 1670374822751,
"source": "peer recovery"
}
]
}
}
],
"2": [
{
"commit": {
"id": "wlSNJOgD2Jm4Ms4eMN8nrQ==",
"generation": 678,
"user_data": {
"local_checkpoint": "14375247",
"min_retained_seq_no": "14370437",
"es_version": "8.4.1",
"max_seq_no": "14375247",
"translog_uuid": "wmeJxF9lSOaUJ0hDigqE1g",
"max_unsafe_auto_id_timestamp": "-1",
"history_uuid": "UCaiMhAXR4-zJHNO54tC4A"
},
"num_docs": 3940119
},
"seq_no": {
"max_seq_no": 14375247,
"local_checkpoint": 14375247,
"global_checkpoint": 14375247
},
"retention_leases": {
"primary_term": 3,
"version": 78244,
"leases": [
{
"id": "peer_recovery/3fQTuJNpQOeCg9k2zQI6Rg",
"retaining_seq_no": 14375248,
"timestamp": 1670374825724,
"source": "peer recovery"
},
{
"id": "peer_recovery/23uocumgThOiaraPXP_JNA",
"retaining_seq_no": 14375248,
"timestamp": 1670374825724,
"source": "peer recovery"
}
]
}
},
{
"commit": {
"id": "FOt/dGPF1B6NxwFV1EWNmA==",
"generation": 678,
"user_data": {
"local_checkpoint": "14375247",
"es_version": "8.4.1",
"min_retained_seq_no": "14374549",
"max_seq_no": "14375247",
"history_uuid": "UCaiMhAXR4-zJHNO54tC4A",
"max_unsafe_auto_id_timestamp": "-1",
"translog_uuid": "hefqLf7iQXaY0ryQuLUAGQ"
},
"num_docs": 3940119
},
"seq_no": {
"max_seq_no": 14375247,
"local_checkpoint": 14375247,
"global_checkpoint": 14375247
},
"retention_leases": {
"primary_term": 3,
"version": 78244,
"leases": [
{
"id": "peer_recovery/3fQTuJNpQOeCg9k2zQI6Rg",
"retaining_seq_no": 14375248,
"timestamp": 1670374825724,
"source": "peer recovery"
},
{
"id": "peer_recovery/23uocumgThOiaraPXP_JNA",
"retaining_seq_no": 14375248,
"timestamp": 1670374825724,
"source": "peer recovery"
}
]
}
}
]
}
}
}
} Unfortunately the disk analysis command would not complete due to What I can do is create another index with the exactly same mapping and index say 10_000 documents and then run disk analysis. Because I was able to run this command on smaller indexes, actually that's how I found out about |
Darn proxy.
Watch out - using smaller indices with |
We haven't seen anything to suggest that there's a problem with the logic to remove the |
Pinging @elastic/es-search (Team:Search) |
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
If
_source
is disabled or filtered in the mappings, we add a_recovery_source
field to support shard recoveries and CCR. Once it's no longer needed, then future merges will drop the_recovery_source
field to reclaim space.In certain cases, it appears that
_recovery_source
can stick around even after a merge. I noticed this issue through the dense vector rally track. This command indexes 100,000 documents with_source
disabled, then force merges to 1 segment:At the end, the shard was larger than expected:
Using the disk usage API, we see this is due to recovery source:
There are no replicas, so the force merge should have removed recovery source. I can reproduce this with both 1 and 2 shards. I haven't found a small-scale reproduction yet.
The text was updated successfully, but these errors were encountered: