Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"failed to turn off translog retention" after upgrade #651

Closed
gferrette opened this issue Aug 17, 2020 · 9 comments
Closed

"failed to turn off translog retention" after upgrade #651

gferrette opened this issue Aug 17, 2020 · 9 comments
Labels
info requested Further information is requested question User requested information

Comments

@gferrette
Copy link

Hello,

After upgrade Opendistro from version 1.0.2 to version 1.7.0, on nodes startup, its appearing on the logs the message below:

[2020-08-14T10:54:21,893][WARN ][o.e.i.s.IndexShard ] [machine] [.tasks][0] failed to turn off translog retention
org.apache.lucene.store.AlreadyClosedException: engine is closed
at org.elasticsearch.index.shard.IndexShard.getEngine(IndexShard.java:2528) ~[elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.index.shard.IndexShard.trimTranslog(IndexShard.java:1106) ~[elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.index.shard.IndexShard$3.doRun(IndexShard.java:1944) [elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:692) [elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.6.1.jar:7.6.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:830) [?:?]

This message is appearing for several indices, but the indices/shards are not corrupted and they are on green state, but on every startup those messages are appearing on the logs.

Is there any way to solve this issue?

Thanks in advance.

Gabriel.

@peterzhuamazon
Copy link
Member

Hi @gferrette, I have tried to reproduce the issue with a centos7 server running RPM upgrading from 1.0.2 to 1.7.0. However, I am not able to reproduce such issue with simple data on my ends.

From the looks of it, this seems like an issue related to upstream.

We would appreciate if you could share more information regarding your setup and logs.

Thanks.

@gferrette
Copy link
Author

Hello @peterzhuamazon !

Thanks for replying.

This issue seems to be the same of this thread https://github.com/opendistro-for-elasticsearch/security/issues/354, but in my case it's happening on several indexes and not only on audit.
My setup is a test environment with one single node, follow below elasticsearch.yml (without certificates info):

#action.destructive_requires_name: true

script.painless.regex.enabled: true

#repositorio snapshot
path.repo: ["/tmp/backup_nodes"]

######## Start OpenDistro for Elasticsearch Security Demo Configuration ########

WARNING: revise all the lines below before you go into production

opendistro_security.ssl.transport.pemcert_filepath: dummy.pem
opendistro_security.ssl.transport.pemkey_filepath: dummy-key.pem
opendistro_security.ssl.transport.pemtrustedcas_filepath: root-ca.pem
opendistro_security.ssl.transport.enforce_hostname_verification: false
opendistro_security.ssl.http.enabled: false
opendistro_security.ssl.http.pemcert_filepath: dummy.pem
opendistro_security.ssl.http.pemkey_filepath: dummy-key.pem
opendistro_security.ssl.http.pemtrustedcas_filepath: root-ca.pem
#opendistro_security.allow_unsafe_democertificates: true
opendistro_security.allow_default_init_securityindex: true
opendistro_security.authcz.admin_dn:

  • 'DUMMY'
    opendistro_security.nodes_dn:
  • 'DUMMY'

opendistro_security.enable_snapshot_restore_privilege: true
opendistro_security.check_snapshot_restore_write_privileges: true
opendistro_security.restapi.roles_enabled: ["all_access", "security_rest_api_access"]
cluster.routing.allocation.disk.threshold_enabled: false
node.max_local_storage_nodes: 3
######## End OpenDistro for Elasticsearch Security Demo Configuration ########

More logs info:

[2020-08-17T15:57:03,094][INFO ][o.e.g.GatewayService ] [machine] recovered [27] indices into cluster_state
[2020-08-17T15:57:03,121][INFO ][c.a.o.s.OpenDistroSecurityPlugin] [machine] Node started
[2020-08-17T15:57:03,122][INFO ][c.a.o.s.c.ConfigurationRepository] [machine] Check if .opendistro_security index exists ...
[2020-08-17T15:57:03,122][INFO ][c.a.o.s.c.ConfigurationRepository] [machine] .opendistro_security index does already exist, so we try to load the config from it
[2020-08-17T15:57:03,127][INFO ][c.a.o.s.OpenDistroSecurityPlugin] [machine] 4 Open Distro Security modules loaded so far: [Module [type=REST_MANAGEMENT_API, implementing class=com.amazon.opendistroforelasticsearch.security.dlic.rest.api.OpenDistroSecurityRestApiActions], Module [type=DLSFLS, implementing class=com.amazon.opendistroforelasticsearch.security.configuration.OpenDistroSecurityFlsDlsIndexSearcherWrapper], Module [type=AUDITLOG, implementing class=com.amazon.opendistroforelasticsearch.security.auditlog.impl.AuditLogImpl], Module [type=MULTITENANCY, implementing class=com.amazon.opendistroforelasticsearch.security.configuration.PrivilegesInterceptorImpl]]
[2020-08-17T15:57:03,130][INFO ][c.a.o.s.c.ConfigurationRepository] [machine] Background init thread started. Install default config?: false
[2020-08-17T15:57:04,056][WARN ][o.e.i.s.IndexShard ] [machine] [.tasks][0] failed to turn off translog retention
org.apache.lucene.store.AlreadyClosedException: engine is closed
at org.elasticsearch.index.shard.IndexShard.getEngine(IndexShard.java:2528) ~[elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.index.shard.IndexShard.trimTranslog(IndexShard.java:1106) ~[elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.index.shard.IndexShard$3.doRun(IndexShard.java:1944) [elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:692) [elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.6.1.jar:7.6.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:830) [?:?]
[2020-08-17T15:57:06,429][WARN ][o.e.i.s.IndexShard ] [machine] [.kibana_-532334581_test_1][0] failed to turn off translog retention
org.apache.lucene.store.AlreadyClosedException: engine is closed
at org.elasticsearch.index.shard.IndexShard.getEngine(IndexShard.java:2528) ~[elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.index.shard.IndexShard.trimTranslog(IndexShard.java:1106) ~[elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.index.shard.IndexShard$3.doRun(IndexShard.java:1944) [elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:692) [elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.6.1.jar:7.6.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:830) [?:?]

@peterzhuamazon
Copy link
Member

Hi @gferrette after discussing with the team, we think this issue is more related to the security repo as there are already similar issues to this one. We will transfer this issue to the security repo. Thanks

@peterzhuamazon peterzhuamazon transferred this issue from opendistro-for-elasticsearch/opendistro-build Aug 17, 2020
@peterzhuamazon peterzhuamazon added info requested Further information is requested question User requested information labels Aug 17, 2020
@dinusX
Copy link

dinusX commented Aug 17, 2020

Hi @gferrette ,
would you be able to change the log level to DEBUG and paste again the stack trace ?

@gferrette
Copy link
Author

Hello @dinusX !

Follow below the stack trace with DEBUG log level:

[2020-08-18T10:52:54,720][DEBUG][o.e.c.s.MasterService ] [machine] publishing cluster state version [482]
[2020-08-18T10:52:54,711][DEBUG][o.e.i.t.Translog ] [machine] [.kibana_1298139586_usuarioteste][0] open uncommitted translog checkpoint Checkpoint{offset=55, numOps=0, generation=4, minSeqNo=-1, maxSeqNo=-1, globalCheckpoint=0, minTranslogGeneration=3, trimmedAboveSeqNo=-2}
[2020-08-18T10:52:54,722][DEBUG][o.e.i.t.Translog ] [machine] [.kibana_1298139586_usuarioteste][0] recovered local translog from checkpoint Checkpoint{offset=55, numOps=0, generation=4, minSeqNo=-1, maxSeqNo=-1, globalCheckpoint=0, minTranslogGeneration=3, trimmedAboveSeqNo=-2}
[2020-08-18T10:52:54,707][DEBUG][o.e.i.t.Translog ] [machine] [.kibana_1276559883_testesubidaversao_1][0] recovered local translog from checkpoint Checkpoint{offset=55, numOps=0, generation=7, minSeqNo=-1, maxSeqNo=-1, globalCheckpoint=0, minTranslogGeneration=5, trimmedAboveSeqNo=-2}
[2020-08-18T10:52:54,733][DEBUG][o.e.i.t.Translog ] [machine] [.kibana_1298139586_usuarioteste][0] recovered local translog from checkpoint Checkpoint{offset=55, numOps=0, generation=4, minSeqNo=-1, maxSeqNo=-1, globalCheckpoint=0, minTranslogGeneration=3, trimmedAboveSeqNo=-2}
[2020-08-18T10:52:54,736][DEBUG][o.e.c.c.PublicationTransportHandler] [machine] received diff cluster state version [482] with uuid [kPQDh3sIRgOcMbRvFdobqQ], diff size [252]
[2020-08-18T10:52:54,744][DEBUG][o.e.i.e.Engine ] [machine] [security-auditlog-2020.08.17][0] Safe commit [CommitPoint{segment[segments_4], userData[{history_uuid=6F_7duGVQfWQ09WpZ8xQVw, local_checkpoint=0, max_seq_no=0, max_unsafe_auto_id_timestamp=-1, min_retained_seq_no=1, sync_id=KGFfdj5NSmu7Cmu82PshqQ, translog_generation=3, translog_uuid=sE5JbcYESma2dXRitoTphw}]}], last commit [CommitPoint{segment[segments_4], userData[{history_uuid=6F_7duGVQfWQ09WpZ8xQVw, local_checkpoint=0, max_seq_no=0, max_unsafe_auto_id_timestamp=-1, min_retained_seq_no=1, sync_id=KGFfdj5NSmu7Cmu82PshqQ, translog_generation=3, translog_uuid=sE5JbcYESma2dXRitoTphw}]}]
[2020-08-18T10:52:54,760][DEBUG][o.e.g.PersistedClusterStateService] [machine] writing cluster state took [0ms]; wrote global metadata [false] and metadata for [0] indices and skipped [27] unchanged indices
[2020-08-18T10:52:54,762][DEBUG][o.e.i.e.Engine ] [machine] [.kibana_1276559883_testesubidaversao_1][0] Safe commit [CommitPoint{segment[segments_4], userData[{history_uuid=FAQMiebKTBiYHhXci0HgUA, local_checkpoint=0, max_seq_no=0, max_unsafe_auto_id_timestamp=-1, min_retained_seq_no=1, sync_id=ZBSi3AaQQKScPFa8XIreQw, translog_generation=5, translog_uuid=3Zfcvlm2QjmmJBIeNnzKeQ}]}], last commit [CommitPoint{segment[segments_4], userData[{history_uuid=FAQMiebKTBiYHhXci0HgUA, local_checkpoint=0, max_seq_no=0, max_unsafe_auto_id_timestamp=-1, min_retained_seq_no=1, sync_id=ZBSi3AaQQKScPFa8XIreQw, translog_generation=5, translog_uuid=3Zfcvlm2QjmmJBIeNnzKeQ}]}]
[2020-08-18T10:52:54,763][DEBUG][o.e.c.s.ClusterApplierService] [machine] processing [Publication{term=21, version=482}]: execute
[2020-08-18T10:52:54,764][DEBUG][o.e.c.s.ClusterApplierService] [machine] cluster state updated, version [482], source [Publication{term=21, version=482}]
[2020-08-18T10:52:54,764][DEBUG][o.e.c.NodeConnectionsService] [machine] connected to {machine}{PNjawAAZRj-olrAsLoq8TQ}{0VAaApceT7ylZ8IhxY-Bug}{10.0.2.191}{10.0.2.191:9300}{dim}
[2020-08-18T10:52:54,764][DEBUG][o.e.c.s.ClusterApplierService] [machine] apply cluster state with version 482
[2020-08-18T10:52:54,767][DEBUG][o.e.i.s.IndexShard ] [machine] [.tasks][0] turn off the translog retention for the replication group [.tasks][0] as it starts using retention leases exclusively in peer recoveries
[2020-08-18T10:52:54,768][DEBUG][o.e.c.s.ClusterApplierService] [machine] set locally applied cluster state to version 482
[2020-08-18T10:52:54,769][DEBUG][o.e.c.s.ClusterApplierService] [machine] processing [Publication{term=21, version=482}]: took [0s] done applying updated cluster state (version: 482, uuid: kPQDh3sIRgOcMbRvFdobqQ)
[2020-08-18T10:52:54,769][DEBUG][o.e.c.c.C.CoordinatorPublication] [machine] publication ended successfully: Publication{term=21, version=482}
[2020-08-18T10:52:54,769][DEBUG][o.e.c.s.MasterService ] [machine] took [0s] to notify listeners on successful publication of cluster state (version: 482, uuid: kPQDh3sIRgOcMbRvFdobqQ) for [cluster_reroute(async_shard_fetch)]
[2020-08-18T10:52:54,775][DEBUG][o.e.i.e.Engine ] [machine] [.kibana_1298139586_usuarioteste][0] Safe commit [CommitPoint{segment[segments_4], userData[{history_uuid=YBSLGiwlQeiK3RocGmqtwQ, local_checkpoint=0, max_seq_no=0, max_unsafe_auto_id_timestamp=-1, min_retained_seq_no=1, sync_id=Tmc9nBuJRaOExp3UzLc3Mg, translog_generation=3, translog_uuid=AJle9vsoRZ-HAKiYeCHd8g}]}], last commit [CommitPoint{segment[segments_4], userData[{history_uuid=YBSLGiwlQeiK3RocGmqtwQ, local_checkpoint=0, max_seq_no=0, max_unsafe_auto_id_timestamp=-1, min_retained_seq_no=1, sync_id=Tmc9nBuJRaOExp3UzLc3Mg, translog_generation=3, translog_uuid=AJle9vsoRZ-HAKiYeCHd8g}]}]
[2020-08-18T10:52:54,792][DEBUG][o.e.i.e.Engine ] [machine] [.tasks][0] Safe commit [CommitPoint{segment[segments_a], userData[{history_uuid=FlCD17y_Qeay5P2wxsfYOA, local_checkpoint=3, max_seq_no=3, max_unsafe_auto_id_timestamp=-1, min_retained_seq_no=4, sync_id=589jUf1WRcuWeIxbW4Ox7Q, translog_generation=23, translog_uuid=65fd-Z8BT0qPvJa45rf7Tw}]}], last commit [CommitPoint{segment[segments_a], userData[{history_uuid=FlCD17y_Qeay5P2wxsfYOA, local_checkpoint=3, max_seq_no=3, max_unsafe_auto_id_timestamp=-1, min_retained_seq_no=4, sync_id=589jUf1WRcuWeIxbW4Ox7Q, translog_generation=23, translog_uuid=65fd-Z8BT0qPvJa45rf7Tw}]}]
[2020-08-18T10:52:54,788][WARN ][o.e.i.s.IndexShard ] [machine] [.tasks][0] failed to turn off translog retention
org.apache.lucene.store.AlreadyClosedException: engine is closed
at org.elasticsearch.index.shard.IndexShard.getEngine(IndexShard.java:2528) ~[elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.index.shard.IndexShard.trimTranslog(IndexShard.java:1106) ~[elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.index.shard.IndexShard$3.doRun(IndexShard.java:1944) [elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:692) [elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.6.1.jar:7.6.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:830) [?:?]
[2020-08-18T10:52:54,824][DEBUG][o.e.i.s.IndexShard ] [machine] [security-auditlog-2020.08.17][0] state: [RECOVERING]->[POST_RECOVERY], reason [post recovery from shard_store]
[2020-08-18T10:52:54,824][DEBUG][o.e.i.s.IndexShard ] [machine] [security-auditlog-2020.08.17][0] recovery completed from [shard_store], took [287ms]
[2020-08-18T10:52:54,825][DEBUG][o.e.c.a.s.ShardStateAction] [machine] sending [internal:cluster/shard/started] to [PNjawAAZRj-olrAsLoq8TQ] for shard entry [StartedShardEntry{shardId [[security-auditlog-2020.08.17][0]], allocationId [W-rGbGb5TMigtXbkl3t60g], primary term [3], message [after existing store recovery; bootstrap_history_uuid=false]}]
[2020-08-18T10:52:54,826][DEBUG][o.e.c.a.s.ShardStateAction] [machine] [security-auditlog-2020.08.17][0] received shard started for [StartedShardEntry{shardId [[security-auditlog-2020.08.17][0]], allocationId [W-rGbGb5TMigtXbkl3t60g], primary term [3], message [after existing store recovery; bootstrap_history_uuid=false]}]
[2020-08-18T10:52:54,827][DEBUG][o.e.c.s.MasterService ] [machine] executing cluster state update for [shard-started StartedShardEntry{shardId [[security-auditlog-2020.08.17][0]], allocationId [W-rGbGb5TMigtXbkl3t60g], primary term [3], message [after existing store recovery; bootstrap_history_uuid=false]}[StartedShardEntry{shardId [[security-auditlog-2020.08.17][0]], allocationId [W-rGbGb5TMigtXbkl3t60g], primary term [3], message [after existing store recovery; bootstrap_history_uuid=false]}]]

Thanks in advance!

@dinusX
Copy link

dinusX commented Aug 18, 2020

From the above logs it seems that you have an index ".tasks" that is failing during ES process boot-up.

@gferrette
Copy link
Author

Hello @dinusX !

Thanks for replying.

This error is occurring in several indexes, not only in .tasks. The .tasks index is in green state, but i have removed this index .tasks, because it is recreated as ES needs it. The error continues on other indexes as below:

[2020-08-18T17:53:34,022][DEBUG][o.e.c.s.ClusterApplierService] [machine] processing [Publication{term=24, version=635}]: execute
[2020-08-18T17:53:34,022][DEBUG][o.e.c.s.ClusterApplierService] [machine] cluster state updated, version [635], source [Publication{term=24, version=635}]
[2020-08-18T17:53:34,022][DEBUG][o.e.c.NodeConnectionsService] [machine] connected to {machine}{PNjawAAZRj-olrAsLoq8TQ}{jOVOPUGCT0mHN2QRYj4Y1Q}{10.0.2.191}{10.0.2.191:9300}{dim}
[2020-08-18T17:53:34,022][DEBUG][o.e.c.s.ClusterApplierService] [machine] applying settings from cluster state with version 635
[2020-08-18T17:53:34,022][DEBUG][o.e.c.s.ClusterApplierService] [machine] apply cluster state with version 635
[2020-08-18T17:53:34,023][DEBUG][o.e.i.s.IndexShard ] [machine] [security-auditlog-2020.06.15][0] state: [POST_RECOVERY]->[STARTED], reason [global state is [STARTED]]
[2020-08-18T17:53:34,023][DEBUG][o.e.i.s.IndexShard ] [machine] [.kibana_-532334581_publicoredmine_2][0] turn off the translog retention for the replication group [.kibana_-532334581_publicoredmine_2][0] as it starts using retention leases exclusively in peer recoveries
[2020-08-18T17:53:34,023][DEBUG][o.e.c.a.s.ShardStateAction] [machine] sending [internal:cluster/shard/started] to [PNjawAAZRj-olrAsLoq8TQ] for shard entry [StartedShardEntry{shardId [[.kibana_-532334581_publicoredmine_2][0]], allocationId [2IgravwzSFCNRA-BY4eJBw], primary term [22], message [master {machine}{PNjawAAZRj-olrAsLoq8TQ}{jOVOPUGCT0mHN2QRYj4Y1Q}{10.0.2.191}{10.0.2.191:9300}{dim} marked shard as initializing, but shard state is [POST_RECOVERY], mark shard as started]}]
[2020-08-18T17:53:34,024][DEBUG][o.e.c.a.s.ShardStateAction] [machine] [.kibana_-532334581_publicoredmine_2][0] received shard started for [StartedShardEntry{shardId [[.kibana_-532334581_publicoredmine_2][0]], allocationId [2IgravwzSFCNRA-BY4eJBw], primary term [22], message [master {machine}{PNjawAAZRj-olrAsLoq8TQ}{jOVOPUGCT0mHN2QRYj4Y1Q}{10.0.2.191}{10.0.2.191:9300}{dim} marked shard as initializing, but shard state is [POST_RECOVERY], mark shard as started]}]
[2020-08-18T17:53:34,024][DEBUG][o.e.i.s.IndexShard ] [machine] [.kibana_-532334581_publicoredmine_1][0] turn off the translog retention for the replication group [.kibana_-532334581_publicoredmine_1][0] as it starts using retention leases exclusively in peer recoveries
[2020-08-18T17:53:34,024][DEBUG][o.e.c.a.s.ShardStateAction] [machine] sending [internal:cluster/shard/started] to [PNjawAAZRj-olrAsLoq8TQ] for shard entry [StartedShardEntry{shardId [[.kibana_-532334581_publicoredmine_1][0]], allocationId [-EIVcT1GQEy3cnSYVpCiGA], primary term [22], message [master {machine}{PNjawAAZRj-olrAsLoq8TQ}{jOVOPUGCT0mHN2QRYj4Y1Q}{10.0.2.191}{10.0.2.191:9300}{dim} marked shard as initializing, but shard state is [POST_RECOVERY], mark shard as started]}]
[2020-08-18T17:53:34,024][DEBUG][o.e.c.a.s.ShardStateAction] [machine] [.kibana_-532334581_publicoredmine_1][0] received shard started for [StartedShardEntry{shardId [[.kibana_-532334581_publicoredmine_1][0]], allocationId [-EIVcT1GQEy3cnSYVpCiGA], primary term [22], message [master {machine}{PNjawAAZRj-olrAsLoq8TQ}{jOVOPUGCT0mHN2QRYj4Y1Q}{10.0.2.191}{10.0.2.191:9300}{dim} marked shard as initializing, but shard state is [POST_RECOVERY], mark shard as started]}]
[2020-08-18T17:53:34,025][DEBUG][o.e.i.s.IndexShard ] [machine] [.kibana_92668751_admin_1][0] turn off the translog retention for the replication group [.kibana_92668751_admin_1][0] as it starts using retention leases exclusively in peer recoveries
[2020-08-18T17:53:34,025][DEBUG][o.e.c.s.ClusterApplierService] [machine] set locally applied cluster state to version 635
[2020-08-18T17:53:34,029][WARN ][o.e.i.s.IndexShard ] [machine] [.kibana_92668751_admin_1][0] failed to turn off translog retention
org.apache.lucene.store.AlreadyClosedException: engine is closed
at org.elasticsearch.index.shard.IndexShard.getEngine(IndexShard.java:2528) ~[elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.index.shard.IndexShard.trimTranslog(IndexShard.java:1106) ~[elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.index.shard.IndexShard$3.doRun(IndexShard.java:1944) [elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:692) [elasticsearch-7.6.1.jar:7.6.1]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.6.1.jar:7.6.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:830) [?:?]
[2020-08-18T17:53:34,033][DEBUG][o.e.i.e.Engine ] [machine] [.kibana_92668751_admin_1][0] Safe commit [CommitPoint{segment[segments_4], userData[{history_uuid=2dgYcvEuRpSjF2naPHfXyA, local_checkpoint=0, max_seq_no=0, max_unsafe_auto_id_timestamp=-1, min_retained_seq_no=1, sync_id=uSgr5MU4QH6yju-GsoF9zA, translog_generation=3, translog_uuid=kyHNsR0nQIGnBo3_O1fAYA}]}], last commit [CommitPoint{segment[segments_4], userData[{history_uuid=2dgYcvEuRpSjF2naPHfXyA, local_checkpoint=0, max_seq_no=0, max_unsafe_auto_id_timestamp=-1, min_retained_seq_no=1, sync_id=uSgr5MU4QH6yju-GsoF9zA, translog_generation=3, translog_uuid=kyHNsR0nQIGnBo3_O1fAYA}]}]

All the indexes that this error is ocurring are in green state.

@dinusX
Copy link

dinusX commented Aug 18, 2020

If I'm not mistaken , the following commit should fix your warning messages: elastic/elasticsearch#57063

This was fixed in ES 7.7.1+

From the description it doesn't seem to be a bug, but just an unnecessary warning message.

@gferrette
Copy link
Author

Hello @dinusX !

It seems it's only a warning message according to this thread.

Thanks for your help and for clarifying our questions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
info requested Further information is requested question User requested information
Projects
None yet
Development

No branches or pull requests

3 participants