When stopping via systemd only kill the JVM, not its control group #25195

droberts195 · 2017-06-13T12:50:16Z

This prevents possible race conditions between the Elasticsearch JVM and
plugin native controller processes that can cause the Elasticsearch shutdown
to hang. The problem can happen when the JVM and the controller process
receive a SIGTERM at almost the same time.

(There's an assumption here that Elasticsearch will continue to use other
mechanisms to kill native controller processes.)

droberts195 · 2017-06-13T12:55:42Z

I checked whether this also affects the old-style init.d script and it doesn't. That script already only kills the JVM and relies on the JVM to do whatever other killing is necessary.

droberts195 · 2017-06-13T14:21:53Z

The problem this fixes was reported in this forum thread: https://discuss.elastic.co/t/disabling-machine-learning-does-not-allow-elasticsearch-to-stop/88869

jasontedor

LGTM. I left a comment about the comment, I trust your judgement as far as addressing. I also left you another comment via another channel.

jasontedor · 2017-06-14T01:09:11Z

distribution/src/main/packaging/systemd/elasticsearch.service

@@ -52,6 +52,9 @@ TimeoutStopSec=0
 # SIGTERM signal is used to stop the Java process
 KillSignal=SIGTERM

+# Send the signal only to the JVM rather than its process group


To be pedantic, this should say control group.

Thanks, I changed the comment

This prevents possible race conditions between the Elasticsearch JVM and plugin native controller processes that can cause the Elasticsearch shutdown to hang. The problem can happen when the JVM and the controller process receive a SIGTERM at almost the same time. (There's an assumption here that Elasticsearch will continue to use other mechanisms to kill native controller processes.)

…25195) This prevents possible race conditions between the Elasticsearch JVM and plugin native controller processes that can cause the Elasticsearch shutdown to hang. The problem can happen when the JVM and the controller process receive a SIGTERM at almost the same time. (There's an assumption here that Elasticsearch will continue to use other mechanisms to kill native controller processes.)

* master: (27 commits) Refactor TransportShardBulkAction.executeUpdateRequest and add tests Make sure range queries are correctly profiled. (elastic#25108) Test: allow setting socket timeout for rest client (elastic#25221) Migration docs for elastic#25080 (elastic#25218) Remove `discovery.type` BWC layer from the EC2/Azure/GCE plugins elastic#25080 When stopping via systemd only kill the JVM, not its control group (elastic#25195) Remove PrefixAnalyzer, because it is no longer used. Internal: Remove Strings.cleanPath (elastic#25209) Docs: Add note about which secure settings are valid (elastic#25212) Indices.rollover/10_basic should refresh to make the doc visible in lucene stats Port support for commercial GeoIP2 databases from Logstash. (elastic#24889) [DOCS] Add ML node to node.asciidoc (elastic#24495) expose simple pattern tokenizers (elastic#25159) Test: add setting to change request timeout for rest client (elastic#25201) Fix secure repository-hdfs tests on JDK 9 Add target_field parameter to gsub, join, lowercase, sort, split, trim, uppercase (elastic#24133) Add Cross Cluster Search support for scroll searches (elastic#25094) Adapt skip version in rest-api-spec/test/indices.rollover/20_max_doc_condition.yml Rollover max docs should only count primaries (elastic#24977) Add remote cluster infrastructure to fetch discovery nodes. (elastic#25123) ...

* master: (44 commits) Upgrade icu4j for the ICU analysis plugin to 59.1 (elastic#25243) move assertBusy to use CheckException (elastic#25246) Use SPI in High Level Rest Client to load XContent parsers (elastic#25098) [TEST] test that low level REST client leaves path untouched (elastic#25193) Speed up PK lookups at index time. (elastic#19856) [Docs] Fix documentation for percentiles bucket aggregation (elastic#25229) Upgrade to lucene-7.0.0-snapshot-92b1783. (elastic#25222) Build: Add master flag for disabling bwc tests (elastic#25230) Scripting: Rename SearchScript.needsScores to needs_score (elastic#25235) Support script context stateful factory in Painless. (elastic#25233) FastVectorHighlighter should not cache the field query globally (elastic#25197) Remove QUERY_AND_FETCH BWC for pre-5.3.0 nodes (elastic#25223) Add more missing AggregationBuilder getters (elastic#25198) Extract the snapshot/restore full cluster restart tests from the translog full cluster restart tests (elastic#25204) Refactor TransportShardBulkAction.executeUpdateRequest and add tests Make sure range queries are correctly profiled. (elastic#25108) Test: allow setting socket timeout for rest client (elastic#25221) Migration docs for elastic#25080 (elastic#25218) Remove `discovery.type` BWC layer from the EC2/Azure/GCE plugins elastic#25080 When stopping via systemd only kill the JVM, not its control group (elastic#25195) ...

droberts195 added :Delivery/Packaging RPM and deb packaging, tar and zip archives, shell and batch scripts >bug review v5.4.2 v5.5.0 v5.6.0 v6.0.0 labels Jun 13, 2017

droberts195 requested a review from jasontedor June 13, 2017 12:50

jasontedor approved these changes Jun 14, 2017

View reviewed changes

droberts195 force-pushed the systemd_shutdown_kill_process_only branch from 1fb3d34 to c497cdf Compare June 14, 2017 08:20

droberts195 changed the title ~~When stopping via systemd only kill the JVM, not its process group~~ When stopping via systemd only kill the JVM, not its control group Jun 14, 2017

droberts195 force-pushed the systemd_shutdown_kill_process_only branch from c497cdf to 5cbd6da Compare June 14, 2017 08:22

droberts195 merged commit a5658c0 into elastic:master Jun 14, 2017

droberts195 deleted the systemd_shutdown_kill_process_only branch June 14, 2017 08:23

clintongormley added v6.0.0-beta1 and removed v6.0.0 labels Jul 25, 2017

mark-vieira added the Team:Delivery Meta label for Delivery team label Nov 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When stopping via systemd only kill the JVM, not its control group #25195

When stopping via systemd only kill the JVM, not its control group #25195

droberts195 commented Jun 13, 2017 •

edited

Loading

droberts195 commented Jun 13, 2017

droberts195 commented Jun 13, 2017

jasontedor left a comment

jasontedor Jun 14, 2017

droberts195 Jun 14, 2017

When stopping via systemd only kill the JVM, not its control group #25195

When stopping via systemd only kill the JVM, not its control group #25195

Conversation

droberts195 commented Jun 13, 2017 • edited Loading

droberts195 commented Jun 13, 2017

droberts195 commented Jun 13, 2017

jasontedor left a comment

Choose a reason for hiding this comment

jasontedor Jun 14, 2017

Choose a reason for hiding this comment

droberts195 Jun 14, 2017

Choose a reason for hiding this comment

droberts195 commented Jun 13, 2017 •

edited

Loading