Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle throws on tasks submitted to thread pools #28667

Merged
merged 11 commits into from
Feb 15, 2018

Conversation

jasontedor
Copy link
Member

When we submit a task to a thread pool for asynchronous execution, we are returned a future. Since we submitted to go asynchronous, these futures are not inspected for failure (we would have to block a thread to do that). While we have on failure handlers for exceptions that are thrown during execution, we do not handle throwables that are not exceptions and these end up silently lost. This commit adds a check after the runnable returns that inspects the status of the future. If an unhandled throwable occurred during execution, this throwable is propogated out where it will land in the uncaught exception handler.

When we submit a task to a thread pool for asynchronous execution, we
are returned a future. Since we submitted to go asynchronous, these
futures are not inspected for failure (we would have to block a thread
to do that). While we have on failure handlers for exceptions that are
thrown during execution, we do not handle throwables that are not
exceptions and these end up silently lost. This commit adds a check
after the runnable returns that inspects the status of the future. If an
unhandled throwable occurred during execution, this throwable is
propogated out where it will land in the uncaught exception handler.
@jasontedor jasontedor added v6.2.3 and removed v6.2.2 labels Feb 13, 2018
@jasontedor jasontedor requested a review from bleskes February 14, 2018 18:32
* master:
  Backported synced-flush PR to v5.6.8 and v6.2.2
  Move more XContent.createParser calls to non-deprecated version (elastic#28672)
  Move more XContent.createParser calls to non-deprecated version (elastic#28670)
  Build: Group archive and package distribution projects (elastic#28673)
  [DOCS] Add supported token filters
  [TEST] bump timeout in testFetchShardsSkipUnavailable to 5s
  Relax remote check for bwc project checkouts (elastic#28666)
  [TEST] Synchronize searcher list in IndexShardTests
  [TEST] packaging: function to collect debug info (elastic#28608)
  Compute declared versions in a static block
  Docs: Remove references to elasticsearch directory in plugins (elastic#28647)
  Remove snapshot conditional for bwc snapshots (elastic#28657)
  Removed redundant JSON object from Put Mapping docs (elastic#28514)
  Update threadpool.asciidoc target_response_time (elastic#28655)
* master:
  Add a note to the docs that _cat api `help` option cannot be used if an optional url param is used (elastic#28686)
  Lift error finding utility to exceptions helpers
  Change "tweet" type to "_doc" (elastic#28690)
  [Docs] Add missing word in nested.asciidoc (elastic#28507)
  Simplify the Translog constructor by always expecting an existing translog (elastic#28676)
  Upgrade t-digest to 3.2 (elastic#28295) (elastic#28305)
  Add comment explaining lazy declared versions
Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jasontedor jasontedor merged commit 3e846ab into elastic:master Feb 15, 2018
jasontedor added a commit that referenced this pull request Feb 15, 2018
When we submit a task to a thread pool for asynchronous execution, we
are returned a future. Since we submitted to go asynchronous, these
futures are not inspected for failure (we would have to block a thread
to do that). While we have on failure handlers for exceptions that are
thrown during execution, we do not handle throwables that are not
exceptions and these end up silently lost. This commit adds a check
after the runnable returns that inspects the status of the future. If an
unhandled throwable occurred during execution, this throwable is
propogated out where it will land in the uncaught exception handler.

Relates #28667
jasontedor added a commit that referenced this pull request Feb 15, 2018
When we submit a task to a thread pool for asynchronous execution, we
are returned a future. Since we submitted to go asynchronous, these
futures are not inspected for failure (we would have to block a thread
to do that). While we have on failure handlers for exceptions that are
thrown during execution, we do not handle throwables that are not
exceptions and these end up silently lost. This commit adds a check
after the runnable returns that inspects the status of the future. If an
unhandled throwable occurred during execution, this throwable is
propogated out where it will land in the uncaught exception handler.

Relates #28667
jasontedor added a commit that referenced this pull request Feb 15, 2018
When we submit a task to a thread pool for asynchronous execution, we
are returned a future. Since we submitted to go asynchronous, these
futures are not inspected for failure (we would have to block a thread
to do that). While we have on failure handlers for exceptions that are
thrown during execution, we do not handle throwables that are not
exceptions and these end up silently lost. This commit adds a check
after the runnable returns that inspects the status of the future. If an
unhandled throwable occurred during execution, this throwable is
propogated out where it will land in the uncaught exception handler.

Relates #28667
@jasontedor jasontedor deleted the async-errors branch February 15, 2018 17:20
@jasontedor
Copy link
Member Author

Thanks @bleskes.

jasontedor added a commit that referenced this pull request Feb 16, 2018
This test has a race condition. The action listener used to listen for
connections has a guard against being executed twice. However, this
listener can be executed twice. After on success is invoked the test
starts to tear down. At this point, the threads the test forked will
terminate and the remote cluster connection will be closed. However, a
thread forked to the management thread pool by the remote cluster
connection can still be executing and try to continue connecting. This
thread will be cancelled when the remote cluster connection is closed
and this leads to the action listener being invoked again. To address
this, we explicitly check that the reason that on failure was invoked
was cancellation, and we assert that the listener was already previously
invoked. Interestingly, this issue has always been present yet a recent
change (#28667) exposed errors that occur on tasks submitted to the
thread pool and were silently being lost.

Relates #28695
jasontedor added a commit that referenced this pull request Feb 16, 2018
This test has a race condition. The action listener used to listen for
connections has a guard against being executed twice. However, this
listener can be executed twice. After on success is invoked the test
starts to tear down. At this point, the threads the test forked will
terminate and the remote cluster connection will be closed. However, a
thread forked to the management thread pool by the remote cluster
connection can still be executing and try to continue connecting. This
thread will be cancelled when the remote cluster connection is closed
and this leads to the action listener being invoked again. To address
this, we explicitly check that the reason that on failure was invoked
was cancellation, and we assert that the listener was already previously
invoked. Interestingly, this issue has always been present yet a recent
change (#28667) exposed errors that occur on tasks submitted to the
thread pool and were silently being lost.

Relates #28695
jasontedor added a commit that referenced this pull request Feb 16, 2018
This test has a race condition. The action listener used to listen for
connections has a guard against being executed twice. However, this
listener can be executed twice. After on success is invoked the test
starts to tear down. At this point, the threads the test forked will
terminate and the remote cluster connection will be closed. However, a
thread forked to the management thread pool by the remote cluster
connection can still be executing and try to continue connecting. This
thread will be cancelled when the remote cluster connection is closed
and this leads to the action listener being invoked again. To address
this, we explicitly check that the reason that on failure was invoked
was cancellation, and we assert that the listener was already previously
invoked. Interestingly, this issue has always been present yet a recent
change (#28667) exposed errors that occur on tasks submitted to the
thread pool and were silently being lost.

Relates #28695
jasontedor added a commit that referenced this pull request Feb 16, 2018
This test has a race condition. The action listener used to listen for
connections has a guard against being executed twice. However, this
listener can be executed twice. After on success is invoked the test
starts to tear down. At this point, the threads the test forked will
terminate and the remote cluster connection will be closed. However, a
thread forked to the management thread pool by the remote cluster
connection can still be executing and try to continue connecting. This
thread will be cancelled when the remote cluster connection is closed
and this leads to the action listener being invoked again. To address
this, we explicitly check that the reason that on failure was invoked
was cancellation, and we assert that the listener was already previously
invoked. Interestingly, this issue has always been present yet a recent
change (#28667) exposed errors that occur on tasks submitted to the
thread pool and were silently being lost.

Relates #28695
ywelsch added a commit that referenced this pull request Jan 17, 2019
This is a continuation of #28667 and has as goal to convert all executors to propagate errors to the
uncaught exception handler. Notable missing ones were the direct executor and the scheduler. This
commit also makes it the property of the executor, not the runnable, to ensure this property. A big
part of this commit also consists of vastly improving the test coverage in this area.
ywelsch added a commit that referenced this pull request Jan 17, 2019
This is a continuation of #28667 and has as goal to convert all executors to propagate errors to the
uncaught exception handler. Notable missing ones were the direct executor and the scheduler. This
commit also makes it the property of the executor, not the runnable, to ensure this property. A big
part of this commit also consists of vastly improving the test coverage in this area.
henningandersen added a commit to henningandersen/elasticsearch that referenced this pull request Jan 30, 2019
Scheduler.schedule(...) would previously assume that caller handles
exception by calling get() on the returned ScheduledFuture.
schedule() now returns a ScheduledCancellable that no longer gives
access to the exception. Instead, any exception thrown out of a
scheduled Runnable is logged as a warning.

This is a continuation of elastic#28667, elastic#36317 and also fixes elastic#37708.
henningandersen added a commit to henningandersen/elasticsearch that referenced this pull request Jan 30, 2019
Fixed review comments: removed todo, use FutureUtils.cancel and removed
scheduler task decoration since this adds more complexity than it
benefits.

This is a continuation of elastic#28667, elastic#36137 and also fixes elastic#37708.
henningandersen added a commit that referenced this pull request Jan 31, 2019
Scheduler.schedule(...) would previously assume that caller handles
exception by calling get() on the returned ScheduledFuture.
schedule() now returns a ScheduledCancellable that no longer gives
access to the exception. Instead, any exception thrown out of a
scheduled Runnable is logged as a warning.

This is a continuation of #28667, #36137 and also fixes #37708.
henningandersen added a commit to henningandersen/elasticsearch that referenced this pull request Feb 1, 2019
Scheduler.schedule(...) would previously assume that caller handles
exception by calling get() on the returned ScheduledFuture.
schedule() now returns a ScheduledCancellable that no longer gives
access to the exception. Instead, any exception thrown out of a
scheduled Runnable is logged as a warning.

In this backport to 6.x, source backwards compatibility is maintained
and some of the changes has therefore not been carried out (notably
the signature change on Processor.Parameters.scheduler).

This is a continuation of elastic#28667, elastic#36137 and also fixes elastic#37708.
henningandersen added a commit that referenced this pull request Feb 4, 2019
Scheduler.schedule(...) would previously assume that caller handles
exception by calling get() on the returned ScheduledFuture.
schedule() now returns a ScheduledCancellable that no longer gives
access to the exception. Instead, any exception thrown out of a
scheduled Runnable is logged as a warning.

In this backport to 6.x, source backwards compatibility is maintained
and some of the changes has therefore not been carried out (notably
the signature change on Processor.Parameters.scheduler).

This is a continuation of #28667, #36137 and also fixes #37708.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants