Improve handling of poolboy timeouts during ping requests #763
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The design of the riak_cs_wm_ping webmachine resource is that it blocks for a
timeout period while attempting to checkout a Riak connection from the poolboy
request_pool. If the timeout expires before a pool connection is checked out
it is designed to attempt to establish a direct connection to Riak and then
attempt the ping.
The reality of what occurs is that the process handling the ping request
crashes after the timeout expires. This is due to the fact that
poolboy:checkout calls gen_fsm:sync_send_event with a timeout parameter
specified. The timeout expiration results in a call to exit(timeout) that
causes the request process to crash and return a 500 error to the client.
This change modifies the riak_cs_wm_ping resource to catch the exit on timeout
and return the atom full. This allows the direct connection logic to execute
and either return success to the user or a 503 error indicating the system is
too heavily loaded.