-
Notifications
You must be signed in to change notification settings - Fork 859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update HealthCheckedChannelPool to check KEEP_ALIVE attribute #1476
Conversation
…f a channel before picking it up in the pool. See #1380
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does KEEP-ALIVE being false indicate the channel is unhealthy?
Basically, there could be a race condition when a channel is being closed at the same moment it's being acquired, and a non-reusable connection might gets picked up by a new request. |
Codecov Report
@@ Coverage Diff @@
## master #1476 +/- ##
============================================
+ Coverage 73.81% 73.82% +<.01%
Complexity 720 720
============================================
Files 850 850
Lines 26158 26160 +2
Branches 2018 2018
============================================
+ Hits 19309 19313 +4
+ Misses 5975 5973 -2
Partials 874 874
Continue to review full report at Codecov.
|
private boolean isHealthy(Channel channel) { | ||
// There might be cases where the channel is not reusable but still active at the moment | ||
// See https://github.com/aws/aws-sdk-java-v2/issues/1380 | ||
if (channel.attr(KEEP_ALIVE).get() != null && !channel.attr(KEEP_ALIVE).get()) { | ||
return false; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine, but any reason we don't close the channel after the response ins consumed instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do invoke ChannelHandlerContext#close
in ResponseHandler
for this case but the channel might not be actually closed and is still active at this point. It's not guaranteed the channel is closed before it's released into the pool.
Also, because of the fact that the underlying HttpStreamsClientHandler
from netty-reactive-streams does not close the channel if there are still inflight messages, channel.close
might not be initiated at all. (This is what we discussed in the morning and we might need to figure out why there is still inflight message when the response is complete in some cases)
Description
Update
HealthCheckedChannelPool
to checkKEEP_ALIVE
when acquiring a channel from the pool to avoid soon-to-be inactive channels being picked up by a new request. This should reduce the frequency ofIOException: Server failed to complete response
errors. See #1380 #1466Testing
Disabled retry and set maxIdleConnectionTimeout = 1 mins on purpose to expose the errors:
Before
After
Screenshots (if appropriate)
Types of changes
Checklist
mvn install
succeedsLicense