-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix][client] Set fields earlier for correct ClientCnx initialization #19327
[fix][client] Set fields earlier for correct ClientCnx initialization #19327
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
Awesome job on the investigation and the fix!
/pulsarbot rerun-failure-checks |
pulsar-client/src/main/java/org/apache/pulsar/client/impl/PulsarChannelInitializer.java
Outdated
Show resolved
Hide resolved
Codecov Report
@@ Coverage Diff @@
## master #19327 +/- ##
=============================================
+ Coverage 32.31% 64.10% +31.78%
- Complexity 6350 25912 +19562
=============================================
Files 1644 1818 +174
Lines 123711 133090 +9379
Branches 13487 14640 +1153
=============================================
+ Hits 39983 85311 +45328
+ Misses 77828 39932 -37896
- Partials 5900 7847 +1947
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great find!
…apache#19327) (cherry picked from commit 3d8b52a) (cherry picked from commit d5244c7)
Fixes #13923
Motivation
When the Java client's
ClientCnx
is initialized, there is a race to set the target broker and the remote hostname. The target brorker is relevant when connecting through the proxy and the remote hostname is relevant for certain kinds of authentication, like SASL. In most cases, this race results in the preferred order where these values are set before theClientCnx#channelActive
method is called. However, when that method is called before they are set, problems occur.This PR ensures that the fields are set before we call
channel.connect(...)
.Additional motivation for understanding the correctness of this solution can be found in the relevant netty code:
https://github.com/netty/netty/blob/cbd324c178135a82f23749bc218c2c6ee3a9b140/transport-classes-epoll/src/main/java/io/netty/channel/epoll/AbstractEpollChannel.java#L648-L659
In that block, notice that the
isActive
gets set totrue
, the promise gets completed (which likely triggers callbacks), and then the channel active event gets triggered. Interestingly, the easiest way to reproduce the problematic behavior in the proxy is with the following steps:createConnection(InetSocketAddress logicalAddress, InetSocketAddress physicalAddress, int connectionKey)
so that the callback forcreateConnection
isthenAcceptAsync
instead ofthenAccept
.ProxyTest#testProducer()
.Modifications
ConnectionPool
to pass the logical broker address through to the initialization methods.Verifying this change
It is challenging to create a pure reproducer for this case. I was easily able to reproduce it by making the callback complete in another thread. Here is the
thenAccept
that I changed tothenAcceptAsync
:pulsar/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ConnectionPool.java
Line 247 in 3c06a4a
At the very least, the current test coverage will ensure this does not introduce a regression.
Does this pull request potentially affect one of the following parts:
This is not a breaking change.
Documentation
doc-not-needed
Matching PR in forked repository
PR in forked repository: michaeljmarshall#20