Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZOOKEEPER-3706: ZooKeeper.close() would leak SendThread when the netw… #1235

Closed
wants to merge 1 commit into from

Conversation

yfxhust
Copy link

@yfxhust yfxhust commented Jan 25, 2020

…ork is broken

  • add unit test to verify the bug
  • bypass the SendThread.startConnect() by throw RuntimeExcepth if state.isAlive is false

Author: Fangxi Yin [email protected]

@maoling
Copy link
Member

maoling commented Jan 31, 2020

@yfxhust
Great issue description in JIRA. this issue is a little hidden. Do you observe it in the production?

@yfxhust
Copy link
Author

yfxhust commented Feb 3, 2020

@yfxhust
Great issue description in JIRA. this issue is a little hidden. Do you observe it in the production?

We observed occasional SendThread leak case in our production. But the root cause is not 100 percent sure. But from source code and some network clues, we believe that this issue is a potential cause.

@hanm
Copy link
Contributor

hanm commented Feb 3, 2020

nice work - left two comments. since the request timeout is a feature disabled by default, the impact of this bug is limited to those who enabled this feature only.

@yfxhust
Copy link
Author

yfxhust commented Feb 4, 2020

nice work - left two comments. since the request timeout is a feature disabled by default, the impact of this bug is limited to those who enabled this feature only.

Thanks for your comments

@hanm
Copy link
Contributor

hanm commented Feb 5, 2020

left two more comments - current approach should work but I am wondering if my proposal would sound better in term of more structurally solve the race condition around the state variable; also was a little bit concerning around RuntimeException we throw and the implications it brought (w.r.t resource leaks)

@yfxhust
Copy link
Author

yfxhust commented Feb 6, 2020

left two more comments - current approach should work but I am wondering if my proposal would sound better in term of more structurally solve the race condition around the state variable; also was a little bit concerning around RuntimeException we throw and the implications it brought (w.r.t resource leaks)

Thank you for your comments. I adapt my code with your suggestions. Please review the latest patch.

@yfxhust yfxhust force-pushed the ZOOKEEPER-3706 branch 3 times, most recently from f787f2b to 95accad Compare February 10, 2020 08:18
@yfxhust yfxhust force-pushed the ZOOKEEPER-3706 branch 2 times, most recently from d1e235b to 16500a6 Compare February 17, 2020 03:48
Copy link
Member

@maoling maoling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hanm Do you have other concerns?

@hanm
Copy link
Contributor

hanm commented Feb 19, 2020

I just have two nits - one on the sync approach and the other on unit test.

@yfxhust
Copy link
Author

yfxhust commented Feb 19, 2020

I just have two nits - one on the sync approach and the other on unit test.

Thank you for your comments. Adapt my code with your suggestion. Please see the latest patch.

@yfxhust
Copy link
Author

yfxhust commented Feb 25, 2020

@hanm Any further comments ?

Copy link
Contributor

@hanm hanm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Any feedback from other committers? If not, I'll land this in next few days.

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work.
I left a few comments

@yfxhust
Copy link
Author

yfxhust commented Mar 2, 2020

@eolivelli Thank you for your comments. I adapt the patch with your suggestions. Please review the latest patch.

…ork is broken

- add unit test to verify the bug
- wrap state modification in SendThread.changeZkState()
- SendThread.startConnect() could be interrupted by IOException caused by SendThread.changeZkState()

Author: Fangxi Yin <[email protected]>
Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hanm please go ahead and merge if you are okay with the latest version

@asfgit asfgit closed this in d3e72c6 Mar 3, 2020
@hanm
Copy link
Contributor

hanm commented Mar 3, 2020

merged to master - thanks for driving this @yfxhust

stickyhipp pushed a commit to stickyhipp/zookeeper that referenced this pull request Aug 19, 2020
…ork is broken

- add unit test to verify the bug
- bypass the SendThread.startConnect() by throw RuntimeExcepth if state.isAlive is false

Author: Fangxi Yin <yinfangxikuaishou.com>

Author: yinfangxi <[email protected]>

Reviewers: Michael Han <[email protected]>, Enrico Olivelli <[email protected]>, maoling

Closes apache#1235 from yfxhust/ZOOKEEPER-3706
asfgit pushed a commit that referenced this pull request May 17, 2022
…ork is broken

- add unit test to verify the bug
- bypass the SendThread.startConnect() by throw RuntimeExcepth if state.isAlive is false

Author: Fangxi Yin <yinfangxikuaishou.com>

Author: yinfangxi <[email protected]>

Reviewers: Michael Han <[email protected]>, Enrico Olivelli <[email protected]>, maoling

Closes #1235 from yfxhust/ZOOKEEPER-3706
RokLenarcic pushed a commit to RokLenarcic/zookeeper that referenced this pull request Aug 31, 2022
…ork is broken

- add unit test to verify the bug
- bypass the SendThread.startConnect() by throw RuntimeExcepth if state.isAlive is false

Author: Fangxi Yin <yinfangxikuaishou.com>

Author: yinfangxi <[email protected]>

Reviewers: Michael Han <[email protected]>, Enrico Olivelli <[email protected]>, maoling

Closes apache#1235 from yfxhust/ZOOKEEPER-3706
RokLenarcic pushed a commit to RokLenarcic/zookeeper that referenced this pull request Aug 31, 2022
…ork is broken

- add unit test to verify the bug
- bypass the SendThread.startConnect() by throw RuntimeExcepth if state.isAlive is false

Author: Fangxi Yin <yinfangxikuaishou.com>

Author: yinfangxi <[email protected]>

Reviewers: Michael Han <[email protected]>, Enrico Olivelli <[email protected]>, maoling

Closes apache#1235 from yfxhust/ZOOKEEPER-3706
RokLenarcic pushed a commit to RokLenarcic/zookeeper that referenced this pull request Aug 31, 2022
…ork is broken

- add unit test to verify the bug
- bypass the SendThread.startConnect() by throw RuntimeExcepth if state.isAlive is false

Author: Fangxi Yin <yinfangxikuaishou.com>

Author: yinfangxi <[email protected]>

Reviewers: Michael Han <[email protected]>, Enrico Olivelli <[email protected]>, maoling

Closes apache#1235 from yfxhust/ZOOKEEPER-3706
RokLenarcic pushed a commit to RokLenarcic/zookeeper that referenced this pull request Sep 3, 2022
…ork is broken

- add unit test to verify the bug
- bypass the SendThread.startConnect() by throw RuntimeExcepth if state.isAlive is false

Author: Fangxi Yin <yinfangxikuaishou.com>

Author: yinfangxi <[email protected]>

Reviewers: Michael Han <[email protected]>, Enrico Olivelli <[email protected]>, maoling

Closes apache#1235 from yfxhust/ZOOKEEPER-3706
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants