-
-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows a deadlock on nng_close()
#1827
Comments
SHA-1: aac4dc3
https://gist.github.com/mikisch81/428c4ad87afcc1c8881b282cd5e80eb3 |
add above code for easy debugging |
I also reproduced the crash reported by @leowang552 a couple of times |
The crash is a different issue... it's a sign that we're trying to remove from a list when already not on the list. Let's open another ticket for it. (And I'd like to know whether that happens only on IPC on Windows, or does it happen either on other platforms or for non-IPC connections -- because the Windows named pipe IPC is totally unlike the other transports.) |
As to this bug, I have a PR out that I'd appreciate it if folks could try out. |
I see the bug not easily reproduced if the computer performance is high. |
Does this problem only happen with Windows IPC? |
@alzix it seems you actually have an item on the pipes list. This is different than the reported issue before that the pipe list was empty. Again, I am wondering if this is exclusive to the IPC dialer. |
Ah I see the original problem was reported on macos. |
In my project I use nng for IPC this I cannot provide any input on other protocols. |
Interesting. Ok, I have pushed another change to the same PR, which attempts to fix a possible issue if the pipes are closed during negotiation (which likely is happening here.) If you can try that PR branch again, I'd be grateful @alzix . |
did - left my insights in PR comments |
Describe the bug
When calling nng_close() there is sometimes a deadlock which causes nng_close() to hang.
This happens also when using only sync APIs (no AIOs).
nng_close should finish successfully.
Actual Behavior
nng_close() hangs.
To Reproduce
I created a modified version of the reqrep example code here (I use it with IPC transport): https://gist.github.com/mikisch81/428c4ad87afcc1c8881b282cd5e80eb3
In the modified example in the client code right before calling nng_recv for the reply from the server I start a thread which just calls nng_close, after a couple of successful runs the deadlock happen:
mischwar@tlv-mpawy reqrep % ./reqrep client ipc:///tmp/reqrep_test
1712848018 - CLIENT: SENDING DATE REQUEST
1712848018 - CLIENT: WAITING FOR REPLY
1712848018 - CLIENT: CLOSING SOCKET
nng_recv error - Object closed
...
...
1712848018 - CLIENT: SENDING DATE REQUEST
1712848018 - CLIENT: WAITING FOR REPLY
1712848018 - CLIENT: CLOSING SOCKET
nng_recv error - Object closed
1712848018 - CLIENT: SENDING DATE REQUEST
1712848018 - CLIENT: WAITING FOR REPLY
1712848018 - CLIENT: CLOSING SOCKET. <--- The thread is suck in nng_close here
Environment Details
NNG version: 1.7.3
Operating system and version: MacOS Sonoma 14.4.1 (but also happens on Windows and Linux)
Compiler and language used: C/C++ clang
Shared or static library - static
Additional context
it seems that on Windows it is a TOCTOU issue in the
nni_sock_shutdown
it is stuck on
while
s_pipes
is already emptyThe text was updated successfully, but these errors were encountered: