-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TCP RST on Stale Socket #65
Comments
Hi, I think JSIP used to do reconnect attempts but a local side disconnect can force massive amount of sockets to start reconnecting with each retransmission and causes sudden havoc because it takes time to recycle sockets on some OSes. I think it makes sense to have this behind a flag perhaps, but I am not sure how hard is to test it properly. If I understand correctly, you are sending out request on a stale socket. This transaction should timeout(32 secs by default if you are transaction or dialog stateful stack, configurable) and on the user level you should be able to handle the transaction timeout and it's up to you at this point if you want to send a request again. Many applications send periodic OPTIONS requests as a heartbeat. You can also try to send periodic TCP keepalive (it's an OS level setting), but the success of TCP keepalives varies by OS, I haven't tested this in years now. |
Because a new instance of Bob is listening on the port, it sends a TCP RST immediately. At the user level, the transaction will timeout. Ideally, the user level code should be independent of the transport. On UDP, we don't have to retry after a transaction timeout because we know that the retransmissions were sent. On TCP, the user level code would have to behave differently and retry. We tried TCP keepalives at the OS but there is still a gab where the problem can happen. |
(cherry picked from commit 3c97c4cb577d4e857ee9d0ca0266ffc85ba8a080)
(cherry picked from commit 3c97c4cb577d4e857ee9d0ca0266ffc85ba8a080)
Setup
Alice using TCP when communicating to the VIP (Virtual IP) of Bob
Scenario
Logs
The IOException is handled by ConnectionOrientedMessageChannel within the catch (IOException ex) at line 595.
Experiments
We've disabled gov.nist.javax.sip.CACHE_CLIENT_CONNECTIONS, it helps for new dialogs but we have the problems on an existing dialog where the SIP Option is sent to a stale TCP socket.
We've switched to NIO and we have the same behavior.
How Should We Handle It?
In SIPTransactionImpl, the retransmission is only activated when the connection is UDP. When using TCP, the error is caught silently. I see 2 options
Should it be fixed in Jain-SIP? What should we do?
The text was updated successfully, but these errors were encountered: