Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix reverse named pipe server resume hang. (#2573)
Running a reverse connect (--diagnostic-port) against a runtime running with nosuspend, hangs/deadlocks IPC client and IPC server thread on start collect -> resume command sequence. Happens since IPC server thread will write out initial header data into stream but client only waits for session id and will then issue a resume command. If the reverse server is not setup with any buffering this will cause IPC server thread to block when writing into session, server thread won’t be able to handle incoming resume command, and client won’t read anything from session unblocking server thread, since its waiting on completion of resume command. This is not an issue on Unix domain, TCP/IP sockets since it by default will allocate smaller in/out buffers, preventing server to block. It’s not an issue on runtime IPC named pipe listener port’s, since they are allocated with 16 KB in/out buffers. It is however an issue on reverse servers created by IpcWindowsNamedPipeServerTransport since those will default to 0 byte in/out buffers, triggering the issue. Fix increase the default size of in/out buffers in sync with default named pipe in/out buffers used by runtime, 16KB. Not an issue on CI since CI uses its own implementation of reverse named pipe server, using 16 KB as in/out buffers as well. Keeping Unix Domain, TCP/IP sockets as is, since runtime also uses defaults in its implementation. CI on the other hand uses 16 KB for Unix Domain Socket in/out buffers, something IpcTcpSocketServerTransport could do as well, but could be adjusted if ever needed. The number of bytes written into stream when starting up streaming is small (xxx) and should fit into the default buffer sizes used by Unix Domain, TCP/IP sockets. This fix is however just fixing the symptom of the underlying issue, that the IPC server thread writes into a stream that it expects reader to consume or it might block from processing further of commands. One alternative runtime fix would be to move the write of stream init data into streaming thread, but since we also write all rundown data on IPC server thread it won’t solve the complete problem since stop collect command could cause same issues. If client doesn’t have a consumer reading the data on the stopped stream, while client is stopping it in parallel, it will block. This is much worse scenario since the size of the data written back as part of rundown will most likely be bigger than any IPC buffer sizes. Changing this behaviour in runtime requires some re-architecture of current IPC infrastructure moving write into trace session away from server thread.
- Loading branch information