Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update socketio.c #976

Closed
wants to merge 1 commit into from
Closed

Conversation

itnic
Copy link

@itnic itnic commented Dec 2, 2017

Issue #787

There is a bug in the Microsoft POSIX compatibility layer used to
translate POSIX socket API to Windows socket API. The bug concerns the
emulated function accept() (socketio_accept() in the file socketio.c),
which may return an invalid socket and may lock the program in an
infinite loop.

A race is happening between the Windows kernel signaling a socket
issue to the POSIX compatibility layer. If a connection to the ssh
service is dropped before being fully handled by the POSIX
compatibility layer, that layer may reach a state where the Windows
kernel is aware of the dropped connection and update all the socket
states of the Windows socket API. However, the POSIX compatibility
layer is in the middle of the accept() function task and fails without
updating the local state of the emulated POSIX socket.

When the emulated accept() exits, the current socket state is not
updated and the emulated select() call would carry on detecting
activity (previously already detected) on the socket, triggering the
same exact error in the accept() call, running then in an infinite
loop.

This may not happen all the time: if the Windows kernel signals the
error before the compatibility POSIX layer has setup its accept()
state and returned successfully, other calls such as send() and recv()
will update the state of the socket and will handle the issue.

However, on a loaded machine where the synchronization between the
Windows socket API and the compatibility POSIX layer may be slower,
the emulated accept() call may finish before being notified by the
Windows kernel of a client disconnection. This may be triggered with
nmap, which makes two TCP connections in a row quickly: one to detect
an opened port, and a second to retrieve the ssh banner.

The fix proposed is to update the emulated accept() function to modify
the internal state of the compatibility POSIX layer as a regular POSIX
kernel would do. I personally had this issue and this patch fixed it.

Please note that we have identified other situations where it could
potentially happen (in particular the socketio_setsockopt() call). But
we haven't investigated this issue more deeply. Other issues such as
#414 or #606 may be related.

Issue 787

There is a bug in the Microsoft POSIX compatibility layer used to
translate POSIX socket API to Windows socket API. The bug concerns the
emulated function accept() (socketio_accept() in the file socketio.c),
which may return an invalid socket and may lock the program in an
infinite loop.

A race is happening between the Windows kernel signaling a socket
issue to the POSIX compatibility layer. If a connection to the ssh
service is dropped before being fully handled by the POSIX
compatibility layer, that layer may reach a state where the Windows
kernel is aware of the dropped connection and update all the socket
states of the Windows socket API. However, the POSIX compatibility
layer is in the middle of the accept() function task and fails without
updating the local state of the emulated POSIX socket.

When the emulated accept() exits, the current socket state is not
updated and the emulated select() call would carry on detecting
activity (previously already detected) on the socket, triggering the
same exact error in the accept() call, running then in an infinite
loop.

This may not happen all the time: if the Windows kernel signals the
error before the compatibility POSIX layer has setup its accept()
state and returned successfully, other calls such as send() and recv()
will update the state of the socket and will handle the issue.

However, on a loaded machine where the synchronization between the
Windows socket API and the compatibility POSIX layer may be slower,
the emulated accept() call may finish before being notified by the
Windows kernel of a client disconnection. This may be triggered with
nmap, which makes two TCP connections in a row quickly: one to detect
an opened port, and a second to retrieve the ssh banner.

The fix proposed is to update the emulated accept() function to modify
the internal state of the compatibility POSIX layer as a regular POSIX
kernel would do. I personally had this issue and this patch fixed it.

Please note that we have identified other situations where it could
potentially happen (in particular the socketio_setsockopt() call). But
we haven't investigated this issue more deeply. Other issues such as
PowerShell#414 or PowerShell#606 may be related.
@bagajjal
Copy link
Collaborator

bagajjal commented Dec 6, 2017

Please raise the PR on https://github.com/PowerShell/openssh-portable repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants