-
-
Notifications
You must be signed in to change notification settings - Fork 30.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When accepting a socket connection and ERROR_NETNAME_DELETED occurs, leads this into a closing of the serving socket (BaseProactorEventLoop) #93821
Comments
Have you reported this to uvicorn/starlette bug tracker? Can you provide a reproducer with only asyncio and no third-party packages? Have you tested main branch? |
I will try to create a reproducer with only asncio, I just testet latest released version and not main branch. When a OS error happen on the finish_accept function then BaseProactorEventLoop._start_serving.loop falls into that OSError except block and close the listener sock at line 864. But its not the listening sock that produce this OS Error it is the connection socket, so an OS Error on connection sock leads into closing the listening sock. Connection socket after this issue happen: While i am trying to create a reproducer with asyncio could you tell me, is that a expected behaviour? Thank u |
Tested with latest main branch and also can reproduce this error. Server import asyncio
class Serve(asyncio.Protocol):
def connection_made(self, transport):
self.transport = transport
def data_received(self, data):
self.transport.write(b"OK")
async def main():
loop = asyncio.get_running_loop()
server = await loop.create_server(lambda: Serve(), '127.0.0.1', 8088)
async with server:
await server.serve_forever()
asyncio.run(main(), debug=True) Client import threading
import requests
import sys
def create():
for i in range(100):
requests.post("http://127.0.0.1:8088/", data="acs")
if __name__ == "__main__":
for i in range(100):
thread = threading.Thread(target=create, args=(), daemon=True)
thread.start()
sys.exit() |
Latest Findings When AcceptEx encounter an ERROR_NETNAME_DELETED with WSAGetLastError() (overlapped.c), The "real" underlying error that on my case occurs is a "WSAECONNRESET" while accepting the connection socket. How can we deal with WSAECONNRESET or ERROR_NETNAME_DELETED errors when accpeting a socket without closing the listening socket, a easy fix is just closing accept socket and raise a ConnectionResetError that is cached from BaseProactorEventLoop serving loop. --- a/Lib/asyncio/windows_events.py
+++ b/Lib/asyncio/windows_events.py
@@ -577,7 +577,13 @@ def accept(self, listener):
ov.AcceptEx(listener.fileno(), conn.fileno())
def finish_accept(trans, key, ov):
- ov.getresult()
+ try:
+ ov.getresult()
+ except OSError as exc:
+ if exc.winerror in (_overlapped.ERROR_NETNAME_DELETED,
+ _overlapped.ERROR_OPERATION_ABORTED):
+ conn.close()
+ raise ConnectionResetError(*exc.args)
+ raise
# Use SO_UPDATE_ACCEPT_CONTEXT so getsockname() etc work.
buf = struct.pack('@P', listener.fileno())
conn.setsockopt(socket.SOL_SOCKET, index ddb9daca02..4b3d24c4e8 100644
--- a/Lib/asyncio/proactor_events.py
+++ b/Lib/asyncio/proactor_events.py
@@ -854,6 +854,8 @@ def loop(f=None):
if self.is_closed():
return
f = self._proactor.accept(sock)
+ except ConnectionResetError:
+ self.call_soon(loop)
except OSError as exc:
if sock.fileno() != -1:
self.call_exception_handler({
|
Can you post a diff rather than copying functions as it will easier to apply your changes? Also seems like #18199 is similar to this issue. |
Sure, I updated my latest comment. Yes looks similar. |
The PR linked above (gh-18199) is more appropriate for the issue in Assuming the best reaction to I have another niggling suspicion. There is a background
in windows_events.py::accept. It is known that sometimes such tasks get garbage-collected before they've completed (e.g. #96323). But really, I don't think it is the cause of @iUnknwn's problem, based on the logs he sent me privately (those show that task as completed with Anyway, if anyone knows how to reproduce getting ERROR_NETNAME_DELETED from an accept() call, please let us know! |
@kumaraditya303 Have you ever heard of ERROR_NETNAME_DELETED? |
I remember this, it is a very annoying error ERROR_NETNAME_DELETED and hard to reproduce, I don't remember much now and haven't used |
Could it be caused by IIS in reverse proxy mode? |
One thing I remember is that when I was getting this I looked at how other programming languages handle it and IIRC golang just retries in this case. This might have changed as it's been a while since then. |
I don't know. Maybe @eryksun can shed some light on this issue? |
Winsock special cases several status codes returned by the AFD device that implements socket files. For example:
If you're using asynchronous |
Either way it means the connection was reset or disconnected, it's the same to asyncio. So it seems |
Looking for a volunteer to implement the proposed fix from my previous comment -- |
Thanks. This avoids my Uvicorn API (FastAPI) from crashing. Although, the following error continues to appear on the terminal: It happens to me when making multiple API requets in a short time to an endpoint which fetches data from another external API. |
…ETED occurs, leads this into a closing of the serving socket This is done with the patches described by fercod in the GitHub thread python#93821. These patches were successfully tested by me and worked properly. The error propagates properly without closing the original socket and without causing the server to hang.
I stumbled upon this solution after discovering it myself for the same issue mentioned. For me, this bug manifest itself by causing my server to hang up, requiring a hard restart to restore it's ability to serve again. There are other +1 for solving the immediate issue. If we can get this in it would be a step in the right direction. |
Just in case someone needs a quick drop-in workaround without waiting for an update: https://gist.github.com/cybergrind/9cbbdc94503548d74dc4d5d3ed99248c |
Hey Guys
Application Description
We discover on our uvicorn server sometimes that the listening socket is closing.
The Uvicorn get image uploads from cameras and sometimes or often releated to mobile network conditions the connection is lost.
And when a connection is lost on remote site, this error occurs sometimes.
I am not very deep into such low level python code so I tried it hard to find the root couse and this are my results.
I hope this results show's that there is sometings and we can talk about that or you can give me hints or help do resolve this issue.
thank you in advance
Bug/Error that occurs
The Error occurs on ov.getresult() (asyncio.windows_events.py:560)
OSError: [WinError 64] The specified network name is no longer available.
After that the server is unresponsive and the listening socket is closed. In that case we need to restart the service.
To me it looks like this happens when ov.getresult() is called and the remote host is disconnected.
When I just wrap this function in a try\except block the listening socket is not closing and uvicorn detects a connection lost.
Please let me know why just wrapping this into a try block resolve this issue,
because I don't know ;-)
Original
Mod (No socket closings)
Error without the try\catch block
Minimal Example to reproduce
Server (Minimal Reproducible Example)
Client (Minimal Reproducible Example)
Tested with:
Uvicorn [x]
Hypercorn [x]
Environment
Python 3.10.5 (tags/v3.10.5:f377153, Jun 6 2022, 16:14:13) [MSC v.1929 64 bit (AMD64)] on win32
Linked PRs
The text was updated successfully, but these errors were encountered: