Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing to detect lost connections (asyncio only) #95

Closed
didip opened this issue Jun 18, 2019 · 12 comments · Fixed by #103, #111 or #627
Closed

Failing to detect lost connections (asyncio only) #95

didip opened this issue Jun 18, 2019 · 12 comments · Fixed by #103, #111 or #627
Labels
bug Something isn't working

Comments

@didip
Copy link

didip commented Jun 18, 2019

1 of 5 requests I initiated would raise ConnectionResetError('Connection lost') when calling AsyncClient from inside Starlette and Uvicorn.

In comparison, urllib, requests and Tornado's AsyncClient works as expected.

Stacktrace:

  File "/project/venv/lib/python3.7/site-packages/http3/client.py", line 145, in send_handling_redirects
    request, verify=verify, cert=cert, timeout=timeout
  File "/project/venv/lib/python3.7/site-packages/http3/dispatch/connection_pool.py", line 121, in send
    raise exc
  File "/project/venv/lib/python3.7/site-packages/http3/dispatch/connection_pool.py", line 116, in send
    request, verify=verify, cert=cert, timeout=timeout
  File "/project/venv/lib/python3.7/site-packages/http3/dispatch/connection.py", line 59, in send
    response = await self.h11_connection.send(request, timeout=timeout)
  File "/project/venv/lib/python3.7/site-packages/http3/dispatch/http11.py", line 53, in send
    await self._send_event(event, timeout)
  File "/project/venv/lib/python3.7/site-packages/http3/dispatch/http11.py", line 101, in _send_event
    await self.writer.write(data, timeout)
  File "/project/venv/lib/python3.7/site-packages/http3/concurrency.py", line 89, in write
    self.stream_writer.drain(), timeout.write_timeout
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/tasks.py", line 388, in wait_for
    return await fut
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/streams.py", line 348, in drain
    await self._protocol._drain_helper()
  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/streams.py", line 202, in _drain_helper
    raise ConnectionResetError('Connection lost')
@tomchristie
Copy link
Member

Could you break this down to the simplest possible reproduction? Which URL, what kind of request? Does it reproduce on a range of URLs, or mainly on one specific site. Can you reproduce independently of Starlette, and just demonstrate it standalone? Thanks so much!

@tomchristie
Copy link
Member

Could be that it’s caused by #5 - will need to replicate, and then do some digging.

@simonw
Copy link
Contributor

simonw commented Jul 7, 2019

I'm seeing this bug too. Here's a recent traceback (from my Heroku logs):

heroku[router]: at=info method=GET path="/-/auth-callback?code=e124aa75f61efd442f86" host=datasette-auth-demo.herokuapp.com request_id=834768c7-8903-4853-98c0-5bae77ffba3e fwd="148.64.98.14" dyno=web.1 connect=1ms service=93ms status=500 bytes=212 protocol=https
app[web.1]: ERROR: Exception in ASGI application
app[web.1]: Traceback (most recent call last):
app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/uvicorn/protocols/http/httptools_impl.py", line 370, in run_asgi
app[web.1]: result = await app(self.scope, self.receive, self.send)
app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/datasette_auth_github/github_auth.py", line 77, in __call__
app[web.1]: return await self.auth_callback(scope, receive, send)
app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/datasette_auth_github/github_auth.py", line 216, in auth_callback
app[web.1]: profile = (await self.github_api_client.get(profile_url)).json()
app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/http3/client.py", line 316, in get
app[web.1]: timeout=timeout,
app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/http3/client.py", line 548, in request
app[web.1]: timeout=timeout,
app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/http3/client.py", line 145, in send
app[web.1]: allow_redirects=allow_redirects,
app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/http3/client.py", line 178, in send_handling_redirects
app[web.1]: request, verify=verify, cert=cert, timeout=timeout
app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/http3/dispatch/connection_pool.py", line 121, in send
app[web.1]: raise exc
app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/http3/dispatch/connection_pool.py", line 116, in send
app[web.1]: request, verify=verify, cert=cert, timeout=timeout
app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/http3/dispatch/connection.py", line 59, in send
app[web.1]: response = await self.h11_connection.send(request, timeout=timeout)
app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/http3/dispatch/http11.py", line 49, in send
app[web.1]: await self._send_request(request, timeout)
app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/http3/dispatch/http11.py", line 83, in _send_request
app[web.1]: await self._send_event(event, timeout)
app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/http3/dispatch/http11.py", line 115, in _send_event
app[web.1]: await self.writer.write(bytes_to_send, timeout)
app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/http3/concurrency.py", line 133, in write
app[web.1]: self.stream_writer.drain(), timeout.write_timeout
app[web.1]: File "/app/.heroku/python/lib/python3.6/asyncio/tasks.py", line 358, in wait_for
app[web.1]: return fut.result()
app[web.1]: File "/app/.heroku/python/lib/python3.6/asyncio/streams.py", line 339, in drain
app[web.1]: yield from self._protocol._drain_helper()
app[web.1]: File "/app/.heroku/python/lib/python3.6/asyncio/streams.py", line 210, in _drain_helper
app[web.1]: raise ConnectionResetError('Connection lost')
app[web.1]: ConnectionResetError: Connection lost
app[web.1]: INFO: ('10.13.156.79', 19351) - "GET /-/auth-callback HTTP/1.1" 500

Here's my code that's making the API call: https://github.com/simonw/datasette-auth-github/blob/b4bbb0ecbc66a67fe6b57dc9c47870fe8fa6c0fc/datasette_auth_github/github_auth.py#L188-L199

        # Exchange that code for a token
        github_response = (
            await self.github_api_client.post(
                "https://github.com/login/oauth/access_token",
                data={
                    "client_id": self.client_id,
                    "client_secret": self.client_secret,
                    "code": qs["code"],
                },
            )
        ).text
        parsed = dict(parse_qsl(github_response))

Where github_api_client = http3.AsyncClient() as a class-level property.

I'm going to try instantiating a new AsyncClient() each time rather than re-using one for each request.

@simonw
Copy link
Contributor

simonw commented Jul 7, 2019

I changed my code to stop re-using the AsyncClient() instance across multiple requests and it seems to have fixed the problem for me - see simonw/datasette-auth-github#27

@tomchristie
Copy link
Member

Gotcha - thanks for the report. Seems pretty likely it’d be occurring when attempting to use a connection that’s disconnected, eg due to keep alive timing out on the server side.

@tomchristie
Copy link
Member

Alrighty. I could recreate this issue by creating a client, issuing a request, then waiting 5+ mins, and making another request.

Version 0.6.7 properly deals with the case where an expired connection is obtained from the connection pool. In my testing this resolves the issue. 👍

@rwboyer
Copy link

rwboyer commented Aug 25, 2019

Alrighty. I could recreate this issue by creating a client, issuing a request, then waiting 5+ mins, and making another request.

Version 0.6.7 properly deals with the case where an expired connection is obtained from the connection pool. In my testing this resolves the issue. 👍

I seem to be having the same sort of issue but I am running 0.7.1 using

#global
client = httpx.AsyncClient()
---

#inside route
r = await client.post("https://www.google.com/recaptcha/api/siteverify", data={"secret": "MYSECRET","response": body['gresponse']})

I've moved client to create a new AsyncClient() inside of the route for now, am I missing something or is this the same issue as above?

@dmontagu
Copy link

dmontagu commented Nov 27, 2019

I'm also getting this issue with 0.7.6 with infrequent calls to an elasticsearch server:

    response = await self.session.post(path, json=body)
  File "/home/worker/.local/lib/python3.7/site-packages/httpx/client.py", line 484, in post
    trust_env=trust_env,
  File "/home/worker/.local/lib/python3.7/site-packages/httpx/client.py", line 626, in request
    trust_env=trust_env,
  File "/home/worker/.local/lib/python3.7/site-packages/httpx/client.py", line 650, in send
    trust_env=trust_env,
  File "/home/worker/.local/lib/python3.7/site-packages/httpx/client.py", line 265, in _get_response
    return await get_response(request)
  File "/home/worker/.local/lib/python3.7/site-packages/httpx/middleware/redirect.py", line 31, in __call__
    response = await get_response(request)
  File "/home/worker/.local/lib/python3.7/site-packages/httpx/client.py", line 226, in get_response
    request, verify=verify, cert=cert, timeout=timeout
  File "/home/worker/.local/lib/python3.7/site-packages/httpx/dispatch/connection_pool.py", line 126, in send
    raise exc
  File "/home/worker/.local/lib/python3.7/site-packages/httpx/dispatch/connection_pool.py", line 121, in send
    request, verify=verify, cert=cert, timeout=timeout
  File "/home/worker/.local/lib/python3.7/site-packages/httpx/dispatch/connection.py", line 65, in send
    response = await self.h11_connection.send(request, timeout=timeout)
  File "/home/worker/.local/lib/python3.7/site-packages/httpx/dispatch/http11.py", line 49, in send
    await self._send_request(request, timeout)
  File "/home/worker/.local/lib/python3.7/site-packages/httpx/dispatch/http11.py", line 91, in _send_request
    await self._send_event(event, timeout)
  File "/home/worker/.local/lib/python3.7/site-packages/httpx/dispatch/http11.py", line 124, in _send_event
    await self.stream.write(bytes_to_send, timeout)
  File "/home/worker/.local/lib/python3.7/site-packages/httpx/concurrency/asyncio.py", line 144, in write
    self.stream_writer.drain(), timeout.write_timeout
  File "/usr/local/lib/python3.7/asyncio/tasks.py", line 442, in wait_for
    return fut.result()
  File "/usr/local/lib/python3.7/asyncio/streams.py", line 348, in drain
    await self._protocol._drain_helper()
  File "/usr/local/lib/python3.7/asyncio/streams.py", line 202, in _drain_helper
    raise ConnectionResetError('Connection lost')

@tomchristie tomchristie reopened this Dec 2, 2019
@tomchristie tomchristie changed the title AyncClient frequently loses connection. Failing to detect lost connections. Dec 2, 2019
@tomchristie
Copy link
Member

Right - I don't think that the issue here is that we're losing the connection, I think it's that we're not detecting that we've lost the connection.

When we acquire a connection from the pool we check if it has been closed or not...

if connection is not None and connection.is_connection_dropped():

However it's been noted that the "is closed" detection on asyncio may be flaky... #346 (comment)

  • Anyone want to dig into double checking that our is_connection_dropped() call is returning False in the case that this issue occurs?
  • Anyone want to dig into checking that this only occurs on asyncio, and not on trio?

@tomchristie
Copy link
Member

There are potential mitigations here, such as having a keep-alive time on connections, and assuming that they're expired once the keep-alive is passed, but it'd be good to have a clear handle on if we've got to the root of the cause first.

@tomchristie tomchristie added the bug Something isn't working label Dec 5, 2019
@tomchristie tomchristie changed the title Failing to detect lost connections. Failing to detect lost connections (asyncio only) Dec 5, 2019
@daxartio
Copy link

daxartio commented Dec 9, 2019

How can i fix this problem? May be request retrying if i expect ConnectionResetError?

@tomchristie
Copy link
Member

@daxartio I'm looking at resolving this issue.
Until then, a cheap fix is to just use a new client on each connection.

This was referenced Dec 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
6 participants