Memory leak when creating lots of AsyncClient contexts #978

Recursing · 2020-05-21T20:06:40Z

Checklist

The bug is reproducible against the latest release and/or master.
There are no similar issues or pull requests to fix it yet.

Describe the bug

After creating an AsyncClient context (with async with), it does not seem to be garbage collected, that can be a problem for very long running services that might create a bunch of them and eventually run out of memory

To reproduce

import httpx
import gc
import asyncio

print(f"httpx version: {httpx.__version__}")


async def make_async_client():
    async with httpx.AsyncClient() as client:
        await asyncio.sleep(10)


async def main(n):
    tasks = []
    for _ in range(n):
        tasks.append(make_async_client())
    print(f"Creating {n} contexts, sleeping 10 secs")
    await asyncio.wait(tasks)


asyncio.run(main(2000))
print("Finished run, still using lots of memory")
gc.collect()
input("gc.collect() does not help :(")

Comparison with aiohttp

import aiohttp
import asyncio

print(f"aiohttp version {aiohttp.__version__}")


async def make_async_client():
    async with aiohttp.ClientSession() as client:
        await asyncio.sleep(10)


async def main(n):
    tasks = []
    for _ in range(n):
        tasks.append(make_async_client())
    print(f"Creating {n} contexts and sleeping")
    await asyncio.wait(tasks)


asyncio.run(main(200000))
input("Finished run, all memory is freed")

Expected behavior

Memory gets freed, after exiting the async context, like for aiohttp

Actual behavior

Memory does not get freed, even after explicitly calling gc.collect()

Debugging material

Environment

OS: Linux (many versions)
Python version: 3.8.3
HTTPX version: both 0.12.1 and master
Async environment: both asyncio and trio
HTTP proxy: no
Custom certificates: no

Additional context

I understand typically you need to have only one async ClientSession, but it shouldn't leak memory anyway, for very long running processes it can be a problem

Thanks for this great library! If you're interested I can try to debug this issue and send a PR

The text was updated successfully, but these errors were encountered:

tomchristie · 2020-05-21T20:26:39Z

That’s super surprising if so. Can you test against master and see if you’re getting the same results?

There’s not anywhere I can think of on a first pass where that could be an issue for us, so yeah any digging into it would be most welcome.

What are you using to measure the memory usage?

Recursing · 2020-05-21T20:32:19Z

I did test against master and have the same issue, can you reproduce?
I'm just measuring system memory usage with top and htop, the above test uses several gigabytes (which seems a lot anyway, even disregarding the leak)

tomchristie · 2020-05-21T20:42:00Z

It might be worth trying verify=False to see if that makes a difference - that’d help isolate if it’s something to do with SSL contexts or elsewhere.

Recursing · 2020-05-21T20:54:00Z

Running the test with verify=False seems to fix both issues, contexts use much less memory (can create 100000 of them with the same RAM usage of the 2000 in the test) and gc collects everything

tomchristie · 2020-05-21T21:49:45Z

Interesting. I wonder if simply creating instances of SSLConfig() is enough to reproduce the issue.

Recursing · 2020-05-21T22:09:48Z

I think I have a more minimal (if even weirder) example:

import httpx
import gc
import trio


async def make_async_client():
    sslc = httpx._config.SSLConfig()
    await trio.sleep(0)


async def main(n):
    tasks = []
    async with trio.open_nursery() as nursery:
        for _ in range(n):
            nursery.start_soon(make_async_client)
        print(f"Creating {n} SSLConfigs")


gc.collect()
trio.run(main, 1000)
print("Finished run, still using lots of memory")
input()
gc.collect()
input("gc.collect() does not help :(")
gc.collect()
input("again :(")

Removing await trio.sleep(0) somehow removes the memory leak, httpcore.SyncConnectionPool and httpcore.AsyncConnectionPool leak the same amount of memory

Recursing · 2020-05-21T22:21:07Z

Edited the minimal example, you're right, just creating a SSLConfig reproduces the issue, but only if it's followed by an await before returning

Recursing · 2020-05-21T22:46:54Z

Found a more minimal example that doesn't use httpx of httpcore, switched back to asyncio so I open the issue on cpython

import ssl
import certifi
import gc
import asyncio


ca_path = certifi.where()
async def make_async_context() -> None:
    context = ssl.SSLContext(ssl.PROTOCOL_TLS)
    context.load_verify_locations(ca_path)
    await asyncio.sleep(1)


async def main(n: int) -> None:
    tasks = []
    for _ in range(n):
        tasks.append(make_async_context())
    print(f"Creating {n} SSLContexts")
    await asyncio.wait(tasks)


gc.collect()
asyncio.run(main(2000))
input("Finished run, still using lots of memory :(")
gc.collect()
input("gc.collect() does not help :(")

victoraugustolls · 2020-05-21T23:03:59Z

Just out of curiosity, in python 3.7 the bug exists?

Recursing · 2020-05-21T23:05:34Z

I have no idea, building 3.9.0a5 right now, will check
Edit:
Can reproduce in both python 3.6.9, 3.7.7 and python 3.9.0a5, on different machines

tomchristie · 2020-05-22T07:14:14Z

Useful points for further investigation...

Is the sleep necessary to replicate?
Can this be simplified further by just creating the SSLContexts sequentially, or not?
If so, can it also be replicated by creating them in plain old synchronous code, or not?

Recursing · 2020-05-22T08:57:54Z

Is the sleep necessary to replicate?

yes, which implies no to the other points

Can you reproduce the behavior I observe on my systems? I think it might depend on the system level ssl library

florimondmanca · 2020-05-22T10:34:38Z

Can you reproduce the behavior I observe on my systems?

I'm on macOS 10.14.6, Python 3.8.2, and it seems I'm not able to reproduce the leak.

I ran your latest script and monitored the memory usage on 1000 tasks. I see an increase of RAM usage up to ~700MB, then once the asyncio.run() call finishes usage drops down back to ~24MB, which I guess is fairly acceptable for a running interpreter.

I also tried the script for the issue description, the one with AsyncClient. I see a similar increase of RAM, and once the main function is finished usage drops down to ~37MB. Calling gc.collect() makes that drop to 35MB.

Recursing · 2020-05-22T10:35:30Z

See https://bugs.python.org/issue40727 I also cannot reproduce the leak on windows, only Linux

tomchristie · 2020-05-22T10:40:01Z

I can't exactly replicate that, no. It'll will use 1.5GB memory while it has 2000 instances simultaneously in memory, although it'll free up once they're out of scope.

One thing we might want to consider in any case is globally caching our SSL contexts. They're a little bit slow to create, and they're memory hungry, so it'd probably make sense for us to cache a small number of them so that users who're (unnecessarily) creating lots of clients aren't being negatively impacted.

Recursing · 2020-05-22T10:47:41Z

Creating and using them synchronously doesn't run into memory issues (as you don't keep them simultaneously in memory, and memory gets freed correctly), also it takes just a few seconds to create thousands.
So I think that if the memory leak is resolved (which seems to be a python on linux issue) the improvement (even for those users who're unnecessarily creating lots of clients) might be negligible, but it might solve the issue until then

tomchristie · 2020-05-22T11:05:18Z

Just adding this so we've got some bearings on this.

ca_path = certifi.where()

contexts = []
for i in range(2000):
    context = ssl.SSLContext(ssl.PROTOCOL_TLS)
    context.load_verify_locations(ca_path)
    contexts.append(context)

Will take about 1.5GB and nearly 15 seconds on my machine, which is fairly slow and hungry whatever. Might be a good argument in favour of caching SSLContexts.

In any case, I'll close this since it's not our issue.

You don't need to create 2000 clients - create a single client, and reuse it throughout. You only need to create additional clients if you need a different configuration on them.

UrbiJr · 2021-04-13T12:33:56Z

I'm experiencing this too:

Windows 10 x64
Python 3.7.9 x64

And it only happens when setting verify=True to my httpx.AsyncClient... and no, unfortunately I cannot use the same client instance for all the requests. So what would be any possible solution to this? Memory really goes up from 100MB to 1GB in less than 1 minute when running 50 simultaneous clients.
Setting verify=False cannot be accepted as solution in my case as many servers require a valid SSL (TLS) context...

gc.collect() does not solve the issue.

tomchristie · 2021-04-14T09:59:46Z

unfortunately I cannot use the same client instance for all the requests

You might want to dig into why that's not possible...

There's no particularly graceful way for us to get around the fact that instantiating a bunch of ssl.SSLContext instances is super-expensive.

Recursing · 2021-04-14T11:08:49Z

@UrbiJr I think aiohttp maybe does some implicit caching to help in this situation.

You can try to see if using that library would fix your issue, in that case httpx might think of doing the same if enough other people run into this.

Also I'm not sure your issue is related to what I ran into last year, as I was unable to reproduce that on windows, but in a year I guess lots of things changed

UrbiJr · 2021-04-14T13:14:36Z

unfortunately I cannot use the same client instance for all the requests

You might want to dig into why that's not possible...

There's no particularly graceful way for us to get around the fact that instantiating a bunch of ssl.SSLContext instances is super-expensive.

I would use the same httpx.AsyncClient instance, but then I'd need to repeatedly update headers, cookies, proxies. For the first two it's still possible, but the third would be a problem - I guess - since currently httpx only allows to set them when creating the instance (?).

UrbiJr · 2021-04-14T13:17:00Z

@UrbiJr I think aiohttp maybe does some implicit caching to help in this situation.

You can try to see if using that library would fix your issue, in that case httpx might think of doing the same if enough other people run into this.

Also I'm not sure your issue is related to what I ran into last year, as I was unable to reproduce that on windows, but in a year I guess lots of things changed

I have too much code based on httpx APIs and I also think it's much easier to use. I wouldn't want to change to another library. I think this is something we could definetely solve. Unfortunately the interest for this is not as much as I expected it to be.

tomchristie · 2021-04-14T13:37:03Z

Hrm.

One option here might be to create a single global SSLContext instance...

ssl_context = httpx.create_ssl_context()

And then to pass that to all the client instances...

client = httpx.AsyncClient(verify=ssl_context)

UrbiJr · 2021-04-14T13:51:53Z

Hrm.

One option here might be to create a single global SSLContext instance...
ssl_context = httpx.create_ssl_context()
And then to pass that to all the client instances...
client = httpx.AsyncClient(verify=ssl_context)

Ok. Sorry I did not think of that as I was focused on verify=True which originated my memory leak issue. Just tested and it seems good. Yeah this could be the solution I guess, don't see any cons as of now.

…l.SSLContext before there was one ssl.SSLContext per client. see encode/httpx#978

memclutter · 2021-07-14T15:10:08Z

import httpx
import gc
import asyncio

print(f"httpx version: {httpx.__version__}")


async def make_async_client():
    async with httpx.AsyncClient(verify=False) as client:
        await client.request(method='get', url='https://gorest.co.in/public/v1/users')
        await asyncio.sleep(10)


async def main(n):
    tasks = []
    for _ in range(n):
        tasks.append(make_async_client())
    print(f"Creating {n} contexts, sleeping 10 secs")
    await asyncio.wait(tasks)


asyncio.run(main(2000))
print("Finished run, still using lots of memory")
gc.collect()
input("gc.collect() does not help :(")

memory leak again. What am I doing wrong?

httpx: 0.18.2
python: 3.7.9
docker: 20.10.7
ubuntu: 20.04

Recursing · 2021-07-14T15:25:10Z

You don't need to create 2000 clients - create a single client, and reuse it throughout.

⬆️

yarikdevcom · 2022-01-16T13:46:53Z

If I create a one client, how should I close it?

yarikdevcom · 2022-01-16T13:49:18Z

From docs, so you can't create 1 client for your session and then you should manually close it and create a new one, so it will continue to leak memory

# However, an explicit call to `aclose()` is also sufficient:
#
#    client = httpx.AsyncClient()
#    try:
#        ...
#    finally:
#        await client.aclose()

yarikdevcom · 2022-01-16T13:50:53Z

For example if I have await self.api.get('adadada') called 1000 times, it will create 1000 Clients and this is ok, because I need to close them and my client library should not require for user to close after all this requests by calling self.api.close_client() <- super inconvenient

tomchristie closed this as completed May 22, 2020

cansarigol mentioned this issue May 23, 2020

Moved create and verify ssl context function into the utils and included lru cache. #986

Closed

dalf added a commit to searxng/searxng that referenced this issue May 5, 2021

[mod] searx.network.client: the same configuration reuses the same ss…

fb4fbad

…l.SSLContext before there was one ssl.SSLContext per client. see encode/httpx#978

dalf mentioned this issue May 5, 2021

[mod] searx.network.client: the same configuration reuses the same ssl.SSLContext searxng/searxng#59

Merged

dalf added a commit to searxng/searxng that referenced this issue May 5, 2021

[mod] searx.network.client: the same configuration reuses the same ss…

0f4e995

…l.SSLContext before there was one ssl.SSLContext per client. see encode/httpx#978

kvch pushed a commit to searx/searx that referenced this issue May 6, 2021

[mod] searx.network.client: the same configuration reuses the same ss…

5e53e94

…l.SSLContext before there was one ssl.SSLContext per client. see encode/httpx#978

adamantike mentioned this issue Nov 23, 2021

Using UvicornWorkers in Gunicorn cause OOM on K8s encode/uvicorn#1226

Closed

2 tasks

bdraco mentioned this issue May 27, 2022

Memory Leak in ONVIF home-assistant/core#72544

Closed

NurFaizin mentioned this issue Jun 2, 2022

fix(client): Fix memory leak issue. AccelByte/iam-python-sdk#24

Merged

MioYvo mentioned this issue Mar 31, 2023

Issues with AsyncClient (httpx) andelf/tronpy#79

Closed

frostming mentioned this issue May 30, 2024

bug: memory leak when I am using bentoml>=1.2 bentoml/BentoML#4760

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak when creating lots of AsyncClient contexts #978

Memory leak when creating lots of AsyncClient contexts #978

Recursing commented May 21, 2020 •

edited

Loading

tomchristie commented May 21, 2020

Recursing commented May 21, 2020 •

edited

Loading

tomchristie commented May 21, 2020

Recursing commented May 21, 2020 •

edited

Loading

tomchristie commented May 21, 2020

Recursing commented May 21, 2020 •

edited

Loading

Recursing commented May 21, 2020

Recursing commented May 21, 2020 •

edited

Loading

victoraugustolls commented May 21, 2020

Recursing commented May 21, 2020 •

edited

Loading

tomchristie commented May 22, 2020 •

edited

Loading

Recursing commented May 22, 2020 •

edited

Loading

florimondmanca commented May 22, 2020

Recursing commented May 22, 2020

tomchristie commented May 22, 2020

Recursing commented May 22, 2020 •

edited

Loading

tomchristie commented May 22, 2020

UrbiJr commented Apr 13, 2021

tomchristie commented Apr 14, 2021

Recursing commented Apr 14, 2021 •

edited

Loading

UrbiJr commented Apr 14, 2021 •

edited

Loading

UrbiJr commented Apr 14, 2021

tomchristie commented Apr 14, 2021

UrbiJr commented Apr 14, 2021

memclutter commented Jul 14, 2021 •

edited

Loading

Recursing commented Jul 14, 2021 via email •

edited

Loading

yarikdevcom commented Jan 16, 2022

yarikdevcom commented Jan 16, 2022

yarikdevcom commented Jan 16, 2022 •

edited

Loading

Memory leak when creating lots of AsyncClient contexts #978

Memory leak when creating lots of AsyncClient contexts #978

Comments

Recursing commented May 21, 2020 • edited Loading

Checklist

Describe the bug

To reproduce

Expected behavior

Actual behavior

Debugging material

Environment

Additional context

tomchristie commented May 21, 2020

Recursing commented May 21, 2020 • edited Loading

tomchristie commented May 21, 2020

Recursing commented May 21, 2020 • edited Loading

tomchristie commented May 21, 2020

Recursing commented May 21, 2020 • edited Loading

Recursing commented May 21, 2020

Recursing commented May 21, 2020 • edited Loading

victoraugustolls commented May 21, 2020

Recursing commented May 21, 2020 • edited Loading

tomchristie commented May 22, 2020 • edited Loading

Recursing commented May 22, 2020 • edited Loading

florimondmanca commented May 22, 2020

Recursing commented May 22, 2020

tomchristie commented May 22, 2020

Recursing commented May 22, 2020 • edited Loading

tomchristie commented May 22, 2020

UrbiJr commented Apr 13, 2021

tomchristie commented Apr 14, 2021

Recursing commented Apr 14, 2021 • edited Loading

UrbiJr commented Apr 14, 2021 • edited Loading

UrbiJr commented Apr 14, 2021

tomchristie commented Apr 14, 2021

UrbiJr commented Apr 14, 2021

memclutter commented Jul 14, 2021 • edited Loading

Recursing commented Jul 14, 2021 via email • edited Loading

yarikdevcom commented Jan 16, 2022

yarikdevcom commented Jan 16, 2022

yarikdevcom commented Jan 16, 2022 • edited Loading

Recursing commented May 21, 2020 •

edited

Loading

Recursing commented May 21, 2020 •

edited

Loading

Recursing commented May 21, 2020 •

edited

Loading

Recursing commented May 21, 2020 •

edited

Loading

Recursing commented May 21, 2020 •

edited

Loading

Recursing commented May 21, 2020 •

edited

Loading

tomchristie commented May 22, 2020 •

edited

Loading

Recursing commented May 22, 2020 •

edited

Loading

Recursing commented May 22, 2020 •

edited

Loading

Recursing commented Apr 14, 2021 •

edited

Loading

UrbiJr commented Apr 14, 2021 •

edited

Loading

memclutter commented Jul 14, 2021 •

edited

Loading

Recursing commented Jul 14, 2021 via email •

edited

Loading

yarikdevcom commented Jan 16, 2022 •

edited

Loading