Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native async implementation #282

Closed
Archmonger opened this issue Jul 7, 2023 · 1 comment
Closed

Native async implementation #282

Archmonger opened this issue Jul 7, 2023 · 1 comment

Comments

@Archmonger
Copy link

python-diskcache currently relies on the built-in sqlite library in order to function. It would be relatively simple to duplicate the current API into an async version. This will result in much higher theoretical performance in high-utilization scenarios.

In theory, the current API could be perfectly replicated as async. For example, with __getitem__...

class AsyncCache:
    async def __getitem__(self, __name: str):
        ...

async def test():
    cache = AsyncCache()  # non-functional example
    await cache["abc"]

I would recommend using aiosqlite for this. It likely will need to be an optional dependency.

pip install diskcache[async]

I would recommend validating whether or not aiosqlite is installed during the __init__ of AsyncDiskCache. Throwing an error like this seems reasonable: AsyncCache requires the aiosqlite dependency.

Adding support for async within DjangoCache would require creating a* methods, such as adelete, aset, etc.

@grantjenks
Copy link
Owner

A few thoughts:

  1. aiosqlite doesn’t implement a real async layer atop SQLite. Instead, it just uses threads: https://github.com/omnilib/aiosqlite/blob/main/aiosqlite/core.py#L76 The same pattern can be done with DiskCache and threads already. Just use the run_in_executor api provided by the built-in asyncio. I would expect lower utilization due to the additional threading layer overhead.

  2. I don’t want to duplicate all the methods in DiskCache just to prefix the letter “a” to the methods and sprinkle the “async” keyword everywhere. I understand it’s maybe a simple source transformation but it seems burdensome from a maintenance perspective. I wish there was some way to make the same method “optionally” async.

  3. Because SQLite runs in-process, even if there were a native async-friendly driver layer, what would it await? There’s no network call as there is with separate server databases like Postgres or MySQL. There’s a disk call deep inside the SQLite code but I don’t think the SQLite api exposes an async api for interacting at that low level. I don’t think Python supports async for disk reads/writes in the standard library either.

  4. If it’s just the async Django cache methods that you’d like to use then the run_in_executor trick should work fine there. DiskCache needs to be updated for the Django 4.2 release too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants