Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify MapAsyncIterable using async generator semantics #197

Conversation

kristjanvalur
Copy link
Contributor

@kristjanvalur kristjanvalur commented Apr 21, 2023

This PR rewrites the MapAsyncIteable using simple native async primitives

Problem

The existing implementation of MapAsyncIterable has several problems, mostly to do with aclose() semantics.
This has become evident in projects such as Strawberry-graphql
where calling aclose() on a subscription sometimes behaves in unexpected manner. Fundamentally this is caused by the unnecessarily complex implementation, containing inner Task objects (created by ensure_future()) created for each iterated value.

A concrete problem which sometimes manifests itself is that calling aclose() on this iterator results in a RuntimeError. The reason is that the actual __anext()__ call is currently performed in a separate Task. Even when that task is cancelled it may still be pending

The iterable tries to look like an AsyncGenerator but fails to correctly implement the changes of aclose() which
added in Python 3.8, where such an iterator is not allowed to be re-entrant.

Ultimately, the class-based implementation is unnecessarily complex for something as fundamentally simple as iterating over, and mapping, another iterable.

Changes

This PR rewrites this iterator using an AsyncGenerator, thus ensuring that simple and understandable internal mechanisms
are used. It avoids any use of synchronization primitives and Task objects and simply iterates over the results. It provides the extended aclose() optional semantics for the inner iterable as expected.

We remove unnecessary unit tests for this class and fix the ones making strange assumptions, such as ones making it allowable to ignore GeneratorExits and yield data. Also a test for un-closing an iterator is removed, since that is a completely un-pythonic idea.

Further improvements

Further enhancements might be:

  • remove the athrow() and is_closed members which serve no purpose
  • implement this method entirely as an Async Generator, and drop the special type, which only complicates code. The relevant checks in graphql-core could then use the inspect.isasyncgen() method instead.

@kristjanvalur kristjanvalur changed the title Rewrite MapAsyncIterable using async generator semantics Simplify MapAsyncIterable using async generator semantics Apr 21, 2023
@Cito
Copy link
Member

Cito commented Apr 22, 2023

Thanks a lot @kristjanvalur. I always wanted to revisit this part of the code anyway. Will definitely look into this, but please have some patience since I'm currently busy with other things.

@Cito Cito self-assigned this Apr 22, 2023
@Cito Cito added the investigate Needs investigaton or experimentation label Apr 22, 2023
@kristjanvalur
Copy link
Contributor Author

Also, I think it is prudent for you to review the semantics here. I see that this has recently been renamed to "Iterable" rather than "Iterator". I am not sure that this is wise. A list is an "iterable", but a list iterator is, while technically also an iterable, is an iterator. The concept of an iterable is something which can be iterated over multiple times.
In the sense of subscriptions, this makes no sense. (You don't want to execute the same subscription multiple times)
If something has a __next__() method, it is primarily an iterator. (same goes for async)

@kristjanvalur
Copy link
Contributor Author

By the way, one of the bad side effects of the previous implementation is that it makes debugging hard. The __anext__() method is executed in its own Task which means that the call chain is broken.

@Cito
Copy link
Member

Cito commented Apr 23, 2023

Also, I think it is prudent for you to review the semantics here. I see that this has recently been renamed to "Iterable" rather than "Iterator". I am not sure that this is wise.

I was mirroring that change in Change.js here. I think JavaScript was inspired by Python regarding iterables, so the semantics is pretty similar. The reason for the renaming was that the class has an __aiter__ method. So it is not only an Iterator, but also an Iterable (similar for JavaScript).

Remember that the goal of GraphQL.js is to be a pretty much 1:1 translation of GraphQL.js to Python. I try to avoid doing or naming things differently, because only by staying close to the original it is sustainable to maintain the library and keep it up to date with the current developments.

For instance, I'm currently working on getting the @defer/@stream directives from GraphQL.js into GraphQL.core which is quite a big feature which requires many changes in the execution module. I will look into your PR once I'm done with that big change, and make sure it's compatible.

By the way, one of the bad side effects of the previous implementation is that it makes debugging hard. The __anext__() method is executed in its own Task which means that the call chain is broken.

Agreed, that's another good reason to change it.

@kristjanvalur
Copy link
Contributor Author

Yes, no rush at all. I'll probably implement workarounds in strawberry anyway.

I also had time to think it over and I don't think my comments on iterables/iterators are necessarily correct, I guess I got caught up in the moment. Cheers!

@kristjanvalur
Copy link
Contributor Author

Closing this in favor of #199 , now that 62749e5 has been committed. There is no need to keep the MapAsyncIterable class around.

@Cito
Copy link
Member

Cito commented May 28, 2023

Thanks. Will try to get #199 into the next pre-release (very soon).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
investigate Needs investigaton or experimentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants