-
Notifications
You must be signed in to change notification settings - Fork 803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
async support #1714
async support #1714
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great!
- Focusing on Search first is a good idea
- I like that most methods are not doing I/O and can be in BaseSearch (which should be named
SearchBase
which is the convention in this project) - Since this code is not generated like in elasticsearch-py, lint should check that
unasync(_async/search.py) == _sync/search.py
. The alternative is to not store_sync/search.py
on disk and use unasync's setuptools integration. - We need to use unasync on tests too, and the lint/docs builds need to work.
- Should we keep
search.py
so thatfrom elasticsearch_dsl.search import Search
still works? It's unfortunately widely used
Do you mean to add this as a check, to make sure the code in
Yes, I will look at that next.
Right, I noticed that the search.py module is often referenced explicitly, so I thought it should be an alias for the generated |
Yes, exactly, we want to make sure in
Most of those results are from forks of a discontinued Mozilla products (zamboni and kuma) but who knows how widely used it is in private code. That said, I don't want to encourage this usage. Not sure if we can emit a warning on |
e89e2a3
to
0f11586
Compare
@pquentin Making good progress on this. See the top issue comment for the completed tasks. I will try to get through the document tests and see if that is a good stop point for this PR, depending on how much is left. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, thanks! I have reviewed the general approach so far, not yet the nitty gritty detail of what is sync, async or shared. But I'm not expecting any big issues there.
The next steps:
- opening a pull request to drop Python 3.7 support (I can do that if you'd prefer)
- add an async page to the docs, probably close to https://elasticsearch-py.readthedocs.io/en/v8.12.1/async.html (with general explanations around installation and usage and a complete API Reference at the end)
- explain how to contribute code in CONTRIBUTING.rst
@@ -11,3 +11,4 @@ filterwarnings = | |||
error | |||
ignore:Legacy index templates are deprecated in favor of composable templates.:elasticsearch.exceptions.ElasticsearchWarning | |||
ignore:datetime.datetime.utcfromtimestamp\(\) is deprecated and scheduled for removal in a future version..*:DeprecationWarning | |||
asyncio_mode = auto |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please keep the default (strict mode) as this is more explicit and will help to add Trio support in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how would that work with unasync. This is one example in which the code would need to be different. Async tests need a decorator to declare them as async, but that decorator needs to be removed in the sync conversion.
Example:
_async version:
@pytest.mark.asyncio
async def test_some_code():
res = await library.do_something()
assert b"expected result" == res
_sync version:
def test_some_code():
res = library.do_something()
assert b"expected result" == res
How do I tell unasync to remove the decorator when doing this conversion? The auto mode solves this, because neither async nor sync need a decorator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, unasync only replaces tokens and @pytest.mark.asyncio
is tokenized with three OP
(@
, .
and .
) and three NAME
(pytest
, mark
, asyncio
). Asking unasync to look ahead sounds overly complex for this use case.
We could add our own marker (sync
) that would not do anything, but it would be confusing and would prevent from even importing the asyncio module. Maybe that's good if we do want to add Trio support because anything fancy would require AnyIO anyway.
We could add the marker using pytestmark = pytest.mark.asyncio
at the top of the file and then modify run-unasync.py
to remove that line for sync tests, and ask unasync to translate it to pytest.mark.trio
for Trio tests.
In any case, that's too complicated for the current pull request, and can be left for future work. Resolving this, thanks.
@pquentin Some more updates:
Will work on the docs next! |
There's some metaclass magic that I don't understand, but this is exposed through
The second option seems better. Is there any disadvantage I'm missing?
Right, I do plan to read everything at least once, but I'm bound to miss things given the size of the pull request. Which is why I'm procrastinating :) |
I have yet to convert the examples, so I'll test how analysis is used and determine if we need to do a conversion. None of the tests seemed to need an async analysis.py, but like you, I don't fully understand what's going on with this module. For the faceted search class I agree with you that unasync is overkill. We just need to add the search class as an argument, and then we can create a Update on the faceted search: I missed that there is an |
@pquentin This PR is ready for review. |
I'm going to review in the same order as you did your work, and the checklist will be in this comment.
|
- remove unused iter/aiter in unasync conversion table - improve contributing documentation wrt _async subdirectories - remove some code duplication in async test fixtures
5cfffac
to
2468e8a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reviewed a third of the files now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reviewed another third of the files: docs, examples and their tests are left.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reviewed everything now! Thanks again for this massive body of work.
Examples
I believe examples should be using unasync too. I'm already seeing divergences (such as yield from
, moving away from @property
or closing connections) that should not be here, and they will only grow in the future. This would require:
- having a script (run-unasync.py or a wrapper) to replace
asyncio.run(main())
withmain()
and removeimport asyncio
, since unasync can't do it itself. - checking we can generate sync examples from async example just like the rest of the files with CI
Docs
Regarding documentation, we should mention async/await support in the introduction and have an async example in index.rst
@pquentin I think I have addressed everything except the unasync for the examples. Please have a look at the new structure for the docs with a separate reference page for the async classes. I will look at the examples next week. I would like to avoid having to change the locations of the sync examples, I'm going to need a special unasync rule to handle going from |
The current navigation feels off: We have two pages about asyncio, one with text and the other with the reference API. And the async reference API has a different name from the sync reference API. Maybe we should have the following navigation instead?
The section names come from https://diataxis.fr/. Here's an example of Sphinx documentation using sections in the navigation: https://esrally.readthedocs.io/en/stable/. Since the links don't reference the section, I believe all existing links will continue to work.
I don't believe this was addressed yet. |
I like the two-level heading organization much better. And you are correct that I forgot to add the async mention and example in the introduction. Will do that as well. |
b8a6b09
to
9c8e63d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new doc navigation is amazing, thanks! And thanks for working on the examples too.
Looks good to me.
subprocess.check_call(["black", "--target-version=py38", output_dir]) | ||
subprocess.check_call(["isort", output_dir]) | ||
for file in glob("*.py", root_dir=dir[0]): | ||
# remove asyncio from sync files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few doubts about 1/ doing this for all files and not only examples and 2/ using a sed command carefully crafted to work on both Linux and macOS. This seems fragile, but it works, so let's do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, 100% agree on the fragility. But I don't feel doing this for all files is a problem, since what we are doing is the correct thing to do as a general case. I could actually make the regex in the sed a bit smarter and capture the function name that is passed to asyncio.run()
instead of having it hardcoded. I would think this type of transformation would make sense to add to unasync (although not with sed!).
All public classes have async versions now. The documentation navigation was also updated to follow the Diátaxis documentation framework.
All public classes have async versions now. The documentation navigation was also updated to follow the Diátaxis documentation framework. (cherry picked from commit b50d538)
All public classes have async versions now. The documentation navigation was also updated to follow the Diátaxis documentation framework. (cherry picked from commit b50d538) Co-authored-by: Miguel Grinberg <[email protected]>
Async support. Tasks:
Closes #1480, #1672