Any plans to give an easy way to make DSL compatible with elasticsearch_async #556

pfreixes · 2017-01-12T00:13:44Z

Hi,

I've seen that you have been working on a project [1] to proxy the official client in aims to achieve a pure asynchronous behavior fully compatible with asyncio. I was wondering if there is any plan to do the same for the DSL project.

Even though is possible to use an asynchronous connection, as the following snippet shows, the principal interface implementation does not allow the developer to use the await statement to get the response. The intermediate object Response is not designed to do so.

client = AsyncElasticsearch(hosts=['localhost'])
async def test():
    s = Search(using=client, index="my-index") \
        .filter("term", category="search") \
        .query("match", title="python")   \
        .query(~Q("match", description="beta"))

    response = await s.execute()

Is there an plan to implement all of this stuff necessary to make the library compatible with asynchronous environments ?

[1] https://github.com/elastic/elasticsearch-py-async

The text was updated successfully, but these errors were encountered:

a-urth · 2017-01-12T10:53:05Z

Some thoughts - actually it mighty be possible without breaking api. if there was a way that search/request object won't use client/connection inside it and be operated just for query construction and not execution, then it would be possible to use it with any other client, for example async one. And we could just wrap response from client in the same manner Search object does it. Of course open question is scan and some other features, but it seems to be easier and faster than just forking and reimplementing api in async manner.

honzakral · 2017-01-12T16:00:46Z

I currently have no plan to implement this purely for time- and expertise- constraints. I would be very happy to help review and merge a solution if someone came up with it.

I think it should just be a matter of implementing a subclass of Response that will be able to work with the future returned by AsyncElasticsearch.search. You can, even now, override the class used on a per request basis by calling the response_class method (0).

I think the proper way to implement this would be to create AsyncSearch class that would default to such AsyncResponse and perhaps add corresponding async_search methods to DocType and Index classes. My only concern (from my lack of expertise in asyncio) is how easy it would be to make this python 2 compatible since I am not ready (though I would love to! :) ) to make elasticsearch-dsl python 3 only.

0 - http://elasticsearch-dsl.readthedocs.io/en/latest/api.html#elasticsearch_dsl.Search.response_class

pfreixes · 2017-01-12T16:34:28Z

The compatibility can reach more or less easily. As an example have a look to this MR [1].

As you said the idea is work with an adhoc class for the response_class in aims to deal with the asynchronous client, handling either Futures - just thinking in elasticsearch_async - or coroutines.

Maybe the tricky thing here is how to get support for that without replace the Search for a new class that it might be annoying for developers.

Let me think a bit and I will try to come back with a solution.

[1] https://github.com/rkern/line_profiler/pull/75/files

pfreixes · 2017-01-12T23:31:52Z

I have a proof of concept that mixes asyncronous and syncronous behaviour using an intermediate Future when the object returned by the client is awaitable. Just as a resume the execute function might get the following shape:

    def execute(self, ignore_cache=False):
        """
        Execute the search and return an instance of ``Response`` wrapping all
        the data.

        :arg response_class: optional subclass of ``Response`` to use instead.
        """
        if ignore_cache or not hasattr(self, '_response'):
            es = connections.get_connection(self._using)
            response = es.search(
                index=self._index,
                doc_type=self._doc_type,
                body=self.to_dict(),
                **self._params
            )

            if inspect.isawaitable(response):
                f = Future()
                def _build_response(task):
                    self._response = self._response_class(
                        self,
                        response.result()
                    )
                    f.set_result(self._response)
                response.add_done_callback(_build_response)
                return f
            else:
                self._response = self._response_class(
                    self,
                    response
                )
        return self._response

Leaving aside readability u other stuff this code allows the library to keep the interface without the need of create derivated classes that will be used ad-hoc for asynchronous environments.

Then what ? Refine a bit the code and cover all of these places where there is an interaction with the client to test if the object returned is awaitable or not. Worth mentioning that the awaitable function is available from Python 3.5, since this version asyncio API almost didn't change, in fact in Python 3.6 has been marked as unbreakable and Python community is gonna keep the contract, something that didn't happen with older versions.

My suggestion here is kept this implementation only for Python 3.5, previous versions since 2.X to 3.4 will work with the regular and synchronous implementation only. It can be achieved with a simple test of the Python version and introducing a if statement to decide if the asynchronous version is supported or not.

And last but not least, I should be able to prepare a proper MR the following days.

Cheers,

pfreixes · 2017-01-15T08:21:34Z

@honzakral I have a proposal, follow this link [1].

This commit only enables the support for asynchronous drivers for the Search.execute method, later when we will come up with an agreement the whole library can be covered to get support for asynchronous clients.

Basically, the commit incorporates the following ideas:

A new constant called ASYNC_SUPPORTED created
Different function implementation having in mind if the library is running in a ASYNC_SUPPORTED or not.
The use of none compatible statements such as async, await is forbidden in the most upper layer. If these are needed, hide the use of them to another file and import if ASYNC_SUPPORTED is True
All asynchronous tests are placed at _test_async.py to avoid be gathered by the runner.
Tests from _test_async are incorporated to each module if and only if ASYNC_SUPPORTED is True

Comments will be wellcomed.

[1] https://github.com/Skyscanner/elasticsearch-dsl-py/commit/814fff7d953362b3d16915b986e7eba2cf574721

pfreixes · 2017-01-22T08:25:32Z

This last commit [1] improves a bit the current implementation, right now there is no need to implement two different versions of each ES-Dsl function that needs to deal with both client implementations. As a result, maintainability becomes easier than before.

[1] https://github.com/Skyscanner/elasticsearch-dsl-py/commit/be83cb2b2c15c64c4bcd45fc14bdf0705f0350ad

pfreixes · 2017-01-22T17:58:04Z

I have a couple of red flags. Not a big deal but they must be resolved.

First of all about the helpers, such as scan one implemented by the elasticsearch-py synchronous implementation and not implemented by elasticsearch_async. I can see that ES-dsl publishes an interface that uses this method. Seeing that, what I can do here is just publish a new PR to the elasticsearch_async project to get support for the asynchronous version of this method. Even thoguth won't be a bad idea implement all helpers in asynchronous flavor, but it will take time and can be done later in other MR.

About integration tests, first of all, how can I run them? Once say that, will be nice implement a minimum set of integration tests using an asynchronous client which they will cover the interfaces that are compatible with asynchronous clients, explicitly using elasticsearch_async client. I guess that some of the dependencies to - again - helpers can be implemented in that integration tests.

Any thoughts?

sethmlarson · 2020-08-10T22:02:09Z

Closing this as a duplicate of #1355

a-urth mentioned this issue Feb 20, 2017

Asyncio support anti-social/elasticmagic#26

Closed

honzakral mentioned this issue Jan 1, 2018

Async branch of this library? #704

Closed

vaidsu mentioned this issue Jun 20, 2018

Integration of official async ES client into DSL library #921

Closed

sethmlarson closed this as completed Aug 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Any plans to give an easy way to make DSL compatible with elasticsearch_async #556

Any plans to give an easy way to make DSL compatible with elasticsearch_async #556

pfreixes commented Jan 12, 2017

a-urth commented Jan 12, 2017

honzakral commented Jan 12, 2017

pfreixes commented Jan 12, 2017

pfreixes commented Jan 12, 2017

pfreixes commented Jan 15, 2017 •

edited

Loading

pfreixes commented Jan 22, 2017

pfreixes commented Jan 22, 2017 •

edited

Loading

sethmlarson commented Aug 10, 2020

Any plans to give an easy way to make DSL compatible with elasticsearch_async #556

Any plans to give an easy way to make DSL compatible with elasticsearch_async #556

Comments

pfreixes commented Jan 12, 2017

a-urth commented Jan 12, 2017

honzakral commented Jan 12, 2017

pfreixes commented Jan 12, 2017

pfreixes commented Jan 12, 2017

pfreixes commented Jan 15, 2017 • edited Loading

pfreixes commented Jan 22, 2017

pfreixes commented Jan 22, 2017 • edited Loading

sethmlarson commented Aug 10, 2020

pfreixes commented Jan 15, 2017 •

edited

Loading

pfreixes commented Jan 22, 2017 •

edited

Loading