-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Too slow query fetch #150
Comments
@umax In order to optimize query performance, you need to create composite indexes for the various queries your application makes. Per the Datastore docs on index configuration:
|
@tseaver does provided composite indexes is not enough for my query? |
@umax Firestore doesn't auto-provide composite indexes: it only has indexes on individual fields. You have to define composite indexes manually. |
@tseaver in the description of this issue I provided list of composite indexes from |
Possibly related to #145 |
What I did: I created composite index and use |
@umax I'm sorry I missed seeing the index definition in your initial description. I wonder if the second index (the "descending" one) is causing the back-end to choke? |
Hi, I guess we’ve ran into the same issue (tested on 2.1.0 and several older versions). We have a query that returns 30 entities (no filter, no order). The combined size of them is less than 200kb. Fetching them from an Appengine F2 instance takes close to 0.8 Seconds. To debug this I’ve split the fetch by running the query keys-only (which is fast!) and then calling get_multi separately. As far as i can tell, the time is wasted in client.py:547. Copying these 30 protobufs into entities takes 0.6 seconds of said 0.8 seconds total. I’ve also iterated the list of these protobufs before and had each PB cast to string to ensure there’s no RPC in flight that's being waited on inside entity_from_protobuf. For me it seems to be entirely CPU-Hog there. Our entities contain several lists and nested entities, maybe this is contributing to this slowdown as i can’t see anything obviously wrong with that entity_from_protobuf function? |
Relatedly, I've been exploring some query slowness here. I'm going to add @tsteinruecken's findings to my search for the culprit. |
I have written a script to test $ python3.8 -m venv /tmp/datastore-1.15.3
$ /tmp/datastore-1.15.3/bin/pip install --upgrade setuptools pip wheel
...
Successfully installed pip-21.0.1 setuptools-56.0.0 wheel-0.36.2
$ /tmp/datastore-1.15.3/bin/pip install "google.cloud-datastore==1.15.3"
...
Successfully installed cachetools-4.2.1 certifi-2020.12.5 chardet-4.0.0 google-api-core-1.26.3 google-auth-1.28.1 google-cloud-core-1.6.0 google.cloud-datastore googleapis-common-protos-1.53.0 grpcio-1.37.0 idna-2.10 packaging-20.9 protobuf-3.15.8 pyasn1-0.4.8 pyasn1-modules-0.2.8 pyparsing-2.4.7 pytz-2021.1 requests-2.25.1 rsa-4.7.2 six-1.15.0 urllib3-1.26.4
$ python3.8 -m venv /tmp/datastore-2.1.0
$ /tmp/datastore-2.1.0/bin/pip install --upgrade setuptools pip wheel
...
Successfully installed pip-21.0.1 setuptools-56.0.0 wheel-0.36.2
$ /tmp/datastore-2.1.0/bin/pip install "google-cloud-datastore==2.1.0"
...
Successfully installed cachetools-4.2.1 certifi-2020.12.5 chardet-4.0.0 google-api-core-1.26.3 google-auth-1.28.1 google-cloud-core-1.6.0 google-cloud-datastore-2.1.0 googleapis-common-protos-1.53.0 grpcio-1.37.0 idna-2.10 libcst-0.3.18 mypy-extensions-0.4.3 packaging-20.9 proto-plus-1.18.1 protobuf-3.15.8 pyasn1-0.4.8 pyasn1-modules-0.2.8 pyparsing-2.4.7 pytz-2021.1 pyyaml-5.4.1 requests-2.25.1 rsa-4.7.2 six-1.15.0 typing-extensions-3.7.4.3 typing-inspect-0.6.0 urllib3-1.26.4
$ /tmp/datastore-1.15.3/bin/python compare_perf_issue_1_15_3.py
Time: 1.3033227920532227
$ /tmp/datastore-2.1.0/bin/python compare_perf_issue_1_15_3.py
Time: 13.733032941818237 Those times are pretty repeatable. |
More efficiently uses proto-plus wrappers, as well as inner protobuf attribute access, to greatly reduce the performance costs seen in version 2.0.0 (which stemmed from the introduction of proto-plus). The size of the performance improvement scales with the number of attributes on each Entity, but in general, speeds once again closely approximate those from 1.15. Fixes #145 Fixes #150
Note that the merge of @craiglabenz's PR #155 improves the time for Datastore 2.x on my benchmark, from being a full 10x slower than Datastore 1.15.3 to about 3x slower. |
@tseaver could you please reopen this issue? I don't consider it resolved at all. |
/cc @craiglabenz, @crwilcox |
Hi, team!
Imagine we have an entity Order with
customer_id
(str, uuid4),created_at
(datetime) and other fields. I want to get all orders for specified customer and time range.The query returns about 1500 records.
The problem: it takes about 7-8 seconds!
Environment details
python:3.8-slim
google-cloud-datastore
version: 2.0.1Datastore indexes:
kind: Order
properties:
direction: asc
kind: Order
properties:
direction: desc
Code example
What I do wrong or can you suggest how to speed up this query?
Thanks!
The text was updated successfully, but these errors were encountered: