-
Notifications
You must be signed in to change notification settings - Fork 502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
services/horizon: /order_book slower when reading from offers graph #1963
Comments
It seems that since the 0.23.1 release the performance of the /order_book endpoint with the in memory order book graph has improved significantly. In the 1.4.0 release we started to query the Horizon DB to determine the order book spread instead of using the in memory order book graph. It turns out the querying the DB is slower than using the in memory graph which is what we would expect. However the response times are still very low using the Horizon DB. Before deploying Horizon 1.4.0 the the average response time for /order_book requests was 5ms. After 1.4.0 was deployed the average response time increased to 9ms. Similarly, before deploying Horizon 1.4.0, the P99 response time was 10 ms. After 1.4.0 was deployed the P99 response time increased to 20ms. |
Thanks for measuring this! For now this seems not urgent to improve since it is fast enough. If we did want to improve performance in the future, are there any obvious steps to take? |
Thanks for checking this @tamirms! Super interesting. Too bad it's hard to load the old data and check the metrics once again (maybe I did something wrong with my query). I zoomed out to check 7d charts and found that in-memory Anyway, I think the response times are great even with a DB version so maybe we shouldn't revert it. Could be interesting to compare CPU profile of 0.23.1 and 1.3.0. Maybe it will give us some useful hints connected to graph optimization. |
I'm against reverting because this is big for us because i) it means we (or anyone) can scale request-serving Horizons horizontally through read-only replicas and ii) it removes the need for request-serving Horizons to ingest, breaking the N-N mapping between Horizon and core. This is a big deal for fast txmeta scalability. |
@ire-and-curses sorry, I wasn't clear. I meant reverting to the code that's using order-book graph as a data source instead of a DB (reading part), not reverting ingestion on front-end nodes (writing part). Once graph is updated by |
Sorry I should have clarified that, since Horizon 1.4.0, the order book graph is populated by polling the Horizon DB instead of via distributed ingestion #2630. This means the order book graph is available on all the request serving Horizon nodes. So we could go back to querying the order book graph instead of the DB to serve /order_book responses. Doing so would not require the request serving Horizon nodes to participate in ingestion again. The reason why I implemented orderbook queries using the Horizon DB in #2617 is because I thought reducing our dependency on the in memory order book graph would make it easier to remove the frontend Horizon nodes from participating in distributed ingestion. While implementing #2630 I realized that we could still have an in memory order book graph without having to force the frontend nodes to participate in ingestion. If we went back to querying the in memory order book graph I would want to check that the extra queries on the order book graph would not have a negative impact in terms of contention. A read write lock ensures that the order book graph can be updated in a thread safe manner while other go routines are trying to read from the order book graph. If we ever want to join information from the order book db query with data found in other tables that will be easier than combining in memory order book graph queries with db queries. |
Closing now as |
What version are you using?
Horizon 0.23.1
What did you do?
I checked response times pre/post upgrade from 0.22.2 to 0.23.1. I was surprised that response times for
/order_book
are actually higher even though 0.23.0 started using in-memory offers graph. Count of responses withduration > 0.1
:Not super urgent because p99 of
duration
for this route is actually below 0.10-0.13 but it's worth checking why it's slow (deploy around 22:10):What did you expect to see?
Smaller
duration
of/order_book
responses.What did you see instead?
Higher
duration
of/order_book
responses.The text was updated successfully, but these errors were encountered: