Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

services/horizon: /order_book slower when reading from offers graph #1963

Closed
bartekn opened this issue Nov 20, 2019 · 7 comments
Closed

services/horizon: /order_book slower when reading from offers graph #1963

bartekn opened this issue Nov 20, 2019 · 7 comments
Labels

Comments

@bartekn
Copy link
Contributor

bartekn commented Nov 20, 2019

What version are you using?

Horizon 0.23.1

What did you do?

I checked response times pre/post upgrade from 0.22.2 to 0.23.1. I was surprised that response times for /order_book are actually higher even though 0.23.0 started using in-memory offers graph. Count of responses with duration > 0.1:

Screenshot 2019-11-20 at 23 51 31

Not super urgent because p99 of duration for this route is actually below 0.10-0.13 but it's worth checking why it's slow (deploy around 22:10):

Screenshot 2019-11-20 at 23 58 04

What did you expect to see?

Smaller duration of /order_book responses.

What did you see instead?

Higher duration of /order_book responses.

@tamirms
Copy link
Contributor

tamirms commented Jun 10, 2020

It seems that since the 0.23.1 release the performance of the /order_book endpoint with the in memory order book graph has improved significantly.

In the 1.4.0 release we started to query the Horizon DB to determine the order book spread instead of using the in memory order book graph. It turns out the querying the DB is slower than using the in memory graph which is what we would expect. However the response times are still very low using the Horizon DB.

Screen Shot 2020-06-10 at 9 16 25 PM

Before deploying Horizon 1.4.0 the the average response time for /order_book requests was 5ms. After 1.4.0 was deployed the average response time increased to 9ms.

Similarly, before deploying Horizon 1.4.0, the P99 response time was 10 ms. After 1.4.0 was deployed the P99 response time increased to 20ms.

@ire-and-curses
Copy link
Member

Thanks for measuring this! For now this seems not urgent to improve since it is fast enough. If we did want to improve performance in the future, are there any obvious steps to take?

@bartekn
Copy link
Contributor Author

bartekn commented Jun 10, 2020

Thanks for checking this @tamirms! Super interesting. Too bad it's hard to load the old data and check the metrics once again (maybe I did something wrong with my query). I zoomed out to check 7d charts and found that in-memory /order_book p99 and avg were slightly higher from time to time (still lower than DB version) so I started wondering that maybe it has something to do with a type of queries (limit or market params actually affect response time):

Screenshot 2020-06-11 at 00 44 51

Anyway, I think the response times are great even with a DB version so maybe we shouldn't revert it. Could be interesting to compare CPU profile of 0.23.1 and 1.3.0. Maybe it will give us some useful hints connected to graph optimization.

@ire-and-curses
Copy link
Member

I'm against reverting because this is big for us because i) it means we (or anyone) can scale request-serving Horizons horizontally through read-only replicas and ii) it removes the need for request-serving Horizons to ingest, breaking the N-N mapping between Horizon and core. This is a big deal for fast txmeta scalability.

@bartekn
Copy link
Contributor Author

bartekn commented Jun 10, 2020

@ire-and-curses sorry, I wasn't clear. I meant reverting to the code that's using order-book graph as a data source instead of a DB (reading part), not reverting ingestion on front-end nodes (writing part). Once graph is updated by OrderBookStream it can be used by any Horizon component (in the previous code /order_book was using the same graph as /paths).

@tamirms
Copy link
Contributor

tamirms commented Jun 10, 2020

Sorry I should have clarified that, since Horizon 1.4.0, the order book graph is populated by polling the Horizon DB instead of via distributed ingestion #2630. This means the order book graph is available on all the request serving Horizon nodes. So we could go back to querying the order book graph instead of the DB to serve /order_book responses. Doing so would not require the request serving Horizon nodes to participate in ingestion again.

The reason why I implemented orderbook queries using the Horizon DB in #2617 is because I thought reducing our dependency on the in memory order book graph would make it easier to remove the frontend Horizon nodes from participating in distributed ingestion. While implementing #2630 I realized that we could still have an in memory order book graph without having to force the frontend nodes to participate in ingestion.

If we went back to querying the in memory order book graph I would want to check that the extra queries on the order book graph would not have a negative impact in terms of contention. A read write lock ensures that the order book graph can be updated in a thread safe manner while other go routines are trying to read from the order book graph.

If we ever want to join information from the order book db query with data found in other tables that will be easier than combining in memory order book graph queries with db queries.

@bartekn bartekn removed this from the Horizon 1.4.0 milestone Jun 22, 2020
@bartekn
Copy link
Contributor Author

bartekn commented Mar 3, 2021

Closing now as /order_book works fine. We can reopen in the future.

@bartekn bartekn closed this as completed Mar 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants