-
Notifications
You must be signed in to change notification settings - Fork 501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
services/horizon/internal/db2/history: Improve effects for liquidity pool query #4065
services/horizon/internal/db2/history: Improve effects for liquidity pool query #4065
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but I wonder if the problem is connected to the index itself. We have:
"index_history_operation_liquidity_pools_on_ids" UNIQUE, btree (history_operation_id, history_liquidity_pool_id)
But we usually query this table like history_liquidity_pool_id = X AND history_operation_id >= Y
. So the fields in the index should be reversed with history_liquidity_pool_id
being the first field.
Nice finding!! I think we have a similar query for claimable balances (and maybe also accounts?). I think we should update those too |
@bartekn I added the reversed index you suggested in staging:
but it did not improve the query
EDIT: actually it did improve the query compared to the original but the improvement is not as significant as the rewritten query in this PR |
We do not have an endpoint for querying claimable balance effects. But, if we were to add one we definitely need to mimic the pattern in this PR. As for account effects, we do join on the accounts table. I'd need to do more testing to see if the query would be more performant without the join. |
I wonder if we can drop old and create new reversed index now that LP effects table is still small. Up to you! |
I think we can do it in the next release |
PR Checklist
PR Structure
otherwise).
services/friendbot
, orall
ordoc
if the changes are broad or impact manypackages.
Thoroughness
.md
files, etc... affected by this change). Take a look in the
docs
folder for a given service,like this one.
Release planning
needed with deprecations, added features, breaking changes, and DB schema changes.
semver, or if it's mainly a patch change. The PR is targeted at the next
release branch if it's not a patch change.
What
Currently, requests to
/liquidity_pools/{liquidity_pool_id}/effects
are timing out. Because the query to fetch effects for a given liquidity pool is running too slowly.This is the query and its explain analyze output:
The history_effects table does not have a foreign key which links to the liquidity pools table. However, the history_effects table does have an indexed column on operation id and we have a history_operation_liquidity_pools table which tracks operations occurring on liquidity pools:
Note that history_operation_liquidity_pools does not store the liquidity pool id as a string. Instead, there is a layer of indirection where we map liquidity pool id strings to integer ids in history_liquidity_pools.
Given all these tables we are able to construct the query to select effects for a given liquidity pool by joining history_effects with history_operation_liquidity_pools so that we can find all operation ids which involve our liquidity pool and then query for all effects occurring on those operation ids. We also need to join on history_liquidity_pools so we can determine the integer liquidity pool id from the given liquidity pool string.
The remaining part of the query is for handling paging logic (we page on heff.history_operation_id and heff.order).
I think the reason why the query is running slowly is because of the number of joins.
To speed up
/liquidity_pools/{liquidity_pool_id}/effects
I broke down the original select query into two separate queries which we execute in horizon.In the first query, we find all operation ids impacted by a given liquidity pool id string:
This query runs very quickly.
Next, we select all rows from the history_effects table which have a history_operation_id contained in the result of the previous query:
This query also runs very quickly. It appears that removing the joins on history_operation_liquidity_pools and history_liquidity_pools is what improved the query.
I also tried modifying the original query to use subqueries, eg:
However, those queries performed just as badly as the original query which used joins.