Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

services/horizon/db2/history: Improve /trades DB queries performance #2869

Merged
merged 3 commits into from
Jul 30, 2020

Conversation

bartekn
Copy link
Contributor

@bartekn bartekn commented Jul 29, 2020

PR Checklist

PR Structure

  • This PR has reasonably narrow scope (if not, break it down into smaller PRs).
  • This PR avoids mixing refactoring changes with feature changes (split into two PRs
    otherwise).
  • This PR's title starts with name of package that is most changed in the PR, ex.
    services/friendbot, or all or doc if the changes are broad or impact many
    packages.

Thoroughness

  • This PR adds tests for the most critical parts of the new functionality or fixes.
  • [ x I've updated any docs (developer docs, .md
    files, etc... affected by this change). Take a look in the docs folder for a given service,
    like this one.

Release planning

  • I've updated the relevant CHANGELOG (here for Horizon) if
    needed with deprecations, added features, breaking changes, and DB schema changes.
  • I've decided if this PR requires a new major/minor version according to
    semver, or if it's mainly a patch change. The PR is targeted at the next
    release branch if it's not a patch change.

This commit improves performance of queries generated by TradesQ which is used in /trades endpoint.

First, a quick fix. A new multicolumn index was added on base_asset_id, counter_asset_id, history_operation_id, order. It supports TradesForAssetPair queries and improved the speed of such queries from ~5s on average in SDF cluster to a few ms. Adding the index took around 2 minutes in SDF cluster (with full history).

Second, I updated queries that contain base_* = X OR counter_* = X conditions to UNIONs. Specifically, ForAccount and ForOffer. Previous query was using a wrong index (htrd_pid which required a lot of filtering) and adding new indices (base_ + history_operation_id + order) didn't really help (I believe because of OR statement). New query decreased execution times down to ms.

Known limitations

  • github.com/Masterminds/squirrel does not support UNION queries so it's built manually.
  • New tests are using scc scenarios. A larger refactor is required to support mocking.

@bartekn bartekn requested a review from a team July 29, 2020 18:27
@cla-bot cla-bot bot added the cla: yes label Jul 29, 2020
return q
}

q.rawSQL = fmt.Sprintf("(%s) UNION (%s) ", firstSQL, secondSQL)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

take a look at TransactionByHash() to see an alternative way to construct the UNION query using squirrel

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was my first approach but it doesn't work when you want to add ORDER BY and/or LIMIT to UNION (requires brackets).

q.sql = q.appendOrdering(q.sql, op, idx, page.Order)
}

q.sql = q.sql.Limit(page.Limit)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be moved to the else clause?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, fixed in 013fdcb.

Copy link
Contributor

@tamirms tamirms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good. I just had 2 comments

@ire-and-curses
Copy link
Member

This is cool! What's the overhead for migration to add the index look like for full pubnet history?

@bartekn
Copy link
Contributor Author

bartekn commented Jul 29, 2020

@ire-and-curses I included this in the PR summary:

Adding the index took around 2 minutes in SDF cluster (with full history).

The new index is around ~1GB as far as I can remember. Can confirm tomorrow.

@ire-and-curses
Copy link
Member

Super.

@bartekn bartekn merged commit 1b2ed56 into stellar:master Jul 30, 2020
@bartekn bartekn deleted the tradesq-performance branch July 30, 2020 10:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants