Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update DB layer to use COPY to insert into history tables #4908

Closed
12 tasks done
tamirms opened this issue Jun 15, 2023 · 2 comments
Closed
12 tasks done

Update DB layer to use COPY to insert into history tables #4908

tamirms opened this issue Jun 15, 2023 · 2 comments
Assignees
Labels
horizon performance issues aimed at improving performance

Comments

@tamirms
Copy link
Contributor

tamirms commented Jun 15, 2023

Currently, we use the BatchInsertBuilder to batch history table rows into large Postgres INSERT statements. However, we have discovered that it is more efficient to use the Postgres COPY statement for bulk loading. Note that COPY does not provide support for updating existing rows but that is ok because the Horizon history tables are immutable (rows are only inserted and never updated).

To complete this issue we will need to:

  • Implement a component similar to BatchInsertBuilder which uses COPY instead of INSERT (feel free to use the prototype from the spike)
  • Once the component is implemented we will need to integrate it in all the services/horizon/internal/db2/history code which inserts rows into the history tables (see master...tamirms:go:ingest-perf-pq for how this is done in the spike branch). After this task is complete the following tables will be populated by the COPY batch insert builder:
    • history_effects
    • history_ledgers
    • history_operation_claimable_balances
    • history_operation_liquidity_pools
    • history_operation_participants
    • history_operations
    • history_trades
    • history_transaction_claimable_balances
    • history_transaction_liquidity_pools
    • history_transaction_participants
    • history_transactions
@tamirms tamirms moved this from Backlog to Next Sprint Proposal in Platform Scrum Jun 15, 2023
@mollykarcher mollykarcher added performance issues aimed at improving performance and removed snapshots labels Jun 15, 2023
@mollykarcher mollykarcher moved this from Next Sprint Proposal to Current Sprint in Platform Scrum Jun 20, 2023
@tamirms tamirms moved this from Current Sprint to In Progress in Platform Scrum Jul 19, 2023
@tamirms tamirms closed this as completed Jul 24, 2023
@github-project-automation github-project-automation bot moved this from In Progress to Done in Platform Scrum Jul 24, 2023
@urvisavla
Copy link
Contributor

Note that COPY does not provide support for updating existing rows but that is ok because the Horizon history tables are immutable (rows are only inserted and never updated).

@tamirms, What happens during reingestion of a ledger range? I understand that it updates the history tables. I recall reading somewhere that it's ok for there to be an overlapping ledgers because the reingestion simply overwrites the existing data, is that correct?

@tamirms
Copy link
Contributor Author

tamirms commented Oct 17, 2023

@urvisavla during reingestion we delete rows in the history tables before we insert them:

https://github.com/stellar/go/blob/master/services/horizon/internal/ingest/fsm.go#L734

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
horizon performance issues aimed at improving performance
Projects
Status: Done
Development

No branches or pull requests

3 participants