-
Notifications
You must be signed in to change notification settings - Fork 500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support/db: PoC: Use COPY instead of INSERT in BatchInsertBuilder #4094
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a couple of issues with using COPY
especially connected to a DB transaction (linked in a comment). I think we can continue experimenting on this in a separate branch and decide later when we see all the pros and cons.
support/db/batch_insert_builder.go
Outdated
return b.Exec(ctx) | ||
func (b *BatchInsertBuilder) initStmt(ctx context.Context) error { | ||
// TODO: could the transaction had been started before? | ||
if err := b.Table.Session.Begin(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in general BatchInsertBuilder
is used inside a DB transaction of a single ledger ingestion, started before the first call to Row()
. I explained some problems with COPY
in #316 (comment).
When using a temporary table (so that an insertion suffix can be used) we still get a ~3x speedup:
|
2519abe
to
f4eafa5
Compare
I've had to move all the This reduces performance, but we are still at a ~2.7x speedup
|
dbadf2d
to
1e2eda3
Compare
9593843
to
1dfb634
Compare
This is proof-of-concept PR which changes
BatchInsertBuilder
to useCOPY
instead ofINSERT
.It brings a ~3.5x speedup.
Before:
After:
TODO:
ON CONFLICT
insertions (this can be done by inserting on a temporary table. This can even speed up things further when indexes are present, since the temporary table doesn't require indexes. Related: services/horizon: Remove by_hash index, replace with index on hash prefix #4087 and Feature Request: Allow Efficient Access to Account "Actions" Where Account is Source #4051).If this goes well and we incorporate it we should later evaluate migrating
func (q *Q) upsertRows()
to use the same approach.Related #4022