Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support/db: PoC: Use COPY instead of INSERT in BatchInsertBuilder #4094

Closed

Conversation

2opremio
Copy link
Contributor

@2opremio 2opremio commented Nov 22, 2021

This is proof-of-concept PR which changes BatchInsertBuilder to use COPYinstead of INSERT.

It brings a ~3.5x speedup.

Before:

goos: darwin
goarch: amd64
pkg: github.com/stellar/go/support/db
cpu: Intel(R) Core(TM) i7-1068NG7 CPU @ 2.30GHz
BenchmarkBatchInsertBuilder
BenchmarkBatchInsertBuilder-8   	      76	  14374755 ns/op
PASS

After:

goos: darwin
goarch: amd64
pkg: github.com/stellar/go/support/db
cpu: Intel(R) Core(TM) i7-1068NG7 CPU @ 2.30GHz
BenchmarkBatchInsertBuilder
BenchmarkBatchInsertBuilder-8   	     294	   3939888 ns/op
PASS

TODO:

If this goes well and we incorporate it we should later evaluate migrating func (q *Q) upsertRows() to use the same approach.

Related #4022

Copy link
Contributor

@bartekn bartekn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple of issues with using COPY especially connected to a DB transaction (linked in a comment). I think we can continue experimenting on this in a separate branch and decide later when we see all the pros and cons.

return b.Exec(ctx)
func (b *BatchInsertBuilder) initStmt(ctx context.Context) error {
// TODO: could the transaction had been started before?
if err := b.Table.Session.Begin(); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in general BatchInsertBuilder is used inside a DB transaction of a single ledger ingestion, started before the first call to Row(). I explained some problems with COPY in #316 (comment).

@2opremio
Copy link
Contributor Author

When using a temporary table (so that an insertion suffix can be used) we still get a ~3x speedup:

goos: darwin
goarch: amd64
pkg: github.com/stellar/go/support/db
cpu: Intel(R) Core(TM) i7-1068NG7 CPU @ 2.30GHz
BenchmarkBatchInsertBuilder
BenchmarkBatchInsertBuilder-8   	     216	   5379026 ns/op
PASS

@2opremio 2opremio force-pushed the poc-use-copy-instead-of-insert branch from 2519abe to f4eafa5 Compare November 22, 2021 23:58
@2opremio
Copy link
Contributor Author

2opremio commented Nov 23, 2021

I've had to move all the COPYing to Exec() so that it happened all at once. Otherwise ingestion would intersperse other sql statements in the middle of the COPY leading to an error (ERROR: unexpected message type 0x50 during COPY from stdin)

This reduces performance, but we are still at a ~2.7x speedup

goos: darwin
goarch: amd64
pkg: github.com/stellar/go/support/db
cpu: Intel(R) Core(TM) i7-1068NG7 CPU @ 2.30GHz
BenchmarkBatchInsertBuilder
BenchmarkBatchInsertBuilder-8   	     204	   5994119 ns/op
PASS 

@2opremio 2opremio force-pushed the poc-use-copy-instead-of-insert branch from dbadf2d to 1e2eda3 Compare November 23, 2021 12:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants