-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue #519: Fix drupal:import-db and drupal:export-db don't compress data. #581
Issue #519: Fix drupal:import-db and drupal:export-db don't compress data. #581
Conversation
On the env that I tested the task drupal:import-db after importing deletes the .gz file if it is compressed, so I made a copy to preserve the file. (also I added the --file-delete parameter because in other environments it may not clear itself) |
Thanks for the PR @alexanderpatriciop ! Is this the bug you're seeing? drush-ops/drush#5377 |
@justafish Thank you for the feedback, you are right, it's the same bug, I added the workaround suggested on the last commit, however, I realized it has an impact on the performance (at least on my local environment) because with the first approach (copy the .gz file), for importing the database it takes around 40 seconds, but with the latest approach it takes around 4 minutes. |
That's very odd to see such a difference using a pipe like that. Do you have testing or reproduction steps others can use to check? |
Apply the patch https://github.com/Lullabot/drainpipe/commit/3b3bb598b96bdf73deb00d46c0900903c4337926.diff
Remove the before patch and apply https://patch-diff.githubusercontent.com/raw/Lullabot/drainpipe/pull/581.diff
And compare the execution time. |
tasks/drupal.yml
Outdated
@@ -44,15 +44,15 @@ tasks: | |||
- echo "🚮 Dropping existing database" | |||
- ./vendor/bin/drush {{ .site }} sql:drop --yes | |||
- echo "📰 Importing database" | |||
- ./vendor/bin/drush {{ .site }} sql:query --file=$DB_DIR/db.sql.gz | |||
- gunzip < $DB_DIR/db.sql.gz | ./vendor/bin/drush {{ .site }} sql:cli |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few years ago @m4olivei suggested using pigz instead of gunzip because it had significant performance benefits.
https://www.clouvider.com/knowledge_base/compression-options-in-linux-gzip-vs-pigz/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohh, thanks for linking that article. I never did learn the reason why pigz
was faster, just that it was 💡
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think pigz is in the default ddev image, so we'd need to add a Dockerfile to install it. Which IMO is fine, we do it on most of our projects anyways.
I do wonder if it's worth asking about adding it to the default web images in ddev upstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mrdavidburns
I tried pigz
with the following command:
pigz -dc file.gz
But the time execution was more than 2 minutes, so I unzipped the file before importing with sql:query
and it worked fine, it takes the same time execution with pigz
and gunzip
so I left gunzip
because the other needed to add a Dockerfile to install it as @deviantintegral mentioned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexanderpatriciop Were you trying pigz with an existing project or just a fresh site install?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I tried it with an existing project
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trawling my workstation for larger databases to test, and surprisingly I don't have any! The largest at hand was 130MB compressed.
$ hyperfine 'gunzip -dkc database.sql.gz > /dev/null' 'pigz -dkc database.sql.gz > /dev/null'
Benchmark 1: gunzip -dkc database.sql.gz > /dev/null
Time (mean ± σ): 334.3 ms ± 1.2 ms [User: 323.5 ms, System: 9.4 ms]
Range (min … max): 332.9 ms … 336.2 ms 10 runs
Benchmark 2: pigz -dkc database.sql.gz > /dev/null
Time (mean ± σ): 409.8 ms ± 2.2 ms [User: 403.8 ms, System: 103.2 ms]
Range (min … max): 407.0 ms … 412.9 ms 10 runs
Summary
gunzip -dkc database.sql.gz > /dev/null ran
1.23 ± 0.01 times faster than pigz -dkc database.sql.gz > /dev/null
I then manually blew it up to ~400MB, expecting to see better improvements:
$ hyperfine 'gunzip -dkc database2.sql.gz > /dev/null' 'pigz -dkc database2.sql.gz > /dev/null'
Benchmark 1: gunzip -dkc database2.sql.gz > /dev/null
Time (mean ± σ): 1.674 s ± 0.004 s [User: 1.621 s, System: 0.043 s]
Range (min … max): 1.664 s … 1.678 s 10 runs
Benchmark 2: pigz -dkc database2.sql.gz > /dev/null
Time (mean ± σ): 2.041 s ± 0.008 s [User: 1.996 s, System: 0.552 s]
Range (min … max): 2.026 s … 2.053 s 10 runs
Summary
gunzip -dkc database2.sql.gz > /dev/null ran
1.22 ± 0.01 times faster than pigz -dkc database2.sql.gz > /dev/null
No decompression difference either. When I was compressing the test file though, pigz was by far faster:
$ hyperfine 'gzip -kc database2.sql > /dev/null' 'pigz -kc database2.sql > /dev/null'
Benchmark 1: gzip -kc database2.sql > /dev/null
Time (mean ± σ): 31.556 s ± 0.257 s [User: 31.100 s, System: 0.176 s]
Range (min … max): 31.266 s … 32.152 s 10 runs
Benchmark 2: pigz -kc database2.sql > /dev/null
Time (mean ± σ): 4.462 s ± 0.183 s [User: 33.747 s, System: 0.765 s]
Range (min … max): 4.184 s … 4.774 s 10 runs
Summary
pigz -kc database2.sql > /dev/null ran
7.07 ± 0.30 times faster than gzip -kc database2.sql > /dev/null
What this tells me is that either something has changed in the move to Apple Silicon, or gzip itself has had performance improvements in the past years, or perhaps pigz has had performance losses, at least as far as decompression goes.
So, +1 to keeping gzip here, and as the default for what Drainpipe ships until we have a clear 80% use case that can benefit from pigz.
Relates to:
Description: