-
Notifications
You must be signed in to change notification settings - Fork 660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Significantly speed up creating backups with isal via zlib-fast #4843
Conversation
isal is a drop in replacement for zlib with the cavet that the compression level mappings are different. zlib-fast is a tiny piece of middleware to convert the standard zlib compression levels to isal compression levels to allow for drop-in replacement https://github.com/bdraco/zlib-fast/releases/tag/v0.1.0 https://github.com/pycompression/python-isal Compression for backups is ~5x faster than the baseline powturbo/TurboBench#43
I tested this locally and verified it was loaded
|
On my local its much faster but I tried it on production and there is no change so I've got to be doing something wrong/different |
I see the issue. my test uses zlib and I need to map the gzip path to IGzipFile as well |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this pose any issue with restores? Otherwise, LGTM.
Restores work the same. decompression is faster as well. ~2.5x faster |
2024.2 is the month of the back-up! Loooking forward to seeing this, alongside the other changes we have made to reliability. |
I've changed that somewhat recently with pvizeli/securetar#22. It helped already dropping backup time by almost 2x. I remember I considered going to a even lower default setting as the compression numbers in gz do not influence compression ratio drastically (at least not as drastic as others, at least according to this somewhat older benchmark). However, I then thought let's move the needle slowly to not run into surprises. I've not heard any negative feedback, so from my point of view we can continue to change the standard. That said, I was actually consider starting to look into another compression algorithm, e.g. zstd. It would mean to introduce a backup format version, but that shouldn't be that big of a deal. One thing I was wondering: Is this library making use of newer instruction set extensions/is the code written such that it still will work on old x86-64 systems without those extensions? |
I've tested the library and it works on older x86_64 systems as well as 32 bit arm. Its just not as fast because it uses fallbacks (but still faster than baseline) |
I didn't test it on every old x86_64 chip (only the synology box with the old chip we had problem with other libs) Given its an intel origin https://github.com/intel/isa-l I expect they have tested it a bit more on older hardware as they have a bit more access to it |
Messaging was added to the supervisor recently stating that backups are not backwards compatible (instead of just discreetly failing), so this should not be a blocking issue. |
I mean if they use hand written assembler, they probably just thought about it and implemented fallbacks. Sometimes it is also the compilers optimization steps which select some instructions from an extension, but that then will depend on flags at build time. Since you already tested on older systems I am not too concerned indeed. Just something to keep an eye out. I do have a Fujitsu machine with a AMD Geode here which also has a quite limited set of instructions available. I'll do a backup once this is on dev. |
Nice. Thankfully this is trivial to revert if there is an issue. I'm going to merge it now and update everything I have once dev is built |
Bumped everyone of my production and dev systems. Backups completed successfully on all of them |
Just tested on my Fujitsu AMD GX-222GC based system, works like a charm! 👍 |
I implemented that in #4884 |
Proposed change
isal
viapython-isal
has a drop in replacement for zlib with the cavet that the compression level mappings are different.zlib-fast
is a tiny compat layer to convert the standard zlib compression levels toisal
compression levels to allow for drop-in replacementI considered a few other approaches here such as using
pigz
or switching to another format but this seems like the simplest and most compatible resulting in only a time code change here.Wheels are built for all supported platforms https://wheels.home-assistant.io/musllinux/
https://github.com/bdraco/zlib-fast/releases/tag/v0.2.0 https://github.com/pycompression/python-isal
Differences
powturbo/TurboBench#43
Hopefully this means we won't see any more database backup failures where core gives up the lock on the SQLite database because it can't hold any more events in memory when the backup takes too long. I was especially keen on fixing this since when that happens it may be that user's only hint that their backup is corrupt is in the core log and they find out the hard way and open issue after it's already too late. https://github.com/home-assistant/core/blob/a793a5445f4a9f33f2e1c334c0d569ec772335fe/homeassistant/components/recorder/core.py#L1015
We can likely supplement the logger warning with a repair issue after this change since most cases should be the result of a hardware or overload problem with the system instead of the compression taking too long. Currently issues like home-assistant/core#105987 end up going stale since there was no viable solution before. I have been waiting to add the repair issue since the answer of telling them to wait until we can make the backups faster seemed like it wasn't going to go over well and the result was the same with the issue eventually going stale since there was no solution
Testing:
A future improvement for I/O constrained system could be to use https://docs.python.org/3/library/tarfile.html#tarfile.TarFile.addfile to add each gziped tar file to the final archive, rewind the stream, and rewrite the size over it, and jump back to the end so we wouldn't have to write all the tgz files to disk and they can be streamed into the final result. This would cut the writes in half (and increase storage lifetime) and likely make more difference for these systems than this change.
Type of change
Additional information
Checklist
black --fast supervisor tests
)If API endpoints of add-on configuration are added/changed: