-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluate efficiency of zsync2 #53
Comments
Chopping the file in chunks more intelligently could possibly help, e.g., after each file inside the squashfs. Also the different compressors might influence this. Let's keep in mind that we might want to use the Zstandard compressor now that it is available. This area of research is especially important when combined with p2p, because if we get chunks with checksums that are the same between different files, then this helps the p2p performance greatly... so let's see if we can get some people from #ipfs and #ipfs-dev interested in this, too... |
@probonopd that's what I thought, too, I've commented in the other issue. This is all about playing with the parameters for both mksquashfs and zsync2 for now. But it should be done properly. If it turns out that compression ruins block alignment (although I think the only sane way they implemented it is that they did it file wise), we'd have to change it to compressing files rather than the image or chunks with a size higher than the block sizes. For now, I can tell that the block sizes of mksquashfs and zsync2 differ right now. First approach will be to equalize them, because using a block size lower than the one used for the squashfs image for zsync2 doesn't add anything but costs performance and adds bloat (i.e., additional hashes, by the factor |
Maybe we can get @whyrusleeping's opinion on this one, too. He suggested that we may need to make the ipfs chunker aware of the compressed file format. Probably using the right type of compression for the AppImage could help a lot, so let's understand
Reference: |
I'd like to spend some time (2+ days) on evaluating the current use of zsync2 with type 2 AppImages, i.e., compare how much has changed file wise (before calling
mksquashfs
, that is), and the difference ratio that zsync2 calculates. I feel like it tends to download more data than what has actually changed, but I'd rather perform some measurements before speculating in any way.User stories:
At the moment, this is promised, and might be true, but it'd be nice to create a little, meaningful study on that, which we can show to people asking about this. Also, it's a great way to find potential for optimizations. And, as we plan to keep zsync2 as our core functionality (even when using alternatives to the classic server-client architecture, such as peer to peer networks (see AppImage/AppImageKit#175)), now seems to be the right time to investigate these issues.
Factors that might have to be optimized are unequal block sizes for zsyncmake2 and mksquashfs, compression after generation of the squashfs image (as far as I know, mksquashfs pads files to fill up the remaining bytes in the last block of a file), etc. Anything that could lead to equal files being stored in a way so that the hash sums for the occupied blocks are different, basically.
Rule of thumb for block sizes: mksquashfs block size
>=
zsyncmake2 block size, mksquashfs block sizemod
zsyncmake2 = 0, block sizes should be powers of 2.To my knowledge, such measurements haven't been performed yet, or at least haven't been set up in a more scientific way, capturing and visualizing the results in a repeatable way.
I think we could e.g., use any random Qt application (bundled with linuxdeployqt and the exact same Qt version) with changes in the main binary only, or some CLI applications bundling just a few but rarely changing libraries.
The goal is to find potential optimizations in our use of squashfs which would decrease the aforementioned difference ratio to save on bandwidth. I think it should be possible to find an acceptable trade-off of higher file space versus more efficient updates.
TODO:
-j
), and some repository containing the AppDirs of which we generate AppImages for with appimagetool, eliminating external dependencies)The text was updated successfully, but these errors were encountered: