-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent results when changing the -B parameter #261
Comments
You can use -v2 parameter to see size of each block and play with -W parameter to get best results.
Max value of -W param is 67108864 (64MB) |
Thanks for the input, but how do the -B and -W parameters interact in practice, and how should they logically be defined, relative to each other, to approach the best possible compression efficiency ? How is the block size defined based on those parameters ? How can it be explained that the highest possible value for -B doesn't yield the lowest possible size for the “diff” file ? And how can I choose one set of -B and -W values to batch process an entire folder with files of different sizes, if, as I experienced, the behaviour is markedly different from one pair of files to another ? Should files be grouped by size, with specific sets of parameters for each size group, or is the outcome of the “diff” file computation related to the specific distribution of matching / non-matching areas for each pair of files, rather than to their sheer size ? I already tested different values of -W, with the same value for -B (the default value), and it barely affected the result, much less than changing -B : for the 523175988 bytes TS file mentioned above, with decreasing -W values between 16777216 and 16384 I got “diff” files between 29496958 and 31974330, the highest size being obtained with 16384 (lowest possible value) and the lowest size being obtained with -W 2097152 (but that test was made with the old 3.0t version, I haven't tried that with the current version since it didn't seem to have a significant effect). Apparently the max value for -W is 16777216, not 67108864, and the default value is 8388608 ; 67108864 is the default value for -B, from what I could see, with xdelta 3.0.11. |
(8 months later) |
I found out about xdelta recently and made some tests with it, with video files in particular, in order to keep a reference to the original file when said file has been remuxed — either by adding subtitles, or an audio track, or simply remuxing into another container. For files which have a straightforward structure, with a (single) header, followed by video / audio streams, it works very well, the size of the generated “diff” file is as small as can be with the default setting. But I'm having trouble with TS files (Transport Stream), which have a specific structure, with (as I understand it) multiple chunks of video / audio data, each having its own header (which is meant to prevent playback issues when parts of the streams aren't transmitted properly). When remuxing such a file to MP4 or MKV, the video / audio streams are extracted from those small chunks and placed contiguously into the new container ; the remuxed file is therefore significantly smaller because of the reduced “overhead”, and the playback is generally smoother (no lag when randomly accessing any spot / timecode). I have many TS files from television recordings which I converted or want to convert to MP4 or MKV, but I'd like to keep a reference to the original TS file before deleting it, as I know from experience that there can be unexpected issues later on (for instance a video editor could refuse the remuxed file, or there could be a glitch which caused a video / audio desynchronization in the remuxed file and which could not be fixed without going back to the original TS file). Here comes xdelta (which was suggested here for such purposes).
I first used an old version, 3.0t, which I happened to find with a “patch” for an animation movie in MKV. I will note below the beginning of the file name (which is the date of broadcast), and the size of the “diff” file obtained with xdelta. Using the basic command
xdelta3 -e -s "remuxed file" "original file" "diff file"
, I got these results :For the first two, the size of the “diff” file seemed acceptable, but for the third one it was obviously too big ; so, based on the advice given in this discussion, I used the -B option, setting it to the size of the source file. With -B 649361203 I got :
Much better indeed. Then I processed the first two files likewise, setting the -B parameter to the size of the corresponding TS file, figuring that it would shrink the size of the resulting “diff” file some more.
But the opposite happened : the “diff” files obtained from those two files were significantly bigger !
Then I tested one of those files with decreasing values of the -B parameter (decreasing powers of 2) :
It turns out that for this file, the “sweet spot” is around the default value. Using a very large value (i.e. equal to the source file's size) yields a bigger “diff” file, and using a smaller value yields a much bigger “diff” file.
I ran those tests again with xdelta 3.0.11, which seems to be the most recent stable release. The global performance was much improved (the sizes of “diff” files are consistently smaller), yet I experienced the same pattern with regards to the relative sizes of “diff” files obtained with various values of the -B parameter.
Bigger “diff” file with -B set to the size of the source file for those two.
Much smaller “diff” file with -B set to the size of the source file for that one.
Smaller sizes than with the older version, but same pattern : the smaller size is obtained with the default value, setting the -B parameter to the size of the source file yields a bigger “diff” file, setting it to a lower value yields a much bigger “diff” file.
How could those results be explained ?
And how could I batch process a whole directory with hundreds of TS files, if there's seemingly no way of finding an optimal setting for all of them ?
I could upload the video files used for those tests, if necessary, but they're quite big and I have a rather slow uploading speed so I would prefer to get some feedback first.
Thanks.
Gabriel, France
The text was updated successfully, but these errors were encountered: