Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Look at Density or other new secondary compression options #13

Open
bangnoise opened this issue Apr 6, 2016 · 14 comments
Open

Look at Density or other new secondary compression options #13

bangnoise opened this issue Apr 6, 2016 · 14 comments

Comments

@bangnoise
Copy link
Collaborator

Density promises improved performance over Snappy.

@elliotwoods
Copy link

elliotwoods commented Jul 8, 2016

+1 for this. i was going to suggest (it's what we use for ofxSquashBuddies generally)
there is an issue though that density leaks memory when it encounters corrupt data

@VuiMuich
Copy link
Contributor

There were recently a bunch of commits and releases on density, also besides of lots of other issues the one with the memory leak got closed.
So maybe now would be a good time to have another look on adding density to hap. Unfortunately I doubt I would be capable of doing a commit on this myself.

@elliotwoods
Copy link

the biggest issue is that this would break backwards compatibility. So would be best placed for an entirely new HAP codec variant, rather than as an incremental update to existing codec variants.
examples of when this might happen:

  • HAP v2 (i.e. update all existing codec variants with a set of backwards-compatibility-breaking features)
  • HAP-L (a new lossless variant - but i presume it would be ideal to consider some gpu-side entropy / quad-tree decoding. Or just rely on improvements in PCIe bandwidth)
  • HAP-F (A 'fast' version with the same characteristics as an existing codec. probably least likely if snappy isn't the bottleneck)

@VuiMuich
Copy link
Contributor

yeah, I see, that this would be somewhat of an issue.
But on the other hand HAP is a very specialized Codec with a userbase that is pretty aware of handling complex codec details like different encoding flags or codec variants.
And I think such an benificial feature could be easily promoted for quick implementation by most of the developers listed in the supported software.
And as far as I understand this repo should just be just a sample on how to implement the codec based on the HapVideoDraft.md, but actual implementation could vary. For example as seen in the FFMPEG implementation (at least last time I checked) that has no support for HapM or HapY, and is handling the textures a bit differently, what could result in slightly different but visible results.

@leavittx
Copy link
Contributor

leavittx commented Apr 6, 2018

This also might worth investigating: https://github.com/Blosc/c-blosc

@ttoinou
Copy link

ttoinou commented Jul 12, 2018

Changing the lossless compression algo is a good idea, will need to see if they are fitted for the DXT data feeded to them.

Side question : what about turning those algo into lossy compression ? i.e. the algo could detect small changes in the input that would perform better compression. Just curious about the artifacts it could generate

@ttoinou
Copy link

ttoinou commented Jul 12, 2018

Also, is the CPU -> GPU memory transfer a bottleneck in general ? That would mean that a decompression algo that could run on the GPU would increase playback performance

@elliotwoods
Copy link

Squash is a compression abstraction library. Implementing squash means you can easily switch between underlying compression codecs (e.g. density, snappy, etc).
Probably not a good idea to use it for HAP since it's probably not a good idea to support a diverse set of compression algorithms in HAP. But they have some amazing benchmark results comparing all the different compression algorithms.
https://quixdb.github.io/squash-benchmark/
NB : doesn't seem like they support blosc just yet

also in terms of lossy encoding : HAP already uses a DXT texture compression (lossy) which happens in addition to the Snappy compression (lossless). If you want to have a lossy compression algorithm which is directly GPU decoded, then you will probably end up with something similar to h264/h265, which has widespread GPU support for decoding up to 8k resolutions already.
To see what your GPU supports, open Google Chrome and type chrome://gpu/ in the address bar. Scroll down to the Video Acceleration Information section and it will tell you what (lossy) codecs your GPU natively supports, e.g. for a GeForce 1080:
image

@ttoinou
Copy link

ttoinou commented Jul 13, 2018

https://quixdb.github.io/squash-benchmark/

It's missing a dataset with DXT textures :p

If you want to have a lossy compression algorithm which is directly GPU decoded, then you will probably end up with something similar to h264/h265

In theory no, HAP without Snappy fits into your definition and a lot of others codecs. We could even create our own codec that is decodable with a custom compute shader. But good idea to look into H264 and compare it to HAP : how many playback can you have at the same time in a real time environment ?

I think this company is trying to do similar stuff than HAP : http://www.binomial.info/

@leavittx
Copy link
Contributor

leavittx commented Oct 25, 2019

Btw I've just took at look at above benchmarks and it looks like density have lower decompression speed (which is most crucial for Hap case I guess) compared to snappy (40Mb/s vs 110Mb/s) in some cases at least - I've chosen the X-ray medical picture dataset.

@leavittx
Copy link
Contributor

Also, lz4 looks quite promising

@bangnoise bangnoise changed the title Look at Density compression Look at Density or other new secondary compression options Nov 11, 2019
@leavittx
Copy link
Contributor

leavittx commented Dec 9, 2019

I've made a really quick prototype of Hap with different 2nd stage compressors couple of weeks ago, and I will just post it here, so everyone can work on it further

@leavittx
Copy link
Contributor

leavittx commented Dec 9, 2019

    { "lz4",        "lz4", 0, AV_OPT_TYPE_CONST, {.i64 = HAP_COMP_LZ4 }, 0, 0, FLAGS, "compressor" }, 



    { "lz4fast3",    "lz4fast (level 3)", 0, AV_OPT_TYPE_CONST, {.i64 = HAP_COMP_LZ4FAST3 }, 0, 0, FLAGS, "compressor" }, 



	{ "lz4fast17",    "lz4fast (level 17)", 0, AV_OPT_TYPE_CONST, {.i64 = HAP_COMP_LZ4FAST17 }, 0, 0, FLAGS, "compressor" }, 



	{ "lz4hc4",    "lz4hc (level 4)", 0, AV_OPT_TYPE_CONST, {.i64 = HAP_COMP_LZ4HC4 }, 0, 0, FLAGS, "compressor" }, 



	{ "lizard10",    "Lizard (level 10)", 0, AV_OPT_TYPE_CONST, {.i64 = HAP_COMP_LIZARD10 }, 0, 0, FLAGS, "compressor" }, 



	{ "lizard12",    "Lizard (level 12)", 0, AV_OPT_TYPE_CONST, {.i64 = HAP_COMP_LIZARD12 }, 0, 0, FLAGS, "compressor" }, 



	{ "zstd1",    "zstd (level 1)", 0, AV_OPT_TYPE_CONST, {.i64 = HAP_COMP_ZSTD1 }, 0, 0, FLAGS, "compressor" }, 

The code snippet above shows the list of new compressors I've tried

@leavittx
Copy link
Contributor

leavittx commented Dec 10, 2019

I've used different DXT1/DXT5 DDS files to estimate the possible speed impovement over snappy.
LZ4 family compressors and Lizard showed best decompression speed results in my lzbench benchmark - that's why I've decided to go on with trying them inside Hap.

Initially I've though that the performance gain could be 2-3x. Though I did get only roughly 20-30% improvement when used LZ4/Lizard inside 8k*8k Hap video (no chunks), and when many threaded chunks are used it is even smaller. The good news is that Lizard/LZ4HC also tend to give 20-30% smaller file size, while having similar or higher decompression speed compared to Snappy.

My current guess is that new compressors work much faster with bigger input, though I couldn't fully confirm that with lzbench results yet (didn't had much time for that, but still). It looks like I'm missing something there.

Some of my DDS compression benchmark results using lzbench:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants