lower bit-rates? #42

ElijahHamilton · 2024-04-20T16:21:27Z

Is it possible for QOA to achieve lower bitrates for speech? like 8kbit/s or 16kbits?

phoboslab · 2024-04-20T16:49:03Z

Most speech codecs downsample audio to 16khz or 8khz. You can do the same. At 8khz mono QOA needs about 25 kbits/s (8000 sample * 3 bits + some overhead for the frame headers). That's as low as it goes with the "official" version.

There's an experimental_1bit branch of QOA here that uses just 1 bit per sample. Quality is... quite bad, but this would get you to ~8kbit/s (assuming 8khz). With 2 bits per sample you'd get acceptable quality at ~16kbits. If that's something you want to do entirely depends on your use-case.

If you need better quality and have enough compute, go with Opus. If you need even lower bitrates, try Codec2 (this goes as low as 0.7kbit/s).

phoboslab · 2024-04-20T17:55:31Z

Made some more experiments:

Results:

wav, 44khz, 16bits per sample (704 kbits/s): https://phoboslab.org/files/temp/male_speech_44khz_16bit.wav
wav, 8khz, 16bits per sample (128 kbits/s): https://phoboslab.org/files/temp/male_speech_8khz_16bit.wav
qoa, 8khz, 3bits per sample (24 kbits/s): https://phoboslab.org/files/temp/male_speech_8khz_3bit.wav
qoa, 8khz, 2bits per sample (16 kbits/s): https://phoboslab.org/files/temp/male_speech_8khz_2bit.wav
qoa, 8khz, 1bits per sample (8 kbits/s): https://phoboslab.org/files/temp/male_speech_8khz_1bit.wav

As you can hear it get pretty noisy. The effect is worsened by the low samplerate (i.e. 1bit at 44khz sound way better than at 8khz), but it's still good enough to be intelligible.

ElijahHamilton · 2024-04-20T19:00:50Z

Thanks! I've been looking for an open-source codec that is this customizable.

ElijahHamilton · 2024-04-20T19:08:21Z

Most speech codecs downsample audio to 16khz or 8khz. You can do the same. At 8khz mono QOA needs about 25 kbits/s (8000 sample * 3 bits + some overhead for the frame headers). That's as low as it goes with the "official" version.

There's an experimental_1bit branch of QOA here that uses just 1 bit per sample. Quality is... quite bad, but this would get you to ~8kbit/s (assuming 8khz). With 2 bits per sample you'd get acceptable quality at ~16kbits. If that's something you want to do entirely depends on your use-case.

If you need better quality and have enough compute, go with Opus. If you need even lower bitrates, try Codec2 (this goes as low as 0.7kbit/s).

Maybe adjusting the predictor length would result in less noise?

phoboslab · 2024-04-20T19:57:32Z

Sure, you can try. In my tests (at least with 3bps) I found 4 coefficients to be the sweet spot. Longer ones tend to become unstable quickly; shorter ones don't predict much at all.

You can also experiment with how fast the predictor adapts to the signal (int delta = residual >> 5; and prediction >> 14; here). For the 1 and 2 bit variants I have chosen a slower adaptation (i.e. higher shift values) - but mostly because I've been testing with 8khz audio. For lower samplerates the difference between samples is higher, so a slower adaptation worked better.

Just try a bunch of things and check the reported PSNR when encoding with qoaconv :]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lower bit-rates? #42

lower bit-rates? #42

ElijahHamilton commented Apr 20, 2024

phoboslab commented Apr 20, 2024 •

edited

Loading

phoboslab commented Apr 20, 2024

ElijahHamilton commented Apr 20, 2024

ElijahHamilton commented Apr 20, 2024

phoboslab commented Apr 20, 2024

lower bit-rates? #42

lower bit-rates? #42

Comments

ElijahHamilton commented Apr 20, 2024

phoboslab commented Apr 20, 2024 • edited Loading

phoboslab commented Apr 20, 2024

ElijahHamilton commented Apr 20, 2024

ElijahHamilton commented Apr 20, 2024

phoboslab commented Apr 20, 2024

phoboslab commented Apr 20, 2024 •

edited

Loading