Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lower bit-rates? #42

Open
ElijahHamilton opened this issue Apr 20, 2024 · 5 comments
Open

lower bit-rates? #42

ElijahHamilton opened this issue Apr 20, 2024 · 5 comments

Comments

@ElijahHamilton
Copy link

Is it possible for QOA to achieve lower bitrates for speech? like 8kbit/s or 16kbits?

@phoboslab
Copy link
Owner

phoboslab commented Apr 20, 2024

Most speech codecs downsample audio to 16khz or 8khz. You can do the same. At 8khz mono QOA needs about 25 kbits/s (8000 sample * 3 bits + some overhead for the frame headers). That's as low as it goes with the "official" version.

There's an experimental_1bit branch of QOA here that uses just 1 bit per sample. Quality is... quite bad, but this would get you to ~8kbit/s (assuming 8khz). With 2 bits per sample you'd get acceptable quality at ~16kbits. If that's something you want to do entirely depends on your use-case.

If you need better quality and have enough compute, go with Opus. If you need even lower bitrates, try Codec2 (this goes as low as 0.7kbit/s).

@phoboslab
Copy link
Owner

Made some more experiments:

Results:

As you can hear it get pretty noisy. The effect is worsened by the low samplerate (i.e. 1bit at 44khz sound way better than at 8khz), but it's still good enough to be intelligible.

@ElijahHamilton
Copy link
Author

Thanks! I've been looking for an open-source codec that is this customizable.

@ElijahHamilton
Copy link
Author

Most speech codecs downsample audio to 16khz or 8khz. You can do the same. At 8khz mono QOA needs about 25 kbits/s (8000 sample * 3 bits + some overhead for the frame headers). That's as low as it goes with the "official" version.

There's an experimental_1bit branch of QOA here that uses just 1 bit per sample. Quality is... quite bad, but this would get you to ~8kbit/s (assuming 8khz). With 2 bits per sample you'd get acceptable quality at ~16kbits. If that's something you want to do entirely depends on your use-case.

If you need better quality and have enough compute, go with Opus. If you need even lower bitrates, try Codec2 (this goes as low as 0.7kbit/s).

Maybe adjusting the predictor length would result in less noise?

@phoboslab
Copy link
Owner

Sure, you can try. In my tests (at least with 3bps) I found 4 coefficients to be the sweet spot. Longer ones tend to become unstable quickly; shorter ones don't predict much at all.

You can also experiment with how fast the predictor adapts to the signal (int delta = residual >> 5; and prediction >> 14; here). For the 1 and 2 bit variants I have chosen a slower adaptation (i.e. higher shift values) - but mostly because I've been testing with 8khz audio. For lower samplerates the difference between samples is higher, so a slower adaptation worked better.

Just try a bunch of things and check the reported PSNR when encoding with qoaconv :]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants