-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C port of core encoder and decoder #33
Conversation
@tmiw - is there a reason we are using
Maybe keeping the namespace as clean as possible? |
Oh Ok it's a libopus convention. Hmm, I need to use some "hidden" function calls inside libopus 🤔 |
I actually included the above code because I was running into linking issues when trying to integrate it with freedv-gui, but it does make sense that forcing |
…mpt at fixing Issue #7, patch to export libopus selected nnet.h funcs
…by export_weights
… model05 & model19_check3, both ctests work, updated opus nnet.h patch
…or bbfm_sc_internal
@tmiw - when convenient - could you pls review this PR? In particular the |
src/CMakeLists.txt
Outdated
target_include_directories(radae_rx PRIVATE | ||
"$<TARGET_PROPERTY:Python3::NumPy,INTERFACE_INCLUDE_DIRECTORIES>") | ||
|
||
add_library(radecore rade_enc.c rade_dec.c rade_enc_data.c rade_dec_data.c) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to have a separate radecore
library just for the C port? There'd have to be modifications on the freedv-gui side to also make sure libradecore.dll
got properly packaged along with librade.dll
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure now the port is done I can reconcile into one library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/opus-nnet.h.diff
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should really be handled by the opus project (i.e. through some sort of ./configure
flag) as major code changes on their end could require changes to the patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, but changes to Opus is outside of the scope of this review, ie we are unlikely to get them to accept a patch in the near term. This code (nnet.h
) has been pretty stable for several years and its a small patch so this approach should be workable for us in the near term as RADE evolves.
@@ -71,6 +71,12 @@ extern "C" { | |||
#define RADE_MODEM_SAMPLE_RATE 8000 // modem waveform sample rate | |||
#define RADE_SPEECH_SAMPLE_RATE 16000 // speech sample rate | |||
|
|||
// init rade_open() flags | |||
#define RADE_USE_C_ENCODER 0x1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: is there a scenario where one would actually want to use Python for RX or TX (and C for the opposite)? If not, should these two be consolidated into one flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For testing, e.g. C encoder driving Python decoder. If you swapped both to C and a test failed you wouldn't know which one had bugs.
Question--where's
|
It's the opus |
I think it was something specific to macOS. I was able to get it to build on its own, but the warning below came back:
Trying to build into freedv-gui now. |
OK, I'm now able to build freedv-gui on both macOS and Windows with this branch. Will need to test actual execution when I have a chance. |
OTOH, now freedv-gui won't build on Linux:
I may need to think about this some more. |
I think the issue is that we're trying to do something different than what Opus does (i.e. make symbols public in our case when they're private in the original library). Maybe we can copy in the implementations of the various functions being used in the C port to avoid needing to link Opus in at all (or only for the tools and not the library). Alternatively, maybe we can build Opus as a dynamic library here instead of a static one. Then the build would produce both a |
So why is it a problem to expose a few library functions? It seems to work just fine for me on Linux. |
It's a problem mainly with integrating the library with others. I worked around my issue for now by having the freedv-gui build use the version of opus built by RADE (i.e. it no longer builds its own copy). I am getting a lot of warnings like this one, though:
But AFAICT they don't seem to affect execution (only tested with Windows so far). |
Warnings look like they're gone now with the latest commit. (BTW I'm working in drowe67/freedv-gui#774 right now for testing.) |
Yep 👍 That's the correct way to build the overall project. It's not a work around - we are patching libopus so of course we need to use only the patched version. |
@tmiw - I'll merge this now, just open another PR if there are any more tweaks you need. |
General strategy is to follow the fine work from DRED/RDOVAE. Scope for this PR is x86 float (int8/other arches later) Vanilla float might actually be fast enough, given the slow frame rate.
Summary:
dsp.py
TODO
export_rade_weights.py
doing something sensiblemodel05
first, as it can be tested without DSP (model19_check3
requires DSP)RADE_EXPORT
stuff). What are the req here? Core enc & dec might be buried and not callable by Windows code.radae_txe.py
, for cteststest_rade_dec.c
arch
. Do we need to? Not now - vanilla float (arch=0
) is currently fast enoughCPU Load Characterisation of Rx DSP + Core Dec
radae_rxe.py
, which includes Rx DSP plus Core Decoder. Usingall.wav
(49 seconds), modern Desktop Intel(R) Core(TM) i5-7600 CPU @ 3.50GHzThe lines below:
To measure just the C core dec:
These results suggest the C core decoder CPU is small (almost 0) compared to the DSP (currently coded in Python). Quite a surprising result. Note the core enc/dec is just using vanilla float (no SIMD).
This suggests (i) the CPU load for the RADE enc/dec is quite low (ii) the next target for optimisation should be the Rx DSP (iii) the overall CPU load will be dominated by the FARGAN decoder (iv) further optimisation of the RADE core enc/dec (e.g. SIMD) may not be warranted/is low priority.
Re (ii) - we should not leap into DSP optimisation for RADE V1 (especially a C port) without careful thought. The DSP will change for RADE V2 so this may be wasted effort. Most of the effort is spent in DSP coding and having the DSP in Python makes this much easier and greatly speeds our development. There may be some simple algorithmic changes we can make in the Python code to speed up performance (i.e. without a full C port).
CPU Load of complete Rx Stack (including FARGAN)
As measured by:
These results also suggest most of the CPU is in the Rx DSP, i.e. Rx DSP >> FARGAN (at least currently). Another test to demonstrate the breakdown between the RADE Rx and FARGAN synthesis (using
rx.f32
fromall.wav
):CPU Load of complete Tx Stack (including feat extract)
As measured by: