Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SKIP CI] Tools: Concepts: Draft of MFCC computing #5769

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

singalsu
Copy link
Collaborator

@singalsu singalsu commented May 4, 2022

Compute Mel Frequency Cepstral Coefficients (MFCC) from a SOF
audio stream. The purpose of this Matlab or Octave script is
to draft C implementation of a SOF MFCC generator component.

Signed-off-by: Seppo Ingalsuo [email protected]

tools/concepts/mfcc/test_mfcc.m Show resolved Hide resolved
tools/concepts/mfcc/test_mfcc.m Outdated Show resolved Hide resolved
tools/concepts/mfcc/test_mfcc.m Outdated Show resolved Hide resolved
tools/concepts/mfcc/test_mfcc.m Outdated Show resolved Hide resolved
tools/concepts/mfcc/test_mfcc.m Outdated Show resolved Hide resolved
tools/concepts/mfcc/test_mfcc.m Show resolved Hide resolved
tools/concepts/mfcc/test_mfcc.m Outdated Show resolved Hide resolved
@wszypelt
Copy link

wszypelt commented May 4, 2022

@singalsu There have been some significant changes to main(codec name), please rebase your branch before checking on CI again

@singalsu
Copy link
Collaborator Author

singalsu commented May 5, 2022

Some new library functions need:

  • linspace
  • window functions, tbd. (simplest need only trig. functions)
  • DCT, which type

We can use existing FFT

@singalsu singalsu force-pushed the mfcc_concept branch 4 times, most recently from 51d6077 to 624ea1a Compare May 12, 2022 16:19
@greg-intel
Copy link
Contributor

greg-intel commented May 12, 2022

SOFCI TEST
Reason: Re-running builds because there was a timeout while running xtensa-build-zephyr.py.

@singalsu singalsu force-pushed the mfcc_concept branch 3 times, most recently from 0123f1c to c856d6d Compare May 17, 2022 14:23
@singalsu
Copy link
Collaborator Author

The MFCC output from Matlab/Octave version now fairly well matches pytorch except in since sweep higher frequency part. I'm still finding out where it happens:

Screenshot from 2022-05-17 16-56-26

Copy link
Collaborator

@marc-hb marc-hb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of this Matlab or Octave script is to draft C implementation of a SOF MFCC generator component.

Can you elaborate what you mean by "draft"? Run that octave code for a number of samples inputs and look at the outputs maybe?

In the future, would it be possible to run both the octave code and future C code and compare outputs as a unit test?

I'm not sure what would be a suitable location for Matlab concepts storing. I used here tools but it could be other too.

tests directory?

rows = numpy.size(mfcc_np, 0)
print("MFCC data size is %d x %d\n" % (cols, rows))
numpy.savetxt(MFCC_CSV, mfcc_np, delimiter=",")
print("Exported MFCC to %s\n" % MFCC_CSV)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added main() into the two Python scripts. Now pylint score at 5.83/10 :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only pylint issues are:

  • Module 'torchaudio' has no 'info' member
  • Module 'torchaudio' has no 'load' member

but the members info/load do exist, maybe we can silent the lint error by # pylint: disable=E1101

Copy link
Collaborator

@marc-hb marc-hb May 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I researched this a bit and yes, # pylint: disable=E1101 is the correct answer for the 3 functions load, info and save. That's because these functions are dynamically initialized depending on the OS-specific backend (sox for Linux or soundfile for Windows)

From https://github.com/pytorch/audio/blob/d62875cc6/torchaudio/backend/utils.py#L51

    for func in ["save", "load", "info"]:
        setattr(torchaudio, func, getattr(module, func))

setattr is one of the usual pylint enemies.

Please summarize this next to # pylint: disable=E1101

@singalsu
Copy link
Collaborator Author

The purpose of this Matlab or Octave script is to draft C implementation of a SOF MFCC generator component.

Can you elaborate what you mean by "draft"? Run that octave code for a number of samples inputs and look at the outputs maybe?

This reference algorithm may change a bit when implementing SOF component if there's a better way to do something vs. current version. We haven't run any neural network with this data yet. If speech metrics differ vs. reference then some small difference seen now may be significant and needs to be addressed.

We will also add intermediate test vectors extract functions to this. Now only input and output is available in files.

In the future, would it be possible to run both the octave code and future C code and compare outputs as a unit test?

I'm not sure what would be a suitable location for Matlab concepts storing. I used here tools but it could be other too.

tests directory?

Yep that would make sense, after making the C component this would remain as reference that C is checked against.

@singalsu
Copy link
Collaborator Author

I just pushed new version that achieves good compatibility with Kaldi and Matlab, and fair compatibility with librosa. The delta-MFCC plots are below for chirp test signal. Librosa needs a new centered zero-pad option. Also STFT phase is different than in Kaldi.

Screenshot from 2022-05-19 18-07-50

@singalsu
Copy link
Collaborator Author

no indent.

Yes, I've written with both Emacs and Matlab and they use different indent styles, though I've already changed to use tab instead of default small 2 character indent, I'll fix those to be more like SOF C code for .m files so it will look more familiar.

@lgirdwood
Copy link
Member

lgirdwood commented May 25, 2022

@singalsu how do we use the matlab/octave version and compare against the C/HiFi implementation using testbench ?

@singalsu
Copy link
Collaborator Author

@singalsu how do we use the matlab/octave version and compare against the C/HiFi implementation using testbench ?

Once the fixed fractional Q-formats are established I will add test vectors output to Matlab code. Then in testbench output the same intermediate data via traces likely and compare the results.

@lgirdwood
Copy link
Member

@singalsu how do we use the matlab/octave version and compare against the C/HiFi implementation using testbench ?

Once the fixed fractional Q-formats are established I will add test vectors output to Matlab code. Then in testbench output the same intermediate data via traces likely and compare the results.

Automatically ? i.e. will the UT test script invoke the matlab and the testbench and compare ? This would then be easy to add into CI.

@singalsu
Copy link
Collaborator Author

Once the fixed fractional Q-formats are established I will add test vectors output to Matlab code. Then in testbench output the same intermediate data via traces likely and compare the results.

Automatically ? i.e. will the UT test script invoke the matlab and the testbench and compare ? This would then be easy to add into CI.

That would be a good target. The built-in data files are a burden to maintain. As long as the reference runs in Octave then it can be done. We don't have Matlab licenses for CI computers.

@singalsu
Copy link
Collaborator Author

singalsu commented Jun 3, 2022

The just pushed draft contains start of src/audio/mfcc component. It can be run in testbench with the test topology. It segments input data in component copy() for STFT with three possible window functions made initially (rectangular, Blackman, Povey). Next I will work with Mel spectrum conversion.

@lgirdwood
Copy link
Member

Whats the plan for making this work for topology2 ?

@singalsu
Copy link
Collaborator Author

Whats the plan for making this work for topology2 ?

That would go to early Q3. Also I might be able to convert this to module adapter before vacation.

@singalsu
Copy link
Collaborator Author

I just pushed a version with lot of component C code added. It computes in testbench correct looking Mel spectrograms. This version is missing DCT for cepstral coefficients calculation. I will work with them next.

@@ -325,8 +325,9 @@
me = hz_to_mel(high_freq);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please ignore changes to this file, it's some mistake with git.

@singalsu singalsu force-pushed the mfcc_concept branch 2 times, most recently from 84ce213 to 85be20d Compare June 27, 2022 15:51
Copy link
Collaborator

@marc-hb marc-hb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid you just triggered my copy/paste/diverge detector... can you add a couple variables and reduce duplication between test_mfcc_kaldi.py and test_mfcc_librosa.py?

Also wondering whether this PR could be "divided and conquered" into several PRs? It's big...

@singalsu
Copy link
Collaborator Author

I'm afraid you just triggered my copy/paste/diverge detector... can you add a couple variables and reduce duplication between test_mfcc_kaldi.py and test_mfcc_librosa.py?

Yes, they could be merged. The final form will depend on how I will do the unit and testbench tests for MFCC.

Also wondering whether this PR could be "divided and conquered" into several PRs? It's big...

Agree! I have now developed both Matlab and C parts the same time so the same git development branch has worked for me best. But I'd expect the Matlab part to stabilize quite soon so it can be separated.

@singalsu
Copy link
Collaborator Author

@marc-hb I've now split this work into two (or more) PRs. This remains the concept and reference code.

I wonder if this location /tools/concepts/<some_new_comp> is good. Since parts of this would be used for unit tests the location could be also e.g. something like /test/reference/audio/mfcc. The cmocka unit tests could call these functions from Octave to generate reference data to avoid make (new errors prone) floating point C functions of this. Any thoughts about this?

@marc-hb
Copy link
Collaborator

marc-hb commented Jun 29, 2022

so the same git development branch has worked for me best. But I'd expect the Matlab part to stabilize quite soon so it can be separated.

FWIW I submit multiple PRs from the same branch all the time. Example with two commits, one branch and two PRs.

git push myfork HEAD~1:refs/heads/newPR1 # or the equivalent from your editor's git plugin
git rebase -i # rotate commits
git push myfork HEAD~1:refs/heads/newPR2

Of course the risk is not testing commits in isolation but:

  • temporary git revert is your friend
  • you're supposed to know what you're doing
  • that's what CI is for :-)

Of course you need to use a very good git client to make that efficient, the git command line is too slow. I use magit.

I wonder if this location /tools/concepts/<some_new_comp> is good.

I cannot help here, sorry. I mostly stopped caring about directories; I only use search/find and "recent files". I generally have no clue in which directories are the files I'm working on.

Many projects have utterly meaningless top-level directories like scripts/, tools/ and utils/ which demonstrates this is a lost cause. Like Yahoo was :-)

Since parts of this would be used for unit tests the location could be also e.g. something like /test/reference/audio/mfcc.

Another fun fact: you'd expect low-level, CMocka unit tests to be located closed to the corresponding code they're testing (as opposed to higher-level tests). They're all isolated in a different directory. Go figure.

@singalsu
Copy link
Collaborator Author

Thanks for tips and thoughts @marc-hb !

@singalsu singalsu changed the title Tools: Concepts: Draft of MFCC computing [SKIP CI] Tools: Concepts: Draft of MFCC computing Sep 12, 2022
Compute Mel Frequency Cepstral Coefficients (MFCC) from a SOF
audio stream. The purpose of this Matlab or Octave script is
to draft C implementation of a SOF MFCC generator component.

Signed-off-by: Seppo Ingalsuo <[email protected]>
@gkbldcig
Copy link
Collaborator

Can one of the admins verify this patch?

@lgirdwood
Copy link
Member

@singalsu @andrula-song ping ?

@kv2019i kv2019i removed this from the v2.3 milestone Feb 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants