comparison of MFCC computation between librosa and essentia, for acoustic scene classification #525

edufonseca · 2016-12-02T18:29:26Z

A comparison was made of the MFCC computation between librosa and essentia, using data from DCASE challenge 2016, using their baseline system (MFCC+GMM), for Task 1 - Acoustic scene classification.

Procedure:

Match common input parameters in both libraries and use same signal framing
Edited minor differences in librosa (20log10 and truncation of lowest amplitude values ) such that same amplitude treatment is used by both libraries
Did not look into the filterbank (in theory, both based on Slaney’s)
Specific essentia params in Windowing algorithm: disable zero phase windowing, and leaving normalization as True (by default). Hence, the window normalization appears to be the only major difference between both computations, at least to the best of my knowledge.

Run two simulations for Task 1 - Acoustic scene classification: with and without normalization.
Report the difference of classification accuracy found between librosa and essentia-based systems:

Normalized = True -> accuracy difference ~ 6 % (librosa based system performs better)
Normalized = False -> accuracy difference ~ +-0.3 %

Next plot shows the hamming window used in librosa and in essentia (Normalized = True). Note bottom of the plot.

Next two plots show mean and std of MFCCs computed over 1500 frames of the same audio file, for librosa and essentia. Up: with window normalization. Bottom: without window normalization

Comment:
This occurs for this particular scenario, audio content (soundscapes) and classifier (GMM). Would something similar happen in a different scenario?

dbogdanov · 2016-12-22T17:13:00Z

I've created a separate issue concerning changes in MFCC values due to signal level #543. Normalized windowing will further contribute to this problem making mel energy values even smaller.

dbogdanov · 2016-12-22T17:16:20Z

We might want to change normalized to False by default.

dbogdanov · 2017-10-05T15:19:36Z

@edufonseca Do you still have your scripts to evaluate accuracy difference when using normalized windows again? (As we lowered the threshold for silence in #543, may be the normalization is not a problem any more).

ChenJunHero · 2018-07-24T09:17:31Z

I also found that there were much differences of spectrum amptitude matrix between essentia and librosa.I doubt it`s of "Pading","StartFromZero".I will try to get the formant frequencies and trace the diffence of result.

sildeag · 2018-07-24T20:09:46Z

@edufonseca Did you compare with any other apps eg. OpenSmile, etc.?

edufonseca · 2018-07-25T00:10:54Z

No. Only with librosa. I think there is good chance that the differences between librosa and essentia have been mitigated in later Essentia versions. -- Eduardo Fonseca Music Technology Group Universitat Pompeu Fabra

…

--

On Tue, 24 Jul 2018 at 22:09, sildeag ***@***.***> wrote: @edufonseca <https://github.com/edufonseca> Did you compare with any other apps eg. OpenSmile, etc.? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#525 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ARVBBwy4jnJ3O3E1JjwsTQugDJNNlq0Aks5uJ38QgaJpZM4LC2VN> .

dbogdanov · 2018-07-25T11:17:16Z

One of the main differences with librosa is in the silence threshold. We have done some updates related to that in the mfcc_thresholding but it's not merged yet. You can try to compare with MFCCs computed using that branch.

dbogdanov added the algorithms QA label Dec 4, 2016

dbogdanov added this to the 2.1 milestone Dec 4, 2016

dbogdanov mentioned this issue Dec 14, 2016

New Music Extractor #533

Open

dbogdanov mentioned this issue Apr 3, 2017

Changes in FreesoundExtractor #582

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

comparison of MFCC computation between librosa and essentia, for acoustic scene classification #525

comparison of MFCC computation between librosa and essentia, for acoustic scene classification #525

edufonseca commented Dec 2, 2016

dbogdanov commented Dec 22, 2016

dbogdanov commented Dec 22, 2016

dbogdanov commented Oct 5, 2017

ChenJunHero commented Jul 24, 2018 •

edited

Loading

sildeag commented Jul 24, 2018

edufonseca commented Jul 25, 2018 via email

dbogdanov commented Jul 25, 2018

comparison of MFCC computation between librosa and essentia, for acoustic scene classification #525

comparison of MFCC computation between librosa and essentia, for acoustic scene classification #525

Comments

edufonseca commented Dec 2, 2016

dbogdanov commented Dec 22, 2016

dbogdanov commented Dec 22, 2016

dbogdanov commented Oct 5, 2017

ChenJunHero commented Jul 24, 2018 • edited Loading

sildeag commented Jul 24, 2018

edufonseca commented Jul 25, 2018 via email

dbogdanov commented Jul 25, 2018

ChenJunHero commented Jul 24, 2018 •

edited

Loading