Poor results #1442

johnbuts · 2024-10-15T17:58:02Z

Hey everyone, thanks in advanced for the help.

So I wanted to use some of the instrument detection models, and was not impressed by the results. I fed it a wav file that just had saxophone playing for around a minute and 10 seconds. Here is the code and output I got:

`
from essentia.standard import MonoLoader, TensorflowPredictEffnetDiscogs, TensorflowPredict2D
import pandas as pd

audio = MonoLoader(filename="other_sax.wav", sampleRate=75000, resampleQuality=4)()
embedding_model = TensorflowPredictEffnetDiscogs(graphFilename="discogs-effnet-bs64-1.pb", output="PartitionedCall:1")
embeddings = embedding_model(audio)

model = TensorflowPredict2D(graphFilename="mtg_jamendo_instrument-discogs-effnet-1.pb")
predictions = model(embeddings)

instruments = [
'accordion', 'acousticbassguitar', 'acousticguitar', 'bass', 'beat', 'bell', 'bongo', 'brass',
'cello', 'clarinet', 'classicalguitar', 'computer', 'doublebass', 'drummachine', 'drums',
'electricguitar', 'electricpiano', 'flute', 'guitar', 'harmonica', 'harp', 'horn', 'keyboard',
'oboe', 'orchestra', 'organ', 'pad', 'percussion', 'piano', 'pipeorgan', 'rhodes', 'sampler',
'saxophone', 'strings', 'synthesizer', 'trombone', 'trumpet', 'viola', 'violin', 'voice'
]

df = pd.DataFrame(predictions, columns=instruments)
instrument_sums = df.sum()

top_5_instruments = instrument_sums.sort_values(ascending=False).head(5)

print(top_5_instruments)
`

output:
synthesizer 218.628510
piano 175.836365
drums 113.429985
cello 85.704750
flute 83.436066

Please tell me what I'm doing wrong, thanks.

palonso · 2024-10-15T21:28:36Z

Hi @johnbuts
The problem with your script is that MonoLoader's sampleRate parameter should match the model's expected sample rate (16000).

johnbuts · 2024-10-15T22:10:50Z

drums 53.502373
bass 42.751842
electricguitar 40.419640
piano 34.878933
guitar 32.601994

that didn't seem to help, been toying around with the sample rate, nothing really seems to help it. Is maybe my code wrong? like the order of the instrumens or something?

johnbuts · 2024-10-15T23:39:11Z

Its like no matter the instrument, its like always really high on sythesizer and piano, like ill put in violin and get this:
synthesizer 70.941162
piano 69.683495
drums 65.955116
electricguitar 63.488628
guitar 62.260086

and then ill put in a saxophone track and get this:
drums 73.532722
synthesizer 72.120262
piano 70.608063
bass 69.452019
electricguitar 66.740585

csipapicsa · 2024-10-26T13:29:13Z

Have your tried to set the resampleQuality=0

resampleQuality (integer ∈ [0, 4], default = 1) :
the resampling quality, 0 for best quality, 4 for fast linear approximation

johnbuts closed this as completed Oct 15, 2024

johnbuts reopened this Oct 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Poor results #1442

Poor results #1442

johnbuts commented Oct 15, 2024 •

edited

Loading

palonso commented Oct 15, 2024

johnbuts commented Oct 15, 2024

johnbuts commented Oct 15, 2024 •

edited

Loading

csipapicsa commented Oct 26, 2024

Poor results #1442

Poor results #1442

Comments

johnbuts commented Oct 15, 2024 • edited Loading

palonso commented Oct 15, 2024

johnbuts commented Oct 15, 2024

johnbuts commented Oct 15, 2024 • edited Loading

csipapicsa commented Oct 26, 2024

johnbuts commented Oct 15, 2024 •

edited

Loading

johnbuts commented Oct 15, 2024 •

edited

Loading