Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't know how to copy audio into object #31

Open
5pacedo9 opened this issue Dec 11, 2023 · 9 comments
Open

Don't know how to copy audio into object #31

5pacedo9 opened this issue Dec 11, 2023 · 9 comments

Comments

@5pacedo9
Copy link

5pacedo9 commented Dec 11, 2023

As it's shown in docs/Introduction.txt and examples/basic.cpp, I set FrameRate, Channels, and SampleCount which I get from taglib. And then I need to copy audio into the object as instructed, but my audio file is mp3, and it's sampleSize is not an integer(I don't know if the sampleSize is calculated correctly, if it's not, please tell me).As a result, I don't know how to copy my audio into the object. In addition, I don't understand the code in basic.cpp:

// Copy audio into the object
    int8_t* buffer = new int8_t[sampleSize];
    int i = 0;
    while (fread(buffer, sizeof(buffer[0]), sampleSize / (sizeof buffer[0]), file) == sampleSize) {
        uint32_t sample = 0;
        for(int i = 0; i < sampleSize; i++) {
            sample |= buffer[i] << (i * 8);
        }
        a.setSample(i, sample);
        i++;
    }
    delete[] buffer;
    fclose(file);

Here's my code:

int main(int argc, char *argv[])
{
    QCoreApplication a(argc, argv);

    const char *filename = "/Users/5pacedo9/Music/网易云音乐/Anyma - The Answer (Extended Version).mp3";

    TagLib::FileRef f(filename);

    if(!f.isNull() && f.audioProperties()) {

        TagLib::AudioProperties *properties = f.audioProperties();

        int seconds = properties->lengthInSeconds() % 60;
        int minutes = (properties->lengthInSeconds() - seconds) / 60;

        cout << "-- AUDIO --" << endl;
        cout << "bitrate     - " << properties->bitrate() << endl;
        cout << "sample rate - " << properties->sampleRate() << endl;
        cout << "channels    - " << properties->channels() << endl;
        cout << "length      - " << minutes << ":" << setfill('0') << setw(2) << right << seconds << endl;
    }


    // Build an empty audio object
    KeyFinder::AudioData b;

    TagLib::AudioProperties *pr = f.audioProperties();
    // Prepare the object for your audio stream
    b.setFrameRate(pr->sampleRate());
    b.setChannels(pr->channels());
    b.addToSampleCount(pr->sampleRate() * pr->lengthInSeconds() );

    // Copy your audio into the object
    //const size_t sampleSize = pr-> bitrate() / pr->channels() / pr->sampleRate() / 8.0 ;
    double ss = 1024.0 * pr-> bitrate() / (pr->channels() * pr->sampleRate()) / 1.0;
    double sampleSize = ss;

    FILE *file = fopen(filename, "r");
    int8_t* buffer = new int8_t[sampleSize];
    int i = 0;
    while (fread(buffer, sizeof(buffer[0]), sampleSize / (sizeof buffer[0]), file) == sampleSize) {
        uint32_t sample = 0;
        for(int i = 0; i < sampleSize; i++) {
            sample |= buffer[i] << (i * 8);
        }
        b.setSample(i, sample);
        i++;
    }
    delete[] buffer;
    fclose(file);

    // Run the analysis
    KeyFinder::KeyFinder k;
    KeyFinder::key_t key =  k.keyOfAudio(b);
    // And do something with the result

    switch(key) {
    case KeyFinder::A_MAJOR:
        puts("A major\n");
        break;
....................................
    case KeyFinder::SILENCE:
        puts("Silence\n");
        break;
    }
    return a.exec();
}

output:

-- AUDIO --
bitrate     - 320
sample rate - 44100
channels    - 2
length      - 6:27
Silence

the value of sampleSize is about 3.71
Because sampleSize is not an integer, I get the result "Silence". If I set sampleSize manually to 3 or 4, I'll get a wrong key like B minor or E minor, but the real key of the song is G minor

@Swiftb0y
Copy link
Member

Swiftb0y commented Dec 14, 2023

AFAI can tell the keyfinder code assumes the file is some plain PCM-encoded audio data (eg. a wav or aiff). The fread loop is essentially super simple decoder that transforms PCM-audio data into a buffer of float samples (the format commonly used in digital signal processing and in Keyfinder).
The file you're using is an mp3 file so the "encoder" from the example is obviously producing garbage with the binary data read from that mp3. The odd sampleSize is an obvious hint that the decoding doesn't work.

In order to proceed, either ensure your input audio is of the format the decoder from the example expects or use a proper audio decoding library (such as ffmpeg) that produces the float sample buffer for you.

I will take the liberty and close this issue as this is not related to Keyfinder itself. The examples probably need some thorough modernization as well as better explanations to avoid these kinds of confusions in the future.
Edit: I don't have the rights to close this issue...

@5pacedo9
Copy link
Author

5pacedo9 commented Dec 16, 2023

Thank you @Swiftb0y , I'll try to get the float samples from mp3 file. Moreover, I've found that in ibsh's repo is_KeyFinder, he used libav to process mp3 file and turn it to correct float samples that the libkeyfinder expects, but the code in that repo is too complicated for me to understand because I have little knowledge of encoding and decoding audio file.

@Swiftb0y
Copy link
Member

yes, thats exactly the code you need in order to use keyfinder with audio data that is not plain wav. If decoding the file automatically is too complicated for you, you can use the simpler example code from the beginning (even though I don't recommend it because its simplicity brings some major flaws) if you transcode the file you're interested in to a .wav manually beforehand.

@5pacedo9
Copy link
Author

5pacedo9 commented Dec 19, 2023

@Swiftb0y , I decide to not spend time on understanding the decoding code now. I do as the 'basic.cpp' do, and the audio file is original .wav , and then I ran into a new problem just like that in other issues:#16#20
"cannot set out-of-bounds sample" exception
code:

`
void getKey()
{
const char *filename = "/Volumes/B/Imagine Dragons - Smoke + Mirrors/Smoke + Mirrors - 11 Summer.wav";
TagLib::FileRef f(filename);

if(!f.isNull() && f.audioProperties()) {

    TagLib::RIFF::WAV::File *ffile = new TagLib::RIFF::WAV::File(filename);
    //::FLAC::Properties *properties = new TagLib::FLAC::Properties(ffile);

    int seconds = ffile->audioProperties()->lengthInSeconds() % 60;
    int minutes = (ffile->audioProperties()->lengthInSeconds() - seconds) / 60;

    cout << "-- AUDIO --" << endl;
    cout << "bitrate     - " << ffile->audioProperties()->bitrate() << endl;
    cout << "sample rate - " << ffile->audioProperties()->sampleRate() << endl;
    cout << "channels    - " << ffile->audioProperties()->channels() << endl;
    cout << "length      - " << minutes << ":" << setfill('0') << setw(2) << right << seconds << endl;
    cout << "bit depth   - " << ffile->audioProperties()->bitsPerSample() << endl;
}


// Build an empty audio object
KeyFinder::AudioData b;

TagLib::RIFF::WAV::File *ffile = new TagLib::RIFF::WAV::File(filename);
// Prepare the object for your audio stream
b.setFrameRate(ffile->audioProperties()->sampleRate());
b.setChannels(ffile->audioProperties()->channels());
b.addToSampleCount(ffile->audioProperties()->sampleRate() * ffile->audioProperties()->lengthInSeconds() );


// Copy your audio into the object
const size_t sampleSize = ffile->audioProperties()->bitsPerSample() / 8;

FILE *file = fopen(filename, "r");
int8_t* buffer = new int8_t[sampleSize];
int i = 0;
while (fread(buffer, sizeof(buffer[0]), sampleSize / (sizeof buffer[0]), file) == sampleSize) {
    uint32_t sample = 0;
    for(int i = 0; i < sampleSize; i++) {
        sample |= buffer[i] << (i * 8);
    }
    b.setSample(i, sample);
    i++;
}
delete[] buffer;
fclose(file);

// Run the analysis
KeyFinder::KeyFinder k;
KeyFinder::key_t key =  k.keyOfAudio(b);
// And do something with the result

switch(key) {
case KeyFinder::A_MAJOR:
    puts("A major\n");
    break;

...............
case KeyFinder::SILENCE:
puts("Silence\n");
break;
}
}
output:
-- AUDIO --
bitrate - 1411
sample rate - 44100
channels - 2
length - 3:38
bit depth - 16
libc++abi.dylib: terminating with uncaught exception of type KeyFinder::Exception: Cannot set out-of-bounds sample (9613800/9613800)
qtc.process_stub: Inferior error: QProcess::Crashed Process crashed
`

@5pacedo9
Copy link
Author

5pacedo9 commented Dec 19, 2023

And the source code in libkeyfinder/src/fftadapter.cpp:
void FftAdapter::setInput(unsigned int i, double real) { if (i >= frameSize) { std::ostringstream ss; ss << "Cannot set out-of-bounds sample (" << i << "/" << frameSize << ")"; throw Exception(ss.str().c_str()); } if (!std::isfinite(real)) { throw Exception("Cannot set sample to NaN"); } priv->inputReal[i] = real; }
I'm not sure if the 'frameSize' is calculated as a wrong value.

@5pacedo9 5pacedo9 reopened this Dec 19, 2023
@Swiftb0y
Copy link
Member

Swiftb0y commented Dec 19, 2023

Yeah. I'm not a keyfinder dev. I don't know how to solve that or even what the cause is exactly. Try b.setFrameRate(ffile->audioProperties()->sampleRate() / ffile->audioProperties()->channels();. Thats the most obvious issue I can spot from your code.

@5pacedo9
Copy link
Author

Thank you @Swiftb0y , I tried b.setFrameRate(ffile->audioProperties()->sampleRate() / ffile->audioProperties()->channels());, and still got the same exception from keyfinder.

@Swiftb0y
Copy link
Member

yeah. not sure what the problem is. I'm sorry I don't have the time to look into it.

@5pacedo9
Copy link
Author

It's ok, thanks for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants