Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix time offset in key and beat analysis #2152

Merged
merged 14 commits into from
Aug 8, 2019
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 9 additions & 14 deletions src/analyzer/analyzerthread.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -256,7 +256,8 @@ AnalyzerThread::AnalysisResult AnalyzerThread::analyzeAudioSource(
}

// 2nd: step: Analyze chunk of decoded audio data
if (readableSampleFrames.frameLength() == mixxx::kAnalysisFramesPerBlock) {
if (readableSampleFrames.frameLength() == mixxx::kAnalysisFramesPerBlock ||
remainingFrames.empty()) {
// Complete chunk of audio samples has been read for analysis
for (auto&& analyzer : m_analyzers) {
analyzer.processSamples(
Expand All @@ -267,19 +268,13 @@ AnalyzerThread::AnalysisResult AnalyzerThread::analyzeAudioSource(
result = AnalysisResult::Complete;
}
} else {
// Partial chunk of audio samples has been read.
// This should only happen at the end of an audio stream,
// otherwise a decoding error must have occurred.
if (remainingFrames.empty()) {
result = AnalysisResult::Complete;
} else {
// EOF not reached -> Maybe a corrupt file?
kLogger.warning()
<< "Aborting analysis after failure to read sample data:"
<< "expected frames =" << inputFrameIndexRange
<< ", actual frames =" << readableSampleFrames.frameIndexRange();
result = AnalysisResult::Partial;
}
// Partial chunk of audio samples has been read, but not the final.
// A decoding error must have occurred, maybe a corrupt file?
kLogger.warning()
<< "Aborting analysis after failure to read sample data:"
<< "expected frames =" << inputFrameIndexRange
<< ", actual frames =" << readableSampleFrames.frameIndexRange();
result = AnalysisResult::Partial;
}

// Don't check again for paused/stopped and simply finish the
Expand Down
45 changes: 28 additions & 17 deletions src/analyzer/plugins/analyzerqueenmarybeats.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,6 @@ bool AnalyzerQueenMaryBeats::initialize(int samplerate) {
}

bool AnalyzerQueenMaryBeats::processSamples(const CSAMPLE* pIn, const int iLen) {
DEBUG_ASSERT(iLen == kAnalysisSamplesPerBlock);
DEBUG_ASSERT(iLen % kAnalysisChannels == 0);
if (!m_pDetectionFunction) {
return false;
Expand All @@ -63,30 +62,21 @@ bool AnalyzerQueenMaryBeats::processSamples(const CSAMPLE* pIn, const int iLen)
}

bool AnalyzerQueenMaryBeats::finalize() {
// TODO(rryan) if iLen is less than frame size, pad with zeros. Do we need
// flush support?
m_helper.finalize();

int nonZeroCount = m_detectionResults.size();
while (nonZeroCount > 0 && m_detectionResults.at(nonZeroCount - 1) <= 0.0) {
--nonZeroCount;
}

std::vector<double> df;
std::vector<double> beatPeriod;
std::vector<double> beatPeriod(nonZeroCount);
std::vector<double> tempi;

df.reserve(nonZeroCount);
beatPeriod.reserve(nonZeroCount);

// NOTE(rryan): The VAMP plugin skipped the first 2 detection function
// results so I do as well. Not sure why.
for (int i = 2; i < nonZeroCount; ++i) {
for (int i = 0; i < nonZeroCount; ++i) {
df.push_back(m_detectionResults.at(i));
beatPeriod.push_back(0.0);
}

if (df.empty()) {
return false;
}

TempoTrackV2 tt(m_iSampleRate, kStepSize);
Expand All @@ -95,10 +85,31 @@ bool AnalyzerQueenMaryBeats::finalize() {
std::vector<double> beats;
tt.calculateBeats(df, beatPeriod, beats);

m_resultBeats.resize(beats.size());
double* result = (double*)&m_resultBeats.at(0);
for (size_t i = 0; i < beats.size(); ++i) {
result[i] = beats[i] * kStepSize;
// In some tracks a beat at 0:00 is detected when a noise floor starts.
// Here we check the level and the position for plausibility and remove
// the beat if this is the case.
size_t firstBeat = 0;
if (beats.size() >= 3) {
if (m_detectionResults.at(beats.at(0)) <
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 times beats.at(0)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping

(m_detectionResults.at(beats.at(0)) +
m_detectionResults.at(beats.at(0))) / 4) {
// the beat is not half es high than the average of the two
// following beats. Skip it.
firstBeat = 1;
} else {
int diff = (beats.at(1) - beats.at(0)) - (beats.at(2) - beats.at(1));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why in the else part? Shouldn't this be applied regardless if firstBeat = 0 or 1, i.e. after shifting the index of the first beat?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...but then we would shift the first beat once more. Not sure what the intention was here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue I am trying to solve is that the fist beat is often off.
It is detected when a noise starts, or it is detected if the track does not start at a beat.
If the automatic detected cue uses this beat, the track is cued in off beat.
The code drops this beat by these two conditions. Thinking about it, it is probably sufficient to use the time condition only.
What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with the details of beat detection algorithm and would follow your recommendation. Just tried to verify the code and check for inconsistencies or unintended behaviour.

// we don't allow a signifcant tempo change after the first beat
if (diff > 2 || diff < -2) {
// firt beat is off grid. Skip it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping

firstBeat = 1;
}
}
}

m_resultBeats.reserve(beats.size());
for (size_t i = firstBeat; i < beats.size(); ++i) {
double result = beats[i] * kStepSize;
m_resultBeats.push_back(result);
}

m_pDetectionFunction.reset();
Expand Down
3 changes: 0 additions & 3 deletions src/analyzer/plugins/analyzerqueenmarykey.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -77,9 +77,7 @@ bool AnalyzerQueenMaryKey::initialize(int samplerate) {
}

bool AnalyzerQueenMaryKey::processSamples(const CSAMPLE* pIn, const int iLen) {
DEBUG_ASSERT(iLen == kAnalysisSamplesPerBlock);
uklotzde marked this conversation as resolved.
Show resolved Hide resolved
DEBUG_ASSERT(iLen % kAnalysisChannels == 0);

if (!m_pKeyMode) {
return false;
}
Expand All @@ -90,7 +88,6 @@ bool AnalyzerQueenMaryKey::processSamples(const CSAMPLE* pIn, const int iLen) {
}

bool AnalyzerQueenMaryKey::finalize() {
// TODO(rryan) do we need a flush?
m_helper.finalize();
m_pKeyMode.reset();
return true;
Expand Down
1 change: 0 additions & 1 deletion src/analyzer/plugins/analyzersoundtouchbeats.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@ bool AnalyzerSoundTouchBeats::processSamples(const CSAMPLE* pIn, const int iLen)
if (!m_pSoundTouch) {
return false;
}
DEBUG_ASSERT(iLen == kAnalysisSamplesPerBlock);
DEBUG_ASSERT(iLen % kAnalysisChannels == 0);
// We analyze a mono mixdown of the signal since we don't think stereo does
// us any good.
Expand Down
52 changes: 38 additions & 14 deletions src/analyzer/plugins/buffering_utils.cpp
Original file line number Diff line number Diff line change
@@ -1,34 +1,63 @@
#include "analyzer/plugins/buffering_utils.h"

#include "util/math.h"
#include "util/sample.h"

#include <string.h>

namespace mixxx {

bool DownmixAndOverlapHelper::initialize(size_t windowSize, size_t stepSize, WindowReadyCallback callback) {
m_buffer.resize(windowSize);
m_buffer.assign(windowSize, 0.0);
m_callback = callback;
m_windowSize = windowSize;
m_stepSize = stepSize;
m_bufferWritePosition = 0;
// make sure the first frame is centered into the fft window. This makes sure
// that the result is significant starting fom the first step.
m_bufferWritePosition = windowSize / 2;
uklotzde marked this conversation as resolved.
Show resolved Hide resolved
return m_windowSize > 0 && m_stepSize > 0 &&
m_stepSize <= m_windowSize && callback;
}

bool DownmixAndOverlapHelper::processStereoSamples(const CSAMPLE* pInput, size_t inputStereoSamples) {
const size_t numInputFrames = inputStereoSamples / 2;
return processInner(pInput, numInputFrames);
}

bool DownmixAndOverlapHelper::finalize() {
// We need to append at least m_windowSize / 2 - m_stepSize silence
// to have a valid analysis results for the last track samples.
// Since we go in fixed steps up to "m_stepSize - 1" samples remains
// unrocessed. That th reason we use "m_windowSize / 2 - 1" below
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor typos in comment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping

// instead of "m_windowSize / 2 - m_stepSize"
size_t framesToFillWindow = m_windowSize - m_bufferWritePosition;
size_t numInputFrames = math_max(framesToFillWindow, m_windowSize / 2 - 1);
return processInner(nullptr, numInputFrames);
}

bool DownmixAndOverlapHelper::processInner(
const CSAMPLE* pInput, size_t numInputFrames) {
size_t inRead = 0;
double* pDownmix = m_buffer.data();

while (inRead < numInputFrames) {
size_t writeAvailable = math_min(numInputFrames,
m_windowSize - m_bufferWritePosition);

for (size_t i = 0; i < writeAvailable; ++i) {
// We analyze a mono downmix of the signal since we don't think
// stereo does us any good.
pDownmix[m_bufferWritePosition + i] = (pInput[(inRead + i) * 2] +
pInput[(inRead + i) * 2 + 1]) *
0.5;
if (pInput) {
for (size_t i = 0; i < writeAvailable; ++i) {
// We analyze a mono downmix of the signal since we don't think
// stereo does us any good.
pDownmix[m_bufferWritePosition + i] = (pInput[(inRead + i) * 2] +
pInput[(inRead + i) * 2 + 1]) *
0.5;
}
} else {
// we are in the finalize call. Add silence to
// complete samples left in th buffer.
for (size_t i = 0; i < writeAvailable; ++i) {
pDownmix[m_bufferWritePosition + i] = 0;
}
}
m_bufferWritePosition += writeAvailable;
inRead += writeAvailable;
Expand All @@ -52,9 +81,4 @@ bool DownmixAndOverlapHelper::processStereoSamples(const CSAMPLE* pInput, size_t
return true;
}

bool DownmixAndOverlapHelper::finalize() {
// TODO(rryan) flush support?
return true;
}

}
} // namespace mixxx
2 changes: 2 additions & 0 deletions src/analyzer/plugins/buffering_utils.h
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@ class DownmixAndOverlapHelper {
bool finalize();

private:
bool processInner(const CSAMPLE* pInput, size_t numInputFrames);

std::vector<double> m_buffer;
// The window size in frames.
size_t m_windowSize = 0;
Expand Down
2 changes: 1 addition & 1 deletion src/track/keyfactory.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ Keys KeyFactory::makePreferredKeys(
pChange->set_key(it->first);
pChange->set_frame_position(frame);
}
key_map.set_global_key(KeyUtils::calculateGlobalKey(key_changes, iTotalSamples));
key_map.set_global_key(KeyUtils::calculateGlobalKey(key_changes, iTotalSamples, iSampleRate));
key_map.set_source(mixxx::track::io::key::ANALYZER);
Keys keys(key_map);
keys.setSubVersion(subVersion);
Expand Down
4 changes: 2 additions & 2 deletions src/track/keyutils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -395,7 +395,7 @@ ChromaticKey KeyUtils::scaleKeySteps(ChromaticKey key, int key_changes) {

// static
mixxx::track::io::key::ChromaticKey KeyUtils::calculateGlobalKey(
const KeyChangeList& key_changes, const int iTotalSamples) {
const KeyChangeList& key_changes, const int iTotalSamples, int iSampleRate) {
const int iTotalFrames = iTotalSamples / 2;
QMap<mixxx::track::io::key::ChromaticKey, double> key_histogram;

Expand All @@ -413,7 +413,7 @@ mixxx::track::io::key::ChromaticKey KeyUtils::calculateGlobalKey(
qDebug() << "Key Histogram";
for (auto it = key_histogram.constBegin();
it != key_histogram.constEnd(); ++it) {
qDebug() << it.key() << ":" << keyDebugName(it.key()) << it.value();
qDebug() << it.key() << ":" << keyDebugName(it.key()) << it.value() / iSampleRate;
if (it.value() > max_delta) {
max_key = it.key();
max_delta = it.value();
Expand Down
2 changes: 1 addition & 1 deletion src/track/keyutils.h
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ class KeyUtils {
static mixxx::track::io::key::ChromaticKey guessKeyFromText(const QString& text);

static mixxx::track::io::key::ChromaticKey calculateGlobalKey(
const KeyChangeList& key_changes, int iTotalSamples);
const KeyChangeList& key_changes, int iTotalSamples, int iSampleRate);

static void setNotation(
const QMap<mixxx::track::io::key::ChromaticKey, QString>& notation);
Expand Down