Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix "Loading Track .." forever #1446

Merged
merged 4 commits into from
Mar 16, 2018
Merged

Fix "Loading Track .." forever #1446

merged 4 commits into from
Mar 16, 2018

Conversation

daschuer
Copy link
Member

@daschuer daschuer commented Dec 29, 2017

This fixes
https://bugs.launchpad.net/mixxx/+bug/1740448

Both rare bugs can be simulated by stopping the audio callback.
Can this be part of the hard to catch race condition in the analyzer?

@Be-ing Be-ing added this to the 2.1.0 milestone Dec 29, 2017
@uklotzde
Copy link
Contributor

uklotzde commented Jan 7, 2018

I remember that I might have discovered both issues in the past. Do you have some simple instructions how to reproduce both reliably for verification? Maybe even by injecting a line of code to trigger the issues? I'm too lazy to figure it out myself ;)

@daschuer
Copy link
Member Author

daschuer commented Jan 7, 2018

On Linux ALSA you can user 2.67 ms buffer size and provoke a buffer underrun, this makes the audio callback not return for a long time (at least on my system). This is a pending bug.
If you do not suffer this bug, you can add a
if (Experiment::isEnabled()) { for(;;); }
somewhere into the engine thead and enable it by the developer menu.

while (i != m_workerStates.end()) {
if (i.key() && i.value()) {
i.key()->wake();
i.value() = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a typical race condition: The worker thread might re-schedule itself after woken up before the scheduler thread resets the flag to false. Even if the probability might be low, those are the critical spots that distinguish correct from wrong multi-threaded code!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you can ensure that the worker is not running during this loop then switching both lines would be sufficient. Without the "not running" assumption you need to use an atomic flag and clear() instead of a simple boolean. This is what I would prefer instead of relying on some implicit assumptions that may change at any time without notice.

@@ -30,7 +31,7 @@ class EngineWorkerScheduler : public QThread {
// runWorkers was run. This should only be touched from the engine callback.
bool m_bWakeScheduler;

FIFO<EngineWorker*> m_scheduleFIFO;
QHash<EngineWorker*, bool> m_workerStates;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we ensure that QHash is never modified from another thread while being read/iterated from the scheduler thread? This should be documented in every detail, because we are in multi-threaded land here! I don't recognize it at first sight.

@@ -27,11 +24,17 @@ void EngineWorkerScheduler::workerReady(EngineWorker* pWorker) {
// If the write fails, we really can't do much since we should not block
// in this slot. Write the address of the variable pWorker, since it is
// a 1-element array.
m_scheduleFIFO.write(&pWorker, 1);
m_workerStates[pWorker] = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs at least a debug assertion that the worker has been registered before to avoid modifying QHash, But as already explained I'm not happy with the decision to use QHash here, at least not without additional safety lanes in form of many debug assertions that validate our assumptions.

@daschuer
Copy link
Member Author

daschuer commented Mar 4, 2018

Now it should be more save, hopefully.

EngineWorker* pWorker = NULL;
while (m_scheduleFIFO.read(&pWorker, 1) == 1) {
if (pWorker) {
m_mutex.lock();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QMutexLocker provides all the functionality you need in a safe way.

@uklotzde
Copy link
Contributor

The current implementation was lock free when adding workers and workers had to be re-added after scheduled. With your changes the worker registration is not lock-free any more and workers are scheduled forever once they have been registered. Can you explain shortly why the new version is more appropriate than the previous one and why it will not new cause any new issues?

I'm not happy with the current design with all those delegations and cyclic dependencies. But we don't need to fix this now.

@@ -16,12 +16,17 @@ void EngineWorker::run() {

void EngineWorker::setScheduler(EngineWorkerScheduler* pScheduler) {
m_pScheduler = pScheduler;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should at least check by a debug assertion that the worker currently doesn't have any scheduler when this function is called. Otherwise it would be scheduled twice.

if (pWorker) {
m_mutex.lock();
for(const auto& pWorker: m_workers) {
if (pWorker->isReady()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The worker should decide internally if it should actually wake up when triggered. With this modification std::atomic_flag looks like a perfect choice for m_ready.

@daschuer
Copy link
Member Author

The current implementation was lock free when adding workers and workers had to be re-added after scheduled.

The workers are only added during initialization and if a skin is loaded with more samplers.
So the lock will not effect us during a normal run. The lock is also not used by the engine thread.

With your changes the worker registration is not lock-free any more and workers are scheduled forever once they have been registered.

No, they are sleeping on m_sema unit they become ready. We have an extra check of the m_notReady flag though.

Can you explain shortly why the new version is more appropriate than the previous one and why it will not new cause any new issues?

The Fifo suffers overflows when in overload conditions. The current solution cannot overflow. As usual I cannot guarantee that it will not suffer any new issues. At least I see none.

I'm not happy with the current design with all those delegations and cyclic dependencies. But we don't need to fix this now.

If we consider these changes as too risky, I can put them into a new PR targeted to master instead.
The original bug happens only under rare conditions anyway. This way we may also do som more refactoring.

@daschuer daschuer changed the title Fix "Loading Track .." forever / Moving waveforms Fix "Loading Track .." forever Mar 13, 2018
@daschuer
Copy link
Member Author

I have rebased in on 2.1 and moved the unrelated commits to #1547

@daschuer
Copy link
Member Author

Notes addressed.

}

void EngineWorker::wakeIfReady() {
if (!m_notReady.test_and_set()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn. I assumed that clear() returns the previous state as a result.
Does it get easier to read if we rename m_notReady to m_idle? On the other hand this would introduce another term around "ready", so let's better leave it as it is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the same thought, and notReady is finally the least hurting name.

while (m_scheduleFIFO.read(&pWorker, 1) == 1) {
if (pWorker) {
pWorker->wake();
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to introduce separate scopes here, just use unlock() and relock().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean:

    while (!m_bQuit) {
        Event::start("EngineWorkerScheduler");
        QMutexLocker lock(&m_mutex);
        for(const auto& pWorker: m_workers) {
            pWorker->wakeIfReady();
        }
        lock.unlock();
        Event::end("EngineWorkerScheduler");
        lock.relock();
        if (!m_bQuit) {
            // Wait for next runWorkers() call
            m_waitCondition.wait(&m_mutex); // unlock mutex and wait
        }
    }

I can do this, but it looks less nice for me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with both.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Than I like to s tick with the current version.
Anything else to do?

@uklotzde
Copy link
Contributor

LGTM

@uklotzde uklotzde merged commit bd456c2 into mixxxdj:2.1 Mar 16, 2018
@daschuer daschuer deleted the lp1740448 branch September 26, 2021 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants