Fix "Loading Track .." forever #1446

daschuer · 2017-12-29T02:03:45Z

This fixes
https://bugs.launchpad.net/mixxx/+bug/1740448

Both rare bugs can be simulated by stopping the audio callback.
Can this be part of the hard to catch race condition in the analyzer?

uklotzde · 2018-01-07T19:20:46Z

I remember that I might have discovered both issues in the past. Do you have some simple instructions how to reproduce both reliably for verification? Maybe even by injecting a line of code to trigger the issues? I'm too lazy to figure it out myself ;)

daschuer · 2018-01-07T21:04:26Z

On Linux ALSA you can user 2.67 ms buffer size and provoke a buffer underrun, this makes the audio callback not return for a long time (at least on my system). This is a pending bug.
If you do not suffer this bug, you can add a
if (Experiment::isEnabled()) { for(;;); }
somewhere into the engine thead and enable it by the developer menu.

uklotzde · 2018-01-10T21:35:36Z

src/engine/engineworkerscheduler.cpp

+        while (i != m_workerStates.end()) {
+            if (i.key() && i.value()) {
+                i.key()->wake();
+                i.value() = false;


This is a typical race condition: The worker thread might re-schedule itself after woken up before the scheduler thread resets the flag to false. Even if the probability might be low, those are the critical spots that distinguish correct from wrong multi-threaded code!

If you can ensure that the worker is not running during this loop then switching both lines would be sufficient. Without the "not running" assumption you need to use an atomic flag and clear() instead of a simple boolean. This is what I would prefer instead of relying on some implicit assumptions that may change at any time without notice.

uklotzde · 2018-01-10T21:39:07Z

src/engine/engineworkerscheduler.h

@@ -30,7 +31,7 @@ class EngineWorkerScheduler : public QThread {
    // runWorkers was run. This should only be touched from the engine callback.
    bool m_bWakeScheduler;

-    FIFO<EngineWorker*> m_scheduleFIFO;
+    QHash<EngineWorker*, bool> m_workerStates;


How do we ensure that QHash is never modified from another thread while being read/iterated from the scheduler thread? This should be documented in every detail, because we are in multi-threaded land here! I don't recognize it at first sight.

uklotzde · 2018-01-10T21:47:12Z

src/engine/engineworkerscheduler.cpp

@@ -27,11 +24,17 @@ void EngineWorkerScheduler::workerReady(EngineWorker* pWorker) {
        // If the write fails, we really can't do much since we should not block
        // in this slot. Write the address of the variable pWorker, since it is
        // a 1-element array.
-        m_scheduleFIFO.write(&pWorker, 1);
+        m_workerStates[pWorker] = true;


This needs at least a debug assertion that the worker has been registered before to avoid modifying QHash, But as already explained I'm not happy with the decision to use QHash here, at least not without additional safety lanes in form of many debug assertions that validate our assumptions.

daschuer · 2018-03-04T15:55:20Z

Now it should be more save, hopefully.

uklotzde · 2018-03-12T15:35:08Z

src/engine/engineworkerscheduler.cpp

-        EngineWorker* pWorker = NULL;
-        while (m_scheduleFIFO.read(&pWorker, 1) == 1) {
-            if (pWorker) {
+        m_mutex.lock();


QMutexLocker provides all the functionality you need in a safe way.

uklotzde · 2018-03-12T15:52:01Z

The current implementation was lock free when adding workers and workers had to be re-added after scheduled. With your changes the worker registration is not lock-free any more and workers are scheduled forever once they have been registered. Can you explain shortly why the new version is more appropriate than the previous one and why it will not new cause any new issues?

I'm not happy with the current design with all those delegations and cyclic dependencies. But we don't need to fix this now.

uklotzde · 2018-03-12T15:53:41Z

src/engine/engineworker.cpp

@@ -16,12 +16,17 @@ void EngineWorker::run() {

 void EngineWorker::setScheduler(EngineWorkerScheduler* pScheduler) {
    m_pScheduler = pScheduler;


We should at least check by a debug assertion that the worker currently doesn't have any scheduler when this function is called. Otherwise it would be scheduled twice.

uklotzde · 2018-03-12T15:56:54Z

src/engine/engineworkerscheduler.cpp

-            if (pWorker) {
+        m_mutex.lock();
+        for(const auto& pWorker: m_workers) {
+            if (pWorker->isReady()) {


The worker should decide internally if it should actually wake up when triggered. With this modification std::atomic_flag looks like a perfect choice for m_ready.

daschuer · 2018-03-12T23:56:26Z

The current implementation was lock free when adding workers and workers had to be re-added after scheduled.

The workers are only added during initialization and if a skin is loaded with more samplers.
So the lock will not effect us during a normal run. The lock is also not used by the engine thread.

With your changes the worker registration is not lock-free any more and workers are scheduled forever once they have been registered.

No, they are sleeping on m_sema unit they become ready. We have an extra check of the m_notReady flag though.

Can you explain shortly why the new version is more appropriate than the previous one and why it will not new cause any new issues?

The Fifo suffers overflows when in overload conditions. The current solution cannot overflow. As usual I cannot guarantee that it will not suffer any new issues. At least I see none.

I'm not happy with the current design with all those delegations and cyclic dependencies. But we don't need to fix this now.

If we consider these changes as too risky, I can put them into a new PR targeted to master instead.
The original bug happens only under rare conditions anyway. This way we may also do som more refactoring.

…ing Bug #1740448

daschuer · 2018-03-13T23:17:43Z

I have rebased in on 2.1 and moved the unrelated commits to #1547

daschuer · 2018-03-14T21:58:42Z

Notes addressed.

uklotzde · 2018-03-15T18:38:03Z

src/engine/engineworker.cpp

+}
+
+void EngineWorker::wakeIfReady() {
+    if (!m_notReady.test_and_set()) {


Damn. I assumed that clear() returns the previous state as a result.
Does it get easier to read if we rename m_notReady to m_idle? On the other hand this would introduce another term around "ready", so let's better leave it as it is.

I had the same thought, and notReady is finally the least hurting name.

uklotzde · 2018-03-15T18:43:00Z

src/engine/engineworkerscheduler.cpp

-        while (m_scheduleFIFO.read(&pWorker, 1) == 1) {
-            if (pWorker) {
-                pWorker->wake();
+        {


You don't need to introduce separate scopes here, just use unlock() and relock().

do you mean:

while (!m_bQuit) { Event::start("EngineWorkerScheduler"); QMutexLocker lock(&m_mutex); for(const auto& pWorker: m_workers) { pWorker->wakeIfReady(); } lock.unlock(); Event::end("EngineWorkerScheduler"); lock.relock(); if (!m_bQuit) { // Wait for next runWorkers() call m_waitCondition.wait(&m_mutex); // unlock mutex and wait } }

I can do this, but it looks less nice for me.

I'm fine with both.

Than I like to s tick with the current version.
Anything else to do?

uklotzde · 2018-03-16T06:03:41Z

LGTM

Be-ing added this to the 2.1.0 milestone Dec 29, 2017

uklotzde reviewed Jan 10, 2018

View reviewed changes

uklotzde reviewed Mar 12, 2018

View reviewed changes

daschuer mentioned this pull request Mar 13, 2018

Stop Waveforms: Lp1740403 #1547

Merged

daschuer added 3 commits March 13, 2018 23:37

Replace fifo by QHash to fix overlow and loss of wake up signals. Fix…

0d9f1e5

…ing Bug #1740448

Make worker management thread save in any case

4cd7d1b

Use QMutexLocker

99b747d

daschuer force-pushed the lp1740448 branch from 271036d to 99b747d Compare March 13, 2018 23:13

daschuer changed the title ~~Fix "Loading Track .." forever / Moving waveforms~~ Fix "Loading Track .." forever Mar 13, 2018

use std::atomic_flag as ready flag

343fb1f

uklotzde reviewed Mar 15, 2018

View reviewed changes

uklotzde merged commit bd456c2 into mixxxdj:2.1 Mar 16, 2018

daschuer deleted the lp1740448 branch September 26, 2021 17:40

This was referenced Aug 23, 2022

Waveform continues in case off missing callbacks #9024

Closed

"Loading Track .." forever #9026

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix "Loading Track .." forever #1446

Fix "Loading Track .." forever #1446

daschuer commented Dec 29, 2017 •

edited

Loading

uklotzde commented Jan 7, 2018

daschuer commented Jan 7, 2018

uklotzde Jan 10, 2018

uklotzde Jan 10, 2018

uklotzde Jan 10, 2018

uklotzde Jan 10, 2018

daschuer commented Mar 4, 2018

uklotzde Mar 12, 2018

uklotzde commented Mar 12, 2018

uklotzde Mar 12, 2018

uklotzde Mar 12, 2018

daschuer commented Mar 12, 2018

daschuer commented Mar 13, 2018

daschuer commented Mar 14, 2018

uklotzde Mar 15, 2018

daschuer Mar 15, 2018

uklotzde Mar 15, 2018

daschuer Mar 15, 2018

uklotzde Mar 15, 2018

daschuer Mar 15, 2018

uklotzde commented Mar 16, 2018

		@@ -16,12 +16,17 @@ void EngineWorker::run() {

		void EngineWorker::setScheduler(EngineWorkerScheduler* pScheduler) {
		m_pScheduler = pScheduler;

Fix "Loading Track .." forever #1446

Fix "Loading Track .." forever #1446

Conversation

daschuer commented Dec 29, 2017 • edited Loading

uklotzde commented Jan 7, 2018

daschuer commented Jan 7, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

daschuer commented Mar 4, 2018

Choose a reason for hiding this comment

uklotzde commented Mar 12, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

daschuer commented Mar 12, 2018

daschuer commented Mar 13, 2018

daschuer commented Mar 14, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

uklotzde commented Mar 16, 2018

daschuer commented Dec 29, 2017 •

edited

Loading