Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HID controller thread hangs randomly on Windows #7877

Closed
mixxxbot opened this issue Aug 22, 2022 · 12 comments
Closed

HID controller thread hangs randomly on Windows #7877

mixxxbot opened this issue Aug 22, 2022 · 12 comments

Comments

@mixxxbot
Copy link
Collaborator

Reported by: neogeo-dc
Date: 2015-03-01T17:09:18Z
Status: Triaged
Importance: Medium
Launchpad Issue: lp1426925
Tags: hid, windows
Attachments: patch.diff


Hi

This is a problem i have seen/suffered since some months, first with a Hercules Dj Control MP3 and now with a Gemini G2V (always on windows environment, w7 or greater, both x86 and x64), and it's a random loss of the controller (Mixxx works fine, but controller seems off, no input or output, and hangs at Mixxx exit waiting for the controller thread to stop)

To keep things simple, what i think i've found looking at the source and doing my own tests, is that you can't (with lastest hidapi and Mixxx 1.12 from two days ago) send and retrieve data at the same time from a HID controller (under windows, at least!). Removing the hid_write() calls the code doesn't hangs ever (or so it seems).

The call to hid_read_timeout(), at least in windows, seems to return almost immediatly, this generates also a lot of traffic for the mapping script (incomingData unnecessary calls).

So i have done some changes on HidReader::run() to work around this:

-The first thing is to make a copy of the incoming data, to see if the next call is any different. If not, don't emit incomingData signal.
-Next, put a simple mutex to control access to the HID device: on the HidReader, i modify a variable while reading, and before calling hid_write(), i look up if i can or not. If i cannot i simply "drop" the packet (output is not as "vital").

With those changes, i could see that the hid_write call doesn't get a lot of opportunities to run (almost none), because the hid_read_timeout() call returns inmediatly on a continous loop.

So, i put a little usleep(1) after hid_read_timeout().

With those changes, debug build and all, seems to go fine (i have still to test it more time, but have sense): low latency, responsive, 20-30% cpu while performing... good! :). And as it doesn't hang, Mixxx can quit just fine.

The controller thread is shared between midi and HID devices, that explains why when i was using both at the same time (nanokontrol with midi and hercules with hid), i will always lost both at the same time.

I don't know if it's an issue with hidapi on windows (i would think so if it's not a know problem), but with those changes seems workable from the outside.

I also don't know much about multithread programming, so probably my workaround it's not the best for this case ;)

PS1: I'm doing now a HID mapping for the G2V, and because it has 4 vumeters, generates more feedback/output than the old Hercules, so now the probabilty of locks is worst than ever. This bug happens equally on 1.11 and 1.12.
PS2: I have posted about this on the forums some months ago :  http://www.mixxx.org/forums/viewtopic.php?f=3&t=6461, but i think have better info now ;)
@mixxxbot
Copy link
Collaborator Author

Commented by: neogeo-dc
Date: 2015-03-02T00:29:18Z


Well, after those changes, it has been autodj'ing for almost 7 hours with no problems (i have stopped the test :) ).... more than 578k packets sent (i only send when data changes), and 817 dropped ones (so, 817 possibly chances to hang "saved"). I don't count the incoming ones, but should be almost two orders of magnitude greater.

@mixxxbot
Copy link
Collaborator Author

Commented by: neogeo-dc
Date: 2015-03-15T22:55:53Z


Well, after some days, i can confirm that's the case.

With that kludge i haven't had any problem with this again :)

@mixxxbot
Copy link
Collaborator Author

Commented by: neogeo-dc
Date: 2015-04-03T20:54:05Z


After all those days, the issue it's totally solved with my kludge code ;) , so i want to make an "official" source patch to be integrated into Mixxx (seriously, nobody uses HID under windows or what? :))

I have taken a look at hidapi tests (from outside Mixxx), and the issue it's the same: they just not block, ever, the reads (with or without controller movements). I don't have more HID devices right now, so i can't report if it's a "hardware thing", but reading about it gives me the impression that i'm not alone:

https://msdn.microsoft.com/en-us/library/windows/hardware/ff542426%28v=vs.85%29.aspx

"Using ReadFile

An application uses the open file handle it obtained by using CreateFile to open a file on the collection. When the application calls ReadFile, it does not have to specify overlapped I/O because the HID Client Drivers buffers reports in a ring buffer. However, an application can use overlapped I/O to have more than one outstanding read request."

Here it says clearly that ReadFile will return inmediatly ever. So, there is a need (always speaking about the possibility that this is a "windows only" thing) of a little delay on the controller thread, to avoid collapsing the CPU reading the same data over and over. Apart from that, there is the thing of emit only changes to javascript (an optimization), and controlling the access to the hid file to avoid simultaneous read/write (avoid lockups).

Another thing reading the Mixxx code, it's the call to ..

hid_set_nonblocking(m_pHidDevice, 1);   // Quit blocking

...on closing the hid device. That does absolutelly nothing on windows, as it's used only for hid_read, and hid_read it's not called throught the code (Mixxx uses hid_read_timeout directly instead). Always speaking about the hidapi windows code, don't know about the linux implementation (or mac or whatever)

Well, i'll try to look into this "git" thing and make a patch...(i'm not at all used to version controlling systems :(, too old for that)

@mixxxbot
Copy link
Collaborator Author

Commented by: esbrandt
Date: 2015-04-04T07:12:19Z


Hi shalty,
you might want to try http://www.sourcetreeapp.com/
It takes the scary out of git.

@mixxxbot
Copy link
Collaborator Author

Commented by: Pegasus-RPG
Date: 2015-04-04T15:06:21Z


Shalty: thank you for your detailed work! We have been in contact with the HIDAPI maintainer in the past and he's very friendly and receptive to patches. Since you said you suspect the root problem is in that library, I'd rather you work with him to resolve it, since it will help other users of HIDAPI as well (and I'm afraid your changes may have negative effects for Mixxx users on other platforms.) The maintainer's E-mail address is at the bottom of this page: http://www.signal11.us/oss/hidapi/ Feel free to CC mixxx-devel on your conversation.

@mixxxbot
Copy link
Collaborator Author

Commented by: Pegasus-RPG
Date: 2015-04-04T15:13:36Z


As for Mixxx using hid_read_timeout, that's because Mixxx would stall on shutdown waiting for some input from the controller with hid_read so it could continue and send any shutdown messages. Using hid_read_timeout avoids that problem, but has the side effect of using more CPU (and apparently triggering your problem.) Mr. Ott of HIDAPI had mentioned a possible alternative solution that I haven't had the chance to investigate. Perhaps it's time for me to do so. :)

@mixxxbot
Copy link
Collaborator Author

Commented by: neogeo-dc
Date: 2015-04-04T19:55:23Z
Attachments: patch.diff


I have sent a mail to Alan refering to this bug, i will update accordingly.

In the meantime, i have also cleaned the code and created (i hope so ;) a patch that i'm using now.

I have though about it, and the use of QTAtomics to avoid active locking (in fact i could let the reading have all the priority to the device and then only send packets if available.. the way it is on my mapping, that's not an issue), but as i'm not proficient with multithread programming and i don't know what others do on their mappings, i have taken the safest route (than i know).

@mixxxbot
Copy link
Collaborator Author

Commented by: joan-ardiaca
Date: 2015-09-22T22:43:21Z


I have a similar problem when using two controllers simultaneously (a Pioneer DDJ-SB using MIDI and an Hercules DJ Control MP3 LE over HID) on Linux using the latest Mixxx master. The DDJ-SB stops working randomly after some time, but only when both controllers are connected, and happens even if the controllers are not being used. But your patch seems to solve this issue, thanks a lot for that!

I can't judge if this patch is an appropriate solution to this problem, but it definitely improved my experience with Mixxx.

@mixxxbot
Copy link
Collaborator Author

Commented by: joan-ardiaca
Date: 2015-09-22T22:49:01Z


Damn, forget it, it just happened again :( I did not wait long enough.

@mixxxbot
Copy link
Collaborator Author

Commented by: rryan
Date: 2018-09-20T17:43:32Z


Due to lack of progress, marking Triaged and clearing assignee. Feel free to revert if it is in fact still in progress :).

I think this might be fixed though? Shalty do you know?

@mixxxbot mixxxbot transferred this issue from another repository Aug 24, 2022
@JoergAtGithub
Copy link
Member

This is fixed in #4585

@ronso0 ronso0 closed this as completed Aug 24, 2022
@Swiftb0y
Copy link
Member

First issue closed on github 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants