Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows&libusb: hid_write Issue: hid_write/WaitForSingleObject: (0x000003E5) Overlapped I/O operation is in progress. (Regression from signal11 version, 1 sec write timeout issue!) #686

Open
fricpa opened this issue Aug 6, 2024 · 11 comments · Fixed by #700
Labels
libusb Related to libusb backend Windows Related to Windows backend

Comments

@fricpa
Copy link

fricpa commented Aug 6, 2024

Hi. I have some USB HID devices here (with some old version of a proprietary firmware...) that work fine with signal11's hidapi (ca. 0.7.0/0.8.0), but create troubles with 0.14.0.

Interestingly, I only see the issues on Windows and on Linux with the -libusb backend - hidraw seems to work fine.
For the libusb backend I don't see the precise error (because #684).

The Windows backend says: hid_write/WaitForSingleObject: (0x000003E5) Overlapped I/O operation is in progress.

I think the culprit is the new rather strange (to me...) implementation of hid_write in Windows. Here's the diff of hid_write in my context
image
in my context, we build this into a JNI library for Java or call directly into into the hidapi .so with JNA and there is also usb4java (a libusb wrapper) involved, however I am quite convinced that these complications are not the culprit.

To give a bit more background, I see it failing like this:

2024-08-06T18:22:31,218 TRACE [main] org.hid4java.jna.HidApi - > [40]: 3f 3e 77 01 6d 6f 64 65 00 53 65 6e 73 6f 72 20 66 61 69 6c 65 64 00 53 68 75 74 64 6f 77 6e 00 4c 6f 77 20 62 61 74 74 65 72 79 3a 20 53 4c 45 45 50 00 52 65 63 68 61 72 67 65 20 73 6f 6f 6e
2024-08-06T18:22:31,220 TRACE [main] org.hid4java.jna.HidApi - > [40]: 3f 3e 00 41 74 74 72 69 62 75 74 69 6f 6e 00 66 61 69 6c 65 64 2c 00 73 74 61 72 74 20 61 67 61 69 6e 00 44 65 61 74 74 72 69 62 75 74 69 6f 6e 20 72 65 71 75 69 72 65 64 00 00 00 00 00 00 00
2024-08-06T18:22:31,221 TRACE [main] org.hid4java.jna.HidApi - > [40]: 3f 3e 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
2024-08-06T18:22:32,237 ERROR [main] ***MainTest - inner error, devicePath=UsbDevicePath(vendorId=0x2047, productId=0x0301, serialNumber='703C936F1F000000'), lastKnownStatus=31119h,630636,0,0,0,0,0
 java.lang.RuntimeException: org.hid4java.jna.HidApiException: hid_write failed hid_error()='hid_write/WaitForSingleObject: (0x000003E5) Overlapped I/O operation is in progress.'

our device is receiving a 512 byte block of data split into multiple 64 byte hid reports and apparently this firmware of the device performs some slow logic on that third message of all 0 there (since in this context, this message marks the end of the useful data of the block).

On a fairly low level of the USB stack then, probably, the device takes a long time to respond, maybe longer than 1000 ms (the device has a ti msp430 usb with the manufacturer's usb & hid stack if that helps as an info).

I don't understand why it was decided that 1000 was a good timeout to pick here or why overlapped io is used in the first place (even by the original implementation), given the api is intended to be synchronous.

I vote to revert to the signal11 implementation or remove overlapped io entirely, any reason not to? I will make my own branch replacing that code with signal11's version (or revert to 0.8.0) and keep testing, but at least for my context the old version reproducibly worked better while this version is BROKEN.

@fricpa
Copy link
Author

fricpa commented Aug 6, 2024

to expand a bit on the discussion for writing to devices/files in general: The API documentation/specification of hid_write should be changed to say that all the function return value indicates is that the operating system or some level of buffering accepted the amount of bytes that were written there. The implementations on any OS (e.g. hidraw), i.e. the different backends, do not guarantee that when this function returns successfully that the data has actually made it to the device in any meaningful way.

Ensugin that is always the job of an overlaid protocol where the device needs to answer back something in response...

@fricpa
Copy link
Author

fricpa commented Aug 6, 2024

Other hid-api's have tried to make this caveat of writing/sending data more clear in their documentation, for instance:
https://github.com/nyholku/purejavahidapi/blob/f769fcddf62503cff554e646587c92350ca664e5/src/purejavahidapi/HidDevice.java#L98

  	 * The method returning is no guarantee that the data has been physically
	 * transmitted from the host to the device.
	 * <p>
	 * The method returns the actual number of bytes successfully scheduled to
	 * be sent to the device.

@fricpa
Copy link
Author

fricpa commented Aug 6, 2024

We should revise

res = WaitForSingleObject(dev->write_ol.hEvent, 1000);
and signal11/hidapi#88 and https://github.com/signal11/hidapi/blob/windows_write_timeout/windows/hid.c#L602

The decision to add a 1 sec timeout seems completely arbitrary.

If necessary, maybe hid_write could provide a hid_write_timeout and if need to make sure not to have to wait forever you need to give an explicit timeout as with hid_read_timeout - that would also increase the symmetry of the api and restore the synchronous/blocking nature/intention.

@fricpa fricpa changed the title Windows: hid_write Issue: hid_write/WaitForSingleObject: (0x000003E5) Overlapped I/O operation is in progress. (Regression from signal11 version!) Windows: hid_write Issue: hid_write/WaitForSingleObject: (0x000003E5) Overlapped I/O operation is in progress. (Regression from signal11 version, 1 sec write timeout issue!) Aug 6, 2024
@fricpa fricpa changed the title Windows: hid_write Issue: hid_write/WaitForSingleObject: (0x000003E5) Overlapped I/O operation is in progress. (Regression from signal11 version, 1 sec write timeout issue!) Windows&libusb: hid_write Issue: hid_write/WaitForSingleObject: (0x000003E5) Overlapped I/O operation is in progress. (Regression from signal11 version, 1 sec write timeout issue!) Aug 6, 2024
@fricpa
Copy link
Author

fricpa commented Aug 6, 2024

As mentioned in the OP, libusb has the same issue with an arbitrary 1 sec timeout:

hidapi/libusb/hid.c

Lines 1431 to 1435 in d101e5c

res = libusb_interrupt_transfer(dev->device_handle,
dev->output_endpoint,
(unsigned char*)data,
length,
&actual_length, 1000);

hidraw has no timeout

bytes_written = write(dev->device_handle, data, length);

@Youw
Copy link
Member

Youw commented Aug 8, 2024

How long does it take to perform the write on platforms where it doesn't fail? (Or with 0.8.0 implementation?)

If it takes longer than 1sec, it means to properly fix this we need to increase the timeout. (See also: #493).

We should revise ... and signal11/hidapi#88 and https://github.com/signal11/hidapi/blob/windows_write_timeout/windows/hid.c#L602

If you look at signal11/hidapi@master...windows_write_timeout and current master of libusb/hidapi - you should find that current master pretty much implements what windows_write_timeout was suggesting back in the day.

I believe the issue is only with the timeout:

  • need to make it configurable;
  • no I don't think we should change the default now, as it would break backward compatibility - someone may already depend on it;
  • yes, we should change the default when we release v1.0;

hidraw has no timeout

It has no configurable timeout. I've checked the kernel implementation - it uses 5sec as a hard-coded timeout.

@fricpa
Copy link
Author

fricpa commented Aug 8, 2024

Thanks for your insights @Youw . Do you have the link to the line in the kernel? Interesting that there would be any timeout at all, my impression was always that pipes (and sometimes even network io devices and files in general) always have infinite read/write timeout by default...

What do we prefer, increasing the timeout to 5s? (maybe making someone's project slower that relies on the timeout?), or adding hid_write_timeout?

I would vote for adding hid_write_timeout to the api and implementing it for all platforms...

@Youw
Copy link
Member

Youw commented Aug 9, 2024

Do you have the link to the line in the kernel?

#493 (comment)

I would vote for adding hid_write_timeout to the api and implementing it for all platforms...

Again, not for all platforms possible (not for hidraw AFAIK).

What do we prefer

Make it configurable.


I suggest you to try to increase a timeout in your own fork, at least to test/confirm my suggestions.

@mcuee mcuee added Windows Related to Windows backend libusb Related to libusb backend labels Aug 9, 2024
@fricpa
Copy link
Author

fricpa commented Aug 9, 2024

I can confirm that a timeout of 5000 ms = 5 sec works well in my context. First I tried to revert to the signal11 version with infinite timeout on Windows, but that literally made the program hang forever in the case of a bad device.

So my final vote is to change the timeout to 5 sec everywhere as a pragmatic solution.

Youw added a commit that referenced this issue Oct 2, 2024
Add API function to control hid_write timeout on WinAPI backend.

Resolves: #686
@Masterxilo
Copy link

The libusb backend has the same issue, can we add a solution for that too? Couldn't we add a set write timeout which is documented as a "best effort" call and no operation for other platforms?

@Youw
Copy link
Member

Youw commented Oct 3, 2024

The libusb backend has the same issue, can we add a solution for that too?

That is my plan, yes.

documented as a "best effort" call and no operation for other platforms?

For libusb it will change timeouts not only for the write operation, again to match the behaviod or linux kernel (hidraw).
So it not only not possible to implement for all platforms, but also has different behavior for those platforms which (kind of) support it.
That is not the "best efforst", but one of the worst, from design perspective.

@Youw Youw closed this as completed in #700 Oct 3, 2024
@Youw Youw closed this as completed in c15392a Oct 3, 2024
@Youw
Copy link
Member

Youw commented Oct 3, 2024

To be resolved when libusb part is done too

@Youw Youw reopened this Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libusb Related to libusb backend Windows Related to Windows backend
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants