Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CAN-USB device stop working after 10 xmit fails #187

Open
Kira-sempai opened this issue Jan 13, 2025 · 17 comments
Open

CAN-USB device stop working after 10 xmit fails #187

Kira-sempai opened this issue Jan 13, 2025 · 17 comments

Comments

@Kira-sempai
Copy link

Hi

I have UCAN adapter with candleLight firmware. After some time it stop working - don't read or write CAN messages, and LED is constantly on. I use device for C++ app on linux with socket API. Unplug USB device solve the problem, but I want to know the proper way to handle this without reinserting device. What commands should I send to device to make it work again?

dmesg -T | grep -i can
[Thu Dec  5 20:38:45 2024] can: controller area network core
[Thu Dec  5 20:38:45 2024] can: raw protocol
[Thu Dec  5 20:38:52 2024] usb 1-1: Product: canable gs_usb
[Thu Dec  5 20:38:52 2024] usb 1-1: Manufacturer: canable.io
[Thu Dec  5 20:38:54 2024] CAN device driver interface
[Thu Dec  5 20:38:55 2024] IPv6: ADDRCONF(NETDEV_CHANGE): can0: link becomes ready
[Mon Jan  6 13:46:58 2025] gs_usb 1-1:1.0 can0: usb xmit fail 0
[Mon Jan  6 13:46:58 2025] gs_usb 1-1:1.0 can0: usb xmit fail 1
[Mon Jan  6 13:46:58 2025] gs_usb 1-1:1.0 can0: usb xmit fail 2
[Mon Jan  6 13:46:58 2025] gs_usb 1-1:1.0 can0: usb xmit fail 3
[Mon Jan  6 13:46:58 2025] gs_usb 1-1:1.0 can0: usb xmit fail 4
[Mon Jan  6 13:46:58 2025] gs_usb 1-1:1.0 can0: usb xmit fail 5
[Mon Jan  6 13:46:58 2025] gs_usb 1-1:1.0 can0: usb xmit fail 6
[Mon Jan  6 13:46:58 2025] gs_usb 1-1:1.0 can0: usb xmit fail 7
[Mon Jan  6 13:46:58 2025] gs_usb 1-1:1.0 can0: usb xmit fail 8
[Mon Jan  6 13:46:58 2025] gs_usb 1-1:1.0 can0: usb xmit fail 9
@marckleinebudde
Copy link
Collaborator

marckleinebudde commented Jan 16, 2025

That doesn't sound good. My gut feeling says the UCAN has a serious problem. Can you add the following patch to the kernel and reproduce the problem? Then we should have some more debug output.

diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c
index bc86e9b329fd..46d6fa5c630e 100644
--- a/drivers/net/can/usb/gs_usb.c
+++ b/drivers/net/can/usb/gs_usb.c
@@ -752,7 +752,8 @@ static void gs_usb_xmit_callback(struct urb *urb)
         struct net_device *netdev = dev->netdev;
 
         if (urb->status)
-                netdev_info(netdev, "usb xmit fail %u\n", txc->echo_id);
+                netdev_info(netdev, "error %pe: usb xmit fail %u\n",
+                            ERR_PTR(urb->status), txc->echo_id);
 }
 
 static netdev_tx_t gs_can_start_xmit(struct sk_buff *skb,

You can try to recover with bringing the interface down and up again:

ip link set dev can0 down
ip link set dev can0 up

This sends USB control messages, maybe these are still properly handled. Otherwise you can try to disable/enable the USB port with something like this (edit: fixed access of disable variable):

disable=$(realpath /sys/class/net/can0/device/../port/disable)
echo 1 > ${disable}
echo 0 > ${disable}

@fenugrec
Copy link
Collaborator

Also post version information of your firmware build (somwhere in lsusb -v ouput)

@Kira-sempai
Copy link
Author

Thank you for reply!

ip link set dev can0 down

brings can0 down, but it can't go up after that:

# ip link set dev can0 up
RTNETLINK answers: Protocol error
# echo $?
2

ifconfig before and after "down":

can0: flags=193<UP,RUNNING,NOARP>  mtu 16
        unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  txqueuelen 10  (UNSPEC)
        RX packets 91396933  bytes 339440296 (323.7 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 4660208  bytes 18764901 (17.8 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
can0: flags=128<NOARP>  mtu 16
        unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  txqueuelen 10  (UNSPEC)
        RX packets 91396933  bytes 339440296 (323.7 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 4660208  bytes 18764901 (17.8 MiB)
        TX errors 0  dropped 10 overruns 0  carrier 10  collisions 0

Hope it will help.

Will fix gs_usb.c print, but it will take some time to reproduce the problem.

lsusb -v output:

Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass            0
  bDeviceSubClass         0
  bDeviceProtocol         0
  bMaxPacketSize0        64
  idVendor           0x1d50 OpenMoko, Inc.
  idProduct          0x606f Geschwister Schneider CAN adapter
  bcdDevice            0.00
  iManufacturer           1 canable.io
  iProduct                2 canable gs_usb
  iSerial                 3 0031001F5943571020343932
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0032
    bNumInterfaces          2
    bConfigurationValue     1
    iConfiguration          0
    bmAttributes         0x80
      (Bus Powered)
    MaxPower              150mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass    255 Vendor Specific Subclass
      bInterfaceProtocol    255 Vendor Specific Protocol
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0020  1x 32 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x02  EP 2 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0020  1x 32 bytes
        bInterval               0
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        1
      bAlternateSetting       0
      bNumEndpoints           0
      bInterfaceClass       254 Application Specific Interface
      bInterfaceSubClass      1 Device Firmware Update
      bInterfaceProtocol      1
      iInterface            224 (error)
      Device Firmware Upgrade Interface Descriptor:
        bLength                             9
        bDescriptorType                    33
        bmAttributes                       11
          Will Detach
          Manifestation Intolerant
          Upload Supported
          Download Supported
        wDetachTimeout                    255 milliseconds
        wTransferSize                    2048 bytes
        bcdDFUVersion                   1.1a
can't get device qualifier: Resource temporarily unavailable
can't get debug descriptor: Resource temporarily unavailable
cannot read device status, Resource temporarily unavailable (11)

Can't find port/disable in path to try reenable UCAN:

/sys/class/net/can0/device# ls -l
total 0
-rw-r--r-- 1 root root 4096 Jan 16 17:27 authorized
-r--r--r-- 1 root root 4096 Jan 16 17:27 bAlternateSetting
-r--r--r-- 1 root root 4096 Jan 16 17:27 bInterfaceClass
-r--r--r-- 1 root root 4096 Jan 16 17:27 bInterfaceNumber
-r--r--r-- 1 root root 4096 Jan 16 17:27 bInterfaceProtocol
-r--r--r-- 1 root root 4096 Jan 16 17:27 bInterfaceSubClass
-r--r--r-- 1 root root 4096 Jan 16 17:27 bNumEndpoints
lrwxrwxrwx 1 root root    0 Jan 16 17:27 driver -> ../../../../../../../../../bus/usb/drivers/gs_usb
drwxr-xr-x 3 root root    0 Jan 16 17:27 ep_02
drwxr-xr-x 3 root root    0 Jan 16 17:27 ep_81
-r--r--r-- 1 root root 4096 Jan 16 17:27 modalias
drwxr-xr-x 3 root root    0 Jan 16 17:27 net
drwxr-xr-x 2 root root    0 Jan 16 17:27 power
lrwxrwxrwx 1 root root    0 Jan 16 17:27 subsystem -> ../../../../../../../../../bus/usb
-r--r--r-- 1 root root 4096 Jan 16 17:27 supports_autosuspend
-rw-r--r-- 1 root root 4096 Jan 16 17:27 uevent

where should I look next?

@Kira-sempai
Copy link
Author

yeah, UCAN firmware based on this commit:
4ae1a7c

  • I added watchdog to stm32, because I thought, that problem may be with some while loops. But it's not, looks like stm32 still running.

@fenugrec
Copy link
Collaborator

lsusb -v output:

Thanks.
I don't know where you got your firmware, but it was not compiled from this tree recently; since ~ 2021 the descriptors should have a fw version such as

...
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength       0x0032
    bNumInterfaces          2
    bConfigurationValue     1
    iConfiguration          4 fw_0801b39_2024-01-21

So, there is no way for me of knowing if:

  • you're running super old firmware ?
  • running some modified firmware from another source ?

You will need to try a current firmware build, from this repo, otherwise there's no point in trying to analyze the situation.

@Kira-sempai
Copy link
Author

hmm, that strange. It could be that I flashed one UCAN adapter and plugged the other one. Will check it.

@Kira-sempai
Copy link
Author

yeah, in my version.h file I see this:

#define GIT_HASH "fw_c877406_2024-11-08+"

will check, why it's not in iConfiguration

@marckleinebudde
Copy link
Collaborator

Will fix gs_usb.c print, but it will take some time to reproduce the problem.

I think this isn't needed anymore, if interface down, interface up doesn't help, the patch will not bring any new information.

@marckleinebudde
Copy link
Collaborator

marckleinebudde commented Jan 17, 2025

Can't find port/disable in path to try reenable UCAN:

/sys/class/net/can0/device# ls -l
total 0
-rw-r--r-- 1 root root 4096 Jan 16 17:27 authorized
-r--r--r-- 1 root root 4096 Jan 16 17:27 bAlternateSetting
-r--r--r-- 1 root root 4096 Jan 16 17:27 bInterfaceClass
-r--r--r-- 1 root root 4096 Jan 16 17:27 bInterfaceNumber
-r--r--r-- 1 root root 4096 Jan 16 17:27 bInterfaceProtocol
-r--r--r-- 1 root root 4096 Jan 16 17:27 bInterfaceSubClass
-r--r--r-- 1 root root 4096 Jan 16 17:27 bNumEndpoints
lrwxrwxrwx 1 root root    0 Jan 16 17:27 driver -> ../../../../../../../../../bus/usb/drivers/gs_usb
drwxr-xr-x 3 root root    0 Jan 16 17:27 ep_02
drwxr-xr-x 3 root root    0 Jan 16 17:27 ep_81
-r--r--r-- 1 root root 4096 Jan 16 17:27 modalias
drwxr-xr-x 3 root root    0 Jan 16 17:27 net
drwxr-xr-x 2 root root    0 Jan 16 17:27 power
lrwxrwxrwx 1 root root    0 Jan 16 17:27 subsystem -> ../../../../../../../../../bus/usb
-r--r--r-- 1 root root 4096 Jan 16 17:27 supports_autosuspend
-rw-r--r-- 1 root root 4096 Jan 16 17:27 uevent

Sorry, I was dazed and confused and used wrong $() instead of ${}, fixed snipped here:

disable=$(realpath /sys/class/net/can0/device/../port/disable)
echo 1 > ${disable}
echo 0 > ${disable}

Or do it step by step:

$ realpath /sys/class/net/can0/device/../port/disable
/sys/devices/pci0000:00/0000:00:14.0/usb2/2-0:1.0/usb2-port2/disable
$ echo 1 > /sys/devices/pci0000:00/0000:00:14.0/usb2/2-0:1.0/usb2-port2/disable
$ echo 0 > /sys/devices/pci0000:00/0000:00:14.0/usb2/2-0:1.0/usb2-port2/disable

Does this work?

@Kira-sempai
Copy link
Author

Hi!

disable don't work:

# sudo echo 1 > ${disable}
-bash: /sys/devices/platform/soc/2100000.bus/2184000.usb/ci_hdrc.0/usb1/1-0:1.0/usb1-port1/disable: Permission denied

I'm under root

/sys/devices/platform/soc/2100000.bus/2184000.usb/ci_hdrc.0/usb1/1-0:1.0/usb1-port1# ls -al
total 0
drwxr-xr-x 3 root root    0 Jan 17 08:57 .
drwxr-xr-x 5 root root    0 Jan 17 08:57 ..
-r--r--r-- 1 root root 4096 Jan 17 09:40 connect_type
lrwxrwxrwx 1 root root    0 Jan 17 09:35 device -> ../../1-1
-r--r--r-- 1 root root 4096 Jan 17 09:40 location
-r--r--r-- 1 root root 4096 Jan 17 09:40 over_current_count
drwxr-xr-x 2 root root    0 Jan 17 09:40 power
-rw-r--r-- 1 root root 4096 Jan 17 09:40 quirks
-rw-r--r-- 1 root root 4096 Jan 17 09:40 uevent

@marckleinebudde
Copy link
Collaborator

marckleinebudde commented Jan 17, 2025

# sudo echo 1 > ${disable}
-bash: /sys/devices/platform/soc/2100000.bus/2184000.usb/ci_hdrc.0/usb1/1-0:1.0/usb1-port1/disable: Permission denied

I'm under root

The left side of the > is root, the right side not. Try instead:

echo 1 | sudo tee ${disable}

@Kira-sempai
Copy link
Author

I accidently broke usb. First sent

# usbreset 1d50:606f
Resetting canable gs_usb ... ok

and can0 interface disappear from ifconfig list, but device still was in lsusb
Next I sent

echo 0 | sudo tee /sys/bus/usb/devices/1-0:1.0/authorized
0
echo 1 | sudo tee /sys/bus/usb/devices/1-0:1.0/authorized
1

and ucan disappear from lsusb too

@marckleinebudde
Copy link
Collaborator

marckleinebudde commented Jan 17, 2025

What's the demsg output during all this?

Please use disable not authorized. If there is no disable your USB controller/hub doesn't support this feature. If your hub supports turning the power off it should do so.

@Kira-sempai
Copy link
Author

dmesg -T | grep -i usb
...
[Mon Jan  6 13:46:58 2025] gs_usb 1-1:1.0 can0: usb xmit fail 7
[Mon Jan  6 13:46:58 2025] gs_usb 1-1:1.0 can0: usb xmit fail 8
[Mon Jan  6 13:46:58 2025] gs_usb 1-1:1.0 can0: usb xmit fail 9
[Thu Jan 16 16:44:05 2025] gs_usb 1-1:1.0 can0: Couldn't shutdown device (err=-71)
[Thu Jan 16 16:44:11 2025] gs_usb 1-1:1.0 can0: Couldn't start device (err=-71)
[Thu Jan 16 16:44:46 2025] gs_usb 1-1:1.0 can0: Couldn't start device (err=-71)
[Thu Jan 16 16:45:48 2025] gs_usb 1-1:1.0 can0: Couldn't start device (err=-71)
[Fri Jan 17 08:56:45 2025] gs_usb 1-1:1.0 can0: Couldn't start device (err=-71)
[Fri Jan 17 10:07:09 2025] gs_usb 1-1:1.0 can0: Couldn't start device (err=-71)
[Fri Jan 17 10:09:13 2025] usb 1-1: can't set config #1, error -71
[Fri Jan 17 10:09:13 2025] usb 1-1: authorized to connect
[Fri Jan 17 10:26:08 2025] usb 1-1: reset full-speed USB device number 2 using ci_hdrc
[Fri Jan 17 11:32:16 2025] usb 1-1: reset full-speed USB device number 2 using ci_hdrc
[Fri Jan 17 11:41:14 2025] usb 1-1: USB disconnect, device number 2

@marckleinebudde
Copy link
Collaborator

You can try to use https://github.com/mvp/uhubctl. I think it does the same in the background.

I however didn't manage to completely power down the USB port on any of my hubs.

@Kira-sempai
Copy link
Author

Hi!
Looks like uhubctl don't work with my hub either:

# uhubctl -a off -l 1 -p 1
Current status for hub 1 [1d6b:0002 Linux 5.10.35-wb172 ehci_hcd EHCI Host Controller ci_hdrc.0, USB 2.00, 1 ports, ppps]
  Port 1: 010b power oc enable connect [1d50:606f FYSETC UCAN USB to CAN adapter 004200275043571820353339]
Sent power off request
New status for hub 1 [1d6b:0002 Linux 5.10.35-wb172 ehci_hcd EHCI Host Controller ci_hdrc.0, USB 2.00, 1 ports, ppps]
  Port 1: 0003 enable connect [1d50:606f FYSETC UCAN USB to CAN adapter 004200275043571820353339]
# ifconfig
can0: flags=193<UP,RUNNING,NOARP>  mtu 16
        unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00  txqueuelen 10  (UNSPEC)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Currently can't access to CAN-USB adapter that stuck. It looks like it has old firmware, maybe it's not UCAN at all (probably CANable). But I have seen same problem with UCAN and new firmware, so I will replace it and continue search.

Testing anothe adapter. Checked lsusb -v with another my UCAN adapter, and it shows:

iConfiguration          4 fw_c877406_2024-11-08+

fanfact, if run lsusb without sudo, it shows only

iConfiguration          4

Spent some hours debugging adapter to find what's going on.

Next time I catch problem, I will try

echo 1 | sudo tee ${disable}

Is there anyway to send some magic command via USB to CAN-adapter so it's firmware could catch it and reboot? Or Linux machine will not gonna try to reconnect to it until manually re plug adapter?

@fenugrec
Copy link
Collaborator

fanfact, if run lsusb without sudo, it shows only

Ah yes, I could have mentioned that, sorry. Depending on your udev rules etc sometimes those extra queries need root perms.

Is there anyway to send some magic command via USB to CAN-adapter so it's firmware could catch it and reboot?

IIRC not really; see related #168 , #137 ,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants