Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eaton 9E Reconnect broke NUT plugin for Unraid by Fix #2671 #2681

Closed
masterwishx opened this issue Nov 16, 2024 · 15 comments · Fixed by #2682
Closed

Eaton 9E Reconnect broke NUT plugin for Unraid by Fix #2671 #2681

masterwishx opened this issue Nov 16, 2024 · 15 comments · Fixed by #2682

Comments

@masterwishx
Copy link
Contributor

masterwishx commented Nov 16, 2024

After compiled latest master with pr #2671 ,, it seems nut not working as usual and reconnect every time .
this fix should maybe also solve #2643 #2644 issues but seems not working as expected.

Eaton 9E200i UPS

@desertwitch also broked NUT Plugin for Unraid :

image

Nov 16 11:24:28 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:24:31 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:24:35 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:24:38 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:24:41 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:24:44 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:24:48 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:24:51 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:24:54 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:24:57 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:25:01 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:25:04 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:25:07 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:25:10 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:25:14 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:25:17 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:25:20 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:25:24 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:25:27 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:25:30 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:25:34 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:25:37 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:25:40 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:25:44 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:25:47 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:25:50 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
Nov 16 11:25:54 server usbhid-ups[8198]: [D4:8198]  device has been disconnected, try to reconnect
@masterwishx
Copy link
Contributor Author

bad nut 16.11.24.txt

@masterwishx masterwishx changed the title Reconnect broke NUT plugin for Unraid by Fix #2671 Eaton 9E Reconnect broke NUT plugin for Unraid by Fix #2671 Nov 16, 2024
@desertwitch
Copy link
Contributor

desertwitch commented Nov 16, 2024

Hmm, first thing that comes to mind is that the early data collection round(s) may still return 0 HID objects (maybe while the UPS is still initializing the connection, collecting data, ...). This would make it reconnect, only to again see 0 HID objects in the first data round(s), reconnect again - resulting in an infinite loop as we see above? Perhaps we should exclude this reconnection condition in the early stages of the connection, and e.g. only consider it when HID elements have been seen at least x times or in the last x previous round(s)?

@masterwishx
Copy link
Contributor Author

masterwishx commented Nov 16, 2024

Hmm, first thing that comes to mind is that the early data collection round(s) may still return 0 HID objects (maybe while the UPS is still initializing the connection, collecting data, ...). This would make it reconnect, only to again see 0 HID objects in the first data round(s), reconnect again - resulting in an infinite loop as we see above? Perhaps we should exclude this reconnection condition in the early stages of the connection, and e.g. only consider it when HID elements have been seen at least x times or in the last x previous round(s)?

Yes maybe or for now Maybe to make this fix for APC model only for what its fixed?

im not sure why i have 0 HID objects, but by function :

Error handler for usb_get/set_* functions.
 * 0 unknown or temporary failure (ignored)

upsdebugx(2, "%s: Connection timed out", desc);

https://github.com/networkupstools/nut/blob/0f6ae03925b139f64e8f45576dd9d5623cd1fd34/drivers/libusb1.c#L838C1-L841C4

Befor fix i had:

Oct 23 16:05:09 server usbhid-ups[32153]: [D1:32153] upsdrv_updateinfo...
Oct 23 16:05:10 server usbhid-ups[32153]: [D2:32153] nut_libusb_get_interrupt: Connection timed out
Oct 23 16:05:10 server usbhid-ups[32153]: [D1:32153] Got 0 HID objects...
Oct 23 16:05:10 server usbhid-ups[32153]: [D1:32153] Quick update...
Oct 23 16:05:10 server usbhid-ups[32153]: [D4:32153] Entering libusb_get_report
Oct 23 16:05:10 server usbhid-ups[32153]: [D3:32153] Report[get]: (3 bytes) => 26 05 02
Oct 23 16:05:10 server usbhid-ups[32153]: [D5:32153] PhyMax = 0, PhyMin = 0, LogMax = 255, LogMin = 0

After:

Nov 16 11:05:49 server usbhid-ups[10186]: [D1:10186] upsdrv_updateinfo...
Nov 16 11:05:49 server usbhid-ups[10186]: [D1:10186] Got to reconnect!
Nov 16 11:05:49 server usbhid-ups[10186]: [D5:10186] send_to_all: SETINFO driver.state "reconnect.trying"
Nov 16 11:05:49 server usbhid-ups[10186]: [D6:10186] send_to_all: write 40 bytes to socket 16 succeeded (ret=40): SETINFO driver.state "reconnect.trying"
Nov 16 11:05:49 server usbhid-ups[10186]: [D4:10186] Closing comm_driver previous handle
Nov 16 11:05:49 server usbhid-ups[10186]: [D4:10186] ===================================================================
Nov 16 11:05:49 server usbhid-ups[10186]: [D4:10186]  device has been disconnected, try to reconnect
Nov 16 11:05:49 server usbhid-ups[10186]: [D4:10186] ===================================================================
Nov 16 11:05:49 server usbhid-ups[10186]: [D4:10186] Opening comm_driver ...
Nov 16 11:05:49 server usbhid-ups[10186]: [D2:10186] Checking device 1 of 11 (1D6B/0003)
Nov 16 11:05:49 server usbhid-ups[10186]: [D1:10186] nut_libusb_open: invalid libusb bus number 0

@masterwishx
Copy link
Contributor Author

Perhaps we should exclude this reconnection condition in the early stages of the connection, and e.g. only consider it when HID elements have been seen at least x times or in the last x previous round(s)?

Seems I have 0 hid object and timeout on like every update but then quick update, but after fix I have reconnect on every update...

@jimklimov
Copy link
Member

I see :\

Would it help to make this a toggle (with 0=>reconnect behavior off by default), and/or a counter of how many 0-entries in a row we've got (similar to throttling elsewhere)?

@masterwishx
Copy link
Contributor Author

I see :\

Would it help to make this a toggle (with 0=>reconnect behavior off by default), and/or a counter of how many 0-entries in a row we've got (similar to throttling elsewhere)?

Not shure becose I have on every update 0 hid
:(

@desertwitch
Copy link
Contributor

I see :\

Would it help to make this a toggle (with 0=>reconnect behavior off by default), and/or a counter of how many 0-entries in a row we've got (similar to throttling elsewhere)?

I'm definitely in a favor of a no-surprise, default-off toggle solution (counter seems good, can be -1 as default for disabled / 0 instant reconnect / 1 one time in a row / 2 two times in a row, ...)

@masterwishx
Copy link
Contributor Author

image

@desertwitch
Copy link
Contributor

desertwitch commented Nov 16, 2024

image

May not fix your case, probably, but at least your UPS is functional so far (iirc?) and doesn't lock up requiring manual reconnect as the initial PR author's. So for such severe cases as his a configurational toggle of this behavior could be helpful, I think.

@masterwishx
Copy link
Contributor Author

I see :\
Would it help to make this a toggle (with 0=>reconnect behavior off by default), and/or a counter of how many 0-entries in a row we've got (similar to throttling elsewhere)?

I'm definitely in a favor of a no-surprise, default-off toggle solution (counter seems good, can be -1 as default for disabled / 0 instant reconnect / 1 one time in a row / 2 two times in a row, ...)

What is mean for row? i have 0 HID for circle when update

@masterwishx
Copy link
Contributor Author

May not fix your case, probably,

Not a problem its working for now ,just i think maybe realted some commands maybe not working all time but maybe will find fix later

@desertwitch
Copy link
Contributor

desertwitch commented Nov 16, 2024

I see :\

Would it help to make this a toggle (with 0=>reconnect behavior off by default), and/or a counter of how many 0-entries in a row we've got (similar to throttling elsewhere)?

I'm definitely in a favor of a no-surprise, default-off toggle solution (counter seems good, can be -1 as default for disabled / 0 instant reconnect / 1 one time in a row / 2 two times in a row, ...)

What is mean for row? i have 0 HID for circle when update

Means for some UPS could be:

zero_hid_reconnect 3

4 HID elements
4 HID elements
0 HID elements (no reconnect)
4 HID elements
0 HID elements
0 HID elements
0 HID elements
(3 in a row - reconnect)
5 HID elements

Like three consecutive zero HID element cycles.

@masterwishx
Copy link
Contributor Author

I see :\

Would it help to make this a toggle (with 0=>reconnect behavior off by default), and/or a counter of how many 0-entries in a row we've got (similar to throttling elsewhere)?

I'm definitely in a favor of a no-surprise, default-off toggle solution (counter seems good, can be -1 as default for disabled / 0 instant reconnect / 1 one time in a row / 2 two times in a row, ...)

What is mean for row? i have 0 HID for circle when update

Means for some UPS could be:

zero_hid_reconnect 3

4 HID elements 4 HID elements 0 HID elements (no reconnect) 4 HID elements 0 HID elements 0 HID elements 0 HID elements (3 in a row - reconnect) 5 HID elements

Like three consecutive zero HID element cycles.

Looks Fine to me!

@masterwishx
Copy link
Contributor Author

i have like every cicle 0 HID :

Nov 16 12:00:44 DaRKNaS usbhid-ups[32357]: [D1:32357] upsdrv_updateinfo...
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D2:32357] nut_libusb_get_interrupt: Connection timed out
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D1:32357] Got 0 HID objects...
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D1:32357] Quick update...
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D4:32357] Entering libusb_get_report
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D3:32357] Report[get]: (3 bytes) => 26 05 02
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] PhyMax = 0, PhyMin = 0, LogMax = 255, LogMin = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] Unit = 00000000, UnitExp = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] Exponent = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D2:32357] Path: UPS.BatterySystem.Charger.ChargerType, Type: Feature, ReportID: 0x26, Offset: 0, Size: 8, Value: 5
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] hu_find_infoval: found Constant Charge (CC) (value: 5)
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D3:32357] Report[buf]: (3 bytes) => 26 05 02
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] PhyMax = 0, PhyMin = 0, LogMax = 255, LogMin = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] Unit = 00000000, UnitExp = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] Exponent = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D2:32357] Path: UPS.BatterySystem.Charger.Status, Type: Feature, ReportID: 0x26, Offset: 8, Size: 8, Value: 2
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D2:32357] ABM Status is 2
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] Lookup [2] failed for [battery.charger.type.status]
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D3:32357] Report[buf]: (3 bytes) => 26 05 02
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] PhyMax = 0, PhyMin = 0, LogMax = 255, LogMin = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] Unit = 00000000, UnitExp = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] Exponent = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D2:32357] Path: UPS.BatterySystem.Charger.Status, Type: Feature, ReportID: 0x26, Offset: 8, Size: 8, Value: 2
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D2:32357] ABM numeric status: 2
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D2:32357] ABM string status: floating
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D2:32357] ABM is -1
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D4:32357] Entering libusb_get_report
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D3:32357] Report[get]: (5 bytes) => 0a ff ff ff ff
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] PhyMax = 0, PhyMin = 0, LogMax = 2147483647, LogMin = -1
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] Unit = 00001001, UnitExp = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] Exponent = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D2:32357] Path: UPS.PowerSummary.DelayBeforeStartup, Type: Feature, ReportID: 0x0a, Offset: 0, Size: 32, Value: -1
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D4:32357] Entering libusb_get_report
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D3:32357] Report[get]: (5 bytes) => 09 ff ff ff ff
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] PhyMax = 0, PhyMin = 0, LogMax = 2147483647, LogMin = -1
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] Unit = 00001001, UnitExp = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] Exponent = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D2:32357] Path: UPS.PowerSummary.DelayBeforeShutdown, Type: Feature, ReportID: 0x09, Offset: 0, Size: 32, Value: -1
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D4:32357] Entering libusb_get_report
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D3:32357] Report[get]: (6 bytes) => 01 01 00 01 00 01
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] PhyMax = 0, PhyMin = 0, LogMax = 255, LogMin = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] Unit = 00000000, UnitExp = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] Exponent = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D2:32357] Path: UPS.PowerSummary.PresentStatus.ACPresent, Type: Feature, ReportID: 0x01, Offset: 0, Size: 8, Value: 1
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] hu_find_infoval: found online (value: 1)
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] process_boolean_info: online
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D3:32357] Report[buf]: (6 bytes) => 01 01 00 01 00 01
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] PhyMax = 0, PhyMin = 0, LogMax = 255, LogMin = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] Unit = 00000000, UnitExp = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] Exponent = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D2:32357] Path: UPS.PowerSummary.PresentStatus.Discharging, Type: Feature, ReportID: 0x01, Offset: 24, Size: 8, Value: 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] process_boolean_info: !dischrg
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D3:32357] Report[buf]: (6 bytes) => 01 01 00 01 00 01
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] PhyMax = 0, PhyMin = 0, LogMax = 255, LogMin = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] Unit = 00000000, UnitExp = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] Exponent = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D2:32357] Path: UPS.PowerSummary.PresentStatus.Charging, Type: Feature, ReportID: 0x01, Offset: 16, Size: 8, Value: 1
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] process_boolean_info: !chrg
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D3:32357] Report[buf]: (3 bytes) => 26 05 02
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] PhyMax = 0, PhyMin = 0, LogMax = 255, LogMin = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] Unit = 00000000, UnitExp = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] Exponent = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D2:32357] Path: UPS.BatterySystem.Charger.Status, Type: Feature, ReportID: 0x26, Offset: 8, Size: 8, Value: 2
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D2:32357] ABM CHRG/DISCHRG legacy string status (ups.status): !chrg
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] process_boolean_info: !chrg
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D3:32357] Report[buf]: (6 bytes) => 01 01 00 01 00 01
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] PhyMax = 0, PhyMin = 0, LogMax = 255, LogMin = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] Unit = 00000000, UnitExp = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] Exponent = 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D2:32357] Path: UPS.PowerSummary.PresentStatus.BelowRemainingCapacityLimit, Type: Feature, ReportID: 0x01, Offset: 8, Size: 8, Value: 0
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] hu_find_infoval: found !lowbatt (value: 0)
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] process_boolean_info: !lowbatt
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D5:32357] send_to_all: SETINFO driver.state "quiet"
Nov 16 12:00:45 DaRKNaS usbhid-ups[32357]: [D6:32357] send_to_all: write 29 bytes to socket 16 succeeded (ret=29): SETINFO driver.state "quiet"
Nov 16 12:00:46 DaRKNaS usbhid-ups[32357]: [D5:32357] send_to_all: SETINFO driver.state "updateinfo"
Nov 16 12:00:46 DaRKNaS usbhid-ups[32357]: [D6:32357] send_to_all: write 34 bytes to socket 16 succeeded (ret=34): SETINFO driver.state "updateinfo"
Nov 16 12:00:46 DaRKNaS usbhid-ups[32357]: [D1:32357] upsdrv_updateinfo...
Nov 16 12:00:47 DaRKNaS usbhid-ups[32357]: [D2:32357] nut_libusb_get_interrupt: Connection timed out
Nov 16 12:00:47 DaRKNaS usbhid-ups[32357]: [D1:32357] Got 0 HID objects...
Nov 16 12:00:47 DaRKNaS usbhid-ups[32357]: [D1:32357] Quick update...
Nov 16 12:00:47 DaRKNaS usbhid-ups[32357]: [D4:32357] Entering libusb_get_report

you mean someone have 0 HID 3 time printed one after another without any printed lines between:

upsdrv_updateinfo...
 nut_libusb_get_interrupt: Connection timed out
 Got 0 HID objects...
 Got 0 HID objects... 
 Got 0 HID objects...

@jimklimov
Copy link
Member

Ack, sorry about that, I'll poke a counter solution then.

jimklimov added a commit to jimklimov/nut that referenced this issue Nov 16, 2024
…e the feature with optional `interrupt_pipe_no_events_tolerance` setting [networkupstools#2681]

Signed-off-by: Jim Klimov <[email protected]>
jimklimov added a commit to jimklimov/nut that referenced this issue Nov 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants