Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leviton switches sending hundreds of messages, won't toggle after latest update #146

Open
sheepster-9k opened this issue Aug 2, 2020 · 29 comments

Comments

@sheepster-9k
Copy link

sheepster-9k commented Aug 2, 2020

The problem

Below posted as bug in integration for HA, however after porting everything to the docker container implementation I'm experiencing the same issue.

Latest Open Z Wave integration update results in abnormal behavior with Leviton DZPA1-15A switch. On switch, log is showing Value Refreshed: Node (X), Instance 1, Label Target State 6 or 7 times per second according to the log. MQTT is receiving these messages as well. Hundreds of packets per minute.
Have tried with 3 other DZPA1-15A's, same behavior. Behavior only observed when switch is in one state vs the other, messages stop when switch is switched. Tried to see if there were any settings regarding polling however there appear to not be.


Behavior is identical with docker container running on Ubuntu

##Logs
Below is the log from OZW switching from off to on twice waiting a couple seconds in between each actuation.

[20200801 11:50:01.169 EDT] [ozw.library] [debug]: Detail - Node: 2 Extended TxStatus: Time: 30, Hops: 0, Rssi: -82 --- --- --- ---, AckChannel: 1, TxChannel: 1, RouteScheme: Last Working Route, Route: 0 0 0 0, RouteSpeed: 40K, RouteTries: 1, FailedLinkFrom: 0, FailedLinkTo: 0
[20200801 11:50:01.169 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.169 EDT] [ozw.library] [debug]: Detail - Node: 2 Expected callbackId was received
[20200801 11:50:01.169 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.185 EDT] [ozw.library] [debug]: Detail - Node: 2 Received: 0x01, 0x0b, 0x00, 0x04, 0x00, 0x02, 0x03, 0x25, 0x03, 0xff, 0xae, 0x00, 0x86
[20200801 11:50:01.185 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.185 EDT] [ozw.library] [info]: Info - Node: 2 Response RTT 48 Average Response RTT 48
[20200801 11:50:01.185 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.185 EDT] [ozw.library] [info]: Info - Node: 2 Received SwitchBinary report from node 2: level=On
[20200801 11:50:01.185 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.185 EDT] [ozw.library] [debug]: Detail - Node: 2 Value Updated: old value=true, new value=true, type=bool
[20200801 11:50:01.185 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.186 EDT] [ozw.library] [debug]: Detail - Node: 2 Target Value is Set to true
[20200801 11:50:01.186 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.186 EDT] [ozw.library] [debug]: Detail - Node: 2 Value Updated: old value=true, new value=true, type=bool
[20200801 11:50:01.186 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.186 EDT] [ozw.library] [debug]: Detail - Node: 2 Changes to this value are not verified
[20200801 11:50:01.186 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.186 EDT] [ozw.library] [debug]: Detail - Node: 2 Expected reply and command class was received
[20200801 11:50:01.186 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.187 EDT] [ozw.library] [debug]: Detail - Node: 2 Message transaction complete
[20200801 11:50:01.187 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.187 EDT] [ozw.library] [debug]: Detail - Node: 2 Removing current message
[20200801 11:50:01.187 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.187 EDT] [ozw.library] [debug]: Detail - Node: 2 Notification: ValueChanged CC: COMMAND_CLASS_SWITCH_BINARY Instance: 1 Index: 0
[20200801 11:50:01.187 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.187 EDT] [ozw.library] [debug]: Detail - Node: 2 Notification: ValueRefreshed CC: COMMAND_CLASS_SWITCH_BINARY Instance: 1 Index: 1
[20200801 11:50:01.187 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.187 EDT] [ozw.notifications] [debug]: Notification pvt_valueChanged: 38354960 Thread: 0x7f7ef41b7d48
[20200801 11:50:01.188 EDT] [ozw.notifications] [debug]: Notification pvt_valueRefreshed: 281475023454224 Thread: 0x7f7ef41b7d48
[20200801 11:50:01.188 EDT] [ozw.mqtt.publisher] [debug]: Publishing Event valueRefreshed: 281475023454224
[20200801 11:50:01.578 EDT] [ozw.library] [info]: Info - Node: 0 Timer: delayed event
[20200801 11:50:01.578 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.579 EDT] [ozw.library] [info]: Info - Node: 2 Sending Get to Refresh Value after Target Check
[20200801 11:50:01.579 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.580 EDT] [ozw.library] [debug]: Detail - Node: 2 Queuing (Send) SwitchBinaryCmd_Get (Node=2): 0x01, 0x09, 0x00, 0x13, 0x02, 0x02, 0x25, 0x02, 0x25, 0xf1, 0x16
[20200801 11:50:01.580 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.580 EDT] [ozw.library] [debug]: Detail - Node: 0 Timer: waiting with timeout -1 ms
[20200801 11:50:01.580 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.580 EDT] [ozw.library] [info]: Info - Node: 2 Sending (Send) message (Callback ID=0xf1, Expected Reply=0x04) - SwitchBinaryCmd_Get (Node=2): 0x01, 0x09, 0x00, 0x13, 0x02, 0x02, 0x25, 0x02, 0x25, 0xf1, 0x16
[20200801 11:50:01.580 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.581 EDT] [ozw.library] [info]: Info - Node: 2 Encrypted Flag is 0
[20200801 11:50:01.581 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.589 EDT] [ozw.library] [debug]: Detail - Node: 2 Received: 0x01, 0x04, 0x01, 0x13, 0x01, 0xe8
[20200801 11:50:01.589 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.589 EDT] [ozw.library] [debug]: Detail - Node: 2 ZW_SEND_DATA delivered to Z-Wave stack
[20200801 11:50:01.589 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.613 EDT] [ozw.library] [debug]: Detail - Node: 2 Received: 0x01, 0x18, 0x00, 0x13, 0xf1, 0x00, 0x00, 0x02, 0x00, 0xad, 0x7f, 0x7f, 0x7f, 0x7f, 0x01, 0x01, 0x03, 0x00, 0x00, 0x00, 0x00, 0x02, 0x01, 0x00, 0x00, 0xaa
[20200801 11:50:01.613 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.614 EDT] [ozw.library] [debug]: Detail - Node: 2 ZW_SEND_DATA Request with callback ID 0xf1 received (expected 0xf1)
[20200801 11:50:01.614 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.614 EDT] [ozw.library] [info]: Info - Node: 2 Request RTT 32 Average Request RTT 32
[20200801 11:50:01.614 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.614 EDT] [ozw.library] [debug]: Detail - Node: 2 Extended TxStatus: Time: 20, Hops: 0, Rssi: -83 --- --- --- ---, AckChannel: 1, TxChannel: 1, RouteScheme: Last Working Route, Route: 0 0 0 0, RouteSpeed: 40K, RouteTries: 1, FailedLinkFrom: 0, FailedLinkTo: 0
[20200801 11:50:01.614 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.615 EDT] [ozw.library] [debug]: Detail - Node: 2 Expected callbackId was received
[20200801 11:50:01.615 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.634 EDT] [ozw.library] [debug]: Detail - Node: 2 Received: 0x01, 0x0b, 0x00, 0x04, 0x00, 0x02, 0x03, 0x25, 0x03, 0xff, 0xad, 0x00, 0x85
[20200801 11:50:01.634 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.634 EDT] [ozw.library] [info]: Info - Node: 2 Response RTT 52 Average Response RTT 50
[20200801 11:50:01.635 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.635 EDT] [ozw.library] [info]: Info - Node: 2 Received SwitchBinary report from node 2: level=On
[20200801 11:50:01.635 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.635 EDT] [ozw.library] [debug]: Detail - Node: 2 Value Updated: old value=true, new value=true, type=bool
[20200801 11:50:01.635 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.636 EDT] [ozw.library] [debug]: Detail - Node: 2 Target Value is Set to true
[20200801 11:50:01.636 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.636 EDT] [ozw.library] [debug]: Detail - Node: 2 Value Updated: old value=true, new value=true, type=bool
[20200801 11:50:01.636 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.636 EDT] [ozw.logging] [debug]: Increasing Logging Capacity to 2004
[20200801 11:50:01.636 EDT] [ozw.library] [debug]: Detail - Node: 2 Changes to this value are not verified
[20200801 11:50:01.636 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.637 EDT] [ozw.library] [debug]: Detail - Node: 2 Expected reply and command class was received
[20200801 11:50:01.637 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.637 EDT] [ozw.library] [debug]: Detail - Node: 2 Message transaction complete
[20200801 11:50:01.637 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.637 EDT] [ozw.library] [debug]: Detail - Node: 2 Removing current message
[20200801 11:50:01.637 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.638 EDT] [ozw.library] [debug]: Detail - Node: 2 Notification: ValueChanged CC: COMMAND_CLASS_SWITCH_BINARY Instance: 1 Index: 0
[20200801 11:50:01.638 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.638 EDT] [ozw.library] [debug]: Detail - Node: 2 Notification: ValueRefreshed CC: COMMAND_CLASS_SWITCH_BINARY Instance: 1 Index: 1
[20200801 11:50:01.638 EDT] [ozw.logging] [debug]: popping Log Mesages
[20200801 11:50:01.638 EDT] [ozw.notifications] [debug]: Notification pvt_valueChanged: 38354960 Thread: 0x7f7ef41b7d48
[20200801 11:50:01.638 EDT] [ozw.notifications] [debug]: Notification pvt_valueRefreshed: 281475023454224 Thread: 0x7f7ef41b7d48
[20200801 11:50:01.638 EDT] [ozw.mqtt.publisher] [debug]: Publishing Event valueRefreshed: 281475023454224
[20200801 11:50:04.021 EDT] [default] [debug]: QVariant(int, 1)
[20200801 11:50:04.021 EDT] [default] [debug]: QVariant(int, 2)
[20200801 11:50:09.551 EDT] [ozw.manager] [info]: Client Connection Disconnected!
[11:50:09] INFO: Starting ozw-admin...
QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-root'
ozwadmin: Settings Path is at "/root/.config/OpenZWave/ozw-admin.conf"
ozwadmin: Checking "/data/ozw/config" for manufacturer_specific.xml
ozwadmin: Checking "/data/ozw/config" for options.xml
ozwadmin: DBPath: "/data/ozw/config/"
ozwadmin: userPath: "/data/ozw/config/"
ozwadmin: Settings Path: "/root/.config/OpenZWave/ozw-admin.conf"
ozwadmin: Using Saved Network Cache Size: 1000
QT_MESSAGE_PATTERN: %{backtrace} is not supported by this Qt build
[20200801 11:50:09.751 EDT] [ozwadmin] [info]: OZWCore Initilized QTOpenZwave(0x7fcf1dac2420) QTOZWManager(0x55b5c5d7a700)
[20200801 11:50:09.765 EDT] [default] [warning]: QVncServer created on port 5900 -->

@sheepster-9k
Copy link
Author

Update, given I couldn't find a direct correlation between the HomeAssistant addon & ozwdaemon build, I cycled backwards through the versions to find where it stabilized. All builds prior to 152 have this issue, 152 is fine.

@kpine
Copy link
Contributor

kpine commented Aug 3, 2020

That would most likely be a side-effect of the work being done for OpenZWave/open-zwave#1321.

@l3arcf
Copy link

l3arcf commented Aug 12, 2020

FWIW, I experience this on the docker version on anything after 152 as well. Just tested with HA Core 0.114.0 that stated "unnecessary polling of devices has been reduced" but that had no effect. It could be related to #1321, but that's not confirmed.
Looking forward to testing again on the next qt-openzwave build (current build is 170).

@l3arcf
Copy link

l3arcf commented Aug 16, 2020

Build (tagged as amd64-172) which I assume with be the next 'latest' for my architecture still exhibits this behavior. I'm struggling to believe there are only a few of us with this issue so there must be something environmental at play here too. Any evidence of the DZ15S for anyone after tag 152?

@kpine
Copy link
Contributor

kpine commented Aug 16, 2020

The issue I linked has not been fixed yet.

@ledhed-jgh
Copy link

I'm experiencing this also. Here is a pastebin of the MQTT messages I'm seeing: https://pastebin.com/ZfZJn7U4

@l3arcf
Copy link

l3arcf commented Aug 25, 2020

As kpine noted, the linked issue has not been fixed, but it is nearly 3 years since it was opened, so I still think there's value in testing new deploys as our issue could fix before 1321 is closed. Also, it's not confirmed to be related, just most likely a side-effect, which I agree with.

It seems like other folks with the DZ15S are experiencing this issue now too. Still no evidence of anyone successfully supporting this switch on anything newer than build 152. I was secretly hoping that something else in my config could be at fault but that theory is fading now.

@kpine
Copy link
Contributor

kpine commented Aug 25, 2020

The problem was introduced by the recent commits for supporting target values. It's not a coincidence that you see it after build 152.

@l3arcf
Copy link

l3arcf commented Aug 25, 2020

@kpine, this is insightful for me. I had done a comparison between my old 1.4 cache xml (I have an old backup) and the 1.6 cache. For CC37, the old one is missing entries for target values and reported as version="1". The new (1.6) cache reports CCVersion "2" and has a Target State and Duration setting of 0.

Now when I compare CC37 between the DZ15S and a ZEN23 (ZooZ) which is also a Binary Switch on my new cache, I see the ZEN reports CCVersion "1" and is missing the Target State and Duration, and has no zwave network flooding issue.

I am wondering if the Levition DZ15S is getting confgured as a version 2 supported device and whether I can somehow force it to a version 1 for this CC.

Recognizing that this is likely related to supporting target values, it does seem weird to me that a binary switch would report values for a feature it doesn't support. Is there a possibility that this is the root cause, or would it be expected that some binary switches support transition?

@kpine
Copy link
Contributor

kpine commented Aug 25, 2020

@l3arcf I forgot to mention that this problem was already reported at OpenZWave/open-zwave#2342 which was closed as being a duplicate of 1321.

In that issue logs were included that show that the switch is reporting V2 of the Binary Switch command class. V2 introduced Target and Duration values. The logs also show that the switch is responding to Get commands with a V1 reply, that does not include the target value. That is the where the problem resides with this device. OZW expects a target value to be returned so it can decide whether or not it needs to refresh the value. I think it's getting confused since the target value is not returned. So if a device says it's V2 and reports V1 messages, it's unexpected. But...

I had done a comparison between my old 1.4 cache xml (I have an old backup) and the 1.6 cache. For CC37, the old one is missing entries for target values and reported as version="1". The new (1.6) cache reports CCVersion "2" and has a Target State and Duration setting of 0.

This seems strange to me. OZW 1.4 does not support V2 of the command class, so it's expected that Target and Duration values are not created. However, I would still expect it to report the version number as 2. If there's some issue in OZW 1.6 that determines the wrong version number, that would be a separate issue and issue 1321 would just be a side effect of that.

The logs from 2342 clearly show the device reporting V2. If you issue a "refresh node" command on yours, you will probably see the same thing. It might be helpful if you could confirm with other software what version is being reported. Even getting logs from OZW 1.4 would be useful, but the SI Labs PC Controller software would be a good 3rd Party tool to verify with. If you want to upload some logs here I would be happy to point out where the problem is occurring.

If the tool reports V1, then we know something has changed in OZW 1.6 that causes it to determine the wrong version number. That would be a separate bug itself (e.g. OpenZWave/open-zwave#2342). If the tool reports V2, then we know the device is buggy because it responds to the Get command with the wrong data.

Anyways, this is an upstream issue no matter the cause, not a problem with qt-openzwave/ozwdaemon.

@l3arcf
Copy link

l3arcf commented Aug 25, 2020

I would still expect it to report the version number as 2. If there's some issue in OZW 1.6 that determines the wrong version number, that would be a separate issue and issue 1321 would just be a side effect of that.

When I said 'report' I meant in the old cache*.xml, sorry if that was misleading. Given that 1.4 didn't support v2, it's quite possible that the version in the cache need not be what the device actually reported. For completeness, here's that snip from
my old 1.4 cache:

<CommandClass id="37" name="COMMAND_CLASS_SWITCH_BINARY" version="1" request_flags="4" innif="true"> <Instance index="1" /> <Value type="bool" genre="user" instance="1" index="0" label="Switch" units="" read_only="false" write_only="false" verify_changes="false" poll_intensity="0" min="0" max="0" value="False" /> </CommandClass>

Again, this does not prove the device reported v1, just that it was in the cache.

I loaded PC Controller, excluded, and included my one switch on a separate zwave network and can see a bunch of stuff that makes more sense now than it did 6 months ago when I first used it for firmware updates, but I confess that my proficiency is low still.

  1. The device seems to report v2 (which likely does not surprise anyone)
  2. I am able to turn it on and off, using the template forms provided (suggesting it's working with V2)
  3. I learned that when you 'set' you need to set the target value, and the duration as "Instantly" or "Default"
  4. I could not figure out how to export a log (image here in lieu of that)
  5. I can not decipher the log at all, so not sure if the response to a 'get' is v1 or v2...

LogFromPCController

Note that the above log was a 'on' then a 'get'. I sent a 'on' again, then an 'off' and a 'get'. Finally a Node info to see if the network was slow. Things seem reasonable here.

I fear my contributions to figuring this out may be diminishing as I'm now deep in territory that I am not familiar with, but it does seem like the PC Controller software has no problem setting and getting values. Maybe that, in itself, is of value?

@kpine
Copy link
Contributor

kpine commented Aug 25, 2020

When I said 'report' I meant in the old cache*.xml, sorry if that was misleading.

No, that was understood. My comment was that from examining the 1.4 code, I could not see a reason why the version in the XML file would not be set to version 2. Obviously there's something I'm missing, based on your results with the PC Controller software (maybe).

Again, this does not prove the device reported v1, just that it was in the cache.

Agreed. You would need a log file from ozw 1.4 to confirm, but I'm not sure it's relevant at this point.

The device seems to report v2 (which likely does not surprise anyone)

If you are basing this on the screenshot above, then no, that doesn't tell you the node's version. I believe the command class window defaults to selecting the latest version of the command class. You would need to navigate to the node itself and see the version. The version is not used for a Get command anyways.

I can not decipher the log at all, so not sure if the response to a 'get' is v1 or v2...

The log shows Rx SWITCH_BINARY_REPORT(25 03) + 00, which means the response is v1, since the data payload is only 1 byte (the current level). It's exactly the same thing OZW reports.

@kpine
Copy link
Contributor

kpine commented Aug 25, 2020

Well, getting the version from the PC Controller software is not as easy as I thought, but it's still not hard. AFAIK you need to manually request the version.

image

In Requested Command Class, select the Switch Binary one, I chose Switch Multilevel. The log will show the result.

image

The first byte is the command class, and the second byte is the version. In my case it's Switch Multilevel V2.

@l3arcf
Copy link

l3arcf commented Aug 25, 2020

Thanks for the step-by step. Here's my log output. I had both 25 - Switch Binary v.1 and a v.2 to select from for "Requested Command Class". I did each with the same result (evidenced in the log).

image

It looks like it reports v2 right?

Now, if the version is not used in the SWITCH_BINARY_GET, then how do we influence the version of the response? For a controller that only supports version 1, the device should not send a v2 report, and for a controller that supports v2, we should somehow let the device know to do that.

Saying it another way, in OZW's world, we're not supporting the v1 response in later releases, but I'm not clear how we tell the device what version response we need.

Although, this may all be mute if the fixes for the dimming switches also include response version detection. (which may come with the 1321 fix.)

I wonder if there are any binary switches running version 2 successfully with builds after 152, proving that the issue is the Leviton only switch.

@kpine
Copy link
Contributor

kpine commented Aug 25, 2020

It looks like it reports v2 right?

Yes.

Now, if the version is not used in the SWITCH_BINARY_GET, then how do we influence the version of the response? For a controller that only supports version 1, the device should not send a v2 report, and for a controller that supports v2, we should somehow let the device know to do that.

I don't think it works that way. If a controller only understands V1 of a command class, then it must ignore any extra data it gets from a Report message. There is no version information present in the message, besides its length. Z-Wave is highly backwards compatible.

OZW does not always support the latest version of everything, yet is is capable of understanding newer devices with updated command class versions just fine. It just won't support any new functionality added by newer command classes.

Saying it another way, in OZW's world, we're not supporting the v1 response in later releases, but I'm not clear how we tell the device what version response we need.

Not sure what you mean here. OZW 1.4 supports V1 of this particular command class and will ignore the data for V2 if a device returns it. OZW 1.6 supports V2 of this command class, and currently expects the Report data be in V2 format if that's what the device says it supports. The bug here is probably in both places, the Leviton returns the wrong version of the report data and OZW is not handling it the unexpected data. When this latest refresh code was being written though, this non-compliant behavior was likely not considered.

Although, this may all be mute if the fixes for the dimming switches also include response version detection. (which may come with the 1321 fix.)

I don't know if Fishwaldo was aware or not of this particular device's behavior. In general though, the message parsing code in OZW considers both version and length of the message, so a similar approach is probably necessary for this case.

I wonder if there are any binary switches running version 2 successfully with builds after 152, proving that the issue is the Leviton only switch.

I have yet to see any other reports. That doesn't disprove anything, but there are plenty of reports for this Leviton. The code works fine if the device returns the data as expected, and I would guess most devices are compliant.

In summary:

  • This Leviton device reports that it supports V2 of the Switch Binary command class
  • When turning a switch ON, OZW sends a V2 Set command, which includes the target value (ON) and duration. It looks like the switch handles the Set message just fine.
  • After the Set, OZW immediately issues a Get to verify the value. Since the device supports V2, OZW expects a response with the current value and target value.
  • The Leviton sends a Report in response to the Get, but it is in V1 format so it includes only the *current value, not the target value. OZW doesn't check the length of the data and reads the target value from a buffer, which is likely some default initialized memory. In C++, this would be zero of initialized properly. So this appears as if the target value is OFF, but the current value is ON. Since the device is not at the (incorrect) target value, OZW will issue another Get again and again until current value == target (OFF) (does it stop?).

@l3arcf
Copy link

l3arcf commented Aug 25, 2020

@kpine , I appreciate you stepping me through all of this. I'm understanding more and more. I dug into the ozw code now and I think I see where the decision is made. I believe it was introduced in the following change (https://github.com/OpenZWave/open-zwave/commit/98c7c036ddeaad05a770993cd7c791dcbc2eb77d#)

Note that the pattern used for the 'Basic.cpp' files was to check the length, but the pattern used for SwitchBinary.cpp was to check the version. SwitchMultilevel.cpp, actually does version and value validation too, if I'm interpreting that correctly.

Do you think Fishwaldo would consider "if (_length == 4)" rather than "if (GetVersion() >= 2)" ? It may be better practice to test the payload you're making a decision on rather than making the assumption, Also, the possibility is still there that there are currently limited Binary switches that support version 2 and that's why the issue seems related to the DZ15s only.

@kpine
Copy link
Contributor

kpine commented Aug 25, 2020

Do you think Fishwaldo would consider "if (_length == 4)" rather than "if (GetVersion() >= 2)" ?

That's a question I can't answer, but I'm pretty sure Fishwaldo will know how to fix it. If you're asking if you should submit a PR with that change? Well, feel free, but I would assume it would get fixed along with the rest of the open issues. Doesn't hurt if you're up to it, but it might not get accepted. With all the refresh value stuff it might be more complicated.

It's pretty trivial to build your own version of ozwdaemon by modifying the Docker file to import your versions of code.

@l3arcf
Copy link

l3arcf commented Aug 25, 2020

I'd need to test before a PR. I think I need to build my own version to do that, so I better get learning. Thanks for all the advice and direction.

@kaijk
Copy link

kaijk commented Aug 31, 2020

In late July PR-2274 at https://github.com/OpenZWave/open-zwave was merged. This completed adding DZ15S and DZPA1 which had previously been somewhat scrambled from someone apparently copying the DZPA1 definition to create the DZ15S (meta data and productID, etc. info was wrong). Anyway, the PR validation was failing at command class 134 (COMMAND_CLASS_VERSION that seems to be about "Compatibility"....

All the Leviton devices, but only the Leviton devices have this command class. We removed the command class from the two definitions, and fishwaldo merged them in. At that time the spamming did not occur either with or without the command class.

After seeing this current Issue, I went back and added the 134 command class back into my local DZ15S definition (I have the DZ15S in my docker test environment).

  <CommandClass id="134">
    <Compatibility>
      <ClassGetVersionSupported>false</ClassGetVersionSupported>
    </Compatibility>
  </CommandClass>

Doing so didn't seem to mitigate the spamming in build 170, nor did it seem to change the cache file for the device from without the command class. Build 152 still works OK with or without the 134 command class.

I wonder if code related to that command class changed between 152 and 170?
Hope this might help shed some light.

@kpine
Copy link
Contributor

kpine commented Sep 1, 2020

@kaijk That's interesting, I wasn't aware of that flag.

After seeing this current Issue, I went back and added the 134 command class back into my local DZ15S definition

Did you refresh the node info after doing so? OZW won't adjust the cache file (and behavior) until it's refreshed. You may also need to restart ozwd in order for something like this to be updated. If all else fails, you can delete the single node from the cache file and it will be re-generated.

If you can prevent OZW from detecting the version and force V1, then this should fix the problem for these switches. OZW will not use the problematic refresh code for V1 switches (unless manually enabled).

The wiki does say "The versions must then be manually specified in the device Config File.". There are a handful of XML files with this flag, and nearly all of them are Leviton devices, none of them have any versions manually defined. I'm thinking maybe OZW just defaults it to version 1 in this case.

@l3arcf
Copy link

l3arcf commented Sep 1, 2020

@kaijk , I'd followed all the config updates and settled on excluding this setting in my config. I tried all kinds of overrides, but nothing seemed to influence the creation of config*.xml. Since my last comment on this thread, I've created a PR for a code change that was introduced post 152. Actually, to test I took a copy of the code, did my own PR and built a docker image that is essentially somewhere after 170 plus the changes I made for binary switches supporting version 2 that are only sending version 1 reports. Feel free to give it a try in your environment if you like.

Code: https://github.com/l3arcf/open-zwave
Docker: https://hub.docker.com/r/l3arcf/ozwdaemon

Not sure if you saw my PR, that I am hoping will be reviewed/accepted/rejected soon: OpenZWave/open-zwave#2370

Note that I have 3 different Binary Switches, but only the DZ15S reports support for V2. I am wondering if there are any Binary Switches that are V2 and send a V2 report. It would be nice to test that scenario. Either way, the proposed change is intended to support either scenario.

Obviously better to find a way to force the code to treat the switch as V1 until Fishwaldo or Leviton have a better solution, But if waiting for that is not an option, the docker image I created seems to be doing the job for me.

@kaijk
Copy link

kaijk commented Sep 1, 2020

@kpine, , I took the container down and deleted the cache between edits. I'll play with it some more, and try scenarios as well. Its easy in that the DZ15S is the only device I have in my test env at the moment. I hadn't done anything with it really since the device PR.

It seemed that the cache file just gets created with no influence of that COMMAND_CLASS. @l3arcf, I'll also try your patched ozwd.

I've got a house full of DX15S and its dimmer kin the DZ6HD so I'd like this to work before I fully commit to the beta.

@kpine
Copy link
Contributor

kpine commented Sep 1, 2020

I took the container down and deleted the cache between edits

That's certainly the best way to test XML changes.

It seemed that the cache file just gets created with no influence of that COMMAND_CLASS

Strange. It should not ask for any command class versions at all if the flag is false, maybe I'm missing something.

https://github.com/OpenZWave/open-zwave/blob/f10f80f7e592d9e1f6b2766ae0d8db9894406183/cpp/src/command_classes/Version.cpp#L189

@l3arcf
Copy link

l3arcf commented Sep 1, 2020

@kaijk , my DZ6HD's are working the same as the did for me prior to moving to the beta. It's beta for a reason though. Your needs may uncover something that mine did not.

@blhoward2
Copy link

blhoward2 commented Sep 20, 2020

@l3arcf

Code: https://github.com/l3arcf/open-zwave
Docker: https://hub.docker.com/r/l3arcf/ozwdaemon

Not sure if you saw my PR, that I am hoping will be reviewed/accepted/rejected soon: OpenZWave/open-zwave#2370

Should it help anyone else, I've built and published a docker image for qt-openzwave incorporating l3arcf's great bugfix that is compiled to run on Rasberry Pis.

https://hub.docker.com/r/blhoward2/qt-openzwave_leviton_bugfix

@bachya
Copy link

bachya commented Sep 20, 2020

Follow up on @blhoward2's post above, in case anyone needs to compile the OpenZWave/open-zwave#2370 bugfix into a Docker image for a different architecture, here are instructions on how to do it manually:

  1. Clone this repo and cd into it:
$ git clone https://github.com/OpenZWave/qt-openzwave
$ cd qt-openzwave/
  1. Open Docker/Dockerfile in a text editor and modify lines 48-51 from this:
RUN git clone https://github.com/OpenZWave/open-zwave.git \
	&& cd open-zwave \
	&& make -j${concurrency} \
	&& make install instlibdir=/usr/local/lib/

...to this:

RUN git clone https://github.com/OpenZWave/open-zwave.git \
	&& cd open-zwave \
        && git fetch origin pull/2370/head:bugfix \
        && git checkout bugfix \
	&& make -j${concurrency} \
	&& make install instlibdir=/usr/local/lib/
  1. Since the building of this image can consume quite a bit of memory, make sure to give Docker enough (on my MacBook, I temporarily gave Docker access to 8 GB).

  2. Build the image (I'm using the allinone variant here):

$ docker build . -f Docker/Dockerfile -t ozw:leviton-bugfix --target allinone

@john3jo
Copy link

john3jo commented Oct 6, 2020

@l3arcf

Code: https://github.com/l3arcf/open-zwave
Docker: https://hub.docker.com/r/l3arcf/ozwdaemon
Not sure if you saw my PR, that I am hoping will be reviewed/accepted/rejected soon: OpenZWave/open-zwave#2370

Should it help anyone else, I've built and published a docker image for qt-openzwave incorporating l3arcf's great bugfix that is compiled to run on Rasberry Pis.

https://hub.docker.com/r/blhoward2/qt-openzwave_leviton_bugfix

So Nice to know I wasn't going insane and this issue is affecting others!
I have the same Leviton switches with above behavior and willing to try your Docker built on my Raspi4.
Could you help me understand how to do that? I am using Zwave2Mqtt via Docker. How do I patch my system with this?
I am ok with starting the build over if necessary.
Thanks for your help on this!

@l3arcf
Copy link

l3arcf commented Oct 6, 2020

@john3jo , I think you want to research the OpenZwave (Beta) integration of Zwave. It uses an MQTT bridge (much like Zwave2Mqtt) and is likely the path forward for you at some point. I do not know the pros and cons of the two. I came from the imbedded z-wave setup., The docker image @blhoward2 is for the Beta version, so maybe take the leap? YMMV though.

I'm hoping it's not long until we get a new version as there are a number of other PR's in the queue so I think the next version will be a leap forward.

@l3arcf
Copy link

l3arcf commented Nov 3, 2020

My PR was merged into qt-openzwave. @sheepster-9k , I have this working for me using qt-openzwave now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants