Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nest Protect stops updating sensors after losing connection with Internet - requires reload #385

Closed
puterboy opened this issue Dec 29, 2024 · 33 comments
Labels
bug Something isn't working

Comments

@puterboy
Copy link

puterboy commented Dec 29, 2024

The problem

This is probably related to the same problem identified and verified for the standard Nest integration Google Nest Integration loses connection with Nest Thermostats... #130407
.

The problem is when the router reboots or otherwise loses connection to the Internet, the integration stop receiving pub/sub messages from the Google Cloud.

The temporary workaround is to manually reload the integration every time you reboot the router or otherwise lose Internet connection

Interestingly, the problem with the standard Nest Integration seemed to start for me about 6 weeks ago (I do automatic router reboots every Friday night) BUT the problem with the Nest Protect integration only first affected me a week ago and again last night.

What version of this integration (ha-nest-protect) has the issue?

0.4.0b9

What version of Home Assistant Core has the issue?

2024.12.5

Device / Model

Nest Protect Temperature sensors

Diagnostics information

As above.

Home Assistant log

Logs
Copy/paste any log here, between the starting and ending backticks (`)

Additional information

No response

@puterboy puterboy added the bug Something isn't working label Dec 29, 2024
@puterboy puterboy changed the title Nest Protect stops updating after losing connection with Internet - requires reload Nest Protect stops updating sensors after losing connection with Internet - requires reload Dec 29, 2024
@GSzabados
Copy link

Are you using NabuCasa or some fancy DIY solutions for the callback?

@iMicknl
Copy link
Owner

iMicknl commented Dec 30, 2024

Duplicate of #63

@iMicknl iMicknl marked this as a duplicate of #63 Dec 30, 2024
@puterboy
Copy link
Author

Are you using NabuCasa or some fancy DIY solutions for the callback?
No cloud or any fancy callback.

@puterboy
Copy link
Author

Duplicate of #63
It's interesting, going back further in my history, I notice that this problem has come and gone over the past 8 months that I have used this integration.
I reboot my routers once a week.
So far:

  • April 18 - Aug 25- Loses connection
  • Aug 30 - Dec 14 - Doesn't lose connection
  • Dec 21 - Dec 30 - Loses connection

Could be that it has always been "broken" but that the router reboot time is borderline in terms of causing loss of connectivity.

Interesting, though the standard Nest Integration only started to lose connectivity on reboot at the start of November so whatever is causing loss of connectivity may be different between the two integration.

Regardless though the bugs for both the standard Nest Integration and Nest Protect are totally repeatable.

Given that per #63 the bug already dates back a couple of years, do you plan to fix it?

@iMicknl
Copy link
Owner

iMicknl commented Dec 30, 2024

@puterboy if we figure out the exact issue, I am happy to see if I have time to fix it. Or perhaps any of the other contributors like @GSzabados.

I will close this issue, let's continue the discussion in #63.

@iMicknl iMicknl closed this as completed Dec 30, 2024
@GSzabados
Copy link

@puterboy, are you sure that you are on the latest beta?

If yes, could you please turn on debug logging before your router reboots and check the logs for nest_protect, before and after the reboot. No need to share, but just please check that any message shows up before and after the reboot. Don't reload the integration after the reboot, just wait about 2 hours and check for any logs with nest_protect in it.

Also, which version of protect do you have?

I have had a router reboot a few days ago and I have not seen any issue like that.

@puterboy
Copy link
Author

v.0.40b9

I will turn on debug logging and check back...

@GSzabados
Copy link

Turn it on in the configuration.yaml not on the Integration's page, so it would be persistent.

@puterboy
Copy link
Author

How long does it last when turned on via the GUI? It seems to have survived a ha core restart

@GSzabados
Copy link

I have mixed experience, that's why I suggested the yaml way.

@puterboy
Copy link
Author

OK.
Debug logs are normal until I reboot the router.
After rebooting the router... logs are silent
Then resume again as soon as I reload the Nest protect integration...

@GSzabados
Copy link

Ok, could you provide some details.

Which Nest Protect do you have wired or wireless?

How long did you wait after rebooting the router before restarting the integration?

What kind of router do you have as it requires regular reboots? And why does it require regular reboots?

How is the IP assignment from your ISP. Do you have IP4 or IP6, fixed or dynamic, have at all a public IP or sitting behind a NAT?

How is your IP assignment on your internal network? How is IP assigned to HA,how does it assigned to your Nest Protects?

I really think that your issue is likely a wrong a network configuration, but not an integration issue. As I cannot reproduce it with a router reboot. And mine does take some time to reboot.

Otherwise, do you have any error message in your logs when you reboot your router, that relates to the integration? Or any log from nest_protect regarding fetching new token, unknown exception, etc...

@puterboy
Copy link
Author

Which Nest Protect do you have wired or wireless?

I have 3 (wireless) Nest Remote Thermostat Sensors (one each for my 3 Nest thermostats)

How long did you wait after rebooting the router before restarting the integration?

I waited about 4 hours.

By then, I knew the temperature had changed several times based on other temperature sensors in the house.
Note: with the Nest Thermostat sensors, the temperature updates/changes every time the room temperature changes by about a degree.
I have tracked this closely for months.
Note that as soon as I restarted the integration, the temperatures updated as seen in HA and in its logs so it really seems that somehow the Nest Protect integration lost and was unable to regain connection either to my Nest thermostats or to the Google server.

What kind of router do you have as it requires regular reboots? And why does it require regular reboots?

I use dd-wrt and reboot weekly as it is a recommended best practice.

How is the IP assignment from your ISP. Do you have IP4 or IP6, fixed or dynamic, have at all a public IP or sitting behind a NAT?

IPv4 only.
IP assignment is dynamic but it doesn't change on reboot (I would have to disconnect for multiple hours or even days to get a new IP address from my ISP)

How is your IP assignment on your internal network? How is IP assigned to HA,how does it assigned to your Nest Protects?

IP address for HA is static.
Nest sensor itself doesn't have an IP address (it connects to the Nest thermostat via Bluetooth). The Nest thermostat has its IP address set by DHCP but it is always the same (i.e., it gets assigned the same address as it did the first time time it connected)

I really think that your issue is likely a wrong a network configuration, but not an integration issue. As I cannot reproduce it with a router reboot. And mine does take some time to reboot.

My network configuration is pretty simple and stable -- I am not doing anything crazy and I understand my setup very well.

Have you tried this with a Nest thermostat remote sensor?
Per your questions it seems like you are using a different type of nest device.

Otherwise, do you have any error message in your logs when you reboot your router, that relates to the integration? Or any log from nest_protect regarding fetching new token, unknown exception, etc...

No logs after I reboot. Only get logs again after restarting the integration -- at which point everything is working again normally.

Again, I think this is probably similar to the bug identified with the standard Nest integration (that I use for the thermostats themselves) Google Nest Integration loses connection with Nest Thermostats... #130407

What seems to happen at least with the Nest Integration is that after rebooting, the Google pub/sub api loses connection with the thermostat and aborts. Perhaps a similar issue occurs with Nest Protect
(the maintainer of the standard Nest integration claims to have fixed in this in dev code that will be released in 2024.2.x)

NOTE: I need to use both the standard "Nest Integration" plus "Nest Protect" since "Nest Integration" is required for the Nest thermostats themselves (using the published Google Nest API) while "Nest Protect" is needed to capture the temperature of the Nest remote sensors (which is not included in the Google API)

BTW, thank you so much for helping me to troubleshoot this!!!!!

@puterboy
Copy link
Author

The following is quoted from the update released to fix the problem with the classic Nest Integration:

Update python-google-nest-sdm to 7.0.0. This contains breaking API changes that use the async Google Cloud Pub/Sub APIs and move away from background threads. The APIs have been updated to make registration a little simpler, and internals require different approach to mocking. The tests have been updated to use simpler more generic fixtures based on AiohttpClientMocker, made with the minimal number of changes to existing tests.

This is meant to address home-assistant/core#130407 as a side effect by simplifying all the network code, and being more resilient to failures with a simpler subscriber loop and retry handling that does not involve any threads. The error handling was manually tested with breaking the network and observing everything can backoff nicely, then reconnect and resume.

Release Notes: https://github.com/allenporter/python-google-nest-sdm/releases/tag/7.0.0
Changes: allenporter/python-google-nest-sdm@6.1.5...7.0.0

Perhaps that is helpful...

@GSzabados
Copy link

GSzabados commented Jan 1, 2025

First of all, this is not the same API as the Nest integration.

Second, we need some log to see what is happening. How it does disconnect. If your HA loose connection, how do you know that your Nest thermostat is really publishing changes as well.

For me the odd thing, that I cannot reproduce it with a router restart or configuration reload, even by dropping all states from the router's table. Which is like a full disconnect.

Otherwise, I have not heard of this suggested router restart thing, but sounds quite weird and suggests that something is not working well in the internals, if you need to clear out everything.

Do you have any other integration which fails like this?

Edit: Happy New Year! 🥳

@GSzabados
Copy link

And thinking a bit more. Are you using NabuCasa? And can you access remotely your HA just after the router is rebooted? Because I think that after rebooting the router it is likely just blocks any incoming connection to your HA server.

@puterboy
Copy link
Author

puterboy commented Jan 1, 2025

Most happiest of New Years to you too!

First of all, this is not the same API as the Nest integration.
Second, we need some log to see what is happening. How it does disconnect.

Well the point is that there are NO Nest Protect debug log entries UNTIL I reload the Nest Protect integration. It's like the integration is just sitting there waiting for messages but never receiving them.

Is it possible to create a stand-alone python script that subscribes to the messages and prints them to stdout for example.
I could then instrument the code with additional debug statements to help narrow this down further...

If your HA loose connection, how do you know that your Nest thermostat is really publishing changes as well.

Well, I can see the Nest remote temperature sensor still updating in both the Google Nest App and Google Home.
So the Nest thermostat is still publishing remote sensor temperatures to the Google Nest cloud

For me the odd thing, that I cannot reproduce it with a router restart or configuration reload, even by dropping all states from the router's table. Which is like a full disconnect.

Perhaps it depends on how long the timeout is before reconnecting?
Also, are you testing with a Nest Thermostat and Remote Sensor?

Otherwise, I have not heard of this suggested router restart thing, but sounds quite weird and suggests that something is not working well in the internals, if you need to clear out everything.

Per the DD-WRT wiki
"Scheduled reboot is often used to give the routers a little clearing now and then to keep its performance at its peak."

Do you have any other integration which fails like this?

Only the regular Nest Integration

@puterboy
Copy link
Author

puterboy commented Jan 1, 2025

And thinking a bit more. Are you using NabuCasa?
Nope. Local install Hassos install on RPi5.

And can you access remotely your HA just after the router is rebooted? Because I think that after rebooting the router it is likely just blocks any incoming connection to your HA server.

Yes - I can immediately access HA after reboot via both web and ssh -- in fact, it doesn't even break existing ssh connections from within my LAN.

@GSzabados
Copy link

in fact, it doesn't even break existing ssh connections from within my LAN.

I am asking remote access. And that it is not broken on lan, I guess you might have some other network equipment as well on the LAN side, either a Mesh in bridge mode or a switch.

Anyhow, how do you access HA remotely? Can you access it remotely, NOT on local LAN but for example 4G?

I still think that your router just closes the connection and even if a callback happens it does not let it through due to the reboot and whatever it does...

@puterboy
Copy link
Author

puterboy commented Jan 2, 2025

I am asking remote access. And that it is not broken on lan, I guess you might have some other network equipment as well on the LAN side, either a Mesh in bridge mode or a switch.

Anyhow, how do you access HA remotely? Can you access it remotely, NOT on local LAN but for example 4G?

Sure... I can access it via my phone on 4G/5G using nginx and https

I still think that your router just closes the connection and even if a callback happens it does not let it through due to the reboot and whatever it does...

The reboot takes a couple of minutes... so it is possible that the callback occurs while the reboot is still occuring.

Can you share with me the logic of how connections and callbacks are supposed to be maintained and restarted?
That in turn will help me troubleshoot on my end...

@GSzabados
Copy link

The reboot takes a couple of minutes... so it is possible that the callback occurs while the reboot is still occuring.

Really unlikely.

Can you share with me the logic of how connections and callbacks are supposed to be maintained and restarted?
That in turn will help me troubleshoot on my end...

I guess your router closes the connection and not notifying the participants. So there isn't reconnect. I have an idea, which I will discuss with @iMicknl to make it more resilient.

@puterboy
Copy link
Author

puterboy commented Jan 2, 2025

I was able to isolate and trigger the problem by simulating a WAN disconnect by adding and deleting the following firewall rules that block connection between my HA instance and Google Nest Cloud. No reboot or other router changes required.

  1. Disconnect HA from Google Nest Cloud
iptables -I FORWARD -s homeassistant -o vlan2 -j DROP
 iptables -I FORWARD  -i vlan2 -d homeassistant -j DROP
  1. Wait a few minutes for disconnect to register (I waited about 5 minutes but may not need to be that long)
  2. Remove above iptables rules
iptables -D FORWARD -s homeassistant -o vlan2 -j DROP
iptables -D FORWARD  -i vlan2 -d homeassistant -j DROP

Note: vlan2 = wlan

Doing the above stops all updates from my Nest Remote Temperature Sensors from registering in HA.
To resume updates I need to either restart HA or reload the Nest Protect integration.

This is 100% reproducible and should confirm that the issue is caused by HA (and thus the Google Nest Protect Integration) losing connection to Google Nest Cloud (and/or vice-versa).

This also should rule out any user issues with my Router or my general WAN connectivity or my LAN setup & topology, or how I reboot the router or how I access HA etc...

@GSzabados
Copy link

You just proved, that it is not a HA issue, it is a router configuration issue. The reboot should not drop those forwarding rules. I would check after reboot, what is happening with your iptables. You might have set up some rules which are not applied correctly, and blocking access remotely.

@puterboy
Copy link
Author

puterboy commented Jan 2, 2025

You just proved, that it is not a HA issue, it is a router configuration issue. The reboot should not drop those forwarding rules. I > would check after reboot, what is happening with your iptables. You might have set up some rules which are not applied > correctly, and blocking access remotely.

I am confused about why you are so insistent on attributing the problem to my specific choice or configuration of router. The reboot doesn't drop any forwarding rules. It just temporarily breaks Internet connectivity.

The root cause is that the Nest Protect Integration fails to recover from temporary lost Internet connectivity, regardless of cause -- a router reboot is just one of many ways to lose temporary connectivity (as any router does) while rebooting.

Regarding my previous post, I only dropped and re-added the iptables forwarding rules to simulate a temporary disconnection
I could have equally demonstrated such a disconnection with iptables on the HA device itself...
Or I could have just temporarily unplugged my HA device from the Internet...
Or unplugged my router from the cable box...
Or my ISP could have had a temporary network disruption...
Or Google could have a temporary issue with its cloud server...
Or any of the many other reasons one can temporarily lose network connectivity...

Indeed, just to be sure, I confirmed that temporarily physically unplugging the Ethernet cable to the HA device and restoring the physical connection also causes the Nest Protect Integration to fail and not recover (in contrast, other services like SSH recover when network connectivity is restored) - -- nothing to do with my router

At a minimum, this uncovers a lack of robustness in the HA Nest Protect Integration. This silent failure to reconnect is insidious because unless one is regularly observing the temperature traces for Nest sensors, one may never know that the integration needs restarting -- and many users may experience temporary Internet disruptions without even being aware of it.

Bottom line is that the root cause has nothing to do with my particular router choice, router configuration, network topology etc...

Can we focus on how one might go about fixing this?

As a temporary workaround, I have set up cron jobs that use the REST API to restart the integration after every (scheduled) reboot.
I also wrote automations to notify me of stale Nest Remote temperature sensors.

It would seem that the right solution though is to make the integration itself more robust and self-aware.
Is there a heartbeat or keepalive possible? Can it occasionally ping the Google Nest Server to check for connectivity?

If it's not fixable, the documentation should at least warn the user that the integration may need to be restarted any time that HA loses connection with the Internet.

@GSzabados
Copy link

GSzabados commented Jan 2, 2025

Your router after reboot blocks the incoming message from the Nest API. The integration waits for incoming messages. The problem is your router. It is not your internet connection, it is not your HA connecting to your router. It is your router.

Or I could have just temporarily unplugged my HA device from the Internet...
Or unplugged my router from the cable box...
Or my ISP could have had a temporary network disruption...
Or Google could have a temporary issue with its cloud server...
Or any of the many other reasons one can temporarily lose network connectivity...

Or Elon Musk can rule the world from tomorrow...

Is there a heartbeat or keepalive possible?

Yes, there is a keepalive message coming from the Nest API which is blocked by your router.

  • Aug 30 - Dec 14 - Doesn't lose connection

FYI. It was reconnecting every 5 minutes and meanwhile it was not updating anything, breaking the whole purpose of the integration. #347 (comment)

It would seem that the right solution though is to make the integration itself more robust and self-aware.

Feel free to contribute....

@puterboy
Copy link
Author

puterboy commented Jan 2, 2025

Your router after reboot blocks the incoming message from the Nest API. The integration waits for incoming messages. The problem is your router. It is not your internet connection, it is not your HA connecting to your router. It is your router.

We seem to be talking past each other :(
The problem occurs even if no reboot -- it occurs any time there is a loss of network connectivity. It would happen no matter what router I have. It would happen even if I had no router and just connected HA raw to my cable modem

I know my router iptables rule thoroughly.
In summary, at a high level, regarding my HA device

  • All outgoing connections from my HA device are allowed (both to WAN and LAN)
  • All incoming connections from the WAN are blocking (unless ESTABLISHED)

This is all pretty basic.

Presumably though the problem is:

The integration waits for incoming messages.

That would explain the behavior since:

  • Pretty much any consumer router will block all NEW messages by default so the connection needs to be existing ESTABLISHED as initiated by HA
  • If a router (ANY ROUTER) reboots, than all existing ESTABLISHED connections are flushed anyway so the Nest Protect Integration will end up waiting FOREVER for messages that will never arrive
  • If connectivity is otherwise temporarily lost, an ESTABLISHED connection will persist for some time before expiring, depending in part on the value of nf_conntrack_tcp_timeout_established (if Linux-based router). But after expiry, again messages will never arrive.

So if truly the Nest Protect Integration is just sitting around waiting for an incoming message, then a reboot or any of the other Internet disconnect use cases listed below will cause the integration to fail silently when the ESTABLISHED connection flushes or expires.

Or I could have just temporarily unplugged my HA device from the Internet...
Or unplugged my router from the cable box...
Or my ISP could have had a temporary network disruption...
Or Google could have a temporary issue with its cloud server...
Or any of the many other reasons one can temporarily lose network connectivity...

Or Elon Musk can rule the world from tomorrow...

Indeed, I left out losing Starlink satellite coverage which would be another possible cause :)

Of course all the above use cases do happen -- more frequently than you might imagine (Google Nest even had an outage a couple of weeks ago)

Is there a heartbeat or keepalive possible?

Yes, there is a keepalive message coming from the Nest API which is blocked by your router.

My router only blocks NEW connections from WAN -- as does pretty much any other consumer router on the market.
My router does NOT block ESTABLISHED connections -- similar to pretty much any other consumer router on the market.

Can you enlighten me on what IP addresses, protocols, ports and/or states you think my router is uniquely blocking versus all the other routers out there on the market?

But without any specifics of what IP address/protocols/ports/states need to be allowed, it's really impossible for me to address the situation -- other than to say that my router does not block any incoming messages that are not blocked by pretty much any other consumer router on the market.

  • Aug 30 - Dec 14 - Doesn't lose connection

FYI. It was reconnecting every 5 minutes and meanwhile it was not updating anything, breaking the whole purpose of the integration. #347 (comment)

That would explain why it used to work!!! Good catch!!!
As the 5 minute reconnect served as an inadvertent but effective watchdog process :)

It would seem that the right solution though is to make the integration itself more robust and self-aware.

Feel free to contribute....

Happy to... which is why I asked previously if there is any way to abstract this code from the Nest Protect Integration so I could play with it independent of the more complicated HA environment. That being said, I am not a coder and am a rank beginner at Python.

@GSzabados
Copy link

GSzabados commented Jan 2, 2025

How often do you receive updates from the Nest API? And how long is a reboot takes on your router?

@puterboy
Copy link
Author

puterboy commented Jan 3, 2025

The updates are asynchronous -- only when temperature changes by 1 degree'ish.
So it depends on how fast a room heat and cools.
So it can be anywhere from every 5 minutes to many hours depending on whether there is active heating or cooling along with the temperature delta between the room and outdoors.

A reboot takes about 3-5 minutes I would say.

Presumably the Nest thermostat itself has to also deal with and recover from intermittent Internet disconnects -- and it always does manage to do so without requiring a HW or SW reset... in fact, you can see a delay in that the thermostat seems to take a couple (1-3 minutes?) to recover from an offline event caused by losing Internet connection. So the Nest thermostat must be getting a heartbeat and/or pinging the server periodically.

(Note the Bluetooth-based Nest Remote Sensor communicates to the Internet via the WiFi-based thermostat rather than directly)

@GSzabados
Copy link

Ok, what if you would try a different router?

I have disconnected my HA server for more than 5 minutes from my network, updates still came in as normal. Forced updates during the disconnect, and once connected those updates came in immediately.

The integration is resilient enough to survive these two cases on my side, so I can narrow it down that the issue is with your router and network setup.

Don't tell me, that the Nest Thermostat has a heartbeat and reconnects, that must communicate with the Nest server, that is its purpose, to get weather forecast and adjust your heating pattern. This integration is like the Nest app on your phone. It shouldn't continuously ping the servers.

As you keep rebooting your router, you can keep rebooting your HA as well or reloading the integration once you rebooted the router. It might help HA as well, if you keep it rebooting.

And just to give you some background on the router.

I used to have a 4G modem/router, which was disconnecting from the internet continuously, so I set up a script to check connectivity, and once it has failed for 15 seconds, then it rebooted the router. I was convinced that one of the base stations in the area must be faulty, and drops connections. It went this way, until I bought a different modem/router, and to my biggest surprise it did not drop connections anymore, and it did not require regular reboots, even it was from the same manufacturer.

@puterboy
Copy link
Author

puterboy commented Jan 3, 2025

Ok, what if you would try a different router?

Per your advice, I swapped the router for another physical one (same popular, well-tested Netgear R6700 model) and got the same behavior so it's not the HW.
Given the maturity, stability, and widespread use of dd-wrt, I doubt it's the router firmware.
And given that I haven't changed any of the iptables (or ebtables) rules affecting my HA device and my network settings are otherwise standard, I don't think its the config either.

I have disconnected my HA server for more than 5 minutes from my network, updates still came in as normal. Forced updates during the disconnect, and once connected those updates came in immediately.

The integration is resilient enough to survive these two cases on my side, so I can narrow it down that the issue is with your router and network setup.

The real question (in my mind) is why in my situation, the Nest Protect integration doesn't try to re-initiate connection with the Google FCM server when Internet connectivity is resumed -- while in your case, the HA Nest Protect integration seems to detect the disconnect and re-initiate connection.

Indeed, I don't see how my router firewall rules (or router behavior more generally) would allow for an initial (new) outbound connection to the Google FCM server but then block subsequent (new) connections when attempting to reconnect after a disconnect or reboot -- the router firewalls can't possibly know the difference since they just look at IP addresses, protocols, and ports when establishing new connections (and after a reboot, a new connection is needed for sure)

Without understanding further how the Nest Protect integration is supposed to reconnect to the Google server after losing connectivity, it's really hard for me to troubleshoot.

Don't tell me, that the Nest Thermostat has a heartbeat and reconnects, that must communicate with the Nest server, that is its purpose, to get weather forecast and adjust your heating pattern. This integration is like the Nest app on your phone. It shouldn't continuously ping the servers.

The Nest Thermostat app behavior is rightly different from the Nest integration with HA.
The Nest app only maintains a connection with the Google FCM servers (that allow push notifications) when the phone is on and the app is open and selected.
Shortly after closing the app or a little longer after closing the phone, the ESTABLISHED connection disappears and the Google FCM servers lose their ability to push anything to the phone -- which is OK since you only need to push weather or temperature data to the user or upload changes from the user when you are in the app.
And when you open the app again, it just establishes a new outbound connection to the Google FCM server.

In contrast, if the Nest Protect Integration relies on 'push' communications from Google servers, then it needs to maintain constant connectivity via an ESTABLISHED connection so that HA can receive asynchronous push notifications since the FCM server can't initiate inbound connections

As you keep rebooting your router, you can keep rebooting your HA as well or reloading the integration once you rebooted the router. It might help HA as well, if you keep it rebooting.

And just to give you some background on the router.

I used to have a 4G modem/router, which was disconnecting from the internet continuously, so I set up a script to check connectivity, and once it has failed for 15 seconds, then it rebooted the router. I was convinced that one of the base stations in the area must be faulty, and drops connections. It went this way, until I bought a different modem/router, and to my biggest surprise it did not drop connections anymore, and it did not require regular reboots, even it was from the same manufacturer.

Again to be clear. My router doesn't require weekly reboots - just people I respect in the router dev community have told me that it's a good practice to reboot periodically to clear out an accumulating "rot". Indeed, that is probably why most routers have a feature to allow regular reboots.

Also my router's connectivity is rock solid - I track it regularly...

@puterboy
Copy link
Author

puterboy commented Jan 3, 2025

I was just looking through the pynest codebase and I don't see any code that checks for network connectivity and could detect a network disconnect. The code can detect server disconnections that are signaled by the server but not network disconnects themselves.

To do so it would need to have a heartbeat or keep-alive or network ping or OS-level check that the TCP connection is alive etc.

In practice:

  1. If the Nest Protect integration can detect a loss of connectivity, then such an "error" should presumably be logged along with any reconnect attempts... so if it is trying but failing to reconnect then that should be logged and I don't have any such error messages -- i.e., if my router causes reconnects to fail, then that failure should obviously be caught and logged.

  2. Conversely, if the Nest Protect integration can't detect connectivity losses then that would explain why network disconnects would cause the problems I am seeing. Said another way, if the integration can't detect Google server disconnects, then it can't be expected to reconnect and will continue waiting for new messages that won't ever come. This would apply to any cause of network disconnection whether due to "routers" or anything else

Are you using a Nest Remote Sensor?
Is it possible that other Nest devices have more regular communication whose absence could be detected.

Are you using HAOS?
If not, is it possible that your setup responds differently to network disconnects.

In summary

  • If my setup is disconnecting and failing to reconnect then such failure should be logged.
  • If your setup is disconnecting and successfully reconnecting, then such re-login attempts and successes should be logged
  • If the Nest Protect integration can't detect network disconnects, then that should be fixed since there are many things that can lead to temporary network disconnects of varying lengths and silent failure is dangerous.
  • Finally, just because your network disconnects don't lead to terminated and/or non resumable connections, doesn't mean that other setups behave nicely... it just may mean that you aren't fully disconnecting in a way that drops the connection or there is something else on your network that is maintaining the connection even when HA is fully disconnected.

@GSzabados
Copy link

  • If my setup is disconnecting and failing to reconnect then such failure should be logged.
  • If your setup is disconnecting and successfully reconnecting, then such re-login attempts and successes should be logged
  • If the Nest Protect integration can't detect network disconnects, then that should be fixed since there are many things that can lead to temporary network disconnects of varying lengths and silent failure is dangerous.
  • Finally, just because your network disconnects don't lead to terminated and/or non resumable connections, doesn't mean that other setups behave nicely... it just may mean that you aren't fully disconnecting in a way that drops the connection or there is something else on your network that is maintaining the connection even when HA is fully disconnected.

Please add it all, just need to handle the right exceptions and need to add more debug logging, feel free to read the documentation of asyncio and aiohttp, plus you can rewrite the whole logic of the code to take care of checking the connection state.

Here is some unsupported documentation on the subject:

https://www.home-assistant.io/more-info/unsupported/connectivity_check/

Otherwise meanwhile you are digging yourself into the codebase and reached an expert level in the disconnection problems, try to change in the client.py the following to 600 at line 278:

timeout = 3600 * 24

And to note, I am not a code myself either, but your essay writing skills are amazing me, just to see a bloody cloud connected temperature sensor... Maybe you should put this effort somewhere else, and should contribute to something instead of whining about it.

@GSzabados
Copy link

Crickets... I can hear crickets...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants