Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Entities not leaving Unavailable state until after integration reload #235

Closed
walberjunior opened this issue Jul 10, 2024 · 21 comments · Fixed by #269
Closed

Entities not leaving Unavailable state until after integration reload #235

walberjunior opened this issue Jul 10, 2024 · 21 comments · Fixed by #269
Assignees

Comments

@walberjunior
Copy link

walberjunior commented Jul 10, 2024

Version of the custom_component

Burmuda v0.6.7

Configuration

HA:
Core2024.6.4
Supervisor2024.06.2
Operating System12.4
Frontend20240610.1
Esphome 2024.6.6

esphome config:

bluetooth_proxy:
  active: True

esp32_ble_tracker:
  scan_parameters:
    interval: 320ms
    window: 300ms

Describe the bug

If the device to be tracked is out of range of the esp board when restarting the HA, the entity remains unavailable until the integration is manually reloaded.

image (11)
image

Debug log

30:ae:xx:xx:xx:34:
  name: display_proxy
  local_name: null
  prefname: null
  address: 30:ae:xx:xx:xx:34
  options:
    attenuation: 3
    devtracker_nothome_timeout: 30
    max_area_radius: 20
    max_velocity: 3
    ref_power: -55
    smoothing_samples: 20
    update_interval: 10
    configured_devices:
      - D1419F0xxxxxxxxxx7958_100_40004
  unique_id: 30:ae:xx:xx:xx:34
  address_type: bd_addr_type_unknown
  area_id: frente
  area_name: Frente
  area_distance: null
  area_rssi: null
  area_scanner: null
  zone: not_home
  manufacturer: null
  connectable: false
  is_scanner: true
  beacon_type: []
  beacon_sources: []
  beacon_unique_id: null
  beacon_uuid: null
  beacon_major: null
  beacon_minor: null
  beacon_power: null
  entry_id: 2dafa38xxxxxxxxxaf01a1dc3ff8
  create_sensor: false
  create_sensor_done: false
  create_tracker_done: false
  last_seen: 0
  scanners: {}
24:6f:xx:xx:xx:64:
  name: Cama
  local_name: null
  prefname: null
  address: 24:6f:xx:xx:xx:64
  options:
    attenuation: 3
    devtracker_nothome_timeout: 30
    max_area_radius: 20
    max_velocity: 3
    ref_power: -55
    smoothing_samples: 20
    update_interval: 10
    configured_devices:
      - D1419F0xxxxxxxxxx7958_100_40004
  unique_id: 24:6f:xx:xx:xx:64
  address_type: bd_addr_type_unknown
  area_id: bedroom
  area_name: Quarto
  area_distance: null
  area_rssi: null
  area_scanner: null
  zone: not_home
  manufacturer: null
  connectable: false
  is_scanner: true
  beacon_type: []
  beacon_sources: []
  beacon_unique_id: null
  beacon_uuid: null
  beacon_major: null
  beacon_minor: null
  beacon_power: null
  entry_id: 519388xxxxxxxxxxx15d22cb68c
  create_sensor: false
  create_sensor_done: false
  create_tracker_done: false
  last_seen: 0
  scanners: {}
94:e6:xx:xx:xx:28:
  name: Espgaragem
  local_name: null
  prefname: null
  address: 94:e6:xx:xx:xx:28
  options:
    attenuation: 3
    devtracker_nothome_timeout: 30
    max_area_radius: 20
    max_velocity: 3
    ref_power: -55
    smoothing_samples: 20
    update_interval: 10
    configured_devices:
      - D1419F0xxxxxxxxxx7958_100_40004
  unique_id: 94:e6:xx:xx:xx:28
  address_type: bd_addr_type_unknown
  area_id: garagem
  area_name: Garagem
  area_distance: null
  area_rssi: null
  area_scanner: null
  zone: not_home
  manufacturer: null
  connectable: false
  is_scanner: true
  beacon_type: []
  beacon_sources: []
  beacon_unique_id: null
  beacon_uuid: null
  beacon_major: null
  beacon_minor: null
  beacon_power: null
  entry_id: 6e4adddaebxxxxxxxxxxxxxxxxxxxf939ab9
  create_sensor: false
  create_sensor_done: false
  create_tracker_done: false
  last_seen: 0
  scanners: {}
@agittins agittins self-assigned this Jul 11, 2024
@kafisc1
Copy link

kafisc1 commented Jul 13, 2024

Can confirm the issue here.

Core 2024.7.2
Supervisor 2024.06.2
Operating System 12.4
Frontend 20240710.0
Esphome 2024.6.6

@agittins
Copy link
Owner

I've been having trouble reproducing this issue. I tried the following:

  • I have my android phone running the companion app, configured to transmit iBeacon.
  • It is also set up as an IRK device, but I was just checking the configured iBeacon sensor
  • Turned off phone transmitter (by disabling it in the app)
  • Waited for sensors to go "Unknown" (or "Away" in the case of the device_tracker sensor)
  • Restart HA
  • Sensors remain at Unknown and Away (not "unavailable")
  • Tap "Enable Sensor" in companion app, in less than a second Area, Distance and Tracker sensors all reported presence.

I've tested this on my dev machine which has only a single esphome tracker, and my production server which has 6 esphome's and a local bluetooth adaptor.

Is there anything in my test actions that seems different from what you are experiencing?

I repeated the test on my dev machine with a different device (a usb power thingie, it's not an iBeacon so Bermuda is tracking just its normal BLE advertisements). Same results - within three seconds (its bootup time) it showed as present.

Questions:

  • does the above process replicate the issue on your end, or are there different steps?
  • If you can update (via HACS) to the main version, you should have a new "Download diagnostics" option. If you could get your system into the state where it's showing "Unavailable" even though the device is transmitting, and download the diags, and then restart the integration so the sensors are working and do the download diagnostics again, then send me both files that would probably help a lot.

The "Download Diagnostics" menu is here (only in main or any version released after 14th July 2024):
image

@walberjunior
Copy link
Author

My Ibeacon arrived, so I can test with more devices besides my smartphone.

Your steps are correct, the only differences are:
I don't disable the sensor, I take the smartphone away from the proxy.
Sometimes it happens when restarting the HA and sometimes when restarting the HOST. In my case it is a VM on Proxmox.
I would say the failure happens 60~70% of the time.

I'm already on the latest version, I tried to redo the download via hacs, but the download diagnostics option still didn't appear.

I made a video, this time restarting the HA I had no problems, but when restarting the HOST the problem appeared.
At 4:20 the devices are already close to the proxy, approximately 50cm away.
https://youtu.be/o4WfH2NIPBc
How did you configure IRK? I searched, but I only found information for the iPhone.

@spetryjohnson
Copy link

I came here to write up my own issue, but I think this might cover it.

I recently converted a bunch of ESPresence nodes to Bermuda and I'm having a rough time of it. It seems like nodes just stop tracking my phone, and presence will be "Unknown" even when I'm sitting right next to the ESP32.

Rebooting the ESP32 always fixes it, but I'm wondering if HA restarts are the issue.

I'll do some testing with your repro steps and report pack; in the meantime, I just wanted to bump this thread so I get notified if you find a solution ;)

@agittins
Copy link
Owner

Hi Seth, that's interesting, I hadn't considered that it could be related to the proxies, but it may be possible.

Frankly I haven't been able to replicate the issue yet. On my systems it comes up fine every time.

I do get proxies that stop reporting sometimes, and there's a few reasons for that to be the case, but I haven't seen them fail to come back after an HA restart, but it's entirely possible since they need to reconnect when HA reboots so if they're having trouble they might not manage to reconnect.

I wouldn't expect a reload of just Bermuda to fix that though. Have you found that clicking "Reload" per the screenshot below restores operation, or just the esphome restart, so far?

image

Some things to consider if your proxies stop sending updates:

  • Have they been flashed by usb/serial at least once since moving from esprescense, or only OTA? I think esphome changed the flash layout not too long ago, which improves BLE stability, but it only applies that change if it's flashed via usb.
  • What window and interval settings are you using? If the timings are too close that might cause it to randomly stop working
  • Are you using C3 boards? (probably not if you had them for a while) but the C3 modules can benefit from some specific settings - https://github.com/agittins/bermuda/wiki/ESPHome-Configurations#esp32-c3-modules

Failing that, could you post one of your esphome yaml configs? I can take a look and see if there's anything that might be causing issues - BLE scanning does tax the esp32 a bit, so sometimes having other components active can make them less stable.

I'll be interested to hear if reloading Bermuda causes different behaviour, too. I'm keen to try and get to the bottom of this one.

@spetryjohnson
Copy link

@agittins thanks for the reply! The more I look into it the less I'm convinced I have the same issue as this one, because reloading the Bermuda integration didn't seem to make a difference, and today I had proxies stop working without an HA restart.

I'm in the middle of opening a new issue but it will take me a bit to collect the debug logs and everything else. I can also do some testing with different interval settings and the like and report my results.

I'm using D1 mini boards that are fully dedicated to the BLE proxy, so I can rule out issues w/ a C3 processor or other ESPHome tasks causing issues.

I'm also going to try and repro this issue as it was reported, just to help determine if there's a common root cause or not.

Stay tuned!

@hellcry37
Copy link

hellcry37 commented Jul 29, 2024

come here to write about this lol so I have exactly the same issue, new everything all working but after a while it all goes unavailable ... always

@agittins
Copy link
Owner

come here to write about this lol so I have exactly the same issue, new everything all working but after a while it all goes unavailable ... always

Hi @hellcry37, sorry you're having trouble! Can you do a 'download diagnostics' before and after the issue occurs? There's instructions on this comment: #235 (comment)

If you can upload the two files it gives you I can see if I can work out what's going on.

@hellcry37
Copy link

come here to write about this lol so I have exactly the same issue, new everything all working but after a while it all goes unavailable ... always

Hi @hellcry37, sorry you're having trouble! Can you do a 'download diagnostics' before and after the issue occurs? There's instructions on this comment: #235 (comment)

If you can upload the two files it gives you I can see if I can work out what's going on.

I've removed everything but ok, I'll reinstall it and help you debug, I'll try this tonight hope to have some debug til tomorrow.

@jeremysherriff
Copy link

Hi all
I had a similar but possibly different issue which is what @agittins has referenced in 264 - but was unrelated to the original request in 264 and I shouldn't have crossed the streams :)

I encountered this the other day, and reloading the integration did not resolve it.
Weirdly the entire integration appeared to die, and the new diagnostics info in the Config Flow indicated that things were totally broken somehow. Note the "0 active" for both endpoint devices and scanner devices:
image

I have the json debug info and debug logs at home and will attach them when I can.
Interestingly, they indicate (to me) that the integration was actually completely healthy with BLE data coming in and all BLE receivers participating.

The fix was for me to change the batteries in the device I was tracking (my cat, via a pet fitness collar) - as soon as I did that, everything started working again. This is the only device I am tracking.

I believe that the integration configuration screen indicating a system-wide issue with no devices being seen (above screenshot) is in error.

I should note that I also rebooted the HA instance, rebooted the proxmox host (which has USB passthrough for the dongle), and eventually removed the Bermuda integration completely, rebooted HA, and reinstalled Bermuda. The reinstall was successful in that I could then see other devices and the dongle and proxies showed as online, but the device with low battery was not found.

I'll upload the logs I have as soon as I can access them.

@agittins
Copy link
Owner

Logs as promised

Excellent, thanks!

That did the trick - the diags and logs showed me what was going on - if there are no "available" configured devices at start-up, then Bermuda never got called to update later on even if a device appeared.

This explains why some folk found it recovered after a reload - it was contingent on there being a configured device visible at the time, otherwise it would still not get any updates.

I think the various presentations of this bug ARE NOW FIXED with the latest commit, which can be tested by updating to main in HACS, and will be in the next release - which I will push out once I am confident that the fix is working and I having broken anything else in the meantime!

@jeremysherriff
Copy link

Great work! I had a feeling this might be due to me having only a single device tracked, whereas you @agittins probably have quite a lot more :)

I've "updated" to main and will call out if there are any issues experienced.

@hellcry37
Copy link

I was about to also add logs just happen again but I believe this is the same situation as jeremysherriff, thank you

@agittins
Copy link
Owner

agittins commented Aug 3, 2024

Awesome, fingers crossed that's sorted it!

@jeremysherriff
Copy link

No issues so far

@spetryjohnson
Copy link

spetryjohnson commented Aug 4, 2024

I think the various presentations of this bug ARE NOW FIXED with the latest commit, which can be tested by updating to main in HACS

I must be doing something stupid; I tried updating to main, but it fails with this log entry:

Download failed - Got status code 404 when trying to download https://github.com/agittins/bermuda/releases/download/main/bermuda.zip Traceback (most recent call last): File "/config/custom_components/hacs/base.py", line 730, in async_download_file raise HacsException( custom_components.hacs.exceptions.HacsException: Got status code 404 when trying to download https://github.com/agittins/bermuda/releases/download/main/bermuda.zip

And sure enough, "github.com/agittins/bermuda/releases/download/main/bermuda.zip" gives me a 404.

I went into HACS -> Bermuda -> Redownload, clicked to show beta versions, and then picked "main". Is that not the right process?

@jeremysherriff
Copy link

@spetryjohnson you are correct, not sure why it doesn't work correctly for this integration/github repo.
I believe you have a few options;

  1. Remove the integration completely, and then download it again and choose main. You won't need to enable beta versions, it's there at the bottom of the list. Or,
  2. Use ssh to access your HA instance, go to /config/custom-components/bermuda, and then use wget to download the raw version/content of the one changed file (noted in the commit), overwriting the existing file. Or lastly,
  3. Manually edit the existing file, making the same edits that you see from the commit.

I went with # 2 because it is easiest and didn't require removing and reconfiguring everything.

@agittins
Copy link
Owner

agittins commented Aug 5, 2024

Ugh. Yeah, I hadn't noticed that. I think this is because I recently changed the hacs.json to use the zip file releases, so that it would show version numbers in the HA gui.

Looks like it might be broken for the 'default branch' feature though, or maybe I've got something set up wrong. I'll see what I can work out.

It looks like perhaps using zip releases is mutually exclusive with being able to include the default branch version, which I don't understand: hacs/integration#3513 - seems to me it would make sense to use zips, but use a checkout for the default branch.

Hmm. I'm going to revisit how I do that, because in-app version reporting is great, but being able to get folks to try out main in an easy way is greater. :-/

@jeremysherriff
Copy link

jeremysherriff commented Aug 5, 2024

@agittins I suspect that is what the "show beta versions" toggle is for? As in, you may need to establish a beta/dev branch and promote changes through from there.

@agittins
Copy link
Owner

agittins commented Aug 6, 2024

@agittins I suspect that is what the "show beta versions" toggle is for? As in, you may need to establish a beta/dev branch and promote changes through from there.

My understanding is that show beta versions is specifically for releases tagged as beta versions. The "default branch" (which is main in Bermuda, usually main or master in most repos) is another thing, and is available without having to select beta versions. It's all pretty well documented, just not the very specific bit I mention above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants