Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bouncing back & forth between correct room and "unknown" #301

Open
devindudeman opened this issue Sep 7, 2024 · 43 comments
Open

Bouncing back & forth between correct room and "unknown" #301

devindudeman opened this issue Sep 7, 2024 · 43 comments
Labels
moreinfo More information required to progress further

Comments

@devindudeman
Copy link

Newest version, Default configuration

image
image
image

Bermuda works perfectly, except that it's going back and forth between "unknown" and correct area every minute. Very strange! Any suggestions?

Using Apollo MSR-1 ESP32's.

@agittins
Copy link
Owner

agittins commented Sep 7, 2024

Howdy!

Yes, that is odd. I think what might be happening is that we're not getting frequent enough updates from the MSR1, perhaps.

If you can do a "download diagnostics" in Bermuda and upload the results that will give me a very clear idea of what's going on. Note that it may take a while to generate the diagnostics, it will be quicker if you do it a few minutes after a restart.

What config are you using on the MSR-1? It looks like their default config might not be ideal for a BLE proxy, but the diags will give me more to go on.

@devindudeman
Copy link
Author

devindudeman commented Sep 7, 2024

Howdy!

Thanks for taking a look.

My esphome config for these devices was simple, and I didn't do much to get them set up as proxies. I'm pretty new to esphome in general.

config_entry-bermuda-01J755TP00Y4S3Z6AZWCTGAV2Y(1).json

Attached my diagnostics!

substitutions:
  name: apollo-msr-1-06b3e4
  friendly_name: Apollo Multisensor Mk1 (MSR-1) 06b3e4
packages:
  ApolloAutomation.MSR-1: github://ApolloAutomation/MSR-1/Integrations/ESPHome/MSR-1.yaml
esphome:
  name: ${name}
  name_add_mac_suffix: false
  friendly_name: ${friendly_name}
api:
  encryption:
    key: !secret encryption_key

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password
  power_save_mode: LIGHT

bluetooth_proxy:
  active: true

@rikardkjell
Copy link

image
Experience the same problem. Is there a way to create a sensor that reflects the last "known" state neglecting the Unknown/Unavailable ones ?

@agittins
Copy link
Owner

Attached my diagnostics!

Great, thanks! I can see that HA is not receiving regular updates from your proxies, and this is why Bermuda will keep reporting them as "unknown". msr-1-a54354 shows these intervals between packets from your configured iBeacon:

image

This is pretty odd. Usually you'd see intervals between 1 second and maybe 4 or 5 seconds, but here we see 19 seconds or more quite commonly, which is unusual. These are the advertisements that HomeAssistant sees. So the source of the issue is almost certainly either of:

  • Your proxy is not sending updated advertisements. Maybe because it keeps rebooting, or perhaps the config is turning off bluetooth at startup, and it reboots
  • Your iBeacon is only sending adverts every 19 seconds or so. This would be pretty unusual, since beacons should be sending at least every 2 seconds, if I recall correctly.

The easiest way to rule out option 2 is to grab a phone and install an app like NRFConnect on it. It will show you the BLE devices the phone can see, and tell you the average advertising interval. If you click "More" it will show you a graph with the advertising packets received arranged by time vs signal strength. If that shows regular adverts (at somewhere between 200ms and 2000ms) then we know the beacon is behaving normally, and we can concentrate on the config of your proxies.

I suspect the most likely issue is your proxy, but I haven't had a chance to dig into it properly yet. The config you posted includes a package from their repo, and at first glance it does look like it turns off the bluetooth at bootup but that's just to disable the radar module's bluetooth.

My guess is that since this is an esp32-C3 board, it might need some extra tweaks. The C3 boards only have a single cpu core, so they need a little extra help to get bluetooth and wifi working solidly at the same time.

I'm going to find a way to make this simpler to fix, but in the meantime if you wanted to try some things with altering the firmware, @grigi has given some excellent notes on making esp32-C3 devices work more reliably, which I'll be integrating into the wiki soon.

@grigi
Copy link

grigi commented Sep 17, 2024

A slightly different issue, I have some really old, low-spec tablets that I gave the kids, and they have the HA app installed that run the BLE beacons, then struggle under load when they play Minecraft as Android rightfully gives is more resources.
So when they play it, their location flip/flops between the room they are in and unknown. It gets as bad as a 5 minute cycle.

It almost always correlates with Minecraft time 😅

Having "last seen in" and "last seen at" sensors would significantly help for some automations as I really only want to know which room they are in, as they pretty much never leave the house.

@rikardkjell
Copy link

I run bluetooth proxy on a couple of XIAO ESP23S3 from Seeed Studio. Making the assumption that the stability fix above is applicable on the S3 as well ?

@grigi
Copy link

grigi commented Sep 17, 2024

I run bluetooth proxy on a couple of XIAO ESP23S3 from Seeed Studio. Making the assumption that the stability fix above is applicable on the S3 as well ?

Honestly, all that does is give the ESP device:

  1. Chance to boot up and not do useless BLE scanning when not connected to HA
  2. Enlarge the time spent listening for BLE messages.

So I'd basically do the same as the C3 example I have at the end of the post, but change the esp32: section to instead have:

esp32:
  board: esp32-s3-devkitc-1
  framework:
    type: esp-idf

The sdkconfig_options: may or may not help things (or even work at all). I have an s3 device but it's not used for BLE scanning at the moment.

But for the S3, I'd say the most important is the scan_parameters.
Setting those up even on full ESP32's helps lots!

@rikardkjell
Copy link

rikardkjell commented Sep 17, 2024

Super. Thanx. I'll try it out....
Screenshot 2024-09-18 17 21 40
Looking much better !!

@rikardkjell
Copy link

Now I have another strange problem...
Screenshot 2024-09-18 20 18 20
Area always "Unknown" even though log say it picks up changes in area...

@rikardkjell
Copy link

Now I have another strange problem... Screenshot 2024-09-18 20 18 20 Area always "Unknown" even though log say it picks up changes in area...

Strangely enough this solved itself after a couple of restarts, both "soft" and a "hard one". No worries there

@agittins
Copy link
Owner

@grigi
Having "last seen in" and "last seen at" sensors would significantly help

Yes, hopefully #202 will address that for you specifically.

@rikardkjell

Area always "Unknown" even though log say it picks up changes

There's a fix for #235 in the beta release v0.6.9rc2, which might be the issue you had here. If it happens again please feel free to open a new issue, and if you could include the result of a "download diagnostics" as well that will help a lot to work out what's going on.

@devindudeman did you have any luck improving the situation? The two main things to try are:

  • Run NRFConnect or a similar app on your phone, and use it to verify how often your device is sending advertisements (I'd typically expect somewhere between 0.2seconds and 2 seconds). This just rules out the problem being the transmitting side of things.
  • Try out the suggestions from @grigi for the config. Their notes have been incorporated into the Wiki page now, so you can just work directly from there. If you're feeling a bit stuck with the whole re-flashing-the-device thing let us know and we can help you out with that.

@rikardkjell
Copy link

I'll look in to that. Thank you very much for your help and support.

@rikardkjell
Copy link

release v0.6.9rc2
The v0.6.9rc2 release seams as a good alternative for me. Mille Grazie signore agittins!

@rikardkjell
Copy link

It now, with a little magic of automation, playing along quite well!!
Thank you for great support.
Screenshot_20241004-161448

@agittins
Copy link
Owner

agittins commented Oct 4, 2024

playing along quite well!!
Thank you for great support.

Woohoo! That's great, and you're welcome 🤗

@agittins
Copy link
Owner

@devindudeman how have you gone with this? Have any of the suggestions worked for you, or still looking for a chance to try them out?

@agittins agittins added the moreinfo More information required to progress further label Oct 15, 2024
@convicte
Copy link

First, thank you so very much for a relatively simple and effective tool for room location purposes!

Second, I have implemented all the changes I was able to find across several open issues here and still have a similar problem with the sensor bouncing back and forth, especially as I sit in my office.
image

@convicte
Copy link

It now, with a little magic of automation, playing along quite well!!

Would you mind sharing your template sensor or automation you've implemented for filtering?

Cheers!!

@rikardkjell
Copy link

rikardkjell commented Oct 15, 2024

First, thank you so very much for a relatively simple and effective tool for room location purposes!

Second, I have implemented all the changes I was able to find across several open issues here and still have a similar problem with the sensor bouncing back and forth, especially as I sit in my office. image

I solved the glitchiness from a room to "unknown"-state by adding a "layer". Created a Helper text-sensor that present last known area. Then I use a Automation that pushes the values via a "input_text.set_value" action into this sensor only if the state isn't "Unavailable" or "Unknown".
See code below that have been cut a bit so it might not bee 100% correct...

`alias: "Tracker: Pixel/Kronaby Area Helper"
description: ""
triggers:

  • entity_id:
    • sensor.kronaby_area
      id: kronaby
      trigger: state
      conditions: []
      actions:
  • choose:
    • conditions:
      • condition: and
        conditions:
        • condition: trigger
          id:
          • kronaby
        • condition: not
          conditions:
          • condition: state
            entity_id: sensor.kronaby_area
            state: unavailable
        • condition: not
          conditions:
          • condition: state
            entity_id: sensor.kronaby_area
            state: unknown
            sequence:
      • action: input_text.set_value
        target:
        entity_id: input_text.kronaby_last_known_area
        data:
        value: "{{states('sensor.kronaby_area')}}"
        mode: single
        `

@convicte
Copy link

Thanks for clarifying - I think I'll take a stab at it with a template sensor, as the automation feels excessive.
Will have to deal with edge cases where unknown is a valid output, since someone may have left or have gone to a place where a proxy doesn't exist.

Hopefully this is only temporary and more root based solution could be implemented as per @agittins in #202

Cheers!!

@PlayFaster
Copy link

Hi. Like some others, I also see the issue of area bounce when my phone is not moving. For me, much of this seems to be down to the expected regular variation in RSSi, coupled with non ideal ESP Tracker placement.

Using the info from the wiki, and on here, coupled with using the calibration functions, and moving some of the ESP devices, I have certainly seen this improve, but it has definitely not been eliminated, for me.

I have implemented a de-bounced / latched Last Known Good Area sensor, as suggested. This certainly helps, and eliminates the issue of the Area showing as Unknown when the device is at home. It improves, but does not eliminate the area-bounce issue. I have included a delay (9 seconds) which reduces the bounce. Increasing this time delay would further reduce the bounce, but at the cost of slower responsiveness, when the phone is actually moving.

YAML Code below in case it's of use to anyone. It's a triggered template sensor, and triggers based on the Bermuda Area, the Bermuda Device Tracker, but also on the BLE Transmitter sensor provided by the HA App on Android. The use of 'this.state' in the final clause keeps the Area unchanged unless a new area has been stable for 9 seconds.


  - trigger:
      - platform: state
        entity_id:
          - sensor.mbbile_phone_ble_transmitter
          - device_tracker.mbbile_phone_bermuda_tracker
          - sensor.mbbile_phone_area
      - platform: state
        entity_id:
          - sensor.mbbile_phone_area
        for:
          seconds: 10
    sensor:
    - name: Phone Area Last Known
      state: >
        {% if is_state('sensor.mbbile_phone_ble_transmitter','Stopped') -%}
        Away
        {%- elif
          ( not states('sensor.mbbile_phone_area')|lower in ('unavailable','unknown','none') )
          and
          ( as_timestamp(now(),[0]) - as_timestamp(states.sensor.mbbile_phone_area.last_changed)
          > 9 )
          %}
        {{ states('sensor.mbbile_phone_area') }}
        {%- elif this.state|lower in ('unavailable','unknown','none') -%}
        {{ states('sensor.mbbile_phone_area') }}
        {%- else -%}
        {{ this.state }}
        {%- endif %}

@convicte
Copy link

Thanks for the sensor draft. One less task to deal with.

Having said that, the calibration and placement can only go so far.
image
My phone was 1m away from the proxy in the kitchen between 18.45 and 19.35, while this trace is completely unusable.
It's a Shelly 1 PM placed on the counter with my phone 1m away.

Without some internally improved preprocessing, it's going to be difficult to draw much from such chaos.

@yisforcolour
Copy link

I had the same issue with bouncing back and fourth between home and unknown. I swapped from using the M5 Atom Lite to a M5Stamp C3 Mate board and put in the config file from here and it is rock solid now. Thanks!

@agittins
Copy link
Owner

My phone was 1m away from the proxy in the kitchen between 18.45 and 19.35, while this trace is completely unusable.
It's a Shelly 1 PM placed on the counter with my phone 1m away.

Wow, what a mess! This is almost certainly the result of phone/shelly/wifi issues - there is only so much that Bermuda can do with garbage input, and this looks like something is seriously wrong upstream of HA. Can you either upload a "download diagnostics" or go in to "Developer Tools", "Actions/Services", choose "bermuda.dump_devices", switch to YAML mode and paste in:

action: bermuda.dump_devices
data:
  configured_devices: true
  redact: false

Look through the listing for your phone's entry, then inside that you should see scanner entries with hist_interval entries. Those hist_interval entries tell you how often HA is receiving advertisements from that scanner for that device. For a shelly device they should be all around 3 seconds. For an esphome, more like 1 to 4 seconds. If they're above 6 seconds or so, you have something awful going on.

Here's a fairly healthy example for an esphome proxy:

      hist_interval:
        - 1.1269979262724519
        - 1.1179979434236884
        - 0.9299982897937298
        - 1.017998126335442
        - 0.9039983376860619
        - 0.9379982743412256
        - 10.846980054862797
        - 0.2189995963126421
        - 0.9159983182325959
        - 1.127997925505042

If digging through the yaml and the instructions above is a bit much, you can do a "download diagnostics" instead and upload that. Go into Bermuda, click the three-dots menu and choose "Download Diagnostics" (it might take a while) and save the result, the upload it here. I can then go through it and see what's going on.

@convicte
Copy link

convicte commented Oct 16, 2024

My phone was 1m away from the proxy in the kitchen between 18.45 and 19.35, while this trace is completely unusable.
It's a Shelly 1 PM placed on the counter with my phone 1m away.

Wow, what a mess! This is almost certainly the result of phone/shelly/wifi issues - there is only so much that Bermuda can do with garbage input, and this looks like something is seriously wrong upstream of HA.

Not sure what else to do here though, as the Shelly has typically just been enabled via GUI and then through UI flow in HA.
For ESPHome, I've adjusted all my Xiao ESP32-S3s already (https://wiki.seeedstudio.com/xiao_esp32s3_getting_started/)

Those hist_interval entries tell you how often HA is receiving advertisements from that scanner for that device. For a shelly device they should be all around 3 seconds. For an esphome, more like 1 to 4 seconds. If they're above 6 seconds or so, you have something awful going on.

I know it's beyond the scope of what you asked for, but there seems to be a trove of information in there, so I've decided to add the whole section.

TL:DR, with all possible ESPHome adjustments recommended, my ESP32-S3 used for my Office proxy doesn't seem to be at all healthy:

      hist_stamp:
        - 61612.17311204
        - 61596.609817611
        - 61581.04745711
        - 61573.785194679
        - 61562.411307733
        - 61555.141988238
        - 61547.871641337
        - 61546.84773092
        - 61536.506599077
      rssi: -61
      hist_rssi:
        - -61
        - -66
        - -70
        - -69
        - -67
        - -66
        - -67
        - -67
        - -65
      hist_distance:
        - 0.5011872336272722
        - 0.7356422544596414
        - 1
        - 0.9261187281287935
        - 0.7943282347242815
        - 0.7356422544596414
        - 0.7943282347242815
        - 0.7943282347242815
        - 0.6812920690579612
      hist_distance_by_interval:
        - 0.5011872336272722
        - 0.5011872336272722
        - 0.5011872336272722
        - 0.5011872336272722
        - 0.5011872336272722
        - 0.5011872336272722
        - 0.5011872336272722
        - 0.5011872336272722
        - 0.5011872336272722
        - 0.5011872336272722
      hist_interval:
        - 15.563294428997324
        - 15.56236050100415
        - 7.262262430995179
        - 11.373886946006678
        - 7.269319494997035
        - 7.270346901001176
        - 1.0239104169959319
        - 10.341131843000767
        - 61536.506599077
      hist_velocity:
        - -0.015064613851649278
        - -0.015064613851649278
        - -0.015064613851649278
        - -0.015064613851649278
        - -0.015064613851649278
        - -0.015064613851649278
        - -0.015064613851649278
        - -0.015064613851649278
        - -0.015064613851649278
        - -0.015064613851649278
      stale_update_count: 75
      tx_power: -127
      rssi_distance: 0.5011872336272721
      rssi_distance_raw: 0.5011872336272722
      adverts: {}
      scanner_sends_stamps: true

Shelly in the kitchen looks very similar:

      hist_stamp:
        - 62570.928125497
        - 62557.469460483
        - 62553.295558947
        - 62539.833857611
        - 62533.116995109
        - 62523.799172149
        - 62519.642245721
        - 62510.343397412
        - 62462.613846548
        - 62444.992842727
      rssi: -96
      hist_rssi:
        - -96
        - -83
        - -90
        - -93
        - -82
        - -78
        - -79
        - -81
        - -93
        - -91
      hist_distance:
        - 7.3564225445964135
        - 2.712272579332028
        - 4.641588833612778
        - 5.843414133735177
        - 2.51188643150958
        - 1.847849797422291
        - 1.9952623149688795
        - 2.326305067153626
        - 5.843414133735177
        - 5.011872336272722
      hist_distance_by_interval:
        - 7.3564225445964135
        - 7.3564225445964135
        - 7.3564225445964135
        - 7.3564225445964135
        - 7.3564225445964135
        - 7.3564225445964135
        - 7.3564225445964135
        - 7.3564225445964135
        - 7.3564225445964135
        - 7.3564225445964135
      hist_interval:
        - 13.458665013997233
        - 4.173901535999903
        - 13.461701335996622
        - 6.716862502005824
        - 9.317822959994373
        - 4.156926428004226
        - 9.298848309001187
        - 47.72955086399452
        - 17.621003821004706
        - 13.479078386997571
      hist_velocity:
        - 0.34506765421640206
        - 0.34506765421640206
        - 0.34506765421640206
        - 0.34506765421640206
        - 0.34506765421640206
        - 0.34506765421640206
        - 0.34506765421640206
        - 0.34506765421640206
        - 0.34506765421640206
        - 0.34506765421640206
      stale_update_count: 137
      tx_power: -127
      rssi_distance: 7.3564225445964135
      rssi_distance_raw: 7.3564225445964135  

Which still seems to be giving a reasonable output with some hiccups (2-3, maybe):
image

Any help would be appreciated!

@convicte
Copy link

convicte commented Oct 18, 2024

Quick update @agittins :

  1. It's either that I live in a Bermuda triangle of RF signals (a single family house, so most likely not), or there is something strange going on with the detection algorithm/hardware.

  2. I've added 2 Android phones and a Samsung watch and tested 5 different ESP32 board types across WROOM-32U, Seeed Studio ESP32C3, ESP32S3, WaveShare ESP32S3-ZERO, WROOM-32D and 3 Shelly types (Shelly 2.5, Shelly Mini PM Gen3, Shelly PM Gen2), all produce location oscillations and dropouts minimized only by a latching template sensor.
    Even this sensor struggles to clean it up:
    image

  3. The template does a decent job, but it's also struggling when the detection oscillation is most severe.

  4. Having looked at the Distance to node parameter, standing in one room I can see that it's often barely higher calculated distance than another room node 5-10 meters away. This may be an issue with attenuation, which I currently have set to default of 3. I have no idea at what scale it should be (0-1000 or 0-10) and whether higher attenuation means the signal drops of less with an attenuation of 1 or 10 or 100. Logically, the attenuation would be a rate of drop, but this could as well be addition or multiplication... IDK

  5. I've tried changing the signal strength on my phone and watch from ultralow to medium and actually no mater if it's low or high the issue is very similar. The reason for it is that with a high setting, the oscillation is picked up more towards a further room, since the signal delta is small. With a low signal strength it fails to consistently pick up even the room I am in, and it drops out to unknown.

I am really not sure what I am doing wrong, but it feels less and less like hardware, since I've tested so much and get this as a result at best:
image

Here are my global setting:
image

No scanner specific adjustments or offsets yet.

The dead nodes are permanently OFF and not a part of the array:
image

I have now also switched to a ESP32S3-ZERO from WaveShare in every room, so that the signal strength should be matched, but still not significantly better.

@convicte
Copy link

For reference - ESP32S3 setup for proxy:

substitutions:
  device_name: office-ble-proxy
  friendly_name: Office BLE proxy

esphome:
  name: ${device_name}
  friendly_name: ${friendly_name}
  min_version: 2024.6.0
  name_add_mac_suffix: false
  platformio_options:
    board_build.flash_mode: dio
  project:
    name: esphome.web
    version: dev

esp32:
  board: esp32-s3-devkitc-1
  framework:
    type: esp-idf
    # version: latest
    sdkconfig_options:
      # @grigi found in testing that these options resulted in better responsiveness.
      # BLE 4.2 is supported by ALL ESP32 boards that have bluetooth, the original and derivatives.
      CONFIG_BT_BLE_42_FEATURES_SUPPORTED: y
      # Also enable this on any derivative boards (S2, C3 etc) but not the original ESP32.
      CONFIG_BT_BLE_50_FEATURES_SUPPORTED: y
      # # Extend the watchdog timeout, so the device reboots if the device appears locked up for over 10 seconds.
      CONFIG_ESP_TASK_WDT_TIMEOUT_S: "10"
    

# Enable logging
logger:

# Enable Home Assistant API
api:
  encryption:
    key: !secret encryption_key

# Allow Over-The-Air updates
ota:
  - platform: esphome
    password: !secret wifi_password

# Allow provisioning Wi-Fi via serial
improv_serial:

wifi:
  reboot_timeout: 5min
  fast_connect: true
  output_power: 20dB
  # power_save_mode: none
  networks:
  - ssid: !secret wifi_ssid
    password: !secret wifi_password
    channel: REDACTED
    hidden: true
    manual_ip:
      static_ip: REDACTED
      gateway: REDACTED
      subnet: REDACTED
      dns1: REDACTED
      dns2: REDACTED

  ap:
    ssid: ${friendly_name} AP
    password: !secret wifi_password

captive_portal:

web_server:
  port: 80

button:
  - platform: restart
    name: Restart

esp32_ble_tracker:
  scan_parameters:
    # Don't auto start BLE scanning.
    continuous: true
    
    # Whether to send scan-request packets to devices to gather more info (like devicename)
    active: true
    # Listen on BLE for 300ms out of ever 320ms interval
    # If the device is failing to keep up, reduce the window to give it more time to do stuff
    interval: 320ms # suggested 211ms # default 320ms
    window:   300ms # suggested 120ms # default 300ms 

bluetooth_proxy:
  active: true

sensor:
  - platform: wifi_signal
    id: wifi_water_meter
    name: RSSI
    update_interval: 10s
    
  - platform: uptime
    name: Uptime

light:
  - platform: esp32_rmt_led_strip
    rgb_order: RGB
    pin: GPIO21
    num_leds: 1
    rmt_channel: 0
    chipset: ws2812
    name: LED

@agittins
Copy link
Owner

That's some excellent info-gathering, thanks!

So the hist_interval as you noted looks pretty unhealthy, and that is the primary indicator of the problem. We can ignore what the sensors are doing for now, and really just focus on hist_interval.

You have your max_radius set to 7, which for now is actually helping, since it will cause devices to switch to "unknown" rather than switch to another area, but after we solve the root cause you'll almost certainly want to increase that enough to disable it - like 200m or something.

Typically, long or variable hist_interval values point to the proxy itself not reporting advertisements regularly, often due to its firmware config. However, the fact you're seeing it over a variety of different hardware and firmware configs is surprising, and points toward the issue being upstream of the devices themselves, so wifi, network, HA, Bermuda.

I'm basing this on the kitchen shelly showing:

hist_interval:
        - 13.458665013997233
        - 4.173901535999903
        - 13.461701335996622
        - 6.716862502005824
        - 9.317822959994373
        - 4.156926428004226
        - 9.298848309001187
        - 47.72955086399452
        - 17.621003821004706
        - 13.479078386997571

And your office esp32-S3 showing:

    hist_interval:
        - 15.563294428997324
        - 15.56236050100415
        - 7.262262430995179
        - 11.373886946006678
        - 7.269319494997035
        - 7.270346901001176
        - 1.0239104169959319
        - 10.341131843000767
        - 61536.506599077

Here's an example from an D1mini32 (orig esp32-wroom) tracking my watch, so a moderately-variable signal:

hist_interval:
        - 0.6079991906881332
        - 3.1419958118349314
        - 0.7589989891275764
        - 3.1649957774206996
        - 2.019997302442789
        - 0.8449988728389144
        - 1.3379982132464647
        - 1.7409976739436388
        - 2.14099713973701
        - 0.8329988839104772

I wouldn't worry about what the sensors are doing until we can get a good-looking hist_interval set. They should be fairly consistent, with Shellys all about 3 seconds (because Shelly firmware limits advert proxying to 3 seconds), and esphome's trending down toward 1sec - this is because Bermuda only checks every second, so that's the smallest the intervals can get.

If you go into the ESPHome integration and "Enable Debug Logging", wait 10 seconds, and disable it again, you'll get a logfile. You can either send it to me (it will have mac addresses, IP addresses etc in it) or you can process it locally to see how many adverts each proxy received in that time for each device.

The logfile has multi-line entries, the mac addresses are shown as integers(!) and each log line may include multiple adverts. This means the file is a little tricky to parse, but if you have awk installed (most *nix's do) you can use:

cat yourlogfile.log |  awk '/BluetoothLERawAdvertisementsResponse/{proxy=$6};/^  address: /{counts[sprintf("%x",$2), " ", proxy]++};END{for(i in counts){printf("%s: %d" RS, i, counts[i])}}'

(!)

This will tally up how many adverts each proxy saw by device, and spit out the result. You can grep this output for the mac address you're looking for. Eg, add the following on to the end of the previous line:

| grep eee8379f6b54

This is the mac address of my watch, so my command window looks like:

$ cat home-assistant_esphome_2024-10-19T09-13-39.022Z.log | awk '/BluetoothLERawAdvertisementsResponse/{proxy=$6};/^  address: /{counts[sprintf("%x",$2), " ", proxy]++};END{for(i in counts){printf("%s: %d" RS, i, counts[i])}}' | grep eee8379f6b54
eee8379f6b54 prox-studio: 46
eee8379f6b54 camfront: 47
$

So in those 10 seconds, my two studio proxies received 46 and 47 adverts - which means if my watch is broadcasting every 200ms then we caught almost every one of the 50 adverts that were sent in those 10 seconds (less my manual counting of seconds etc).

If you can run the above on your logfile (or mail it to me at [email protected] if you prefer) we can see if the ESPHome integration running inside HA actually received as many adverts as it should. If we see a lot of adverts (like more than 1 per second) that indicates that the proxies, the network and the esphome integration are all working fine, and we can concentrate on what's happening in the HA and Bermuda area of things.

If you don't get lots of adverts, then we are still looking at a hardware/firmware/network issue.

Interestingly, my HA is getting 5 adverts a second, but in Bermuda it sometimes still sees 3 second gaps! I know there is some rate-limiting happening in the aiohabluetooth lib, which I hope to nail down and address at some point. But for now... let's see what you are getting from an esphome debug log.

@convicte
Copy link

convicte commented Oct 21, 2024

Thank you so very much for the detailed overview and your willingness to help.
I was out over the weekend, so please excuse the tardiness.

In your inbox you will find the logs requested - my sincere apologies but command line data wrangling goes through one ear and out the other, as my Linux knowhow is basic at best.

Looking forward to your thoughts!

@EFH52
Copy link

EFH52 commented Nov 14, 2024

image
Note the consistency after the 0.70 update. Thank you. Hope others in this thread are seeing similar improvments. I've yet to fiddle with any of the new settings permitted,

This is a bluecharm tag sitting about 24" from my server. It hasn't moved in months, but had been bouncing over 4 rooms until the last update. Looking forward to testing more things out.

@convicte
Copy link

Not my experience at all. I see no major change with the root of the problem as outlined above.

Constantly bouncing back and forth for a phone 1-2 meters away from a proxy to one that is 2-3 times the distance, with the sensor claiming it's now way closer somehow:
Screenshot_20241115_220757_Home Assistant.jpg

@agittins any chance you had a moment to look at the logs I sent above?

Cheers in advance!

@agittins
Copy link
Owner

Hey @convicte sorry for the delay and thanks for the nudge!

I've looked over the logs you emailed me, and it looks to me that you either have too much broken stuff configured in HA, or your network (or HA device's network connection) is borked.

Lots of [re]connection errors for all sorts of integrations,

Lots of blebox_uniapi failing to connect to various devices (shutters?)
lots of timeouts connecting to an Android SM-S918B (a samsung?)
quite a few disconnections from the Sony TV
quite a few disconnections/timeouts other network issues from deebot, nabucasa, dahua, smartthings/samsung_tv
heaps of errors from shelly just for getting sensor values, like:
[homeassistant.components.shelly] Error fetching Kitchen lamp data: Error fetching data: DeviceConnectionTimeoutError(TimeoutError())

These integrations look like they have particular issues / are broken:
sat
alexa_media_player
smartthings

modbus has been flooding the logs with an Unexpected response error

Basically, from what Homeassistant can see, the network is in such a poor state that it can't reliably connect or stay connected to anything. There's no chance you'll get good results from Bermuda without taking care of that, first.

It's possible that the issue is local to your HA box, it might be so overwhelmed that it can't manage to keep it's network connections working (hmm.... you're not running HA in a vm under windows, are you?)

See what your HA cpu usage is like, if it's particularly high that might be the root of the issue, or maybe your router/modem/wifi-ap/ethernet-switch is having serious issues.

Sorry I don't have better news!

@andersonimes
Copy link

andersonimes commented Nov 20, 2024

I am having a very similar issue. I have read this thread and done some tweaking of settings, but it seems like I am not able to get consistent results either, even less than a meter away. Here is my diagnostic file.
config_entry-bermuda-01JCZZE6HBTZ9JGPAK6X74JPHZ.json

Graph of presence + distance. Notice the drop-outs in distance. Notice the drop-outs in the distance information that correlates with lack of presence data. I'm sure this is expected.

Not really sure where to go from here. Happy to provide any other diagnostic info. Everything seems well calibrated - the distances are reasonably accurate. Just disappears regularly.

@agittins
Copy link
Owner

@andersonimes I took a look at your diags and the problem looks likely to be the proxy, probably the firmware setup. The hist_interval shows how many seconds have passed between each update, so ideally we want to see a list of 1 to 2 second values. This is what the office proxy sees for the ibeacon:

            "hist_interval": [
              10.259905143000651,
              29.791724173992407,
              39.019637816993054,
              5.095952618023148,
              5.139952189987525,
              10.243904655013466,
              5.11595235299319,
              93.29112738199183,
              5.119951889006188,
              34.89567142300075
            ]

From a Bermuda point of view, this is "broken". :-) Can you share the full yaml you used to flash the office proxy, and advise which board it is?

(If you just want to improve the home/away from the device tracker, you could change your Device Tracker timeout from 120 seconds to 300 or 600, but this is definitely in the "sweep it under the rug" category of fixes :-) )

@andersonimes
Copy link

andersonimes commented Nov 21, 2024

I have the same behavior with all of my two devices:

  • The "office" device, which is an m5stack atom
  • The "bedroom" device, which is an Everything Presence One

office.txt
bedroom.txt

Thanks a bunch for looking into this!

@p0macs
Copy link

p0macs commented Nov 30, 2024

Hi,
Thank you for all the information in this thread. I had the very same symptom, but the info about the "hist_interval" was very helpful.
Recently I was happy to find out that I have a bunch of Shelly devices in the house and turned on the BLE proxy on all of them (9 devices). Then I was forced to disable the Bermuda integration for a while, when recognized that it has flooded the log with devices bouncing from one tracker to the another.

Now, I can say the problem is with Shelly. I turned off BLE proxy on all of them and turned on two Atom Echo devices flashed as BLE proxy - and the magic happened: no more bouncing of the devices, rock stable - for now.

I have a question:
is it possible somehow to remove the old "dead" scanners from the Bermuda integration?
Now I have "2 active out of 11 bluetooth scanner devices" - and I don't want to see them and don't want to have the disabled distance entities related to the inactive scanners.

How to remove them from the configuration?

Thank you!

@agittins
Copy link
Owner

agittins commented Dec 5, 2024

I have the same behavior with all of my two devices:

  • The "office" device, which is an m5stack atom
  • The "bedroom" device, which is an Everything Presence One

Hi, sorry for the delay in my follow-up!

The EP1's are a great little board! I've created a repo to hold my suggested settings and make it easy to add those to a config as a package. For your bedroom proxy, I'd suggest trying this:

substitutions:
  name: everything-presence-one-24fcf0
  friendly_name: EP1 Bedroom

packages:
  Everything Smart Technology.Everything Presence One: github://everythingsmarthome/everything-presence-one/everything-presence-one-sen0609-co2.yaml@main
  Bermuda.esp32: github://agittins/bermuda-proxies/packages/bermuda-proxy-esp32-orig-minimal.yaml

esphome:
  name: ${name}
  name_add_mac_suffix: false
  friendly_name: ${friendly_name}
  min_version: 2024.11.2

api:
  encryption:
    key: MYKEY

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

(The bermuda line sets up all the bluetooth stuff for you).

For the office proxy, it looks like you're running a fairly standard config from the esphome site, possibly the "ready-made projects" thingie. I suggest trying this:

substitutions:
  name: esphome-web-a07100
  friendly_name: Office BT Proxy

packages:
  Bermuda.esp32: github://agittins/bermuda-proxies/packages/bermuda-proxy-esp32-orig-minimal.yaml

esphome:
  name: ${name}
  friendly_name: ${friendly_name}
  min_version: 2024.11.2
  name_add_mac_suffix: false

esp32:
  board: m5stack-atom
  framework:
    type: esp-idf

# Enable logging
logger:

# Enable Home Assistant API
api:

# Allow Over-The-Air updates
ota:
- platform: esphome

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

I think that should give you a solid firmware setup, so we can then see how that performs and see if we need to look at the wifi or not.

I've bumped the min_version to 2024.11.2 on both configs so that it should pull in the latest esphome. If it's convenient, I'd suggest flashing them via USB at least once, just to make sure you have the latest flash partitioning layout on them (the OTA update can't update the partitioning). I think it was a while ago that it changed though, so if they're both recent(ish) acquisitions then it may not be necessary.

Let me know how you go, would be good to see a fresh diags after they've been up for a few minutes!

@andersonimes
Copy link

Thanks I will do it! A note - when I tried your config for the EP1, it set the baud_rate to 0 which wasn't compatible with the EP1 config and I got a pre-compiler error. I added baud_rate: "115200" to the set of substitutions and all was well.

I will let you know how it goes after a while of trying it, thanks a bunch!

@drothenberger
Copy link

I recently discovered that the

bluetooth_proxy:
  active: true

stanza in my ESPHome config was causing this issue for me. I have two devices in my house that require active BT connections. The proxy will sometimes spend up to a minute servicing those connections instead of doing passive scanning, causing any Bermuda devices in that proxy's area to go to unknown until the passive scanning starts again.

I switched off active scanning on all my proxies and the unknown area issue went away. I did have to enable active scanning on two other MSR presence sensors to get back access to the devices that require active connections, but they are in rooms with other proxies that are not doing the active scanning, so they don't cause any issues.

@andersonimes
Copy link

andersonimes commented Dec 15, 2024

I used the suggested configuration and have been trying it for a week, but things aren't really better, unfortunately. I've attached my latest debug log to see if it points to anything obvious.

bermuda.txt

@Kibukx
Copy link

Kibukx commented Dec 30, 2024

So I was having the same issue and it seems that this was due to my RPI having its Bluetooth enabled and not set in an area in Home Assistant so Bermuda detected my phone near it and caused the "unknown" status because of it. Check if you have other BT devices around the house that don't have a designated area and this could be the cause. Hope this helps someone

@andersonimes
Copy link

Great call-out. My rpi's Bluetooth sensor was not in a room as well. This didn't help my situation, but I'm confident that would have been the next problem I needed to solve, so thanks for that!

@franz82
Copy link

franz82 commented Jan 5, 2025

Same issue here, solved adjusting the transmitter settings in HA app as suggested here:
https://www.reddit.com/r/homeassistant/comments/1etz44q/comment/lih585s/
-f

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
moreinfo More information required to progress further
Projects
None yet
Development

No branches or pull requests