HA Database bloat after introduction of Bermuda_Global sensors #389

jeremysherriff · 2024-11-17T01:29:10Z

Configuration

Not applicable (I think?)

Describe the bug

Since v0.7 and the addition of the Global Bermuda device, my HA database has increased in size +15%. This seems to all be in the State table.
My database is MariaDB so using phpMyAdmin I checked what the highest state counts are, grouped by entity_id:

select states_meta.entity_id,count(*) from states left join states_meta on states.metadata_id = states_meta.metadata_id group by entity_id order by count(*) desc;

entity_id					count(*)   	
sensor.bermuda_global_visible_device_count	151901	
sensor.bermuda_global_total_device_count	 89860	
sensor.deck_sensor_humidity			 31100	
sensor.atc_c280_humidity			 30346	
sensor.deck_sensor_temperature			 28670	
sensor.lywsd03mmc_7d73_humidity			 26242	
sensor.atc_c280_temperature			 24805	
sensor.lywsd03mmc_7d73_temperature		 24085	
sensor.zm_memory_used				 22791	
...

The sensor.bermuda_global_visible_device_count has a state change count that is an order of magnitude higher than most sensors.

I suspect a lot of the bloat is coming from the lack of state_class attributes for these sensors, which causes every state change to be recorded as a discrete value rather than being aggregated:

I think the following attribute should be added to these sensors:

state_class: measurement

I have now manually added these attributes using the customize.yaml file and will update on success of this.

Alternatively, these sensors should perhaps be disabled by default?

Diagnostics

Not applicable (I think?)

The text was updated successfully, but these errors were encountered:

jeremysherriff · 2024-11-17T03:36:05Z

Update; My edits to customize.yaml did what I was hoping from the point of view that the states are now seen as numbers and can be aggregated and fed into the statistics engine. Whether this causes the state changes to be stored more efficiently I am unsure (but I have some other very high-change stats that do not bloat the database like these were, so I am quietly confident...)

agittins · 2024-11-17T19:40:20Z

Hi Jeremy!

v0.7.1 introduced rate-limits on the global sensor updates, can you check how it works for you on v0.7.2?

You're right about the state_class though, I'll add that to fix the treatment of the sensor values - I don't think it directly affects the state recording, at least until it goes into long-term stats where the aggregations happen - but it will fix how they're displayed.

jeremysherriff · 2024-11-18T01:04:04Z

Ha I am always late to the party! I'm running 0.7.2 so the rate limit (once per minute?) is already in place, so my "testing" will be guaranteed successful :)

My recorder purge is 7 days so by this coming weekend it will be a fair test to see the difference there.

jeremysherriff · 2024-11-18T04:22:57Z

About 25 hours since I purged the state data, now looking much better:

select states_meta.entity_id,count(*) from states left join states_meta on states.metadata_id = states_meta.metadata_id
 where states_meta.entity_id LIKE 'sensor.bermuda_global_%'
 group by entity_id
 order by count(*) desc;

entity_id					count(*)
sensor.bermuda_global_total_device_count	1566	
sensor.bermuda_global_visible_device_count	1315	
sensor.bermuda_global_active_proxy_count	 412	
sensor.bermuda_global_total_proxy_count		  25

The ~152k in the OP was 7 days so approx 21k per day.

With state data being written every 1 minute (1,440 per day) and it being very likely that a change is observed at every interval, it is now trending towards 22k for the 7 days (which is still high; it'll be in the top 10 noisiest sensors but a vast improvement).

agittins · 2024-11-21T01:40:33Z

Great, thanks for the follow-up!

I think it's probably worth winding it back further - these sensors are more about gathering a wide-viewed picture of health (and there are more to come!) so perhaps every 5 minutes would be sufficient - possibly with an extra mechanism to force a refresh for any realtime diagnostic needs. Given their purpose it doesn't make a lot of sense for them to be near the top of that list! :-)

agittins self-assigned this Nov 17, 2024

agittins added the enhancement New feature or request label Nov 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HA Database bloat after introduction of Bermuda_Global sensors #389

HA Database bloat after introduction of Bermuda_Global sensors #389

jeremysherriff commented Nov 17, 2024 •

edited

Loading

jeremysherriff commented Nov 17, 2024

agittins commented Nov 17, 2024

jeremysherriff commented Nov 18, 2024

jeremysherriff commented Nov 18, 2024

agittins commented Nov 21, 2024

HA Database bloat after introduction of Bermuda_Global sensors #389

HA Database bloat after introduction of Bermuda_Global sensors #389

Comments

jeremysherriff commented Nov 17, 2024 • edited Loading

Configuration

Describe the bug

Diagnostics

jeremysherriff commented Nov 17, 2024

agittins commented Nov 17, 2024

jeremysherriff commented Nov 18, 2024

jeremysherriff commented Nov 18, 2024

agittins commented Nov 21, 2024

jeremysherriff commented Nov 17, 2024 •

edited

Loading