-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HAOS Core: "Fatal Python error: Segmentation fault" in RaspberryPi4 after 2024.02 upgrade related to threading #110538
Comments
This is too unstable, here is a chart showing how right after the update is installed the restarts start occurring. If I look closer, latest days all reboots happen exactly at around 4hours 10 minutes sin last restart: With same error stack: |
I'm seeing something similar since upgrading from 2024.1 to 2024.2.1 (home-assistant.log.fault.txt). Raspberry Pi 4, Ubuntu 20.04 64-bit, using Home Assistant Container.
|
Might be fixed in 2024.2.2: #110464 (comment) |
That one mentions python 3.11, but my fault stack strace only shows python 3.12. let's see if we're lucky and it gets fixed too but could potentially not be the same. |
I confirm is still an issue in 2024.2.2. I can't believe is just a couple of us?
|
Nope not a couple of us. I have the same issue. I'm running on Proxmox VE 8.1 **Fatal Python error: Segmentation fault Thread 0x00007f1b3d0ffb30 (most recent call first): Thread 0x00007f1b3d71fb30 (most recent call first): Thread 0x00007f1b3de3fb30 (most recent call first): Thread 0x00007f1b3ee46b30 (most recent call first): Thread 0x00007f1b3f133b30 (most recent call first): Thread 0x00007f1b3f246b30 (most recent call first): Thread 0x00007f1b4186cb30 (most recent call first): Thread 0x00007f1b43c7fb30 (most recent call first): Thread 0x00007f1b4537fb30 (most recent call first): Thread 0x00007f1b40c59b30 (most recent call first): Thread 0x00007f1b40d6cb30 (most recent call first): Thread 0x00007f1b4287fb30 (most recent call first): Thread 0x00007f1b46496b30 (most recent call first): Thread 0x00007f1b465a9b30 (most recent call first): Thread 0x00007f1b466bcb30 (most recent call first): Thread 0x00007f1b467cfb30 (most recent call first): Thread 0x00007f1b480efb30 (most recent call first): Thread 0x00007f1b5bb5eb30 (most recent call first): Thread 0x00007f1b48f0fb30 (most recent call first): Thread 0x00007f1b4a519b30 (most recent call first): Thread 0x00007f1b4a919b30 (most recent call first): Thread 0x00007f1b4f1cfb30 (most recent call first): Thread 0x00007f1b5588fb30 (most recent call first): Thread 0x00007f1b5608fb30 (most recent call first): Thread 0x00007f1b4bd2cb30 (most recent call first): Thread 0x00007f1b4c43fb30 (most recent call first): Thread 0x00007f1b4e75cb30 (most recent call first): Thread 0x00007f1b5196fb30 (most recent call first): Thread 0x00007f1b51f8fb30 (most recent call first): Thread 0x00007f1b595afb30 (most recent call first): Thread 0x00007f1b5b2ccb30 (most recent call first): Thread 0x00007f1b5b3e3b30 (most recent call first): Thread 0x00007f1b5b4fab30 (most recent call first): Thread 0x00007f1b5b611b30 (most recent call first): Thread 0x00007f1b5b728b30 (most recent call first): Thread 0x00007f1b5b83fb30 (most recent call first): Thread 0x00007f1b5bd75b30 (most recent call first): Thread 0x00007f1b5be8cb30 (most recent call first): Current thread 0x00007f1b5c69fb30 (most recent call first): Thread 0x00007f1b67d5db38 (most recent call first): Thread 0x00007f1b67f74b38 (most recent call first): Thread 0x00007f1b78c5ab38 (most recent call first): Thread 0x00007f1b7a76fb38 (most recent call first): Thread 0x00007f1b845ffb38 (most recent call first): Thread 0x00007f1b84ffbb38 (most recent call first): Thread 0x00007f1b850ffb38 (most recent call first): Thread 0x00007f1b859ffb38 (most recent call first): Thread 0x00007f1b87b97b38 (most recent call first): Thread 0x00007f1b87cbfb38 (most recent call first): Thread 0x00007f1b8cc01b08 (most recent call first): |
Found my problem. It was the hacs watchman plugin what got screwed. Disabled it and everything is fine now. |
I closed #110464 but after 4days it crashed again, same symptoms -
|
How did you figured out which one it was? |
I had an ! on my device_tracker.yaml file. I checked it and it gave me an error on the file with a line couldn't be empty or only had special characters. I first disabled my ble_tracker and rebooted. Same fault. Then I disabled the watchman plugin and rebooted and fault was gone. It also had a lot of devices in the log and errors that devices where not found. I think it brought my Home Assistant install in a constant error loop and that it got to much so the core got rebooted every now and then. |
Ok, now is clear something is wrong. Now I think I know what it is: iperf integration, which I've set it such scan interval to be that one. Not a coincidence the last log message before crash is precisely a server unavailable. Nothing new that shouldn't crash HA.
|
Hey there @rohankapoorcom , mind taking a look at this issue as it has been labeled with an integration (iperf3) you are listed as a code owner for? Thanks! |
Forget it is iPerf. Mine is still crashing and I'm not using iperf at all. It is definitelly core issue! |
Crap, thanks for the update, will revert the ticket title and integration. Something that bothers me: in my last fault that shows the segmentation fault, it mentions some custom integrations that shouldn't have been enabled as I was in safe boot mode. Another work action I took, I tried to understand if Python 3.12 had any main changes regarding threading and could find there has been important changes https://peps.python.org/pep-0684/ Not sure if HA devs already announced an impact due to this? |
Well, I'm still on Python 3.11, 1st thoughts was that, but I see segfaults on 3.12 here.. Must be related to 2024.02 as with previous release I had weeks of uptime with no issues. Such an error is mostly related to stack memory, buffer overflow, etc., not threading. |
Add me to the list, same problem here. Using 2024.2.2 with Python 3.12. MemTest ran for a couple of hours, no problem detected. Error from supervisor: |
This is my commont part where it ends with segfault (3 crashed in recent 24h)
I suspect orjson ijl/orjson#458 |
afaik ijl/orjson#458 has been an issue since day 1, if you're seeing "new" crashes I think it's more likely to be related to ijl/orjson#457 |
This is most likely due to ijl/orjson#452. The crashes affect all orjson versions starting from 3.9.11. The crash rate was reduced in 3.9.14, but it's apparently still present. The HOC seems to use 3.9.14. Warning! Personal Opinion: |
An short update: after deleting all refresh-tokes, system is up for 8 hours. |
An update from my side. I did this and the system didn't reboot in a full day. I re-added the custom_components folder and performed a normal restart again and the system now is not crashing. I hope it remains like that.
I think this might merit for a new bug ticket:
|
It crashed again. core 2024.2.3 System Information
Home Assistant Community Store
Home Assistant Cloud
Home Assistant Supervisor
Dashboards
Recorder
|
Still an issue in core-2024.2.5 I'm wondering, could it be a Database related issue? I'm using the addon MariaDB Current version: 2.6.1. System Information
Home Assistant Community Store
Home Assistant Cloud
Home Assistant Supervisor
Dashboards
Recorder
|
There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. |
The problem
After 2024.02 upgrade, my HA randomly restarts on its own at least 2-3 times per day.
Haven't found a pattern to it. System resources usage seems normal. HA created a .fault log file which I'm attaching from the last occurence today. I'm also adding the log file from before and after the restart; can't see anything odd.
Based on the error seems to be some issue when using the multiprocessing in some core function.
I'm happy to provide more details.
I installed HA on a new SD card back on early December and was running just fine until the previous version, so shouldn't be tear-down related AFAIK.
What version of Home Assistant Core has the issue?
core-2024.2.1
What was the last working version of Home Assistant Core?
core-2024.1.6
What type of installation are you running?
Home Assistant OS
Integration causing the issue
Unknown (iperf3?)
Link to integration documentation on our website
Unknown
Diagnostics information
home-assistant.log
I'm changing the file extensions so that it can be uploaded.
home-assistant.1.log
home-assistant.fault.log
Example YAML snippet
N/A
Anything in the logs that might be useful for us?
Additional information
System Information
Home Assistant Community Store
Home Assistant Cloud
Home Assistant Supervisor
Dashboards
Recorder
The text was updated successfully, but these errors were encountered: