Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error updating versions higher than Home Assistant OS 8.0.dev20220321 #1830

Closed
pcartwright81 opened this issue Apr 3, 2022 · 34 comments
Closed
Labels
board/ova Open Virtual Appliance (Virtual Machine) bug

Comments

@pcartwright81
Copy link

pcartwright81 commented Apr 3, 2022

Describe the issue you are experiencing

When trying to update my Hyper V image to a version higher than Home Assistant OS 8.0.dev20220321. It always gets stuck on the grub bootloader. If I select try 0 it says missing, try 1 is the old image. Tried updating from 7.6 and it failed also.

What operating system image do you use?

generic-x86-64 (Generic UEFI capable x86-64 systems)

What version of Home Assistant Operating System is installed?

8.0.dev20220321

Did you upgrade the Operating System.

Yes

Steps to reproduce the issue

  1. Click update for OS
  2. Open Hyper-V

...

Anything in the Supervisor logs that might be useful for us?

No

Anything in the Host logs that might be useful for us?

No

System Health information

System Health

version core-2022.5.0.dev20220403
installation_type Home Assistant OS
dev true
hassio true
docker true
user root
virtualenv false
python_version 3.9.9
os_name Linux
os_version 5.15.25
arch x86_64
timezone America/Chicago
Home Assistant Community Store
GitHub API ok
GitHub Content ok
GitHub Web ok
GitHub API Calls Remaining 4907
Installed Version 1.24.2
Stage running
Available Repositories 1078
Downloaded Repositories 10
Home Assistant Cloud
logged_in true
subscription_expiration March 28, 2023, 8:00 PM
relayer_connected true
remote_enabled true
remote_connected true
alexa_enabled false
google_enabled true
remote_server us-east-1-3.ui.nabu.casa
can_reach_cert_server ok
can_reach_cloud_auth ok
can_reach_cloud ok
Home Assistant Supervisor
host_os Home Assistant OS 8.0.dev20220321
update_channel dev
supervisor_version supervisor-2022.03.dev3104
docker_version 20.10.12
disk_total 62.3 GB
disk_used 13.4 GB
healthy true
supported true
board ova
supervisor_api ok
version_api ok
installed_addons Check Home Assistant configuration (3.10.0), Dnsmasq (1.4.4), Duck DNS (1.14.0), Home Assistant Google Drive Backup (0.106.2), NGINX Home Assistant SSL proxy (3.1.1), Samba share (9.5.1), phpMyAdmin (0.7.1), MariaDB (2.4.0), ESPHome (2022.3.1), DHCP server (1.2), Eufy Security Add-on (0.8.4)
Dashboards
dashboards 5
resources 3
views 5
mode storage

Additional information

No response

@agners agners added the board/generic-x86-64 Generic x86-64 Boards (like Intel NUC) label Apr 3, 2022
@agners
Copy link
Member

agners commented Apr 3, 2022

So 8.0.dev20220321 itself works? Obviously, the change to the GRUB2 bootloader could be the culprit, but 8.0.dev20220321 already uses the GRUB2 bootloader...

If I select try 0 it says missing

What is it missing exactly?

@agners
Copy link
Member

agners commented Apr 3, 2022

Btw, for Hyper-V OVA should be the preferred image as its made for virtualization environments.

@pcartwright81
Copy link
Author

pcartwright81 commented Apr 4, 2022

Sorry, yes. The OVA started as hassos_ova-5.10.vhdx. Also tried starting with haos_ova-7.6.vhdx.zip, and experienced the same issue. Wrote the error message from memory, so took screenshots of the grub loader and the error. Smart status and chkdsk says my drive is good, and using a new ova should have fixed the corruption if there was any. Also tried a clean vm and same issue.

image
image

@agners agners added board/ova Open Virtual Appliance (Virtual Machine) and removed board/generic-x86-64 Generic x86-64 Boards (like Intel NUC) labels Apr 4, 2022
@agners
Copy link
Member

agners commented Apr 4, 2022

In a quick try I installed Hyper-V on Windows 11 and was able to upgrade from a new OS 7.6 to 8.0.dev20220331. Which version of Windows are you using? Can you reproduce this every time when upgrading from OS 7.6?

@pcartwright81
Copy link
Author

Yes, I am able to reproduce every time. Even creating a new VM with 7.6 as base did not work.
My test was probably the same as yours:
Create a Gen2 VM with haos_ova-7.6.vhdx as the drive used, and disable secure boot.
Start VM
Type OS update
I am also using Windows 11.
image

@agners
Copy link
Member

agners commented Apr 4, 2022

Hm, 22H2 seems to be a insider preview, could it be a Hyper-V bug?

I use 21H2, and just updated 7.6 again, this time to 8.0.dev20220404 it seems to work fine on my system.

@pcartwright81
Copy link
Author

Could be, after getting the error updating from 7.6 directly to 8.0.dev20220321 I am looking at updates and changes that have happened since then. Also going to put the OVA on another hard drive and see if it is some type of corruption that neither checkdisk or full SMART diagnostic could find.

@agners
Copy link
Member

agners commented Apr 5, 2022

Could also be on the read side, e.g. that the new GRUB2 binary triggers some Hyper-V bug which causes the read failure.

@arikhris
Copy link

arikhris commented Apr 9, 2022

I get the same problem with Intel Nuc when i updated it wouldnt boot into Hassio i plugged it in to monitor and it came up with grub error, i thought it was SSD failure so i bought a new one still had the same problem put 7.6 on the old SSD and worked straight away

@nubicula
Copy link

nubicula commented Apr 9, 2022

I have also had the same on intel nuc. Tried twice to upgrade. Booted with Live image and overwrote the first 5 partitions using the first 3 from the previous image and system back to working state.

Upgrade attempt to 8.0 rc1

@agners
Copy link
Member

agners commented Apr 11, 2022

@arikhris @nubicula which version did you try upgrading to?

@arikhris
Copy link

@arikhris @nubicula which version did you try upgrading to?

8 rc1 i think i also downloaded the image too and put it on a new ssd which was giving me the same error

@agners
Copy link
Member

agners commented Apr 12, 2022

@arikhris @nubicula what operating system, exact versions and Hypervisor are you using?

I tried again to reproduce this here, unfortunately unsuccessful. I need to have exact configurations and steps to reproduce the issue on my end, otherwise I won't be able to address this.

@nubicula
Copy link

nubicula commented Apr 12, 2022 via email

@pcartwright81
Copy link
Author

pcartwright81 commented Apr 12, 2022

I am still the exact opposite, and can reproduce it every time.
This is my process
Create a new vm using the instructions from https://www.home-assistant.io/installation/windows on the Hyper-V tab
Disable secure boot.
Boot Machine & let sit for 5-10 minutes.
Open the shell
supervisor options --channel dev
supervisor reload
OS Update
The good news is 2 systems have the same partitions, but I do not know where to start troubleshooting.
Is there a way to disable the reboot after OS update? I think this is a write issue and not a read issue.
Just tried 8.0RC1 to latest dev version and it failed also.
Slot A Both
Slot B Both

@agners
Copy link
Member

agners commented Apr 12, 2022

Hello, I’ve been using the standard generic x86-x64 Home Assistant image directly on the nuc. Currently running 7.6. System has been upgrade several times so cannot remember original install version. Would have been built using the version at the time, but from equivalent versioned url : https://github.com/home-assistant/operating-system/releases/download/7.6/haos_generic-x86-64-7.6.img.xz

Hm, I do have a Intel NUC Gen11 here which has upgraded to 8.0.rc1 fine.

@nubicula was the failing sector number the same (0x149c9)? Can you share the exact model number and hard drive you are using on your Intel NUC?

@agners
Copy link
Member

agners commented Apr 12, 2022

The good news is 2 systems have the same partitions, but I do not know where to start troubleshooting.

Not sure if its possible, but can you downgrade Windows 11 back to a stable (non-insider) version?

Is there a way to disable the reboot after OS update? I think this is a write issue and not a read issue.
Just tried 8.0RC1 to latest dev version and it failed also.

You can manually execute the update using the rauc command. E.g. use the following commands:

cd /mnt/data
curl -L -O https://github.com/home-assistant/operating-system/releases/download/8.0.rc1/haos_ova-8.0.rc1.raucb
rauc install haos_ova-8.0.rc1.raucb

@pcartwright81
Copy link
Author

Tried on Windows 10 with Configuration Version 9 and it works great. Exported the machine and imported into Windows 11 and had the same boot issue. Also on Windows 11 using rauc the update is successful.
RAUC

@nubicula
Copy link

nubicula commented Apr 16, 2022

I tried upgrading to 8.0 rc2 this morning and these are the screenshots :

image

image

This is running on a NUC7i7DNB board which is in model NUC7i7DNKE system, with a Samsung SSD 970 Evo Plus.

@derekcannady
Copy link

I am receiving exact same error code on updating to 8.0rc1. Of note, I am also using a Samsung SSD 970 Evo Plus with same failed sector on hd0

@agners
Copy link
Member

agners commented Apr 19, 2022

@derekcannady

I am also using a Samsung SSD 970 Evo Plus with same failed sector on hd0

The same as seen in which report? From What I can see the OP reported errors on sector 0x149c0, then we have 0x9c900 and 0x12a00 from @nubicula .

@nubicula do those values change on every boot in your case?

@pcartwright81 how is it in your case is it the same sector in every boot?

It seems that GRUB2 has problems reading some sectors on those systems. I assume that all systems did work with HAOS 7.x, so it seems that there isn't a "hardware" problem per se, and its likely a software bug. That surprises me a bit since GRUB2 is the most widely used boot loader.

agners added a commit to agners/operating-system that referenced this issue Apr 20, 2022
@agners
Copy link
Member

agners commented Apr 20, 2022

One thing I've noticed is that all sectors reported here are in the kernel squashfs partition (0x12a00 and 0x149c0 are in the first squashfs partition /dev/sda2, 0x9c900 is in the second squashfs partition /dev/sda4).

Device       Start       End   Sectors   Size Type
/dev/sda1     2048     67583     65536    32M EFI System
/dev/sda2    67584    116735     49152    24M Linux filesystem
/dev/sda3   116736    641023    524288   256M Linux filesystem
/dev/sda4   641024    690175     49152    24M Linux filesystem
/dev/sda5   690176   1214463    524288   256M Linux filesystem

We use compressed squashfs file system, which might be not as well tested in GRUB2. I found a single bugfix in newest GRUB2 code in that area and added it with #1858. This is a rather long shot, but who knows 😅

I'll create a new development build today.

@agners
Copy link
Member

agners commented Apr 20, 2022

The development build is available on the development channel and downloadable from here:
https://os-builds.home-assistant.io/8.0.dev20220420/

Let me know if that fixes it for any one of you!

@adonno
Copy link

adonno commented Apr 29, 2022

just tagging along I just installed the most recent beta ( i think it's beta since I'm in the beta channel? i think it was rc3) anyways
yeah following this thread since I'm having the same issue

erro reading sector 0xa01c0 from 'hd0'

@pcartwright81
Copy link
Author

I deleted the previous vhdx that was having issues, and downloaded haos_ova-8.0.dev20220414 which boots slot A.
image
I have tried updating to 8.0.dev20220420 and 8.0.dev20220427 which should boot as slot B. Tried 2x and the 0x9c900 was having issues.
image

@agners
Copy link
Member

agners commented Apr 29, 2022

@pcartwright81 it is interesting that you can reproduce this on Hyper-V but I don't see it here, despite the same configuration (like you I set it up as documented https://www.home-assistant.io/installation/windows).

What kind of host system are you using?

@pcartwright81
Copy link
Author

X570-A PRO board
AMD Ryzen 7 3700X 4 Processors dedicated to the VM
32 GB DDR4 2GB dedicated to the VM
Windows 11 22H2
VM Version 11
Main drive WDS100T1X0E-00AFY0

@arikhris
Copy link

arikhris commented Apr 29, 2022 via email

@agners
Copy link
Member

agners commented Apr 29, 2022

@adonno and I tried with GRUB2 2.06 from this Debian package, however GRUB showed the same read error. So it really looks like a bug in upsream GRUB2. As soon as I have access to a system (probably sometime next week), I'll try to get to the bottom of it.

agners added a commit to agners/operating-system that referenced this issue May 5, 2022
@agners
Copy link
Member

agners commented May 6, 2022

WIth 8.0.rc4 GRUB2 now has a bug fix which should make it boot on all affected systems as well. Please speak up if 8.0.rc4 does not work f you!

Thanks @nubicula for your help to get this fixed!

@agners agners closed this as completed May 6, 2022
@derekcannady
Copy link

Still having the same issue as previously mentioned.

@agners
Copy link
Member

agners commented May 9, 2022

@derekcannady using HAOS 8.0.rc4? What system are you using and how is your disk drive attached?

@agners
Copy link
Member

agners commented May 9, 2022

@pcartwright81 can you check if the latest release 8.0.rc4 works for you?

@pcartwright81
Copy link
Author

Home Assistant OS 8.0.dev20220505 works 8.0.rc4 works also installed by rauc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
board/ova Open Virtual Appliance (Virtual Machine) bug
Projects
None yet
Development

No branches or pull requests

6 participants