Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GRUB failing to load kernel on Intel Atom boards (Intel NM10 chipset) #3305

Closed
HAPSagan opened this issue Apr 12, 2024 · 118 comments · Fixed by #3324
Closed

GRUB failing to load kernel on Intel Atom boards (Intel NM10 chipset) #3305

HAPSagan opened this issue Apr 12, 2024 · 118 comments · Fixed by #3324
Labels
board/generic-x86-64 Generic x86-64 Boards (like Intel NUC) bug

Comments

@HAPSagan
Copy link

Describe the issue you are experiencing

I see GNU GRUB with 4 options - Slot A, Slot B, Slot A rescue shell, Slot B rescue shell.
Selecting any of them results in a message that it's unable to boot.
I dont get any CLI options.
I have used Linux Reader to download the backups from the disk and have then tried to do a fresh installation with OS12.2. The result is the same as after the update - unable to boot from any slot. After that I did a fresh install with OS 12.1 and everything started fine again.

What operating system image do you use?

generic-x86-64 (Generic UEFI capable x86-64 systems)

What version of Home Assistant Operating System is installed?

12.1

Did you upgrade the Operating System.

Yes

Steps to reproduce the issue

  1. Do a fresh install with OS 12.2 or update to OS 12.2 and it's unable to boot.

...

Anything in the Supervisor logs that might be useful for us?

Can´t get a log.

Anything in the Host logs that might be useful for us?

Can´t get a log.

System information

No response

Additional information

No response

@HAPSagan HAPSagan added the bug label Apr 12, 2024
@ahmetem
Copy link

ahmetem commented Apr 12, 2024

same problem . HA OS doesn't boot when updating from 12.1 to 12.2

@Mpgod80
Copy link

Mpgod80 commented Apr 12, 2024

Same issue here. Going from 12.1 to 12.2 then I can´t boot.

@Onepamopa
Copy link

Same here... When the VM reboots after the update - it selects "Slot B" which has "unknown filesystem".
I have to manually select "Slot A" to boot - it boots 12.1 w/o an update.

@agners
Copy link
Member

agners commented Apr 12, 2024

@HAPSagan @ahmetem @Mpgod80 you all are running on native x86-64 hardware? What hardware are you using?

@Onepamopa can you open a new issue along with information of your virtualization environment? It also seems yours behaved different as the old boot slot still worked (unlike the case OP reported).

@agners agners added the board/generic-x86-64 Generic x86-64 Boards (like Intel NUC) label Apr 12, 2024
@ahmetem
Copy link

ahmetem commented Apr 12, 2024

Its features are below. It was working smoothly until the last update.
tried to do a fresh installation with OS12.2. but the result did not change.
Slot A, Slot B, Slot A rescue shell, Slot B rescue shell .none of them work.

MacBook Air (11-inch, Mid 2011) a1370
64GB flash storage
1.6GHz dual-core Intel Core i5
2GB of 1333MHz DDR3 onboard memory
Advanced Intel HD Graphics 3000.

@agners
Copy link
Member

agners commented Apr 12, 2024

On a new installation, what happens exactly when you choose Slot A with HAOS 12.2? Anything written on the screen? Black screen? Reset?

@Mpgod80
Copy link

Mpgod80 commented Apr 12, 2024

@agners
The hardware both me and HAPSagan is runing on is:

Intel Atom CPU D2500 @ 1,86Ghz 4,0 Gb RAM 60gb SSD.
Runing HAOS img directly on the SSD drive with an image flashed in Balena Etcher.

@ahmetem
Copy link

ahmetem commented Apr 12, 2024

When I try to make a new installation, the same list appears.
Slot A (OK=0 TRY=0)
Slot B (OK=0 TRY=0) and after the selection is made, a line flashes on the screen.

@Onepamopa

This comment was marked as off-topic.

@henrikcaesar
Copy link

Same here 😱 Also running an old Intel atom with cpu integrated on the motherboard. Can’t get any info out of the system. If it would help I can get a live Linux usb.

@ahmetem
Copy link

ahmetem commented Apr 12, 2024

i boot ha with Super Grub2 Disk image on usb disk. show all boot menu. and selected slot a. it open normaly. but i dont know How do I make this permanent? or fix it.
i think re install grub. or maybe downgrade 12.1 fix it.

@coc
Copy link

coc commented Apr 12, 2024

Same problem on a ThinkCentre m93p.

@SellSan
Copy link

SellSan commented Apr 13, 2024

I Have HP T610 and the same issue. The only I can do I can run command from this window and can edit these 4 options parameters.
20240413_085029

@Onepamopa

This comment was marked as off-topic.

@Mpgod80
Copy link

Mpgod80 commented Apr 13, 2024

I Have HP T630 and the same issue. The only I can do I can run command from this window and can edit these 4 options parameters. 20240413_085029

I have the same screen but when I select "Slot A" it boots into 12.1. Have you tried?

Tried that but that wont work for me :/

@SellSan
Copy link

SellSan commented Apr 13, 2024

I Have HP T630 and the same issue. The only I can do I can run command from this window and can edit these 4 options parameters. 20240413_085029

I have the same screen but when I select "Slot A" it boots into 12.1. Have you tried?

I tried, none of the options worked...

@JitteryDoodle
Copy link

I have an Intel NUC D54250WYK and also experienced this issue when upgrading to 12.2. Luckily, selecting Slot A booted back into 12.1.

@Omri-Kleynhans
Copy link

Even downloading haos_generic-x86-64-12.2.img and created new installation does not work. Boot not possible. Same error as afterupgrade from 12.1 to 12.2

@Onepamopa

This comment was marked as off-topic.

@agners
Copy link
Member

agners commented Apr 13, 2024

It sounds like a GRUB2 issue of some kind. That got updated from GRUB 2.06 to 2.12 as far as I can tell.

FWIW, the rc (and release) versions have been tested and are working fine on various Intel NUC systems. It seems a particular hardware/BIOS which causes issue.

It kinda reminds me of that bug fix we added in 8.0.rc4 (#1830 (comment)), but it seems that is applied upstream in GRUB 2.12 today, so that should no longer be the problem 🤔

Can you replace the U-Boot binary (on the first partition of the disk at /EFI/BOOT/bootx64.efi) with the version from HAOS 12.1 and see if this boots?

@Omri-Kleynhans
Copy link

Omri-Kleynhans commented Apr 13, 2024

It sounds like a GRUB2 issue of some kind. That got updated from GRUB 2.06 to 2.12 as far as I can tell.

FWIW, the rc (and release) versions have been tested and are working fine on various Intel NUC systems. It seems a particular hardware/BIOS which causes issue.

It kinda reminds me of that bug fix we added in 8.0.rc4 (#1830 (comment)), but it seems that is applied upstream in GRUB 2.12 today, so that should no longer be the problem 🤔

Can you replace the U-Boot binary (on the first partition of the disk at /EFI/BOOT/bootx64.efi) with the version from HAOS 12.1 and see if this boots?

Yes have done that, replaced all files at `/EFI/BOOT/'. It does work. Server can boot again HAS back up. Thanks

@SellSan
Copy link

SellSan commented Apr 13, 2024

It sounds like a GRUB2 issue of some kind. That got updated from GRUB 2.06 to 2.12 as far as I can tell.
FWIW, the rc (and release) versions have been tested and are working fine on various Intel NUC systems. It seems a particular hardware/BIOS which causes issue.
It kinda reminds me of that bug fix we added in 8.0.rc4 (#1830 (comment)), but it seems that is applied upstream in GRUB 2.12 today, so that should no longer be the problem 🤔
Can you replace the U-Boot binary (on the first partition of the disk at /EFI/BOOT/bootx64.efi) with the version from HAOS 12.1 and see if this boots?

Yes have done that, replaced all files at `/EFI/BOOT/'. It does work. Server can boot again HAS back up. Thanks

Hi, could you share where from I can download that file?

@pierrepaap
Copy link

pierrepaap commented Apr 13, 2024

same issue.
hardware : Intel D525MW motherboard (bios from 2010)
a bit more history there https://community.home-assistant.io/t/os-12-2-upgrade-left-ha-on-grub-menu-unbootable/715974/17

I'm surprised though that the 'backup' entry also does not start. I would assume that the EFI file would still be from 12.1 ?

@Onepamopa

This comment was marked as off-topic.

@pierrepaap
Copy link

pierrepaap commented Apr 14, 2024

So... any viable solution to update to 12.2 ? Or we should wait for 12.3..

well...
after replacing the uboot efi with the one of the 12.1 image I am now in 12.2
Obivously a hassle. So 'im going to stay like this until this bug is fixed

 Core 2024.4.2
Supervisor 2024.04.0
Operating System 12.2
Frontend 20240404.1

EDIT: corrected typo 'I am now in 12.2'

@Onepamopa

This comment was marked as off-topic.

@pierrepaap
Copy link

So... any viable solution to update to 12.2 ? Or we should wait for 12.3..

well... after replacing the uboot efi with the one of the 12.1 image I am not in 12.2 Obivously a hassle. So 'im going to stay like this until this bug is fixed

 Core 2024.4.2
Supervisor 2024.04.0
Operating System 12.2
Frontend 20240404.1

Ugh.... it says you're on 12.2 ?

typo... I am NOW in 12.2

@agners
Copy link
Member

agners commented Apr 15, 2024

I'm surprised though that the 'backup' entry also does not start. I would assume that the EFI file would still be from 12.1 ?

We only ship a single boot loader. In a way, we rely on the bootloader to implement the backup boot method. But if that new bootloader fails in some ways, it is essentially game over 😢

Hi, could you share where from I can download that file?

These are the two U-Boot boot loader files from HAOS 12.1 (replace the existing ones in /EFI/BOOT on the first boot partition with these files):
grub-2.06-haos-12.1.zip

Are those affected UEFI BIOS'es maybe 32-bit BIOSes? E.g. can someone check if only replacing bootia32.efi helps (so maybe this is related to #1752)?

@sairon
Copy link
Member

sairon commented Aug 14, 2024

@ChernyaevAN Right, I know what's wrong 🤦 I will create a patch for that, however, that means the original patch only works on D525 😬 Unfortunately, no one with the affected boards answered my call for testing or tried the RC builds of 13.0.

sairon added a commit that referenced this issue Aug 14, 2024
Report from @ChernyaevAN in [1] revealed it's 13829424153406670433 in decimal,
which means the endianness was wrong for the CPUs I didn't have available for
testing.

[1] #3305 (comment)
@ChernyaevAN
Copy link

@ChernyaevAN Right, I know what's wrong 🤦 I will create a patch for that, however, that means the original patch only works on D525 😬 Unfortunately, no one with the affected boards answered my call for testing or tried the RC builds of 13.0.

Sorry, but I'm not so advanced to try RC's.
Do I understand correctly that I need just to wait patch and manual?

sairon added a commit that referenced this issue Aug 14, 2024
Report from @ChernyaevAN in [1] revealed it's 13829424153406670433 in decimal,
which means the endianness was wrong for the CPUs I didn't have available for
testing.

[1] #3305 (comment)
@sairon
Copy link
Member

sairon commented Aug 14, 2024

@ChernyaevAN It's a bit more complicated, since the current GRUB installed on your machine is faulty. You will need to connect the drive to a different PC, or boot any live USB distro on your Atom, and copy the files from the following archive to overwrite those in /EFI/BOOT folder in the boot partition of HAOS.

grub2-nm10-fixed.zip

Sorry for the complications 😢

@ChernyaevAN
Copy link

@ChernyaevAN It's a bit more complicated, since the current GRUB installed on your machine is faulty. You will need to connect the drive to a different PC, or boot any live USB distro on your Atom, and copy the files from the following archive to overwrite those in /EFI/BOOT folder in the boot partition of HAOS.

grub2-nm10-fixed.zip

Sorry for the complications 😢

This is not the problem. I will post after installation.

@ahmetem
Copy link

ahmetem commented Aug 14, 2024

The same problem occurred with 13.0. There was no problem with version 12.4.
i boot ha with Super Grub2 Disk image on usb disk. show all boot menu. and selected slot a. it open normaly.
after downgrade 12.4.boot is normaly start again.

MacBook Air (11-inch, Mid 2011) a1370
64GB flash storage
1.6GHz dual-core Intel Core i5
2GB of 1333MHz DDR3 onboard memory
Advanced Intel HD Graphics 3000.

@ChernyaevAN
Copy link

@ChernyaevAN It's a bit more complicated, since the current GRUB installed on your machine is faulty. You will need to connect the drive to a different PC, or boot any live USB distro on your Atom, and copy the files from the following archive to overwrite those in /EFI/BOOT folder in the boot partition of HAOS.

grub2-nm10-fixed.zip

Sorry for the complications 😢

It works after patching but offers me to upgrade. Do I just need to wait for the new operation system version?

@sairon
Copy link
Member

sairon commented Aug 14, 2024

@ChernyaevAN It should be safe to select the other boot slot with OS 13.0 in GRUB boot menu, or run ha os boot-slot other. But definitely don't upgrade to OS 13.0 again, wait at least for 13.1.rc1 which should fix that.

@ahmetem Can you please also follow the instructions above to get the processor ID?

@randallzapata
Copy link

I am able to get into grub, but I can't get HAOS to boot linux.

I installed
https://os-artifacts.home-assistant.io/13.1.dev20240816/haos_generic-x86-64-13.1.dev20240816.img.xz

Also, I tried the grub patch, and it looked the same as what I have.

I am running on an AtomMan X7 Ti with a Intel® Core™ Ultra 9 processor 185H.

smbios --type 4 --get-qword 8
13829424153407129252

When I am in debug mode, this is what I get.

loader/efi/linux.c:218:linux: * linux command line:
BOOT_IMAGE=(hd0,gpt2)/bzImage root=PARTUUID=8d3d53a9-6d49-4c38-8349-aff6859e82fd rootwait zram.enabled=1
zram.num_devices=3 systemd.machine_id= fsck.repair=yes
systemd.condition=first-boot=true systemd.firstboot=tty0 rauc.slot=A debug'
loader/efi/linux.c:238:linux: starting image 0x5e9fad18

It just does nothing after this.

@sairon
Copy link
Member

sairon commented Aug 20, 2024

@randallzapata It's a modern system, so I don't expect it to have problems with the EFI kernel loader. It looks rather like this issue, you can try disabling VT-d in the BIOS settings too. Let's follow up there if it helps, or please create a new issue for your system.

@ChernyaevAN
Copy link

I have updated to HAOS 13.1. Everything works fine. Thank you.

@VerbruggenBart
Copy link

Just wanted to let you know I still had this problem with the latest update and a Lenovo Thinkpad [11e (Type 20D9, 20DA)]

I used the download above (grub-2.06-haos-12.1.zip together with a debian live usb stick to copy both files to the BOOT directory, which seems to have fixed it again (temporarily)...

@pitzer
Copy link

pitzer commented Sep 11, 2024

I also ran into this issue on my 2011 Mac Mini running Generic x86-64 when updating HAOS a week ago. It runs a 2.3GHz dual-core Intel Core i5 CPU. Reverting grub via bootable USB solved the issue. Thank you for the pointers in this thread!

@agners
Copy link
Member

agners commented Sep 12, 2024

@VerbruggenBart hm, looking at this datasheet, that seems a Celeron CPU from the same era. Can you run the commands @sairon pointed out in #3305 (comment)?

@sairon
Copy link
Member

sairon commented Sep 12, 2024

@pitzer Could you please also send me the CPU ID reported on the Mac Mini, using the smbios command from the above post? It is probably the same platform as @ahmetem reported earlier it's having issues but no one shared the required information so far.

@ahmetem
Copy link

ahmetem commented Sep 17, 2024

@sairon i try smbios --type 4 --get-qword 8 but give error.
error : cant find command 'smbios' maybe situation is different for Mac.

@agners
Copy link
Member

agners commented Sep 18, 2024

@ahmetem did you run the command from the GRUB2 command line? Note you need to install the HAOS 13.0 version to get the smbios command.

@ahmetem
Copy link

ahmetem commented Sep 18, 2024

yes i run GRUB2 command line but my HAOS 12.4 version. i will update and try again.

@ahmetem
Copy link

ahmetem commented Sep 19, 2024

@sairon smbios --type 4 --get-qword 8
13829424153406604967

sairon added a commit that referenced this issue Sep 25, 2024
Fix loading issues on this Intel-based platform as well. As described in the
patch commit message, there will be likely bigger collateral effect by
referring just to the CPU ID but it shouldn't have major detrimental effects.

[1] #3305 (comment)
sairon added a commit that referenced this issue Sep 25, 2024
Fix loading issues on this Intel-based platform as well. As described in the
patch commit message, there will be likely bigger collateral effect by
referring just to the CPU ID but it shouldn't have major detrimental effects.

[1] #3305 (comment)
@KnzHz
Copy link

KnzHz commented Sep 27, 2024

I had this issue updating to 13.1 last week on a Dell Wyse 5070. Using the rescue files mentioned here fixed it. Need any data from me Og can I safely try to upgrade again?

I tried to go from 12.4 to 13.1.

@okh-mzny
Copy link

I too have a broken UEFI platform that doesnt load the kernel.
HP T640 Thin Client with Ryzen Embedded R1505G CPU.

grub> smbios --type 4 --get-qword 8
1696726757278814081

@MYF540
Copy link

MYF540 commented Nov 20, 2024

Same here, Lenovo ThinkCentre M715q with a Ryzen 5 2400GE

grub> smbios --type 4 --get-qword 8
1696726757278813968

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
board/generic-x86-64 Generic x86-64 Boards (like Intel NUC) bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.