Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KDE/SDDM fails to start on NVIDIA proprietary driver v560.35.03 + Kernel 6.11.0 (Could not initialize egl/EGL not available) #344167

Closed
opl- opened this issue Sep 24, 2024 · 11 comments
Labels

Comments

@opl-
Copy link
Contributor

opl- commented Sep 24, 2024

Updating NixOS to nixpkgs c04d565 results in SDDM crashing on startup with "Could not initialize egl" and "EGL not available" errors logged in the journal.

Additional context

nixpkgs: c04d565
Kernel: v6.11.0
NVIDIA driver: v560.35.03 (crashes with both open and non-open kernel module)
KDE: v6.1.5 (wayland)
dGPU: NVIDIA RTX 3070 Ti Laptop

Previous working generation was running nixpkgs c374d94 (Linux kernel v6.10.6 with the beta v560.31.02 NVIDIA driver).

# configuration.nix
boot.kernelPackages = pkgs.linuxPackages_latest;
services.xserver.videoDrivers = [ "nvidia" ];
hardware.nvidia.package = config.boot.kernelPackages.nvidiaPackages.beta;
hardware.nvidia.modesetting.enable = true;
hardware.nvidia.open = true; # either crashes
hardware.nvidia.powerManagement.enable = true;
hardware.nvidia.powerManagement.finegrained = false;
hardware.nvidia.prime.sync.enable = true;
sudo journalctl -b -1 | grep sddm

Nearly identical with open and non-open kernel module, the only difference being the HDMI-A-1 display being named unknown.

sddm[1664]: Greeter session started successfully
sddm-helper-start-wayland[1874]: Starting Wayland process "/nix/store/yxy38krm4jpq9f4xbb3i31bszyp5dvv3-kwin-6.1.5/bin/kwin_wayland --no-global-shortcuts --no-kactivities --no-lockscreen --locale1" "sddm"
sddm-helper-start-wayland[1874]: started succesfully "/nix/store/yxy38krm4jpq9f4xbb3i31bszyp5dvv3-kwin-6.1.5/bin/kwin_wayland --no-global-shortcuts --no-kactivities --no-lockscreen --locale1"
sddm-helper-start-wayland[1874]: "No backend specified, automatically choosing drm\n"
sddm-helper-start-wayland[1874]: Directory "/run/user/175" has changed, checking for Wayland socket
sddm-helper-start-wayland[1874]: Found Wayland socket "/run/user/175/wayland-0"
sddm-helper-start-wayland[1874]: "Accepting client connections on sockets: QList(\"wayland-0\")\n"
sddm-greeter-qt6[1893]: High-DPI autoscaling Enabled
sddm-helper-start-wayland[1874]: "\"applications.menu\"  not found in  QList(\"/run/current-system/sw/etc/xdg/menus\")\n"
sddm-helper-start-wayland[1874]: "kwin_scene_opengl: Creating the OpenGL rendering failed:  \"Could not initialize egl\"\n"
sddm-greeter-qt6[1893]: Reading from "/nix/store/7j5hgwyngfx5vpdkyh29ar8bzg43xdip-desktops/share/wayland-sessions/plasma.desktop"
sddm-greeter-qt6[1893]: Reading from "/nix/store/7j5hgwyngfx5vpdkyh29ar8bzg43xdip-desktops/share/xsessions/plasmax11.desktop"
sddm-greeter-qt6[1893]: Loading theme configuration from "/run/current-system/sw/share/sddm/themes/breeze/theme.conf"
sddm-greeter-qt6[1893]: Connected to the daemon.
sddm[1664]: Message received from greeter: Connect
sddm-greeter-qt6[1893]: EGL not available
sddm-greeter-qt6[1893]: Loading file:///run/current-system/sw/share/sddm/themes/breeze/Main.qml...
sddm-greeter-qt6[1893]: failed to acquire GL context to resolve capabilities, using defaults..
sddm-greeter-qt6[1893]: Adding view for "HDMI-A-1" QRect(800,0 2048x1152)
sddm-greeter-qt6[1893]: Loading file:///run/current-system/sw/share/sddm/themes/breeze/Main.qml...
sddm-greeter-qt6[1893]: failed to acquire GL context to resolve capabilities, using defaults..
sddm-greeter-qt6[1893]: Adding view for "eDP-2" QRect(2848,0 1707x1067)
sddm-greeter-qt6[1893]: Loading file:///run/current-system/sw/share/sddm/themes/breeze/Main.qml...
sddm-greeter-qt6[1893]: failed to acquire GL context to resolve capabilities, using defaults..
sddm-greeter-qt6[1893]: Adding view for "Unknown-1" QRect(0,0 800x600)
sddm-greeter-qt6[1893]: Message received from daemon: Capabilities
sddm-greeter-qt6[1893]: Message received from daemon: HostName
sddm-greeter-qt6[1893]: QRhiGles2: Failed to create temporary context
sddm-greeter-qt6[1893]: QRhiGles2: Failed to create context
sddm-greeter-qt6[1893]: Failed to create RHI (backend 2)
sddm-greeter-qt6[1893]: Failed to initialize graphics backend for OpenGL.
systemd-coredump[2002]: Process 1893 (sddm-greeter-qt) of user 175 terminated abnormally with signal 6/ABRT, processing...
systemd-coredump[2003]: Process 1893 (sddm-greeter-qt) of user 175 dumped core.
                        Module sddm-greeter-qt6 without build-id.
                        #20 0x00000000004125b4 main (sddm-greeter-qt6 + 0x125b4)
                        #23 0x0000000000412a25 _start (sddm-greeter-qt6 + 0x12a25)
sddm-helper-start-wayland[1874]: wayland greeter finished 6 QProcess::CrashExit
sddm-helper-start-wayland[1874]: quitting helper-start-wayland
sddm-helper-start-wayland[1874]: Stopping... "/nix/store/yxy38krm4jpq9f4xbb3i31bszyp5dvv3-kwin-6.1.5/bin/kwin_wayland"
sddm-helper-start-wayland[1874]: wayland compositor finished 15 QProcess::NormalExit
sddm-helper-start-wayland[1874]: quitting helper-start-wayland
sddm-helper[1764]: [PAM] Closing session
sddm-helper[1764]: pam_systemd(sddm-greeter:session): New sd-bus connection (system-bus-pam-systemd-1764) opened.
drkonqi-coredump-processor[2004]: "/nix/store/shlcpqycfm5ni30aigipjfig8lxg112w-sddm-unwrapped-0.21.0/bin/sddm-greeter-qt6" 1893 "/var/lib/systemd/coredump/core.sddm-greeter-qt.175.8d57ab7e4618474cabfaa73d494e5ada.1893.1727162623000000.zst"
drkonqi-coredump-launcher[2034]: Unable to find file for pid 1893 expected at "kcrash-metadata/sddm-greeter-qt6.8d57ab7e4618474cabfaa73d494e5ada.1893.ini"
sddm-helper[1764]: [PAM] Ended.
sddm[1664]: Auth: sddm-helper exited successfully
sddm[1664]: Greeter stopped. SDDM::Auth::HELPER_SUCCESS
(sd-pam)[1790]: pam_unix(systemd-user:session): session closed for user sddm

The simple-framebuffer section is not present in the drmdevice output when using my previous system generation.

nix shell nixpkgs#libdrm^bin -c drmdevice
--- Checking the number of DRM device available ---
--- Devices reported 3 ---
--- Retrieving devices information (PCI device revision is ignored) ---
device[0]
+-> available_nodes 0x01
+-> nodes
|   +-> nodes[0] /dev/dri/card0
+-> bustype 0002
|   +-> platform
|       +-> fullname	simple-framebuffer
+-> deviceinfo
    +-> platform
        +-> compatible
                    simple-framebuffer

--- Opening device node /dev/dri/card0 ---
--- Retrieving device info, for node /dev/dri/card0 ---
device[0]
+-> available_nodes 0x01
+-> nodes
|   +-> nodes[0] /dev/dri/card0
+-> bustype 0002
|   +-> platform
|       +-> fullname	simple-framebuffer
+-> deviceinfo
    +-> platform
        +-> compatible
                    simple-framebuffer

device[1]
+-> available_nodes 0x05
+-> nodes
|   +-> nodes[0] /dev/dri/card2
|   +-> nodes[2] /dev/dri/renderD129
+-> bustype 0000
|   +-> pci
|       +-> domain 0000
|       +-> bus    01
|       +-> dev    00
|       +-> func   0
+-> deviceinfo
    +-> pci
        +-> vendor_id     10de
        +-> device_id     24a0
        +-> subvendor_id  1043
        +-> subdevice_id  1a8c
        +-> revision_id   IGNORED

--- Opening device node /dev/dri/card2 ---
--- Retrieving device info, for node /dev/dri/card2 ---
device[1]
+-> available_nodes 0x05
+-> nodes
|   +-> nodes[0] /dev/dri/card2
|   +-> nodes[2] /dev/dri/renderD129
+-> bustype 0000
|   +-> pci
|       +-> domain 0000
|       +-> bus    01
|       +-> dev    00
|       +-> func   0
+-> deviceinfo
    +-> pci
        +-> vendor_id     10de
        +-> device_id     24a0
        +-> subvendor_id  1043
        +-> subdevice_id  1a8c
        +-> revision_id   a1

--- Opening device node /dev/dri/renderD129 ---
--- Retrieving device info, for node /dev/dri/renderD129 ---
device[1]
+-> available_nodes 0x05
+-> nodes
|   +-> nodes[0] /dev/dri/card2
|   +-> nodes[2] /dev/dri/renderD129
+-> bustype 0000
|   +-> pci
|       +-> domain 0000
|       +-> bus    01
|       +-> dev    00
|       +-> func   0
+-> deviceinfo
    +-> pci
        +-> vendor_id     10de
        +-> device_id     24a0
        +-> subvendor_id  1043
        +-> subdevice_id  1a8c
        +-> revision_id   a1

device[2]
+-> available_nodes 0x05
+-> nodes
|   +-> nodes[0] /dev/dri/card1
|   +-> nodes[2] /dev/dri/renderD128
+-> bustype 0000
|   +-> pci
|       +-> domain 0000
|       +-> bus    00
|       +-> dev    02
|       +-> func   0
+-> deviceinfo
    +-> pci
        +-> vendor_id     8086
        +-> device_id     46a6
        +-> subvendor_id  1043
        +-> subdevice_id  1a8c
        +-> revision_id   IGNORED

--- Opening device node /dev/dri/card1 ---
--- Retrieving device info, for node /dev/dri/card1 ---
device[2]
+-> available_nodes 0x05
+-> nodes
|   +-> nodes[0] /dev/dri/card1
|   +-> nodes[2] /dev/dri/renderD128
+-> bustype 0000
|   +-> pci
|       +-> domain 0000
|       +-> bus    00
|       +-> dev    02
|       +-> func   0
+-> deviceinfo
    +-> pci
        +-> vendor_id     8086
        +-> device_id     46a6
        +-> subvendor_id  1043
        +-> subdevice_id  1a8c
        +-> revision_id   0c

--- Opening device node /dev/dri/renderD128 ---
--- Retrieving device info, for node /dev/dri/renderD128 ---
device[2]
+-> available_nodes 0x05
+-> nodes
|   +-> nodes[0] /dev/dri/card1
|   +-> nodes[2] /dev/dri/renderD128
+-> bustype 0000
|   +-> pci
|       +-> domain 0000
|       +-> bus    00
|       +-> dev    02
|       +-> func   0
+-> deviceinfo
    +-> pci
        +-> vendor_id     8086
        +-> device_id     46a6
        +-> subvendor_id  1043
        +-> subdevice_id  1a8c
        +-> revision_id   0c

Notify maintainers

@Kiskae @edwtjo

Metadata

Please run nix-shell -p nix-info --run "nix-info -m" and paste the result.

 - system: `"x86_64-linux"`
 - host os: `Linux 6.11.0, NixOS, 24.11 (Vicuna), 24.11.20240919.c04d565`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.18.5`
 - nixpkgs: `/nix/store/hiasfhl8f5yy88hcfbr3s8s4bm63wsjw-source`

Add a 👍 reaction to issues you find important.

@opl- opl- added the 0.kind: bug Something is broken label Sep 24, 2024
@opl-
Copy link
Contributor Author

opl- commented Sep 24, 2024

Linking issue #343774 as it might be related, but the errors in the logs given there differ from mine.

This comment on that issue links to an Arch forum thread, where someone explains the issue is caused by "simpledrm" not being automatically disabled by the NVIDIA driver due to header changes in kernel v6.11.0.

SDDM and KDE start correctly when testing the suggested workaround by adding initcall_blacklist=simpledrm_platform_driver_init to kernel parameters with the open kernel modules, but it causes console TTYs to freeze almost immediately during boot, eternally showing only the first two lines of boot logs. I think KDE crashed twice without the open kernel module.

To quickly test if this will fix the issue, I selected the NixOS generation with kernel v6.11.0 in grub, pressed [e], then added initcall_blacklist=simpledrm_platform_driver_init at the end of the text box at the bottom, separated from the rest by a space, and pressed [enter] to boot.

@opl-
Copy link
Contributor Author

opl- commented Sep 24, 2024

There's already a PR to the NVIDIA open-gpu-kernel-modules repository which adds support for the renamed kernel header files.

I tried to test it with the following NixOS configuration change after merging the PR into the v560.35.03 kernel module. I think this is technically incorrect as I'm not globally overriding the linuxPackages.nvidia_x11 package, but the Nix documentation again failed to assist me in doing that.

As a result SDDM was no longer crashing, but wasn't rendering correctly either, staying as a black screen. The only reason I realized it's running is because it briefly flashed (at the wrong resolution) when I switched to a console TTY.

After blindly entering my password into the black SDDM, KDE crashed with the errors from #343774 appearing in it.

I guess I'm finally experiencing the reasons why people always say not to run the latest kernel with NVIDIA proprietary drivers.

{ config, pkgs }: {
  # This does not work. Kind of.
  hardware.nvidia.open = true;
  hardware.nvidia.package = config.boot.kernelPackages.nvidiaPackages.beta.overrideAttrs {
    open = config.boot.kernelPackages.nvidiaPackages.beta.open.overrideAttrs {
      src = pkgs.fetchFromGitHub {
        owner = "opl-";
        repo = "open-gpu-kernel-modules";
        rev = "main";
        hash = "sha256-SzbXewSU1Mn8uFtLlDGiJKJSEkXBoTRpLlFzlvZiliU=";
      };
    };
  };
}

@opl-
Copy link
Contributor Author

opl- commented Sep 24, 2024

And indeed, Kernel v6.10.11 ({ boot.kernelPackages = pkgs.linuxPackages_6_10; }) works fine with NVIDIA proprietary v560.35.03 + open kernel module.

@VeilSilence
Copy link

Nvidia issue. Stay at 6.10 until new driver release.

TLATER added a commit to TLATER/dotfiles that referenced this issue Oct 12, 2024
TLATER added a commit to TLATER/dotfiles that referenced this issue Oct 12, 2024
NovaViper added a commit to NovaViper/NixConfig that referenced this issue Oct 18, 2024
- Pin kernel to 6.10.11 for ryzennova, due to NixOS/nixpkgs#344167
- Fix eza icon options, made it set to "auto" instead of true
- Removed qtbase, breaks Plasma 6.2.1 theming
- Remove python nose as it was deprecated and remove from nixpkgs
- Add space to transient prompt for oh-my-posh
- Switch from base xwaylandvideobridge to kdePackages.xwaylandvideobridge
- Remove gpg scdaemon settings as it seems to break Yubikey support
@djmaze
Copy link
Contributor

djmaze commented Oct 26, 2024

With kernel 6.11.5 (boot.kernelPackages = pkgs.linuxPackages_latest;), the latest beta nvidia driver seems to work for me. Running 24.05 stable, I did this:

    package = config.boot.kernelPackages.nvidiaPackages.mkDriver {
      version = "565.57.01";
      sha256_64bit = "sha256-buvpTlheOF6IBPWnQVLfQUiHv4GcwhvZW3Ks0PsYLHo=";
      sha256_aarch64 = "sha256-aDVc3sNTG4O3y+vKW87mw+i9AqXCY29GVqEIUlsvYfE=";
      openSha256 = "sha256-/tM3n9huz1MTE6KKtTCBglBMBGGL/GOHi5ZSUag4zXA=";
      settingsSha256 = "sha256-H7uEe34LdmUFcMcS6bz7sbpYhg9zPCb/5AmZZFTx1QA=";
      persistencedSha256 = "sha256-hdszsACWNqkCh8G4VBNitDT85gk9gJe1BlQ8LdrYIkg=";
    };

Need to disable nvidia-settings though because of a compilation error:

    # The nvidia-settings build is currently broken due to a missing
    # vulkan header; re-enable whenever?
    # 0384602eac8bc57add3227688ec242667df3ffe3the hits stable.
    nvidiaSettings = false;

Also, booting the system with an external monitor attached makes the system freeze instantly when loading the kernel on my device (ProArt PX13), so for now I disconnect it before booting the machine.

@Murazaki
Copy link

Murazaki commented Oct 28, 2024

Could not stay on 6.10 as I can´t rebuild with it "because it reached end of life upstream".
And KDE still crashing after a few minutes and refusing to reboot ?
Trying to switch to beta (565.57.01) like @djmaze.

no nvidia-settings build issue for me.

Edit: getting a nvidia driver mismatch issue...

Edit: fixed by deactivating boot nvidia modules. (or you can use nvidia_x11_beta)

# boot.extraModulePackages = [ config.boot.kernelPackages.nvidia_x11 ];
boot.extraModulePackages = [ config.boot.kernelPackages.nvidia_x11_beta ];

@Murazaki
Copy link

Confirming using nvidia 565.57.01 is much more stable than previous versions after several hours of running.
Fixed SDDM not booting.
Fixed KDE crash.
No issues with Electron and Firefox (might be due to Firefox update to 131 though).
Games building shaders and running properly on high perfs.

@mksafavi
Copy link
Contributor

Confirming using nvidia 565.57.01 is much more stable than previous versions after several hours of running.

Great 👍
Is that with kernel >6.11 ?

Could not stay on 6.10 as I can´t rebuild with it "because it reached end of life upstream".

I'm still on 6.10 on my nvidia machine. I didn't notice this issue.
I switched to 6.10 by this:

boot.kernelPackages = pkgs.linuxPackages_6_10;

@Murazaki
Copy link

Great 👍
Is that with kernel >6.11 ?

Yes, this is on latest :

$ uname -r
6.11.5

@NovaViper
Copy link
Contributor

I can confirm that v565.57.01 works with 6.11.5-xanmod1!

@RedEtherbloom
Copy link
Contributor

We can also confirm this on Kernel 6.11.5.
Troubles first began on 6.11, went away when we had to upgrade to 6.11.
Switching to the drivers beta branch fixed them for us again.

@opl- opl- closed this as completed Nov 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants