-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to Linux 5.17 drivers #236
Conversation
Tried to compile and run it and I get
If it's too early for testing, just ignore me :o) |
The backport is finished, but polishing it will take a significant amount of work: I need to work on/revisit the framebuffer/vt(4) integration code. The |
Driver loads and works fine with Xorg and resume/suspend is without glitches on amdgpu laptop that I have. |
@mekanix I'm getting the error you did. What did you do to address it? |
I had to check again, and I was on a wrong branch, so same error as before. |
I ran into the same issue as above, so I tried to fix it. I added the following shot-in-the-dark patch to fix the linking error: diff --git a/drivers/gpu/drm/drm_sysfs.c b/drivers/gpu/drm/drm_sysfs.c
index 036aaf7a28..312d538231 100644
--- a/drivers/gpu/drm/drm_sysfs.c
+++ b/drivers/gpu/drm/drm_sysfs.c
@@ -31,6 +31,7 @@
#include <drm/drm_device.h>
#include <drm/drm_print.h>
#include <drm/drm_sysfs.h>
+#include <drm/drm_property.h>
#include "drm_internal.h"
@@ -97,6 +98,23 @@ void drm_sysfs_connector_hotplug_event(struct drm_connector *connector)
sbuf_delete(sb);
}
+void drm_sysfs_connector_status_event(
+ struct drm_connector *connector, struct drm_property *property)
+{
+ struct drm_device *dev = connector->dev;
+ struct sbuf *sb = sbuf_new_auto();
+
+ DRM_DEBUG("generating status event\n");
+
+ sbuf_printf(sb, "cdev=dri/%s connector_id=%u connector_name=\"%s\""
+ "property_name=\"%s\"",
+ dev_name(dev->primary->kdev), connector->base.id, connector->name,
+ property->name);
+ sbuf_finish(sb);
+ devctl_notify("DRM", "CONNECTOR", "STATUS", sbuf_data(sb));
+ sbuf_delete(sb);
+}
+
static void drm_sysfs_release(struct device *dev)
{
kfree(dev); Then, drm loaded without issue but attempting to load i915kms seems to crash my entire system (alder lake-p). The screen goes black or freezes, nothing appears in the logs, remote sessions immediately hang, no crash dump is produced, and (to my eye) nothing shows up in the syslogs (booted verbose, logged over the network):
I would like to be useful on this project! Please let me know how I can help or how I can begin debugging this issue. |
Immediately after I sent this, I figured out that after the system hangs, I can trigger the break-to-debugger keystroke and type in the dump command even though the display isn't working. I now have a core dump from after the driver is loaded. I see a bunch of i915 threads and I will look into what they're doing, but help would still be appreciated :) |
Discovered the DRM_DEBUG_LOG_ALL flag. Here's the logs from that: /var/log/messages exerpt
|
Thank you @rhelmot for testing! Sorry I didn't push everything to both my freebsd-src and drm-kmod forks as it's really not working at all currently :-) I made several temporary/testing commits that are not made to be pushed to GitHub... Once I have something closer to a testable version, I will push again and ask for more help :-) |
Thanks for the update. Is there anything I can do to facilitate development? Also, what revision of 5.17 are you targeting? I tried debugging this by comparing the boot logs to my linux 5.17 install on this laptop (ubuntu) and the thing that popped out at me was that linux is using dmc firmware v2.14 and this port seems to want to use v2.12. When I ran the diff tool myself, I found a huge number of unported changes, many of which look like correctness fixes. I attempted to port all the changes I could find which seemed relevant to alder lake, and upgraded the firmware version, but haven't gotten any different results. |
Patches come from Linux I selfishly admit I would like to find the problem myself for the challenge and "adrenaline". I'm working on the backport since Linux 5.11 patches and this has been uneventful so far. I kind of need/want to go through this and find the issue. |
Quick update on the backport of 5.17.
|
f7b9ef3
to
da06239
Compare
The rewrite of the vt(4) integration layer is almost finished, but it's already working for me; see #243. The Linux 5.17 backport branch was rebased on top of it. It will be required anyway. |
@dumbbell thank you so much for your hard work! |
For now, I still need to work on 5.17 which is a lot harder to port compared to all previous versions. Once this is done, I'm not sure. Probably, I'll continue with 5.18, 5.19 and so on. That would be really cool to be in sync with Linux! I'm glad that it was easier than expected to fix the vt(4) integration, though there are still a few things left. |
After loading amdgpu, my laptop seams frozen. SSH works, so I can do stuff to figure out what's going on. Relevant
|
da06239
to
9b0a547
Compare
I pushed a commit to disable GuC by default as it's clearly causing the computer to freeze when i915kms is loaded. With GuC disabled on the vt(4) integration rewritten in #243 (which this branch is based on), it looks to work with Intel 12th gen GPUs (I tested that very lightly for now). I also noticed that a function returns an error during amdgpu init. It always did but the error was ignored so far. I will investigate. |
Sway works fine on 5.15-lts on my Intel 12th gen GPU (i7-1260p), but not on 5.17 where I get a black screen with a working mouse cursor (and lots of errors in |
Under a GENERIC kernel, I experience this crash:
I assume you tested i915kms under a GENERIC-NODEBUG kernel. |
Indeed, because that's my daily driver. I will try on a GENERIC kernel like you. Thanks for the bug report! |
I can triage this - the locks being held are vt's vd->vd_curwindow->vb_lock spinlock from vt_flush and the work queue lock, a sleep lock (from drm_fb_helper_damage adding to the workqueue). I have absolutely no idea what the implications of these uses of locks are, but I'm gonna try switching the vt lock to a sleep lock and seeing what happens. edit: as anyone could have predicted, that did not work! |
I finally twiddled with locks to the point that the driver loads, though X refuses to start. I suspect that's a separate issue. These patches are assuredly not commit-ready but they should at least help you skip some steps in figuring out where all the paths that are locking erroneously are, since there's more than just the one which is documented above. kernel: diff --git a/sys/dev/vt/vt_core.c b/sys/dev/vt/vt_core.c
index 267dd7bf2489..80002b9c21f9 100644
--- a/sys/dev/vt/vt_core.c
+++ b/sys/dev/vt/vt_core.c
@@ -1406,17 +1406,23 @@ vt_flush(struct vt_device *vd)
vd->vd_flags &= ~VDF_INVALID;
a = teken_get_curattr(&vw->vw_terminal->tm_emulator);
+ /*
+ * The below calls may need to run a sleep lock in order to
+ * dispatch linuxkpi work
+ */
+ vtbuf_unlock(&vw->vw_buf);
vt_set_border(vd, &vw->vw_draw_area, a->ta_bgcolor);
vt_termrect(vd, vf, &tarea);
if (vd->vd_driver->vd_invalidate_text)
vd->vd_driver->vd_invalidate_text(vd, &tarea);
if (vt_draw_logo_cpus)
vtterm_draw_cpu_logos(vd);
+ vtbuf_lock(&vw->vw_buf);
}
if (tarea.tr_begin.tp_col < tarea.tr_end.tp_col) {
- vd->vd_driver->vd_bitblt_text(vd, vw, &tarea);
vtbuf_unlock(&vw->vw_buf);
+ vd->vd_driver->vd_bitblt_text(vd, vw, &tarea);
return (1);
} drm-kmod: diff --git a/drivers/gpu/drm/vt_drmfb.c b/drivers/gpu/drm/vt_drmfb.c
index 116121e14f..62ef106d0c 100644
--- a/drivers/gpu/drm/vt_drmfb.c
+++ b/drivers/gpu/drm/vt_drmfb.c
@@ -257,6 +257,7 @@ vt_drmfb_postswitch(struct vt_device *vd)
{
struct fb_info *fbio;
struct linux_fb_info *info;
+ int locked = mtx_owned(&vd->vd_lock);
fbio = vd->vd_softc;
info = to_linux_fb_info(fbio);
@@ -266,8 +267,14 @@ vt_drmfb_postswitch(struct vt_device *vd)
}
if (!kdb_active && !KERNEL_PANICKED()) {
+ if (locked) {
+ VT_UNLOCK(vd);
+ }
linux_set_current(curthread);
info->fbops->fb_set_par(info);
+ if (locked) {
+ VT_LOCK(vd);
+ }
} else {
#ifdef DDB
db_trace_self_depth(10);
|
Intel DG1:
It does populate /dev/dri/renderD128, which is a significant progression :) Unfortunately vainfo doesn't like it and can't open the device (my usecase & primary test case is transcoding video). I compiled against freebsd/main, and needed the following patch on this PR:
|
ab28294
to
ef78ba2
Compare
Hello! I've been busy for a few months and have not been able to try the latest code. Looks like @dumbbell has been busy too. Do you have a chance to say what the latest status is? Is it worth testing? |
MMHUB PG needs to be disabled for Picasso for stability reasons. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
Add a quirk in sienna_cichlid_ppt.c to fix some OEM SKU specific stability issues. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
Even if PSR is allowed for a present GPU, there might be no eDP link which supports PSR. Fixes: 708978487304 ("drm/amdgpu/display: Only set vblank_disable_immediate when PSR is not enabled") Reviewed-by: Harry Wentland <[email protected]> Signed-off-by: Michel Dänzer <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
When we switch to dma_resv_wait_timeout() the returned type changes as well. Signed-off-by: Christian König <[email protected]> Fixes: 89aae41d740f ("drm/radeon: use dma_resv_wait_timeout() instead of manually waiting") Bug: https://bugzilla.kernel.org/show_bug.cgi?id=215600 Reviewed-by: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
In order to fill the drm_display_info structure each time an EDID is read, the code currently will call drm_add_display_info with the parsed EDID. drm_add_display_info will then call drm_reset_display_info to reset all the fields to 0, and then set them to the proper value depending on the EDID. In the color_formats case, we will thus report that we don't support any color format, and then fill it back with RGB444 plus the additional formats described in the EDID Feature Support byte. However, since that byte only contains format-related bits since the 1.4 specification, this doesn't happen if the EDID is following an earlier specification. In turn, it means that for one of these EDID, we end up with color_formats set to 0. The EDID 1.3 specification never really specifies what it means by RGB exactly, but since both HDMI and DVI will use RGB444, it's fairly safe to assume it's supposed to be RGB444. Let's move the addition of RGB444 to color_formats earlier in drm_add_display_info() so that it's always set for a digital display. Fixes: da05a5a71ad8 ("drm: parse color format support for digital displays") Cc: Ville Syrjälä <[email protected]> Reported-by: Matthias Reichl <[email protected]> Signed-off-by: Maxime Ripard <[email protected]> Reviewed-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
commit 0064b0ce85bb ("drm/amd/pm: enable ASPM by default") enabled ASPM by default but a variety of hardware configurations it turns out that this caused a regression. * PPC64LE hardware does not support ASPM at a hardware level. CONFIG_PCIEASPM is often disabled on these architectures. * Some dGPUs on ALD platforms don't work with ASPM enabled and PCIe subsystem disables it Check with the PCIe subsystem to see that ASPM has been enabled or not. Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default") Link: https://wiki.raptorcs.com/w/images/a/ad/P9_PHB_version1.0_27July2018_pub.pdf Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1723 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1739 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1885 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1907 Tested-by: [email protected] Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
[Why] Found when running igt@kms_atomic. Userspace attempts to do a TEST_COMMIT when 0 streams which calls dc_remove_stream_from_ctx. This in turn calls link_enc_unassign which ends up modifying stream->link = NULL directly, causing the global link_enc to be removed preventing further link activity and future link validation from passing. [How] We take care of link_enc unassignment at the start of link_enc_cfg_link_encs_assign so this call is no longer necessary. Fixes global state from being modified while unlocked. Reviewed-by: Jimmy Kizito <[email protected]> Acked-by: Jasdeep Dhillon <[email protected]> Signed-off-by: Nicholas Kazlauskas <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
The GPU reset function of raven2 is not maintained or tested, so it should be very unstable. Now the amdgpu_asic_reset function is added to amdgpu_pmops_suspend, which causes the S3 test of raven2 to fail, so the asic_reset of raven2 is ignored here. Fixes: daf8de0874ab5b ("drm/amdgpu: always reset the asic in suspend (v2)") Signed-off-by: Chen Gong <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
vkms leverages common amdgpu framebuffer creation, and also as it does not support FB modifier, there is no need to check tiling flags when initing framebuffer when virtual display is enabled. This can fix below calltrace: amdgpu 0000:00:08.0: GFX9+ requires FB check based on format modifier WARNING: CPU: 0 PID: 1023 at drivers/gpu/drm/amd/amdgpu/amdgpu_display.c:1150 amdgpu_display_framebuffer_init+0x8e7/0xb40 [amdgpu] v2: check adev->enable_virtual_display instead as vkms can be enabled in bare metal as well. Signed-off-by: Leslie Shi <[email protected]> Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
Workstation application ANSA/META v21.1.4 get this error dmesg when running CI test suite provided by ANSA/META: [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16) This is caused by: 1. create a 256MB buffer in invisible VRAM 2. CPU map the buffer and access it causes vm_fault and try to move it to visible VRAM 3. force visible VRAM space and traverse all VRAM bos to check if evicting this bo is valuable 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable() will set amdgpu_vm->evicting, but latter due to not in visible VRAM, won't really evict it so not add it to amdgpu_vm->evicted 5. before next CS to clear the amdgpu_vm->evicting, user VM ops ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted) but fail in amdgpu_vm_bo_update_mapping() (check amdgpu_vm->evicting) and get this error log This error won't affect functionality as next CS will finish the waiting VM ops. But we'd better clear the error log by checking the amdgpu_vm->evicting flag in amdgpu_vm_ready() to stop calling amdgpu_vm_bo_update_mapping() later. Another reason is amdgpu_vm->evicted list holds all BOs (both user buffer and page table), but only page table BOs' eviction prevent VM ops. amdgpu_vm->evicting flag is set only for page table BOs, so we should use evicting flag instead of evicted list in amdgpu_vm_ready(). The side effect of this change is: previously blocked VM op (user buffer in "evicted" list but no page table in it) gets done immediately. v2: update commit comments. Acked-by: Paul Menzel <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Qiang Yu <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
SLPC unset param H2G only needs one parameter - the id of the param. Fixes: 025cb07bebfa ("drm/i915/guc/slpc: Cache platform frequency limits") Suggested-by: Umesh Nerlige Ramappa <[email protected]> Signed-off-by: Vinay Belgaumkar <[email protected]> Reviewed-by: Umesh Nerlige Ramappa <[email protected]> Signed-off-by: Ramalingam C <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 9648f1c3739505557d94ff749a4f32192ea81fe3) Signed-off-by: Tvrtko Ursulin <[email protected]>
This JSP2 PCH actually seems to be some special Apple specific ICP variant rather than a JSP. Make it so. Or at least all the references to it seem to be some Apple ICL machines. Didn't manage to find these PCI IDs in any public chipset docs unfortunately. The only thing we're losing here with this JSP->ICP change is Wa_14011294188, but based on the HSD that isn't actually needed on any ICP based design (including JSP), only TGP based stuff (including MCC) really need it. The documented w/a just never made that distinction because Windows didn't want to differentiate between JSP and MCC (not sure how they handle hpd/ddc/etc. then though...). Cc: [email protected] Cc: Matt Roper <[email protected]> Cc: Vivek Kasireddy <[email protected]> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/4226 Fixes: 943682e3bd19 ("drm/i915: Introduce Jasper Lake PCH") Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Acked-by: Vivek Kasireddy <[email protected]> Tested-by: Tomas Bzatek <[email protected]> (cherry picked from commit 53581504a8e216d435f114a4f2596ad0dfd902fc) Signed-off-by: Tvrtko Ursulin <[email protected]>
VRR capable property is not attached by default to the connector It is attached only if VRR is supported. So if the driver tries to call drm core set prop function without it being attached that causes NULL dereference. Cc: Jani Nikula <[email protected]> Cc: Ville Syrjälä <[email protected]> Cc: [email protected] Signed-off-by: Manasi Navare <[email protected]> Reviewed-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Regression has been reported that suspend/resume may hang with the previous vm ready check commit. So bring back the evicted list check as a temp fix. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1922 Fixes: c1a66c3bc425 ("drm/amdgpu: check vm ready by amdgpu_vm->evicting flag") Reviewed-by: Christian König <[email protected]> Signed-off-by: Qiang Yu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
Currently we are observing occasional screen flickering when PSR2 selective fetch is enabled. More specifically glitch seems to happen on full frame update when cursor moves to coords x = -1 or y = -1. According to Bspec SF Single full frame should not be set if SF Partial Frame Enable is not set. This happened to be true for ADLP as PSR2_MAN_TRK_CTL_ENABLE is always set and for ADL_P it's actually "SF Partial Frame Enable" (Bit 31). Setting "SF Partial Frame Enable" bit also on full update seems to fix screen flickering. Also make code more clear by setting PSR2_MAN_TRK_CTL_ENABLE only if not on ADL_P. Bit 31 has different meaning in ADL_P. Bspec: 49274 v2: Fix Mihai Harpau email address v3: Modify commit message and remove unnecessary comment Tested-by: Lyude Paul <[email protected]> Fixes: 7f6002e58025 ("drm/i915/display: Enable PSR2 selective fetch by default") Reported-by: Lyude Paul <[email protected]> Cc: Mihai Harpau <[email protected]> Cc: José Roberto de Souza <[email protected]> Cc: Ville Syrjälä <[email protected]> Bugzilla: https://gitlab.freedesktop.org/drm/intel/-/issues/5077 Signed-off-by: Jouni Högander <[email protected]> Reviewed-by: José Roberto de Souza <[email protected]> Signed-off-by: José Roberto de Souza <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 8d5516d18b323cf7274d1cf5fe76f4a691f879c6) Signed-off-by: Tvrtko Ursulin <[email protected]>
4f9e064
to
ebaf4bd
Compare
I just marked the branch as ready to review now that #243 is merged!
Does it work with the drm-515-kmod package (5.15-lts branch) otherwise?
It won't because the patches to vt(4) (related to #243) are not backported yet. |
Does this work with FreeBSD 15-CURRENT? |
@thewanderingtraderm: Yes, it works with FreeBSD 15-CURRENT |
how do I install it. I tried compiling and installing it but it didn't load because of version mismatch. I'm on FreeBSD 15-CURRENT |
@dumbbell Great seeing this merged! Have the installation instructions changed? Is it enough to compile and install the FreeBSD |
@orbitz: Yes drm-kmod |
Do you think this will make it into 14.0-RELEASE at some point? |
@ko56: 14.0-RELEASE, no, the patches to freebsd-src were not ready in time unfortunately. I think they can go into the future 14.1-RELEASE however, perhaps 13.x-RELEASE as well, I don't know. |
This is the backport of the DRM drivers from Linux 5.17.
Progress:
Changes in Linux 5.17
You can read this Phoronix article to learn about the changes in the DRM drivers in Linux 5.17:
https://www.phoronix.com/news/Linux-5.17-DRM-Submitted
Patches to linuxkpi
This update depends on the following patches to linuxkpi in FreeBSD:
https://reviews.freebsd.org/D39049https://reviews.freebsd.org/D39050https://reviews.freebsd.org/D39051https://reviews.freebsd.org/D39052https://reviews.freebsd.org/D39053https://reviews.freebsd.org/D39054https://reviews.freebsd.org/D39055https://reviews.freebsd.org/D39056https://reviews.freebsd.org/D39057These patches are maintained in the following repository and branch:
https://github.com/dumbbell/freebsd-src/tree/linuxkpi-updates-for-drm
How to test
You need to run a recent FreeBSD 14-CURRENT to test it.
Here are some instructions:
You need to checkout the FreeBSD src branch I mentionned,
linuxkpi-updates-for-drm
, and compile a kernel from that branch:You need to checkout the branch referenced in this pull request and compile it:
This will need access to the FreeBSD src tree cloned above. I don't remember the name of the variable to point the build to it. You can link
/usr/src
to your clone and it will be enough.You will need GPU firmwares in the
kernel.drm
directory as well. To compile and install them:Load the relevant driver(s) as you usually do.