Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

weston produces a black screen with fkms mode but not kms #78

Closed
Gerharddc opened this issue Jan 13, 2017 · 9 comments
Closed

weston produces a black screen with fkms mode but not kms #78

Gerharddc opened this issue Jan 13, 2017 · 9 comments

Comments

@Gerharddc
Copy link

For reasons described in #72 I am trying to eventually get OpenGL ES 2 running on fkms. In said issue I was working with self compiled kernels etc but I have now tested with stock Raspbian everything just to make sure this was not my distro's fault.

When configuring Raspbian to use VC4 and then changing the overlay to fkms everything seems to work fine at first. In other words, the framebuffer console comes up and works just fine. In my previous issue I tested with a DRM OpenGL application but in this case it did not want to compile and instead of wasting my time I instead tried to see if Weston would launch. To little surprise after launching Weston on fkms the screen just went black while on normal kms it worked just fine. I therefore believe that fimwarekms is broken on either EGL or DRM level. Am I missing something or is this a known problem?

@ktb83
Copy link

ktb83 commented Jan 20, 2017

Hi Gerharddc,

I've been trying to get the Anholt driver working with the RPF touchscreen.

I did an ARM64 Debian Sid build with the 4.10 64-bit kernel which is running great in general:
Linux version 4.10.0-rc3-v8+ (ktb@KTBMacMini) (gcc version 6.2.1 20161124 (Ubuntu/Linaro 6.2.1-5ubuntu1) ) #2 SMP PREEMPT Wed Jan 18 11:34:51 CST 2017

I've been able to use vc4-kms-v3d with an HDMI connected TV. Works OK. Unfortunately, I haven't gotten the RPF touchscreen to work with it. I'll keep trying.

Using vc4-fkms-v3d, I'm able to get the RPF touchscreen to sort of work. The desktop/display seems to basically freeze up after logging in via LightDM (EDIT: actually, it locks up after I try to launch an application). However, if I SSH in from another machine, then it unfreezes. Strange? Permissions issue?

EDIT: Upon second review, perhaps applications are just launching very slowly, hanging up on something? Whatever it is, the the hangs tend to go away if I SSH in to the Pi while it's locked up (cursor spinning, but not movable).

Did you build the 64-bit userland? I did.

The onboard WiFi works, but I haven't worked out the bluetooth error I'm seeing. Somehow the driver feels slower than it did last year when I was testing/playing Open Arena with cma=512M. Unfortunately, it no longer boots with cma=512M.

@Gerharddc
Copy link
Author

@ktb83 Yes I have built a 64 bit a userpace, but not "userland" as that is actually the package for the proprietary GPU driver which I cannot build. I have also been able to get 64bit OpenGL applications running in normal KMS on the RPF touchscreen even though the console framebuffer doesn’t show until I exit one of these applications after running it. I have no idea what causes that.

Relating to this question though I cannot get OpenGL + DRM stuff to work with FKMS even though I can get it to boot and show a framebuffer.

I have not tested WiFi yet.

@Electron752
Copy link

@ktb83 - I was able to track the weird locks to GPU Reset/Hangs. I have no idea what is causing it though. If anybody has any ideas...

I also noticed cma=512M doesn't work anymore either. I haven't been able to determine why yet. The RPI 3 doesn't make it very far into the boot process at all. I guess one thing to try is the good old fashioned serial console to see if any error messages are being printed.

The cma=512M is too bad, because my monitor likes to run at 1920x1080 and running at that resolution is a bit memory intensive.

@ktb83
Copy link

ktb83 commented Jan 21, 2017

I think Eric mentioned somewhere in a GitHub comment that it is limited to 256 for some reason. I don't know why.

I found some of my notes from 2015:
gohai/vc4-buildbot#3
gohai/vc4-buildbot#4

@Electron752
Copy link

OK, I remember reading now the 256 is a hardware limitation of the GPU. I remember reading it from this topic:

raspberrypi#1707

I have no idea how cma=512M every worked.

@anholt
Copy link
Owner

anholt commented Feb 1, 2017

Testing the 4.9 backport PR with HDMI-only, fkms came up with lightdm autologin into XFCE just fine and GL apps are fine.

weston produces a black screen with fkms, but not with proper kms. Renaming this issue and focusing it on that bug.

@anholt anholt changed the title Not much seems to work on firmwarekms weston produces a black screen with fkms mode but not kms Feb 1, 2017
@anholt
Copy link
Owner

anholt commented Feb 2, 2017

Things I've learned so far:

  • Sometimes weston will go from black to displaying once I move the cursor.
  • Sometimes it doesn't.
  • Weston, weridly, doesn't display the cursor at first startup.
  • Our display is being scanned out as ARGB, with HVS looking at the alpha channel. That's no good. DRM (fbdev, at least) requires that we support XRGB. Unfortunately, the alpha_mode=0 property channel message to set xrgb mode will get smashed back to 0 when we do our MBOX message. OTOH, given that below our primary plane is just black, I guess this is fine.
  • Got a pageflipping failure under X, will investigate that first since it doesn't require screwing around on the console like weston does.

@anholt
Copy link
Owner

anholt commented Feb 2, 2017

Got it. Failed to attach pageflip completion events to the CRTC. Fixed in my 4.9 PR now, and I'm working on a 4.4.

@anholt
Copy link
Owner

anholt commented Feb 8, 2017

4.4 PR is merged. 4.9 is still stalled, but I'm going to mark this fixed since we've got the patch in the current stable branch.

@anholt anholt closed this as completed Feb 8, 2017
anholt pushed a commit that referenced this issue Nov 13, 2017
commit 18ae68f upstream.

WMI ops wrappers did not properly check for null
function pointers for spectral scan. This caused
null dereference crash with WMI-TLV based firmware
which doesn't implement spectral scan.

The crash could be triggered with:

  ip link set dev wlan0 up
  echo background > /sys/kernel/debug/ieee80211/phy0/ath10k/spectral_scan_ctl

The crash looked like this:

  [  168.031989] BUG: unable to handle kernel NULL pointer dereference at           (null)
  [  168.037406] IP: [<          (null)>]           (null)
  [  168.040395] PGD cdd4067 PUD fa0f067 PMD 0
  [  168.043303] Oops: 0010 [#1] SMP
  [  168.045377] Modules linked in: ath10k_pci(O) ath10k_core(O) ath mac80211 cfg80211 [last unloaded: cfg80211]
  [  168.051560] CPU: 1 PID: 1380 Comm: bash Tainted: G        W  O    4.8.0 #78
  [  168.054336] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
  [  168.059183] task: ffff88000c460c00 task.stack: ffff88000d4bc000
  [  168.061736] RIP: 0010:[<0000000000000000>]  [<          (null)>]           (null)
  ...
  [  168.100620] Call Trace:
  [  168.101910]  [<ffffffffa03b9566>] ? ath10k_spectral_scan_config+0x96/0x200 [ath10k_core]
  [  168.104871]  [<ffffffff811386e2>] ? filemap_fault+0xb2/0x4a0
  [  168.106696]  [<ffffffffa03b97e6>] write_file_spec_scan_ctl+0x116/0x280 [ath10k_core]
  [  168.109618]  [<ffffffff812da3a1>] full_proxy_write+0x51/0x80
  [  168.111443]  [<ffffffff811957b8>] __vfs_write+0x28/0x120
  [  168.113090]  [<ffffffff812f1a2d>] ? security_file_permission+0x3d/0xc0
  [  168.114932]  [<ffffffff8109b912>] ? percpu_down_read+0x12/0x60
  [  168.116680]  [<ffffffff811965f8>] vfs_write+0xb8/0x1a0
  [  168.118293]  [<ffffffff81197966>] SyS_write+0x46/0xa0
  [  168.119912]  [<ffffffff818f2972>] entry_SYSCALL_64_fastpath+0x1a/0xa4
  [  168.121737] Code:  Bad RIP value.
  [  168.123318] RIP  [<          (null)>]           (null)

Signed-off-by: Michal Kazior <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Signed-off-by: Amit Pundir <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
anholt pushed a commit that referenced this issue Apr 17, 2018
[ Upstream commit df93dc6 ]

Currently, there's no check if an invalid buffer range
is passed. However, while testing DVB memory mapped apps,
I got this:

   videobuf2_core: VB: num_buffers -2143943680, buffer 33, index -2143943647
   unable to handle kernel paging request at ffff888b773c0890
   IP: __vb2_queue_alloc+0x134/0x4e0 [videobuf2_core]
   PGD 4142c7067 P4D 4142c7067 PUD 0
   Oops: 0002 [#1] SMP
   Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables bluetooth rfkill ecdh_generic binfmt_misc rc_dvbsky sp2 ts2020 intel_rapl x86_pkg_temp_thermal dvb_usb_dvbsky intel_powerclamp dvb_usb_v2 coretemp m88ds3103 kvm_intel i2c_mux dvb_core snd_hda_codec_hdmi crct10dif_pclmul crc32_pclmul videobuf2_vmalloc videobuf2_memops snd_hda_intel ghash_clmulni_intel videobuf2_core snd_hda_codec rc_core mei_me intel_cstate snd_hwdep snd_hda_core videodev intel_uncore snd_pcm mei media tpm_tis tpm_tis_core intel_rapl_perf tpm snd_timer lpc_ich snd soundcore kvm irqbypass libcrc32c i915 i2c_algo_bit drm_kms_helper
   e1000e ptp drm crc32c_intel video pps_core
   CPU: 3 PID: 1776 Comm: dvbv5-zap Not tainted 4.14.0+ #78
   Hardware name:                  /NUC5i7RYB, BIOS RYBDWi35.86A.0364.2017.0511.0949 05/11/2017
   task: ffff88877c73bc80 task.stack: ffffb7c402418000
   RIP: 0010:__vb2_queue_alloc+0x134/0x4e0 [videobuf2_core]
   RSP: 0018:ffffb7c40241bc60 EFLAGS: 00010246
   RAX: 0000000080360421 RBX: 0000000000000021 RCX: 000000000000000a
   RDX: ffffb7c40241bcf4 RSI: ffff888780362c60 RDI: ffff888796d8e130
   RBP: ffffb7c40241bcc8 R08: 0000000000000316 R09: 0000000000000004
   R10: ffff888780362c00 R11: 0000000000000001 R12: 000000000002f000
   R13: ffff8887758be700 R14: 0000000000021000 R15: 0000000000000001
   FS:  00007f2849024740(0000) GS:ffff888796d80000(0000) knlGS:0000000000000000
   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
   CR2: ffff888b773c0890 CR3: 000000043beb2005 CR4: 00000000003606e0
   Call Trace:
    vb2_core_reqbufs+0x226/0x420 [videobuf2_core]
    dvb_vb2_reqbufs+0x2d/0xc0 [dvb_core]
    dvb_dvr_do_ioctl+0x98/0x1d0 [dvb_core]
    dvb_usercopy+0x53/0x1b0 [dvb_core]
    ? dvb_demux_ioctl+0x20/0x20 [dvb_core]
    ? tty_ldisc_deref+0x16/0x20
    ? tty_write+0x1f9/0x310
    ? process_echoes+0x70/0x70
    dvb_dvr_ioctl+0x15/0x20 [dvb_core]
    do_vfs_ioctl+0xa5/0x600
    SyS_ioctl+0x79/0x90
    entry_SYSCALL_64_fastpath+0x1a/0xa5
   RIP: 0033:0x7f28486f7ea7
   RSP: 002b:00007ffc13b2db18 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
   RAX: ffffffffffffffda RBX: 000055b10fc06130 RCX: 00007f28486f7ea7
   RDX: 00007ffc13b2db48 RSI: 00000000c0086f3c RDI: 0000000000000007
   RBP: 0000000000000203 R08: 000055b10df1e02c R09: 000000000000002e
   R10: 0036b42415108357 R11: 0000000000000246 R12: 0000000000000000
   R13: 00007f2849062f60 R14: 00000000000001f1 R15: 00007ffc13b2da54
   Code: 74 0a 60 8b 0a 48 83 c0 30 48 83 c2 04 89 48 d0 89 48 d4 48 39 f0 75 eb 41 8b 42 08 83 7d d4 01 41 c7 82 ec 01 00 00 ff ff ff ff <4d> 89 94 c5 88 00 00 00 74 14 83 c3 01 41 39 dc 0f 85 f1 fe ff
   RIP: __vb2_queue_alloc+0x134/0x4e0 [videobuf2_core] RSP: ffffb7c40241bc60
   CR2: ffff888b773c0890

So, add a sanity check in order to prevent going past array.

Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Acked-by: Sakari Ailus <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
anholt pushed a commit that referenced this issue Jun 22, 2018
syzbot reported a use-after-free:

BUG: KASAN: use-after-free in ip6_route_mpath_notify+0xe9/0x100 net/ipv6/route.c:4180
Read of size 4 at addr ffff8801bf789cf0 by task syz-executor756/4555

CPU: 1 PID: 4555 Comm: syz-executor756 Not tainted 4.17.0-rc7+ #78
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1b9/0x294 lib/dump_stack.c:113
 print_address_description+0x6c/0x20b mm/kasan/report.c:256
 kasan_report_error mm/kasan/report.c:354 [inline]
 kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
 __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:432
 ip6_route_mpath_notify+0xe9/0x100 net/ipv6/route.c:4180
 ip6_route_multipath_add+0x615/0x1910 net/ipv6/route.c:4303
 inet6_rtm_newroute+0xe3/0x160 net/ipv6/route.c:4391
 ...

Allocated by task 4555:
 save_stack+0x43/0xd0 mm/kasan/kasan.c:448
 set_track mm/kasan/kasan.c:460 [inline]
 kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553
 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
 kmem_cache_alloc+0x12e/0x760 mm/slab.c:3554
 dst_alloc+0xbb/0x1d0 net/core/dst.c:104
 __ip6_dst_alloc+0x35/0xa0 net/ipv6/route.c:361
 ip6_dst_alloc+0x29/0xb0 net/ipv6/route.c:376
 ip6_route_info_create+0x4d4/0x3a30 net/ipv6/route.c:2834
 ip6_route_multipath_add+0xc7e/0x1910 net/ipv6/route.c:4240
 inet6_rtm_newroute+0xe3/0x160 net/ipv6/route.c:4391
 ...

Freed by task 4555:
 save_stack+0x43/0xd0 mm/kasan/kasan.c:448
 set_track mm/kasan/kasan.c:460 [inline]
 __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521
 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
 __cache_free mm/slab.c:3498 [inline]
 kmem_cache_free+0x86/0x2d0 mm/slab.c:3756
 dst_destroy+0x267/0x3c0 net/core/dst.c:140
 dst_release_immediate+0x71/0x9e net/core/dst.c:205
 fib6_add+0xa40/0x1650 net/ipv6/ip6_fib.c:1305
 __ip6_ins_rt+0x6c/0x90 net/ipv6/route.c:1011
 ip6_route_multipath_add+0x513/0x1910 net/ipv6/route.c:4267
 inet6_rtm_newroute+0xe3/0x160 net/ipv6/route.c:4391
 ...

The problem is that rt_last can point to a deleted route if the insert
fails.

One reproducer is to insert a route and then add a multipath route that
has a duplicate nexthop.e.g,:
    $ ip -6 ro add vrf red 2001:db8:101::/64 nexthop via 2001:db8:1::2
    $ ip -6 ro append vrf red 2001:db8:101::/64 nexthop via 2001:db8:1::4 nexthop via 2001:db8:1::2

Fix by not setting rt_last until the it is verified the insert succeeded.

Fixes: 3b1137f ("net: ipv6: Change notifications for multipath add to RTA_MULTIPATH")
Cc: Eric Dumazet <[email protected]>
Reported-by: syzbot <[email protected]>
Signed-off-by: David Ahern <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants