Sometimes you have to debug kernel GPU driver issues on the same
machine that you use for development. For Panfrost at least, recent
kernel releases have made GPU resets much more reliable (but they
sometimes still fail), so this isn't a problem for userspace
development any more.
But other times, you get weird cases where this happens:
It's the WebGL spec. Therefore the bug is caused by WebGL.
Or this:
This is a nearest-filtered hole, inside a linearly-filtered image,
while zoomed into a webpage. It is showing through my desktop
background, so this is clearly a vulnerability. When pinch-zooming in
and out of the webpage, the hole stayed attached to the texture, but
it did grow in size once.
(My current theory is that GEM objects are being freed without using
the proper functions from panfrost.ko
, causing use-after-free issues
in the shrinker and also trying to map memory that is already mapped
to point somewhere else.)
How do you debug this?
Often, you'll hit cases where your graphical session, including the
fifty terminal tabs each with an emacsclient(1)
instance, decides to
completely freeze. Hitting a button on an IR remote control to launch
a script on another device to SSH back into your computer to pkill -9
sway; rmmod --force panfrost
means that you have to spend a lot of
time getting things set up again after you recompile and insmod(8)
your new panfrost.ko
.
An obvious workaround would be to use virtual machines, so that it
only affects a nested window inside your section, but alas, even if
there was a way to use the GPU or at least something that looks like a
GPU from a VM, my old Chromebook is apparently unable to run virtual
machines without firmware modifications. User-Mode Linux doesn't
appear to have ARM support, so that isn't an option either.
(I'm waiting for someone to add GPU register emulation for Mali GPUs
to QEMU or crosvm, to allow catching panfrost.ko
bugs by testing
against a fake GPU from a VM.)
What's the solution then?
$ WLR_RENDERER=pixman sway
Yup, that's it.
Or the equivalent flag for whatever compositor you use. If you can't
find it, just rmmod panfrost
before you launch it.
Except for one little problem, now you don't have accelerated
graphics, so can't reproduce the bug!
What to do next depends on whether the kernel bug requires importing
and exporting resources to be hit or not.
If not, you've got it easy.
It only takes a single patch to Mesa which includes sub-optimal
error-checking (oops), a bit of code stolen from the asahi
Gallium
driver, and a bit more code with my usual bad habit of
micro-optimising things even when the fast-path is hardly any faster,
and, worse, inconsistent indentation:
if (rsrc->dt_stride == trans->stride) {
memcpy(map, tex_map, rsrc->base.height0 * rsrc->dt_stride);
} else {
for (unsigned row = 0; row < rsrc->base.height0; ++row) {
memcpy(map + row * rsrc->dt_stride,
tex_map + row * trans->stride,
MIN2(rsrc->dt_stride, trans->stride));
}
}
Oh, and that memcpy
reads from write-combine memory. I think that
the next part in the register allocation series might cover just how
much of an effect that has on performance.
Now you just need to set -Dgallium-drivers=panfrost,swrast
and you
can launch GPU-accelerated GL apps from inside your unaccelerated Sway
session. But be careful! Other Qt and GTK4 programs will use OpenGL
themselves when given the chance, it's maybe safest to leave
panfrost.ko
unloaded until you actually want to run your test app.
Otherwise you may find removing panfrost.ko
after something goes
wrong a little difficult.
Another fix, no Mesa patches required
What if.. I told you about another super-secret wlroots variable
you can use, to launch a nested sway
instance inside your already
running (and software rendered) compositor:
$ WLR_BACKENDS=headless WL_RENDERER=gles2 WLR_LIBINPUT_NO_DEVICES=1 sway
(I don't know if the LIBINPUT
variable is really needed, but it's
probably safer to keep it.)
Weston has a headless backend that probably works as well, and you
could give kwin_wayland --virtual
a shot if you really wanted to.
If you don't care about seeing the output from the application that
reproduces the bug, you can just run WAYLAND_DISPLAY=wayland-2
glmark2-es2-wayland
or whatever, otherwise you can use something like
wayvnc and connect from TigerVNC vncviewer
or whatever your
favourite VNC client is.
(Aside: To use legacy X11 applications on accelerated Wayland and
avoid #5758, start a new Xwayland instance without
-rootless
, and use whatever X11 window manager you want (I like
fvwm
) inside it. You lose clipboard syncing, though.)
"Can I haz root?"
Anyone with a Mali GPU at home can test themselves if their kernel is
vulnerable to the (possible) use-after-free I hit, which probably
allows for root escalation from any web browser with accelerated
rendering (even without canvas or WebGL) easily enough.
First, increase memory usage so that you only have a few hundred megs
of free memory. An easy way is to create a tmpfs
with a size of most
of RAM, and fill it with data via dd if=/dev/urandom
of=/tmp/randomdata bs=1M count=
<a few thousand>
. Disable
physically backed swap (i.e. anything but zram
) as well.
Alternatively, start a Chromium-based web browser and open a new tab
or two.
Now:
$ for i in {1..800}; do glmark2-es2-wayland -b refract & sleep 0.2; done
Unless you are (un)lucky, you should hit at least at least one kernel
bug. You get different ones with the two different methods I described
above, and you can also try changing the sleep(1)
time.
Fin.
Though I've already written the second part of the series on speeding
up register allocation, I've been distracted by too many other things
to post it. I am also halfway through writing about the tiler heap,
which unfortunately does not appear to have secret backdoor
instructions.
But right now, the highest-priority thing for me is to recompile the
kernel with KASAN. Maybe that will catch this pesky bug. Um, these
pesky bugs.
Appendix A: stuff from dmesg
(The stack being shown as ????????
is a kernel bug which survived
for a couple of releases, where __get_user
is used to look at the
memory to dump, rather than get_kernel_nofault
.)
Click to expand
[36401.844840][ T111] 8<--- cut here ---
[36401.844870][ T111] Unable to handle kernel paging request at virtual address 61df8928
[36401.844879][ T111] pgd = 69d587fd
[36401.844888][ T111] [61df8928] *pgd=00000000
[36401.844902][ T111] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
[36401.844909][ T111] Modules linked in: cros_ec_debugfs cros_ec_sysfs hantro_vpu(C) snd_soc_rockchip_max98090 panfrost rockchip_rga gpu_sched dw_hdmi_cec videobuf2_dma_contig v4l2_h264 v4l2_mem2mem snd_soc_rockchip_i2s rk_crypto snd_soc_rockchip_pcm brcmfmac brcmutil snd_soc_max98090 cros_ec_spi snd_soc_ts3a227e
[36401.844954][ T111] CPU: 1 PID: 111 Comm: kswapd0 Tainted: G WC 5.15.0-rc5-darkstar #3
[36401.844963][ T111] Hardware name: Rockchip (Device Tree)
[36401.844969][ T111] PC is at panfrost_gem_shrinker_count+0x38/0x9c [panfrost]
[36401.844997][ T111] LR is at 0x0
[36401.845005][ T111] pc : [<bf1853b4>] lr : [<00000000>] psr: 300f0113
[36401.845010][ T111] sp : c278fdb8 ip : 00000000 fp : 00000000
[36401.845017][ T111] r10: c4b0c5f8 r9 : 0000000c r8 : 00000080
[36401.845022][ T111] r7 : c4b0c5f8 r6 : c1c12000 r5 : c4b0c5d0 r4 : 00000000
[36401.845027][ T111] r3 : 61df8820 r2 : 61df892c r1 : 00000000 r0 : c4b0c5f0
[36401.845034][ T111] Flags: nzCV IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[36401.845042][ T111] Control: 10c5387d Table: 0d05406a DAC: 00000051
[36401.845049][ T111] Register r0 information: slab kmalloc-1k start c4b0c400 pointer offset 496 size 1024
[36401.845067][ T111] Register r1 information: NULL pointer
[36401.845075][ T111] Register r2 information: non-paged memory
[36401.845081][ T111] Register r3 information: non-paged memory
[36401.845088][ T111] Register r4 information: NULL pointer
[36401.845094][ T111] Register r5 information: slab kmalloc-1k start c4b0c400 pointer offset 464 size 1024
[36401.845107][ T111] Register r6 information: slab kmalloc-2k start c1c12000 pointer offset 0 size 2048
[36401.845120][ T111] Register r7 information: slab kmalloc-1k start c4b0c400 pointer offset 504 size 1024
[36401.845131][ T111] Register r8 information: non-paged memory
[36401.845138][ T111] Register r9 information: non-paged memory
[36401.845145][ T111] Register r10 information: slab kmalloc-1k start c4b0c400 pointer offset 504 size 1024
[36401.845158][ T111] Register r11 information: NULL pointer
[36401.845164][ T111] Register r12 information: NULL pointer
[36401.845171][ T111] Process kswapd0 (pid: 111, stack limit = 0xc8c0da63)
[36401.845179][ T111] Stack: (0xc278fdb8 to 0xc2790000)
[36401.845186][ T111] fda0: ???????? ????????
[36401.845192][ T111] fdc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[36401.845198][ T111] fde0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[36401.845203][ T111] fe00: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[36401.845209][ T111] fe20: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[36401.845215][ T111] fe40: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[36401.845221][ T111] fe60: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[36401.845226][ T111] fe80: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[36401.845232][ T111] fea0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[36401.845237][ T111] fec0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[36401.845243][ T111] fee0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[36401.845248][ T111] ff00: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[36401.845254][ T111] ff20: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[36401.845259][ T111] ff40: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[36401.845264][ T111] ff60: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[36401.845269][ T111] ff80: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[36401.845276][ T111] ffa0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[36401.845281][ T111] ffc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[36401.845286][ T111] ffe0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[36401.845298][ T111] [<bf1853b4>] (panfrost_gem_shrinker_count [panfrost]) from [<c029b194>] (do_shrink_slab+0x30/0x464)
[36401.845334][ T111] [<c029b194>] (do_shrink_slab) from [<c029b674>] (shrink_slab+0xac/0x2c0)
[36401.845348][ T111] [<c029b674>] (shrink_slab) from [<c029f310>] (shrink_node+0x2b8/0x6ec)
[36401.845360][ T111] [<c029f310>] (shrink_node) from [<c029ff78>] (kswapd+0x420/0xa1c)
[36401.845371][ T111] [<c029ff78>] (kswapd) from [<c0152604>] (kthread+0x164/0x194)
[36401.845384][ T111] [<c0152604>] (kthread) from [<c0100130>] (ret_from_fork+0x14/0x24)
[36401.845395][ T111] Exception stack(0xc278ffb0 to 0xc278fff8)
[36401.845402][ T111] ffa0: ???????? ???????? ???????? ????????
[36401.845408][ T111] ffc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[36401.845414][ T111] ffe0: ???????? ???????? ???????? ???????? ???????? ????????
[36401.845421][ T111] Code: e5302008 e2423f43 e1500002 0a00000f (e5932108)
[36401.845431][ T111] ---[ end trace 5371a792ba72612c ]---
[34856.107292][T25454] ------------[ cut here ]------------
[34856.107319][T25454] WARNING: CPU: 1 PID: 25454 at drivers/iommu/io-pgtable-arm.c:293 arm_lpae_map_pages+0x4d0/0x5cc
[34856.107346][T25454] Modules linked in: cros_ec_debugfs cros_ec_sysfs hantro_vpu(C) snd_soc_rockchip_max98090 panfrost rockchip_rga gpu_sched dw_hdmi_cec videobuf2_dma_contig v4l2_h264 v4l2_mem2mem snd_soc_rockchip_i2s rk_crypto snd_soc_rockchip_pcm brcmfmac brcmutil snd_soc_max98090 cros_ec_spi snd_soc_ts3a227e
[34856.107400][T25454] CPU: 1 PID: 25454 Comm: Renderer Tainted: G WC 5.15.0-rc5-darkstar #3
[34856.107411][T25454] Hardware name: Rockchip (Device Tree)
[34856.107423][T25454] [<c010f488>] (unwind_backtrace) from [<c010ae8c>] (show_stack+0x10/0x14)
[34856.107442][T25454] [<c010ae8c>] (show_stack) from [<c103a21c>] (dump_stack_lvl+0x40/0x4c)
[34856.107463][T25454] [<c103a21c>] (dump_stack_lvl) from [<c012df98>] (__warn+0xec/0x148)
[34856.107477][T25454] [<c012df98>] (__warn) from [<c1032a80>] (warn_slowpath_fmt+0x68/0x7c)
[34856.107489][T25454] [<c1032a80>] (warn_slowpath_fmt) from [<c09c7478>] (arm_lpae_map_pages+0x4d0/0x5cc)
[34856.107503][T25454] [<c09c7478>] (arm_lpae_map_pages) from [<c09c75a0>] (arm_lpae_map+0x2c/0x34)
[34856.107515][T25454] [<c09c75a0>] (arm_lpae_map) from [<bf187f98>] (mmu_map_sg+0xa8/0x12c [panfrost])
[34856.107546][T25454] [<bf187f98>] (mmu_map_sg [panfrost]) from [<bf188ab8>] (panfrost_mmu_map+0x6c/0xc8 [panfrost])
[34856.107579][T25454] [<bf188ab8>] (panfrost_mmu_map [panfrost]) from [<bf184e44>] (panfrost_gem_open+0x114/0x208 [panfrost])
[34856.107612][T25454] [<bf184e44>] (panfrost_gem_open [panfrost]) from [<c09e8190>] (drm_gem_handle_create_tail+0xec/0x1c4)
[34856.107638][T25454] [<c09e8190>] (drm_gem_handle_create_tail) from [<c09f7944>] (drm_gem_prime_fd_to_handle+0xb4/0x1e4)
[34856.107650][T25454] [<c09f7944>] (drm_gem_prime_fd_to_handle) from [<c09e92ac>] (drm_ioctl+0x204/0x394)
[34856.107660][T25454] [<c09e92ac>] (drm_ioctl) from [<c031ae50>] (sys_ioctl+0x130/0xba8)
[34856.107674][T25454] [<c031ae50>] (sys_ioctl) from [<c0100060>] (ret_fast_syscall+0x0/0x48)
[34856.107685][T25454] Exception stack(0xc5657fa8 to 0xc5657ff0)
[34856.107693][T25454] 7fa0: ???????? ???????? ???????? ???????? ???????? ????????
[34856.107699][T25454] 7fc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[34856.107704][T25454] 7fe0: ???????? ???????? ???????? ????????
[34856.107712][T25454] ---[ end trace 5371a792ba72611a ]---
[11438.233380][ T110] 8<--- cut here ---
[11438.233400][ T110] Unable to handle kernel NULL pointer dereference at virtual address 00000010
[11438.233408][ T110] pgd = 487417c0
[11438.233418][ T110] [00000010] *pgd=00000000
[11438.233430][ T110] Internal error: Oops: 17 [#1] PREEMPT SMP ARM
[11438.233440][ T110] Modules linked in: cros_ec_debugfs cros_ec_sysfs hantro_vpu(C) snd_soc_rockchip_max98090 panfrost videobuf2_dma_contig v4l2_h264 dw_hdmi_cec gpu_sched rockchip_rga brcmfmac snd_soc_rockchip_i2s rk_crypto v4l2_mem2mem snd_soc_rockchip_pcm brcmutil snd_soc_max98090 cros_ec_spi snd_soc_ts3a227e [last unloaded: stap_8344c1eb88b90f297c3daaca90da5cf_3507]
[11438.233496][ T110] CPU: 3 PID: 110 Comm: kswapd0 Tainted: G C O 5.15.0-rc5-darkstar #3
[11438.233506][ T110] Hardware name: Rockchip (Device Tree)
[11438.233511][ T110] PC is at arm_dma_unmap_sg+0x38/0x74
[11438.233525][ T110] LR is at drm_gem_shmem_purge_locked+0x64/0x144
[11438.233537][ T110] pc : [<c0113dd0>] lr : [<c0a1564c>] psr: 200d0013
[11438.233543][ T110] sp : c26ffd50 ip : c0113d98 fp : dcd65754
[11438.233548][ T110] r10: 00000000 r9 : 00000000 r8 : 00000005
[11438.233554][ T110] r7 : c1d80010 r6 : c11014e4 r5 : 00000000 r4 : 00000000
[11438.233559][ T110] r3 : c11014e4 r2 : 00000005 r1 : 00000000 r0 : c1d80010
[11438.233565][ T110] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[11438.233574][ T110] Control: 10c5387d Table: 11c5006a DAC: 00000051
[11438.233581][ T110] Register r0 information: slab kmalloc-1k start c1d80000 pointer offset 16 size 1024
[11438.233597][ T110] Register r1 information: NULL pointer
[11438.233606][ T110] Register r2 information: non-paged memory
[11438.233612][ T110] Register r3 information: non-slab/vmalloc memory
[11438.233620][ T110] Register r4 information: NULL pointer
[11438.233626][ T110] Register r5 information: NULL pointer
[11438.233632][ T110] Register r6 information: non-slab/vmalloc memory
[11438.233638][ T110] Register r7 information: slab kmalloc-1k start c1d80000 pointer offset 16 size 1024
[11438.233651][ T110] Register r8 information: non-paged memory
[11438.233658][ T110] Register r9 information: NULL pointer
[11438.233664][ T110] Register r10 information: NULL pointer
[11438.233670][ T110] Register r11 information: slab kmalloc-512 start dcd65600 pointer offset 340 size 512
[11438.233684][ T110] Register r12 information: non-slab/vmalloc memory
[11438.233690][ T110] Process kswapd0 (pid: 110, stack limit = 0xbdf943f7)
[11438.233696][ T110] Stack: (0xc26ffd50 to 0xc2700000)
[11438.233702][ T110] fd40: ???????? ???????? ???????? ????????
[11438.233708][ T110] fd60: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[11438.233713][ T110] fd80: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[11438.233719][ T110] fda0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[11438.233724][ T110] fdc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[11438.233729][ T110] fde0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[11438.233734][ T110] fe00: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[11438.233740][ T110] fe20: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[11438.233747][ T110] fe40: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[11438.233754][ T110] fe60: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[11438.233759][ T110] fe80: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[11438.233766][ T110] fea0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[11438.233772][ T110] fec0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[11438.233777][ T110] fee0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[11438.233784][ T110] ff00: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[11438.233789][ T110] ff20: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[11438.233794][ T110] ff40: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[11438.233799][ T110] ff60: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[11438.233805][ T110] ff80: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[11438.233810][ T110] ffa0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[11438.233815][ T110] ffc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[11438.233820][ T110] ffe0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[11438.233829][ T110] [<c0113dd0>] (arm_dma_unmap_sg) from [<c0a1564c>] (drm_gem_shmem_purge_locked+0x64/0x144)
[11438.233845][ T110] [<c0a1564c>] (drm_gem_shmem_purge_locked) from [<bf0dc334>] (panfrost_gem_shrinker_scan+0x148/0x190 [panfrost])
[11438.233873][ T110] [<bf0dc334>] (panfrost_gem_shrinker_scan [panfrost]) from [<c029b2e8>] (do_shrink_slab+0x184/0x464)
[11438.233900][ T110] [<c029b2e8>] (do_shrink_slab) from [<c029b674>] (shrink_slab+0xac/0x2c0)
[11438.233912][ T110] [<c029b674>] (shrink_slab) from [<c029f310>] (shrink_node+0x2b8/0x6ec)
[11438.233923][ T110] [<c029f310>] (shrink_node) from [<c029ff78>] (kswapd+0x420/0xa1c)
[11438.233934][ T110] [<c029ff78>] (kswapd) from [<c0152604>] (kthread+0x164/0x194)
[11438.233946][ T110] [<c0152604>] (kthread) from [<c0100130>] (ret_from_fork+0x14/0x24)
[11438.233955][ T110] Exception stack(0xc26fffb0 to 0xc26ffff8)
[11438.233961][ T110] ffa0: ???????? ???????? ???????? ????????
[11438.233967][ T110] ffc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
o[11438.233972][ T110] ffe0: ???????? ???????? ???????? ???????? ???????? ????????
[11438.233980][ T110] Code: da00000f e1a07000 e1a04001 e3a05000 (e5942010)
[11438.233988][ T110] ---[ end trace 3c244c3514f41c75 ]---
[ 2512.582786][ T1676] ptrace attach of "glmark2-es2 -b refract"[1428] was attempted by "gdb glmark2-es2 1428"[1676]
[ 2538.460964][ T34] INFO: task glmark2-es2:1300 blocked for more than 120 seconds.
[ 2538.460985][ T34] Tainted: G C 5.15.0-rc5-darkstar #3
[ 2538.460992][ T34] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2538.460997][ T34] task:glmark2-es2 state:D stack: 0 pid: 1300 ppid: 1292 flags:0x00000004
[ 2538.461012][ T34] [<c1042c14>] (__schedule) from [<c1043398>] (schedule+0x58/0x10c)
[ 2538.461029][ T34] [<c1043398>] (schedule) from [<c1043924>] (schedule_preempt_disabled+0x14/0x20)
[ 2538.461039][ T34] [<c1043924>] (schedule_preempt_disabled) from [<c1044b14>] (__mutex_lock.constprop.0+0x248/0x5e4)
[ 2538.461051][ T34] [<c1044b14>] (__mutex_lock.constprop.0) from [<bf068acc>] (panfrost_gem_free_object+0x20/0x104 [panfrost])
[ 2538.461078][ T34] [<bf068acc>] (panfrost_gem_free_object [panfrost]) from [<c09e77f0>] (drm_gem_object_release_handle+0x64/0x6c)
[ 2538.461102][ T34] [<c09e77f0>] (drm_gem_object_release_handle) from [<c0918eec>] (idr_for_each+0x44/0xdc)
[ 2538.461116][ T34] [<c0918eec>] (idr_for_each) from [<c09e85bc>] (drm_gem_release+0x1c/0x28)
[ 2538.461127][ T34] [<c09e85bc>] (drm_gem_release) from [<c09e60a4>] (drm_file_free.part.0+0x1ec/0x224)
[ 2538.461139][ T34] [<c09e60a4>] (drm_file_free.part.0) from [<c09e6460>] (drm_release+0x64/0x148)
[ 2538.461150][ T34] [<c09e6460>] (drm_release) from [<c0307a58>] (__fput+0x74/0x248)
[ 2538.461162][ T34] [<c0307a58>] (__fput) from [<c014f7ac>] (task_work_run+0x90/0xbc)
[ 2538.461174][ T34] [<c014f7ac>] (task_work_run) from [<c0133c88>] (do_exit+0x3a0/0xad0)
[ 2538.461186][ T34] [<c0133c88>] (do_exit) from [<c0134420>] (do_group_exit+0x3c/0xb8)
[ 2538.461209][ T34] [<c0134420>] (do_group_exit) from [<c01410e4>] (get_signal+0x1c8/0x9c4)
[ 2538.461222][ T34] [<c01410e4>] (get_signal) from [<c010a5b4>] (do_work_pending+0xf0/0x558)
[ 2538.461233][ T34] [<c010a5b4>] (do_work_pending) from [<c01000c0>] (slow_work_pending+0xc/0x20)
[ 2538.461243][ T34] Exception stack(0xceaa3fb0 to 0xceaa3ff8)
[ 2538.461249][ T34] 3fa0: ???????? ???????? ???????? ????????
[ 2538.461255][ T34] 3fc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 2538.461260][ T34] 3fe0: ???????? ???????? ???????? ???????? ???????? ????????
[ 689.948918] 8<--- cut here ---
[ 689.948947] Unable to handle kernel NULL pointer dereference at virtual address 00000004
[ 689.948955] pgd = ddb227dd
[ 689.948965] [00000004] *pgd=00000000
[ 689.948978] Internal error: Oops: 817 [#1] PREEMPT SMP ARM
[ 689.948986] Modules linked in: cros_ec_sysfs cros_ec_debugfs snd_soc_rockchip_max98090 hantro_vpu(C) panfrost brcmfmac videobuf2_dma_contig gpu_sched v4l2_h264 brcmutil rockchip_rga dw_hdmi_cec v4l2_mem2mem rk_crypto snd_soc_rockchip_i2s snd_soc_rockchip_pcm snd_soc_max98090 cros_ec_spi snd_soc_ts3a227e
[ 689.949030] CPU: 1 PID: 1155 Comm: glmark2-es2 Tainted: G C 5.15.0-rc5-darkstar #3
[ 689.949041] Hardware name: Rockchip (Device Tree)
[ 689.949047] PC is at panfrost_mmu_reset+0x4c/0xac [panfrost]
[ 689.949077] LR is at 0xc4a3fd68
[ 689.949085] pc : [<bf0de9ec>] lr : [<c4a3fd68>] psr: 20010113
[ 689.949091] sp : cc6a5d60 ip : fffffef8 fp : 000000a0
[ 689.949096] r10: a335bd86 r9 : c4b10a80 r8 : c4a3fd4c
[ 689.949102] r7 : ffffffff r6 : c4a3fc40 r5 : 00000000 r4 : c4a3fd68
[ 689.949107] r3 : cc3e7708 r2 : cc3e7600 r1 : 00000000 r0 : c4a3fd4c
[ 689.949114] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[ 689.949123] Control: 10c5387d Table: 0c91006a DAC: 00000051
[ 689.949127] Register r0 information: slab kmalloc-1k start c4a3fc00 pointer offset 332 size 1024
[ 689.949147] Register r1 information: NULL pointer
[ 689.949157] Register r2 information: slab kmalloc-512 start cc3e7600 pointer offset 0 size 512
[ 689.949171] Register r3 information: slab kmalloc-512 start cc3e7600 pointer offset 264 size 512
[ 689.949186] Register r4 information: slab kmalloc-1k start c4a3fc00 pointer offset 360 size 1024
[ 689.949200] Register r5 information: NULL pointer
[ 689.949207] Register r6 information: slab kmalloc-1k start c4a3fc00 pointer offset 64 size 1024
[ 689.949222] Register r7 information: non-paged memory
[ 689.949228] Register r8 information: slab kmalloc-1k start c4a3fc00 pointer offset 332 size 1024
[ 689.949242] Register r9 information: slab kmalloc-128 start c4b10a80 pointer offset 0 size 128
[ 689.949257] Register r10 information: non-paged memory
[ 689.949265] Register r11 information: non-paged memory
[ 689.949272] Register r12 information: non-paged memory
[ 689.949280] Process glmark2-es2 (pid: 1155, stack limit = 0x7c115980)
[ 689.949291] Stack: (0xcc6a5d60 to 0xcc6a6000)
[ 689.949300] 5d60: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 689.949306] 5d80: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 689.949312] 5da0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 689.949319] 5dc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 689.949325] 5de0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 689.949332] 5e00: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 689.949338] 5e20: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 689.949345] 5e40: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 689.949351] 5e60: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 689.949357] 5e80: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 689.949364] 5ea0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 689.949371] 5ec0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 689.949377] 5ee0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 689.949384] 5f00: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 689.949390] 5f20: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 689.949398] 5f40: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 689.949405] 5f60: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 689.949411] 5f80: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 689.949417] 5fa0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 689.949424] 5fc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 689.949429] 5fe0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 689.949443] [<bf0de9ec>] (panfrost_mmu_reset [panfrost]) from [<bf0da5e0>] (panfrost_device_resume+0x20/0x38 [panfrost])
[ 689.949490] [<bf0da5e0>] (panfrost_device_resume [panfrost]) from [<c0a52e30>] (genpd_runtime_resume+0x94/0x244)
[ 689.949531] [<c0a52e30>] (genpd_runtime_resume) from [<c0a47a24>] (__rpm_callback+0x3c/0x108)
[ 689.949550] [<c0a47a24>] (__rpm_callback) from [<c0a47b50>] (rpm_callback+0x60/0x64)
[ 689.949563] [<c0a47b50>] (rpm_callback) from [<c0a480e0>] (rpm_resume+0x58c/0x7a8)
[ 689.949578] [<c0a480e0>] (rpm_resume) from [<c0a48330>] (__pm_runtime_resume+0x34/0x6c)
[ 689.949590] [<c0a48330>] (__pm_runtime_resume) from [<bf0df5bc>] (panfrost_perfcnt_close+0x24/0x78 [panfrost])
[ 689.949620] [<bf0df5bc>] (panfrost_perfcnt_close [panfrost]) from [<bf0d9328>] (panfrost_postclose+0x10/0x2c [panfrost])
[ 689.949651] [<bf0d9328>] (panfrost_postclose [panfrost]) from [<c09e606c>] (drm_file_free.part.0+0x1b4/0x224)
[ 689.949692] [<c09e606c>] (drm_file_free.part.0) from [<c09e6460>] (drm_release+0x64/0x148)
[ 689.949710] [<c09e6460>] (drm_release) from [<c0307a58>] (__fput+0x74/0x248)
[ 689.949731] [<c0307a58>] (__fput) from [<c014f7ac>] (task_work_run+0x90/0xbc)
[ 689.949753] [<c014f7ac>] (task_work_run) from [<c0133c88>] (do_exit+0x3a0/0xad0)
[ 689.949775] [<c0133c88>] (do_exit) from [<c0134420>] (do_group_exit+0x3c/0xb8)
[ 689.949785] [<c0134420>] (do_group_exit) from [<c01410e4>] (get_signal+0x1c8/0x9c4)
[ 689.949807] [<c01410e4>] (get_signal) from [<c010a5b4>] (do_work_pending+0xf0/0x558)
[ 689.949825] [<c010a5b4>] (do_work_pending) from [<c01000c0>] (slow_work_pending+0xc/0x20)
[ 689.949841] Exception stack(0xcc6a5fb0 to 0xcc6a5ff8)
[ 689.949847] 5fa0: ???????? ???????? ???????? ????????
[ 689.949856] 5fc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 689.949864] 5fe0: ???????? ???????? ???????? ???????? ???????? ????????
[ 689.949880] Code: e3e07000 e592e10c e5825104 e5827100 (e581e004)
[ 689.949896] ---[ end trace de6f7d01ac03d9e0 ]---
[ 689.959456] note: glmark2-es2[1155] exited with preempt_count 1
[ 690.013004] Fixing recursive fault but reboot is needed!
[ 2379.786174] 8<--- cut here ---
[ 2379.786205] Unable to handle kernel paging request at virtual address fffffffc
[ 2379.786212] pgd = f3957c37
[ 2379.786221] [fffffffc] *pgd=2fffd861, *pte=00000000, *ppte=00000000
[ 2379.786237] Internal error: Oops: 37 [#1] PREEMPT SMP ARM
[ 2379.786246] Modules linked in: panfrost cros_ec_sysfs cros_ec_debugfs hantro_vpu(C) dw_hdmi_cec snd_soc_rockchip_max98090 gpu_sched rockchip_rga videobuf2_dma_contig v4l2_h264 brcmfmac rk_crypto v4l2_mem2mem snd_soc_rockchip_i2s snd_soc_rockchip_pcm brcmutil snd_soc_max98090 cros_ec_spi snd_soc_ts3a227e [last unloaded: panfrost]
[ 2379.786293] CPU: 1 PID: 110 Comm: kswapd0 Tainted: G C 5.15.0-rc5-darkstar #3
[ 2379.786301] Hardware name: Rockchip (Device Tree)
[ 2379.786307] PC is at panfrost_gem_shrinker_count+0x38/0x9c [panfrost]
[ 2379.786337] LR is at 0x0
[ 2379.786343] pc : [<bf0a03b4>] lr : [<00000000>] psr: a0070013
[ 2379.786349] sp : c266fdb8 ip : 00000000 fp : 00000000
[ 2379.786355] r10: c4a901f8 r9 : 0000000c r8 : 00000080
[ 2379.786361] r7 : c4a901f8 r6 : c1c12000 r5 : c4a901d0 r4 : 00000000
[ 2379.786367] r3 : fffffef4 r2 : 00000000 r1 : 00000000 r0 : c4a901f0
[ 2379.786374] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[ 2379.786382] Control: 10c5387d Table: 0d17806a DAC: 00000051
[ 2379.786387] Register r0 information: slab kmalloc-1k start c4a90000 pointer offset 496 size 1024
[ 2379.786405] Register r1 information: NULL pointer
[ 2379.786414] Register r2 information: NULL pointer
[ 2379.786422] Register r3 information: non-paged memory
[ 2379.786428] Register r4 information: NULL pointer
[ 2379.786434] Register r5 information: slab kmalloc-1k start c4a90000 pointer offset 464 size 1024
[ 2379.786449] Register r6 information: slab kmalloc-2k start c1c12000 pointer offset 0 size 2048
[ 2379.786462] Register r7 information: slab kmalloc-1k start c4a90000 pointer offset 504 size 1024
[ 2379.786474] Register r8 information: non-paged memory
[ 2379.786482] Register r9 information: non-paged memory
[ 2379.786488] Register r10 information: slab kmalloc-1k start c4a90000 pointer offset 504 size 1024
[ 2379.786502] Register r11 information: NULL pointer
[ 2379.786510] Register r12 information: NULL pointer
[ 2379.786516] Process kswapd0 (pid: 110, stack limit = 0xc70f127f)
[ 2379.786522] Stack: (0xc266fdb8 to 0xc2670000)
[ 2379.786528] fda0: ???????? ????????
[ 2379.786534] fdc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 2379.786540] fde0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 2379.786545] fe00: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 2379.786551] fe20: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 2379.786557] fe40: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 2379.786563] fe60: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 2379.786568] fe80: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 2379.786573] fea0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 2379.786579] fec0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 2379.786585] fee0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 2379.786590] ff00: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 2379.786597] ff20: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 2379.786602] ff40: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 2379.786608] ff60: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 2379.786614] ff80: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 2379.786621] ffa0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 2379.786626] ffc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 2379.786631] ffe0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 2379.786641] [<bf0a03b4>] (panfrost_gem_shrinker_count [panfrost]) from [<c029b194>] (do_shrink_slab+0x30/0x464)
[ 2379.786677] [<c029b194>] (do_shrink_slab) from [<c029b674>] (shrink_slab+0xac/0x2c0)
[ 2379.786688] [<c029b674>] (shrink_slab) from [<c029f310>] (shrink_node+0x2b8/0x6ec)
[ 2379.786700] [<c029f310>] (shrink_node) from [<c029ff78>] (kswapd+0x420/0xa1c)
[ 2379.786712] [<c029ff78>] (kswapd) from [<c0152604>] (kthread+0x164/0x194)
[ 2379.786725] [<c0152604>] (kthread) from [<c0100130>] (ret_from_fork+0x14/0x24)
[ 2379.786735] Exception stack(0xc266ffb0 to 0xc266fff8)
[ 2379.786741] ffa0: ???????? ???????? ???????? ????????
[ 2379.786747] ffc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
[ 2379.786752] ffe0: ???????? ???????? ???????? ???????? ???????? ????????
[ 2379.786761] Code: e5302008 e2423f43 e1500002 0a00000f (e5932108)
[ 2379.786803] ---[ end trace eb69bad2a84d2a4f ]---
1118353-[ 2301.719545] panfrost ffa30000.gpu: map: as=2, iova=7404000, paddr=44b98000, len=8000
1118354-[ 2301.719558] panfrost ffa30000.gpu: map: as=2, iova=740c000, paddr=b1ce8000, len=8000
1118355-[ 2301.719567] ------------[ cut here ]------------
1118356-[ 2301.719576] WARNING: CPU: 2 PID: 7155 at drivers/iommu/io-pgtable-arm.c:293 arm_lpae_map_pages+0x4d0/0x5cc
1118357-[ 2301.719611] Modules linked in: panfrost(O) cros_ec_debugfs cros_ec_sysfs hantro_vpu(C) snd_soc_rockchip_max98090 videobuf2_dma_contig v4l2_h264 gpu_sched brcmfmac rk_crypto dw_hdmi_cec rockchip_rga snd_soc_rockchip_i2s v4l2_mem2mem snd_soc_rockchip_pcm brcmutil cros_ec_spi snd_soc_max98090 snd_soc_ts3a227e [last unloaded: panfrost]
1118358-[ 2301.719677] CPU: 2 PID: 7155 Comm: sway Tainted: G WC O 5.15.0-rc5-darkstar #3
1118359:[ 2301.719695] Hardware name: Rockchip (Device Tree)
1118360-[ 2301.719709] [<c010f488>] (unwind_backtrace) from [<c010ae8c>] (show_stack+0x10/0x14)
1118361-[ 2301.719742] [<c010ae8c>] (show_stack) from [<c103a21c>] (dump_stack_lvl+0x40/0x4c)
1118362-[ 2301.719765] [<c103a21c>] (dump_stack_lvl) from [<c012df98>] (__warn+0xec/0x148)
1118363-[ 2301.719785] [<c012df98>] (__warn) from [<c1032a80>] (warn_slowpath_fmt+0x68/0x7c)
1118364-[ 2301.719801] [<c1032a80>] (warn_slowpath_fmt) from [<c09c7478>] (arm_lpae_map_pages+0x4d0/0x5cc)
1118365-[ 2301.719823] [<c09c7478>] (arm_lpae_map_pages) from [<c09c75a0>] (arm_lpae_map+0x2c/0x34)
1118366-[ 2301.719839] [<c09c75a0>] (arm_lpae_map) from [<bf0b6048>] (mmu_map_sg+0x108/0x164 [panfrost])
1118367-[ 2301.719882] [<bf0b6048>] (mmu_map_sg [panfrost]) from [<bf0b6b94>] (panfrost_mmu_map+0x6c/0xc8 [panfrost])
1118368-[ 2301.719924] [<bf0b6b94>] (panfrost_mmu_map [panfrost]) from [<bf0b2e9c>] (panfrost_gem_open+0x10c/0x204 [panfrost])
1118369-[ 2301.719963] [<bf0b2e9c>] (panfrost_gem_open [panfrost]) from [<c09e8190>] (drm_gem_handle_create_tail+0xec/0x1c4)
1118370-[ 2301.720014] [<c09e8190>] (drm_gem_handle_create_tail) from [<c09f7944>] (drm_gem_prime_fd_to_handle+0xb4/0x1e4)
1118371-[ 2301.720033] [<c09f7944>] (drm_gem_prime_fd_to_handle) from [<c09e92ac>] (drm_ioctl+0x204/0x394)
1118372-[ 2301.720045] [<c09e92ac>] (drm_ioctl) from [<c031ae50>] (sys_ioctl+0x130/0xba8)
1118373-[ 2301.720063] [<c031ae50>] (sys_ioctl) from [<c0100060>] (ret_fast_syscall+0x0/0x48)
1118374-[ 2301.720076] Exception stack(0xcfd7dfa8 to 0xcfd7dff0)
1118375-[ 2301.720077] panfrost ffa30000.gpu: unmap: as=3, iova=63e0000, len=20000
1118376-[ 2301.720084] dfa0: ???????? ???????? ???????? ???????? ???????? ????????
1118377-[ 2301.720088] dfc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ????????
1118378-[ 2301.720099] dfe0: ???????? ???????? ???????? ????????
1118379-[ 2301.720108] ---[ end trace 576f58ddaf677cf1 ]---
1118380-[ 2301.720121] panfrost ffa30000.gpu: map: as=2, iova=7414000, paddr=bed50000, len=10000
About the author
Apparently too young both to drink and to develop GPU drivers,
Icecream95 ignores only the second of these, and spends his time
working on a fork of the
Panfrost driver in Mesa for Arm Midgard, Bifrost, and Valhall
GPUs. From his first bug report to the kernel back when he could
still write his age with a single hex digit, Icecream95 has found
kernel bugs that others haven't, but later discovered that if they are
only mentioned on IRC, they can end up forgotten. Now his age in hex
is a palindrome, and some of the bugs are over a year old and still
unfixed.
Lightning McQueen also has 95 on his side, and crashes sometimes, but
this isn't the reason for Icecream95's username. Icecream98 was
already taken on Scratch, and he decided that XP was too stable to go
in that direction. Icecream95 doesn't remember much about using
Windows 95 though, and certainly would've forgotten about any kernel
bugs.