What do I do when I get a Linux kernel bug?
- by raldi
I just bought a tiny computer called a fit-pc2 which came with a somewhat customized Ubuntu 9.10 installation. uname -a reports:
Linux 2.6.31-34-fitpc2 #7 SMP Thu Apr 22 17:43:26 IDT 2010 i686 GNU/Linux
It seems that after several hours of running with heavy network load, all networking ceases and I get the following in kern.log:
BUG: unable to handle kernel paging request at ff09dfc0
IP: [<c0150300>] kthread_should_stop+0x10/0x20
*pde = 00000000
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:1d.7/usb1/idVendor
Modules linked in: binfmt_misc ppdev sbc_fitpc2_wdt snd_usb_audio snd_usb_lib i2c_isch sch_gpio snd_seq_dummy snd_hda_intel snd_pcm_oss snd_seq_oss snd_seq_midi snd_rawmidi snd_mixer_oss snd_seq_midi_event snd_seq snd_pcm snd_timer snd_page_alloc snd_seq_device iptable_filter ip_tables x_tables snd_hwdep lpc_sch snd psmouse rt2860sta(C) uvcvideo video pl2303 soundcore mfd_core output videodev v4l1_compat lirc_igorplugusb lirc_dev serio_raw lp parport usbhid r8169 mii iegd_mod drm agpgart
Pid: 16, comm: kblockd/1 Tainted: G C (2.6.31-34-fitpc2 #7) SBC-FITPC2
EIP: 0060:[<c0150300>] EFLAGS: 00010246 CPU: 1
EIP is at kthread_should_stop+0x10/0x20
EAX: ff09dfc4 EBX: c180cbac ECX: 0109d000 EDX: f709df98
ESI: f709df98 EDI: c180cba0 EBP: f709dfb8 ESP: f709df90
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process kblockd/1 (pid: 16, ti=f709c000 task=f7084b60 task.ti=f709c000)
Stack:
c014c14d c180cba4 00000000 f7084b60 c0150770 f709dfa4 f709dfa4 f7023ef4
<0> c180cba0 c014c0d0 f709dfe0 c015047c 00000000 00000000 00000000 f709dfcc
<0> f709dfcc c0150400 00000000 00000000 00000000 c0103ce7 f7023ef4 00000000
Call Trace:
[<c014c14d>] ? worker_thread+0x7d/0xe0
[<c0150770>] ? autoremove_wake_function+0x0/0x40
[<c014c0d0>] ? worker_thread+0x0/0xe0
[<c015047c>] ? kthread+0x7c/0x90
[<c0150400>] ? kthread+0x0/0x90
[<c0103ce7>] ? kernel_thread_helper+0x7/0x10
Code: a6 8b 55 0c 8d 4d e0 89 f8 89 34 24 e8 7a fd ff ff 89 c3 eb 92 90 90 90 90 90 90 55 64 a1 00 80 76 c0 8b 80 70 02 00 00 89 e5 5d <8b> 40 fc c3 8d b6 00 00 00 00 8d bf 00 00 00 00 55 ba d7 86 62
EIP: [<c0150300>] kthread_should_stop+0x10/0x20 SS:ESP 0068:f709df90
CR2: 00000000ff09dfc0
---[ end trace 06004df70b9cf435 ]---
BUG: unable to handle kernel paging request at ff09dfc8
IP: [<c0521bc8>] _spin_lock_irqsave+0x18/0x30
*pde = 00000000
Oops: 0002 [#2] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:1d.7/usb1/idVendor
Modules linked in: binfmt_misc ppdev sbc_fitpc2_wdt snd_usb_audio snd_usb_lib i2c_isch sch_gpio snd_seq_dummy snd_hda_intel snd_pcm_oss snd_seq_oss snd_seq_midi snd_rawmidi snd_mixer_oss snd_seq_midi_event snd_seq snd_pcm snd_timer snd_page_alloc snd_seq_device iptable_filter ip_tables x_tables snd_hwdep lpc_sch snd psmouse rt2860sta(C) uvcvideo video pl2303 soundcore mfd_core output videodev v4l1_compat lirc_igorplugusb lirc_dev serio_raw lp parport usbhid r8169 mii iegd_mod drm agpgart
Pid: 16, comm: kblockd/1 Tainted: G D C (2.6.31-34-fitpc2 #7) SBC-FITPC2
EIP: 0060:[<c0521bc8>] EFLAGS: 00010086 CPU: 1
EIP is at _spin_lock_irqsave+0x18/0x30
EAX: 00000100 EBX: ff09dfc8 ECX: 00000286 EDX: ff09dfc8
ESI: f7084b60 EDI: ff09dfc4 EBP: f709dd88 ESP: f709dd88
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process kblockd/1 (pid: 16, ti=f709c000 task=f7084b60 task.ti=f709c000)
Stack:
f709dda4 c0127c0b 00000082 00000001 ff09dfc4 f7084b60 00000000 f709ddd0
<0> c0137fd2 00000086 f70954c4 00000000 f7098480 f709ddf0 f7094fc0 f7084b60
<0> 00000000 00000009 f709ddf0 c013c3f8 00000001 c1807c60 f709ddf0 f7084b60
Call Trace:
[<c0127c0b>] ? complete+0x1b/0x60
[<c0137fd2>] ? mm_release+0x52/0xf0
[<c013c3f8>] ? exit_mm+0x18/0x110
[<c013c6db>] ? do_exit+0xfb/0x2e0
[<c013998a>] ? print_oops_end_marker+0x2a/0x30
[<c0522aab>] ? oops_end+0x8b/0xd0
[<c011eac4>] ? no_context+0xb4/0xd0
[<c011eb1d>] ? __bad_area_nosemaphore+0x3d/0x1a0
[<c0133a56>] ? load_balance_newidle+0x96/0x320
[<c011ec92>] ? bad_area_nosemaphore+0x12/0x20
[<c0524106>] ? do_page_fault+0x2f6/0x380
[<c012cc30>] ? finish_task_switch+0x50/0xe0
[<c0523e10>] ? do_page_fault+0x0/0x380
[<c0522006>] ? error_code+0x66/0x70
[<c0523e10>] ? do_page_fault+0x0/0x380
[<c0150300>] ? kthread_should_stop+0x10/0x20
[<c014c14d>] ? worker_thread+0x7d/0xe0
[<c0150770>] ? autoremove_wake_function+0x0/0x40
[<c014c0d0>] ? worker_thread+0x0/0xe0
[<c015047c>] ? kthread+0x7c/0x90
[<c0150400>] ? kthread+0x0/0x90
[<c0103ce7>] ? kernel_thread_helper+0x7/0x10
Code: 00 00 00 55 89 e5 f0 83 28 01 79 05 e8 02 ff ff ff 5d c3 55 89 c2 89 e5 9c 58 8d 74 26 00 89 c1 fa 90 8d 74 26 00 b8 00 01 00 00 <f0> 66 0f c1 02 38 e0 74 06 f3 90 8a 02 eb f6 89 c8 5d c3 90 8d
EIP: [<c0521bc8>] _spin_lock_irqsave+0x18/0x30 SS:ESP 0068:f709dd88
CR2: 00000000ff09dfc8
---[ end trace 06004df70b9cf436 ]---
Fixing recursive fault but reboot is needed!
This seems to happen at least once a day. How do I even begin to debug this?