Server randomly freezes
- by PsySkeletor
Im facing a very strange issue, my debian squeeze freezes up always at night (Berlin, time).
Here is what i get from a time and after doing this a few times, it becomes frozen and must be hard-reset.
From /var/log/messages
Dec 11 01:36:11 srv156 kernel: [125983.204251] CPU 1:
Dec 11 01:36:11 srv156 kernel: [125983.204251] Modules linked in: xt_multiport nf_conntrack_ipv4 nf_defrag_ipv4 xt_recent xt_state nf_conntrack xt_tcpudp iptable_filter ip_tables x_tables hwmon_vid snd_hda_codec_atihdmi snd_hda_intel snd_hda_codec snd_hwdep snd_pcm radeon snd_timer ttm drm_kms_helper snd k10temp i2c_piix4 soundcore snd_page_alloc edac_core parport_pc drm i2c_algo_bit i2c_core shpchp pci_hotplug pcspkr edac_mce_amd parport wmi evdev processor button ext3 jbd mbcache raid1 md_mod sd_mod crc_t10dif ata_generic ahci ohci_hcd pata_atiixp e100 mii libata xhci floppy ehci_hcd thermal thermal_sys usbcore scsi_mod nls_base [last unloaded: i2c_dev]
Dec 11 01:36:11 srv156 kernel: [125983.204251] Pid: 758, comm: flush-9:0 Tainted: G B 2.6.32-5-amd64 #1 GA-78LMT-USB3
Dec 11 01:36:11 srv156 kernel: [125983.204251] RIP: 0010:[<ffffffff810b3506>] [<ffffffff810b3506>] find_get_pages_tag+0x66/0xdd
Dec 11 01:36:11 srv156 kernel: [125983.204251] RSP: 0018:ffff8804235e7b30 EFLAGS: 00000286
Dec 11 01:36:11 srv156 kernel: [125983.204251] RAX: ffffffffffffffff RBX: ffff8804235e7c00 RCX: 0000000000000000
Dec 11 01:36:11 srv156 kernel: [125983.204251] RDX: 0000000000040000 RSI: ffffea000496b2a8 RDI: ffffea000496b2a0
Dec 11 01:36:11 srv156 kernel: [125983.204251] RBP: ffffffff8101166e R08: ffff8804235e7af0 R09: 0000000000000000
Dec 11 01:36:11 srv156 kernel: [125983.204251] R10: 0000000000000000 R11: 0000000000040000 R12: ffff8804235e7c08
Dec 11 01:36:11 srv156 kernel: [125983.204251] R13: 0000000d22678a20 R14: ffff8804235e7af0 R15: 00000000091b9060
Dec 11 01:36:11 srv156 kernel: [125983.204251] FS: 0000000000000000(0000) GS:ffff880010440000(0000) knlGS:000000007ebf7b70
Dec 11 01:36:11 srv156 kernel: [125983.204522] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
Dec 11 01:36:11 srv156 kernel: [125983.204522] CR2: 00000000dec86000 CR3: 0000000001001000 CR4: 00000000000006e0
Dec 11 01:36:11 srv156 kernel: [125983.204522] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Dec 11 01:36:11 srv156 kernel: [125983.204522] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Dec 11 01:36:11 srv156 kernel: [125983.204522] Call Trace:
Dec 11 01:36:11 srv156 kernel: [125983.204522] [<ffffffff810bb792>] ? pagevec_lookup_tag+0x1a/0x21
Dec 11 01:36:11 srv156 kernel: [125983.204522] [<ffffffff810ba330>] ? write_cache_pages+0x162/0x327
Dec 11 01:36:11 srv156 kernel: [125983.204522] [<ffffffff810b9d48>] ? __writepage+0x0/0x25
Dec 11 01:36:11 srv156 kernel: [125983.204522] [<ffffffff8110758a>] ? writeback_single_inode+0xe7/0x2da
Dec 11 01:36:11 srv156 kernel: [125983.204522] [<ffffffff81108290>] ? writeback_inodes_wb+0x424/0x4ff
Dec 11 01:36:11 srv156 kernel: [125983.204522] [<ffffffff81108497>] ? wb_writeback+0x12c/0x1ab
Dec 11 01:36:11 srv156 kernel: [125983.204522] [<ffffffff8110870d>] ? wb_do_writeback+0x14f/0x165
Dec 11 01:36:11 srv156 kernel: [125983.204522] [<ffffffff81108754>] ? bdi_writeback_task+0x31/0xaa
Dec 11 01:36:11 srv156 kernel: [125983.204522] [<ffffffff810c8664>] ? bdi_start_fn+0x0/0xd0
Dec 11 01:36:11 srv156 kernel: [125983.204522] [<ffffffff810c86d4>] ? bdi_start_fn+0x70/0xd0
Dec 11 01:36:11 srv156 kernel: [125983.204522] [<ffffffff810c8664>] ? bdi_start_fn+0x0/0xd0
Dec 11 01:36:11 srv156 kernel: [125983.204522] [<ffffffff81064ac1>] ? kthread+0x79/0x81
Dec 11 01:36:11 srv156 kernel: [125983.204522] [<ffffffff81011baa>] ? child_rip+0xa/0x20
Dec 11 01:36:11 srv156 kernel: [125983.204522] [<ffffffff81064a48>] ? kthread+0x0/0x81
Dec 11 01:36:11 srv156 kernel: [125983.204522] [<ffffffff81011ba0>] ? child_rip+0x0/0x20
From /var/log/syslog
Dec 10 21:20:29 srv156 kernel: [110625.162930] BUG: Bad page map in process java pte:14fa4f067 pmd:424b54067
Dec 10 21:20:29 srv156 kernel: [110625.162937] page:ffffea000496c148 flags:0200000000000878 count:2 mapcount:-1 mapping:ffff88014f8d7de8 index:2f4
Dec 10 21:20:29 srv156 kernel: [110625.162946] addr:0000000009096000 vm_flags:00100077 anon_vma:ffff880422410d40 mapping:(null) index:9096
Dec 10 21:20:29 srv156 kernel: [110625.162955] Pid: 21356, comm: java Tainted: G B 2.6.32-5-amd64 #1
Dec 10 21:20:29 srv156 kernel: [110625.162961] Call Trace:
Dec 10 21:20:29 srv156 kernel: [110625.162966] [<ffffffff810ca4bf>] ? print_bad_pte+0x232/0x24a
Dec 10 21:20:29 srv156 kernel: [110625.162973] [<ffffffff810cb56f>] ? unmap_vmas+0x62d/0x931
Dec 10 21:20:29 srv156 kernel: [110625.162980] [<ffffffff810cfc74>] ? exit_mmap+0xc4/0x148
Dec 10 21:20:29 srv156 kernel: [110625.162986] [<ffffffff8104bbc1>] ? mmput+0x3c/0xdf
Dec 10 21:20:29 srv156 kernel: [110625.162992] [<ffffffff8104f81e>] ? exit_mm+0x102/0x10d
Dec 10 21:20:29 srv156 kernel: [110625.162998] [<ffffffff81051243>] ? do_exit+0x1f8/0x6c9
Dec 10 21:20:29 srv156 kernel: [110625.163004] [<ffffffff81071abb>] ? futex_wake+0xd6/0xe7
Dec 10 21:20:29 srv156 kernel: [110625.163010] [<ffffffff8105178a>] ? do_group_exit+0x76/0x9d
Dec 10 21:20:29 srv156 kernel: [110625.163016] [<ffffffff8105df9f>] ? get_signal_to_deliver+0x310/0x339
Dec 10 21:20:29 srv156 kernel: [110625.163023] [<ffffffff81010037>] ? do_notify_resume+0x87/0x73f
Dec 10 21:20:29 srv156 kernel: [110625.163029] [<ffffffff810cc664>] ? handle_mm_fault+0x7aa/0x80f
Dec 10 21:20:29 srv156 kernel: [110625.163036] [<ffffffff81073f14>] ? compat_sys_futex+0x10d/0x12b
Dec 10 21:20:29 srv156 kernel: [110625.163043] [<ffffffff812fb546>] ? do_page_fault+0x2e0/0x2fc
Dec 10 21:20:29 srv156 kernel: [110625.163049] [<ffffffff81010e0e>] ? int_signal+0x12/0x17
Dec 10 21:20:29 srv156 kernel: [110625.163114] BUG: Bad page state in process java pfn:14fa0c
Dec 10 21:20:29 srv156 kernel: [110625.163120] page:ffffea000496b2a0 flags:020000000002001c count:0 mapcount:-1 mapping:ffff88039dc0db30 index:11e3
Dec 10 21:20:29 srv156 kernel: [110625.164563] Pid: 21356, comm: java Tainted: G B 2.6.32-5-amd64 #1
Dec 10 21:20:29 srv156 kernel: [110625.164570] Call Trace:
Dec 10 21:20:29 srv156 kernel: [110625.164578] [<ffffffff810b71a9>] ? bad_page+0x116/0x129
Dec 10 21:20:29 srv156 kernel: [110625.164586] [<ffffffff810b7692>] ? free_pages_check+0x38/0x57
Dec 10 21:20:29 srv156 kernel: [110625.164595] [<ffffffff810b89cf>] ? free_hot_cold_page+0x46/0x190
Dec 10 21:20:29 srv156 kernel: [110625.164603] [<ffffffff810b8b82>] ? __pagevec_free+0x69/0x7f
Dec 10 21:20:29 srv156 kernel: [110625.164611] [<ffffffff810bba3f>] ? release_pages+0x137/0x18d
Dec 10 21:20:29 srv156 kernel: [110625.164620] [<ffffffff810d8559>] ? free_pages_and_swap_cache+0x57/0x73
Dec 10 21:20:29 srv156 kernel: [110625.164629] [<ffffffff810cb5ed>] ? unmap_vmas+0x6ab/0x931
Dec 10 21:20:29 srv156 kernel: [110625.164637] [<ffffffff810cfc74>] ? exit_mmap+0xc4/0x148
Dec 10 21:20:29 srv156 kernel: [110625.164644] [<ffffffff8104bbc1>] ? mmput+0x3c/0xdf
Dec 10 21:20:29 srv156 kernel: [110625.164652] [<ffffffff8104f81e>] ? exit_mm+0x102/0x10d
Dec 10 21:20:29 srv156 kernel: [110625.164660] [<ffffffff81051243>] ? do_exit+0x1f8/0x6c9
Dec 10 21:20:29 srv156 kernel: [110625.164667] [<ffffffff81071abb>] ? futex_wake+0xd6/0xe7
Dec 10 21:20:29 srv156 kernel: [110625.164675] [<ffffffff8105178a>] ? do_group_exit+0x76/0x9d
Dec 10 21:20:29 srv156 kernel: [110625.164683] [<ffffffff8105df9f>] ? get_signal_to_deliver+0x310/0x339
Dec 10 21:20:29 srv156 kernel: [110625.164692] [<ffffffff81010037>] ? do_notify_resume+0x87/0x73f
Dec 10 21:20:29 srv156 kernel: [110625.164700] [<ffffffff810cc664>] ? handle_mm_fault+0x7aa/0x80f
The last piece of log, has been recently posted, because I've just found it. It seems Java process do something and began to slowly eat all the resources of the server. I don't know exactly if this could be the root cause.
Im using Debian Squeeze.
uname -a
Linux srv156 2.6.32-5-amd64 #1 SMP Sun Sep 23 11:00:33 UTC 2012 x86_64 GNU/Linux
I really will appreciate your help, i dont know what more to do.