Server hang - data loss on reboot, post mortem analysis

Posted by rovangju on Server Fault See other posts from Server Fault or by rovangju
Published on 2011-06-20T19:44:08Z Indexed on 2011/06/25 0:24 UTC
Read the original article Hit count: 511

Filed under:

linux

|

debian

|

disk

|

freezing

|

data-loss

A development server I'm responsible for (ext3 on raid 5 w/Debian Squeeze) froze up over the weekend and I was forced to reset it, as in unresponsive from KVM/physical keyboard access, no eth devices responding, etc. Not even the backup process ran (Figures, the one time I don't check for confirmation)

So after the reset, it turns out that every trace of ~~disk IO~~ activity that should have happened for a period of ~24H is completely gone. The log files have a big gap in the dates and times. As if the writes were never committed to disk, no processes seemed to have run.

Luckily it was a weekend and nothing of value would have been lost and I don't suspect a hack.

What can I do in post mortem to this event - to prevent it from ever happening again? I've seen this happen before on a completely different machine running FreeBSD.

I am rounding up the disk checking tools right now - but there must be more going on!

Mount options: /dev/sda1 on / type ext3 (rw,errors=remount-ro)
Kernel: Linux dev 2.6.32-5-686-bigmem
Disk/Inodes: 13%/3%

© Server Fault or respective owner

Related posts about linux

apt-get install and update fail

as seen on Ask Ubuntu - Search for 'Ask Ubuntu'
I've got a problem with apt-get update and apt-get install ... commands . every time update or installing fails and errors are : Get:1 http://dl.google.com stable Release.gpg [198B] Ign http://dl.google.com/linux/chrome/deb/ stable/main Translation-en_US Get:2 http://dl… >>> More
kernel module compiling error

as seen on Ask Ubuntu - Search for 'Ask Ubuntu'
sh@ubuntu:/home/ccpp/helloworld$ make gcc-4.6 -O2 -DMODULE -D_KERNEL_ -W -Wall -Wstrict-prototypes -Wmissing-prototypes -isystem /lib/modules/`uname -r`/build/include -c -o hello-1.o hello-1.c hello-1.c:4:0: warning: "MODULE" redefined [enabled by default] <command-line>:0:0: note: this is… >>> More
Build-Essentials installation failing

as seen on Ask Ubuntu - Search for 'Ask Ubuntu'
I am having trouble accessing the several critical header files that show to be a part of the build process. The "Ubuntu Software Center" shows "Build Essentials" as installed: Next I did the following two commands, which did not improve the problem: ~$ sudo apt-get install build-essential [sudo]… >>> More
Updating Debian kernel

as seen on Super User - Search for 'Super User'
I'm trying to update my Debian machine to 2.6.32-46 (which is the new stable). However, after doing apt-get update my apt-cache search linux-image shows me: linux-headers-2.6.32-5-486 - Header files for Linux 2.6.32-5-486 linux-headers-2.6.32-5-686-bigmem - Header files for Linux 2.6.32-5-686-bigmem linux-headers-2… >>> More
Serial connection over a single USB cable (Windows to linux, or linux to linux)

as seen on Server Fault - Search for 'Server Fault'
I'm helping out with a project for an embedded device that only has USB and no serial. This device is running Linux. These days, when we need to connect to a serial port on a device we typically use a USB to serial adapter (on something like a phone system or a load balancing device, etc). I would… >>> More

Related posts about debian

Trying to update debian not working

as seen on Super User - Search for 'Super User'
As root i type this command apt-get update and get these error messages. > Err http://security.debian.org lenny/updates Release.gpg Could not resolve 'security.debian.org' Err http://security.debian.org lenny/updates/main… >>> More
Trouble with dns and debian update

as seen on Server Fault - Search for 'Server Fault'
I tried to update my debian dreamplug server with the command running as root apt-get update and recieved these errors. Err http://security.debian.org lenny/updates Release.gpg Could not resolve 'security.debian… >>> More
Errors when installing Open Office

as seen on Ask Ubuntu - Search for 'Ask Ubuntu'
I followed the first set of instructions on this page to install Open Office: How to install Open Office? However, the last step which says to change the CHMOD of a folder, I got an error saying that the directory does not exist. Open Office now appears in my Ubuntu start menu, but clicking… >>> More
Installing PHP4 on a Debian (lenny) 7 32bit box

as seen on Server Fault - Search for 'Server Fault'
I am trying to install PHP4 on a Debian 7 32bit box but I ran into the following root@php4:~# apt-get update Get:1 http://snapshot.debian.org lenny Release.gpg [189 B] Hit http://snapshot.debian.org lenny Release Ign http://snapshot.debian.org lenny Release Hit http://snapshot.debian.org lenny/main… >>> More
Debian keyring error: "No keyring installed"

as seen on Server Fault - Search for 'Server Fault'
I have a Debian Squeeze EC2 AMI. On booting up an instance with it and trying to install packages with apt-get I get errors saying there is no keyring installed. Here is the error with apt-get update: root@ip:~# apt-get update Get:1 http://ftp.us.debian.org squeeze Release.gpg [1672 B] Ign http://ftp… >>> More