Hang while starting several daemons [solved]
- by Adrian Lang
I’m running a Debian Squeeze AMD64 server. Target runlevel after boot is runlevel 2, which includes rsyslogd, cron, sshd and some other stuff, but not dovecot, postfix, apache2, etc. The system fails to reach runlevel 2 with several symptoms:
The system hangs at trying to start rsyslogd
Booting into runlevel 1 works, then login from the console works
Starting rsyslogd from runlevel 1 via /etc/init.d/rsyslog hangs
Starting runlevel 2 with rsyslogd disabled works
But then, logging in via console fails: I get the motd, and then nothing
Starting sshd from runlevel 1 succeeds
But then, I cannot login via ssh. Sometimes password ssh login gives me the motd and then nothing, sometimes not even this. Trying to offer a public key seems to annoy the sshd enough to not talk to me any further.
When rebooting from runlevel 1, the server hangs at trying to stop apache2 (which is not running, so this really should be trivial). Trying to stop apache2 when logged in in runleve 1 does hang as well.
And that’s just the stuff which fails all the time. RAM has been tested, dmesg shows no problems. I have no clue.
Update: (shortened) output from rsyslogd -c4 -d called in runlevel 1
rsyslogd 4.6.4 startup, compatibility mode 4, module path ''
caller requested object 'net', not found (iRet -3003)
Requested to load module 'lmnet'
loading module '/user/lib/rsyslog/lmnet.so'
module of type 2 being loaded
conf.c requested ref for 'lmnet', refcount 1
rsylog runtime initialized, version 4.6.4, current users 1
syslogd.c requested ref for 'lmnet', refcount now 2
I can kill rsyslogd with Strg+C, then. /var/log shows none of the configured log files, though.
Update2: Thanks to @DerfK I still have no clue, but at least I narrowed down the problem. I’m now testing with /etc/init.d/apache2 stop (without an apache2 running, of course) which hangs as well and looks like an even more obvious failure.
After some testing I found out that a file with one single line:
/usr/sbin/apache2ctl configtest /dev/null 2&1
hangs, while the same line executed in an interactive shell works. I was not able to further reduce this line while, i. e. every single part, the stream redirections and the commando itself is necessary to reproduce the hang. @DerfK also pointed me to strace which gave a shallow hint about what kind of hang we have here:
wait4(-1for the init scripts
futex(0xsomepointer, FUTEX_WAIT_PRIVATE, 2, NULL for rsyslogd / apache2 binaries called by the init scripts
The system was installed as a Debian Lenny by my hoster in autumn 2011, I upgraded it to Squeeze immediately and kept it up to date with Squeeze, which then used to be testing. There were no big changes, though. I guess I never tried to reboot the system before.
Update3: I found the problem. My /etc/nsswitch.conf specified ldap as hosts lookup backup, which is not available at that time of the boot. Relying on dns solely fixes my boot problems.