Finding latency issues (stalls) in embedded Linux systems
- by camh
I have an embedded Linux system running on an Atmel AT91SAM9260EK board on which I have two processes running at real-time priority. A manager process periodically "pings" a worker process using POSIX message queues to check the health of the worker process. Usually the round-trip ping takes about 1ms, but very occasionally it takes much longer - about 800ms. There are no other processes that run at a higher priority.
It appears the stall may be related to logging (syslog). If I stop logging the problem seems to go away. However it makes no difference if the log file is on JFFS2 or NFS. No other processes are writing to the "disk" - just syslog.
What tools are available to me to help me track down why these stalls are occurring? I am aware of latencytop and will be using that. Are there some other tools that may be more useful?
Some details:
Kernel version: 2.6.32.8
libc (syslog functions): uClibc 0.9.30.1
syslog: busybox 1.15.2