Scenario: All of a sudden, my computer feels sluggish. Mouse moves but windows take ages to open, etc. uptime says the load is 7.69 and raising.
What is the fastest way to find out which process(es) are the cause of the load?
Now, "top" and similar tools isn't the answer because they either show CPU or memory usage but not both at the same time. What I need is the single command which I might be able to type as it happens - something that will figure out any of
System is trying to swap 8GB of RAM to disk because process X ...
or
process X seeks all over the disk
or
process X uses 400% CPU"
So what I'm looking for is iostat, htop/atop and similar tools run into one with an output like this:
1235 cp - Disk trashing
87 chrome - Uses 2 GB of RAM
137 nfs_bench - Uses 95% of the network bandwidth
I don't want a tool that gives me some numbers which I can analyze but a tool that tells me exactly which process causes the current load. Assume that the user in front of the keyboard barely knows how to write "process", but the user is quickly overwhelmed when it comes to "resident size", "virtual memory" or "process life cycle".
My argument goes like this: A user notices a problem. There can be thousands of reasons ... well, almost :-) The user wants to know the source of the problem.
The current solutions give me lots of numbers, and I need to know what these numbers mean. What I'm looking for is a meta tool. 99% of the data is irrelevant to the problem. So what the tool should do is look for processes which hog some resource and list only those along with "this process needs a lot of CPU, this produces many IRQs, this process allocates a lot of RAM (and it's still growing)".
This will be a relatively short list. It will be much more simple for someone new to this to locate the culprit from this list than from the output of, say, htop which gives me about 5000 numbers but requires me to fold multi-threaded processes myself (I have 50 lines which say VIRT 2750M but only 16 GB of RAM - the machine ought to swap itself to death but of course, this is a misinterpretation of the data that can happen quickly).