How to find out what is causing a slow down of the application on this server?
Posted
by
Jan P.
on Server Fault
See other posts from Server Fault
or by Jan P.
Published on 2012-04-14T20:17:04Z
Indexed on
2012/04/14
23:33 UTC
Read the original article
Hit count: 281
This is not the typical serverfault question, but I'm out of ideas and don't know where else to go. If there are better places to ask this, just point me there in the comments. Thanks.
Situation
We have this web application that uses Zend Framework, so runs in PHP on an Apache web server. We use MySQL for data storage and memcached for object caching.
The application has a very unique usage and load pattern. It is a mobile web application where every full hour a cronjob looks through the database for users that have some information waiting or action to do and sends this information to a (external) notification server, that pushes these notifications to them. After the users get these notifications, the go to the app and use it, mostly for a very short time. An hour later, same thing happens.
Problem
In the last few weeks usage of the application really started to grow. In the last few days we encountered very high load and doubling of application response times during and after the sending of these notifications (so basically every hour). The server doesn't crash or stop responding to requests, it just gets slower and slower and often takes 20 minutes to recover - until the same thing starts again at the full hour.
We have extensive monitoring in place (New Relic, collectd) but I can't figure out what's wrong; I can't find the bottlekneck. That's where you come in:
Can you help me figure out what's wrong and maybe how to fix it?
Additional information
The server is a 16 core Intel Xeon (8 cores with hyperthreading, I think) and 12GB RAM running Ubuntu 10.04 (Linux 3.2.4-20120307 x86_64). Apache is 2.2.x and PHP is Version 5.3.2-1ubuntu4.11.
If any configuration information would help analyze the problem, just comment and I will add it.
Graphs
info
collectd
New Relic
(Sorry the graphs are gifs and not the same time period, but I think the most important info is in there)
© Server Fault or respective owner