Setup: ASP.NET 4.0 website on IIS 6.0 on Win 2003 64 bit, 8xCPUs, 16GB memory, separate SQL 2005 DB server.
Had a serious slowdown today with any otherwise fairly well performing ASP.NET site. For a period of a couple of hours all page requests were taking a very long time to be served - e.g. 30-60s compared to usual 2s. The w3wp.exe's CPU and memory usage on the webserver was not much higher than normal. The application pool was not in the middle of recycling (and it hadn't recycled for several hours). Bottlenecks in the database were ruled out - no blocks occurring and query results were being returned quickly. I couldn't make any sense of it and set up the following Perfmon counters:
Current Anonymous Users (for site in question)
Get requests/sec (ditto)
Requests/sec for the ASP.NET application running the site
Get requests/sec was averaging 100-150. Requests/sec for ASP.NET was averaging 5-10. However Current Anonymous Users was around 200. And then as I was watching, the Current Anonymous Users began to climb steeply going up to about 500 within a few minutes. All this time Get requests/sec & Requests/sec for ASP.NET was if anything going down.
I did a whole load of things (in a panic!) to try to get the site working, like shutting it down, recycling the app pool, and adding another worker process to the pool. I also extended the expiration time for content (in IIS under HTTP Headers) in an attempt to lower the number of requests for static files (there are a lot of images on the site).
The site is now back to normal, and the counters are fairly steady and reading (added Current Connections counter):
Current Anonymous Users : average 30
Get requests/sec : average 100
Requests/sec for ASP.NET : 5
Current Connections : average 300
I have also observed an inverse relationship between Get requests/sec & Current Anonymous Users. Usually both are fairly steady but there will be short periods when Get requests/sec will go down dramatically and Current Anonymous Users will go up in a perfect mirror image. Then they will flip back to their usual levels.
So, my questions are:
Thinking of the original performance issue - if w3wp.exe CPU, memory usage were normal and there was no DB bottleneck, what could explain page requests taking 20 times longer to be served than usual? What other counters should I be looking at if this happens again?
What explains the inverse relationship between Get requests/sec & Current Anonymous Users?
What could explain Current Anonymous Users going from 200 to 500 within a few minutes?
Many thanks for any insight into this.