Postgres 9.0 locking up, 100% CPU
- by Jake
We are having a problem where our Postgres 9.0 server occasionally locks up and kills our webapp. Restarting Postgres fixes the problem.
Here's what I've been able to observe:
First, usage of one CPU jumps to 100% for a few minutes
Disk operations drop to ~0 during this time
Database operations drop to 0 (blocks and tuples per sec)
Logs show during this time:
WARNING: worker took too long to start; cancelled
WARNING: worker took too long to start; cancelled
No Queries in logs (only those over 200ms are logged)
No unusually long-running queries logged before or during
Then the second CPU jumps to 100%
The number of postgres processes jumps from the usual 8-10 to ~20
Matched by a spike in Postgres Blocks per second (about twice normal)
Logs show
LOG: could not accept SSL connection: EOF detected
Queries are running but slow
Restarting postgres returns everything to normal
Setup:
Server: Amazon EC2 Large
Ubuntu 10.04.2 LTS
Postgres 9.0.3
Dedicated DB server
Does anyone have any idea what's causing this? Or any suggestions about what else I should be checking out?