Websites down EC2 inaccessible via SSH CPU utilisation 100% last few hours - what should I do?
- by fuzzybee
I have multiple websites hosted on 1 single EC2 instance.
1 website "abc" were down for a few hours, sometimes threw database connection error and sometimes just took too long to respond.
1 website "def" were incredibly slow but still up and running
the rest of the websites had the same symptoms has "abc"
I can afford 15 min or less down time for "def".
Should I then (in AWS console)
reboot my instance
or
create an AMI image from my instance and launch it and associate my elastic IP to the new instance
or
"launch more like this"
Background on what may have happened to my ec2
The last time I made changes for 21 hours ago.
A cronjob to create snapshots ran around 19 hours ago and it has been running for a long time.
Google Analytics shows traffic to my websites such as kidlander.sg has been nothing exceptional.
Is there any other actions I should take or better options I could have?
(I have already contacted AWS support but their turnaround is 12 hours so I appreciate all the help I could get)
Update
I got everything back up and running and CPU utilisation back to normal, around 30%.
There is 1 difference between "def" and "abc" as well as my other websites
"def"'s database is hosted on RDS
"abc"'s database is hosted on an EC2 instance (different from my web server instance) configured by myself
Nevertheless, I checked the EC2 instance I'm using as MySQL server yesterday and it was absolutely fine during the incident
low CPU ultilisation
I could log in using linux command line