I'm using the RDS database from amazon with a some very big tables, and yesterday I started to face 100% CPU utilisation on the server and a bunch of slow query logs that were not happening before.
I tried to check the queries that were running and faced this result from the explain command
+----+-------------+-------------------------------+--------+----------------------------------------------------------------------------------------------+---------------------------------------+---------+-----------------------------------------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------------------------+--------+----------------------------------------------------------------------------------------------+---------------------------------------+---------+-----------------------------------------------------------------+------+----------------------------------------------+
| 1 | SIMPLE | businesses | const | PRIMARY | PRIMARY | 4 | const | 1 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | activities_businesses | ref | PRIMARY,index_activities_users_on_business_id,index_tweets_users_on_tweet_id_and_business_id | index_activities_users_on_business_id | 9 | const | 2252 | Using index condition; Using where |
| 1 | SIMPLE | activities_b_taggings_975e9c4 | ref | taggings_idx | taggings_idx | 782 | const,myapp_production.activities_businesses.id,const | 1 | Using index condition; Using where |
| 1 | SIMPLE | activities | eq_ref | PRIMARY,index_activities_on_created_at | PRIMARY | 8 | myapp_production.activities_businesses.activity_id | 1 | Using where |
+----+-------------+-------------------------------+--------+----------------------------------------------------------------------------------------------+---------------------------------------+---------+-----------------------------------------------------------------+------+----------------------------------------------+
Also checkin in the process list, I got something like this:
+----+-----------------+-------------------------------------+----------------------------+---------+------+--------------+------------------------------------------------------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+-----------------+-------------------------------------+----------------------------+---------+------+--------------+------------------------------------------------------------------------------------------------------+
| 1 | my_app | my_ip:57152 | my_app_production | Sleep | 0 | | NULL |
| 2 | my_app | my_ip:57153 | my_app_production | Sleep | 2 | | NULL |
| 3 | rdsadmin | localhost:49441 | NULL | Sleep | 9 | | NULL |
| 6 | my_app | my_other_ip:47802 | my_app_production | Sleep | 242 | | NULL |
| 7 | my_app | my_other_ip:47807 | my_app_production | Query | 231 | Sending data | SELECT my_fields... |
| 8 | my_app | my_other_ip:47809 | my_app_production | Query | 231 | Sending data | SELECT my_fields... |
| 9 | my_app | my_other_ip:47810 | my_app_production | Query | 231 | Sending data | SELECT my_fields... |
| 10 | my_app | my_other_ip:47811 | my_app_production | Query | 231 | Sending data | SELECT my_fields... |
| 11 | my_app | my_other_ip:47813 | my_app_production | Query | 231 | Sending data | SELECT my_fields... |
...
So based on the numbers, it looks like there is no reason to have a slow query, since the worst execution plan is the one that goes through 2k rows which is not much.
Edit 1
Another information that might be useful is the slow query_log
SET timestamp=1401457485;
SELECT my_query...
# User@Host: myapp[myapp] @ ip-10-195-55-233.ec2.internal [IP] Id: 435
# Query_time: 95.830497 Lock_time: 0.000178 Rows_sent: 0 Rows_examined: 1129387
Edit 2
After profiling, I got this result. The result have approximately 250 rows with two columns each.
+----------------------+----------+
| state | duration |
+----------------------+----------+
| Sending data | 272 |
| removing tmp table | 0 |
| optimizing | 0 |
| Creating sort index | 0 |
| init | 0 |
| cleaning up | 0 |
| executing | 0 |
| checking permissions | 0 |
| freeing items | 0 |
| Creating tmp table | 0 |
| query end | 0 |
| statistics | 0 |
| end | 0 |
| System lock | 0 |
| Opening tables | 0 |
| logging slow query | 0 |
| Sorting result | 0 |
| starting | 0 |
| closing tables | 0 |
| preparing | 0 |
+----------------------+----------+
Edit 3
Adding query as requested
SELECT activities.share_count,
activities.created_at
FROM `activities_businesses`
INNER JOIN `businesses`
ON `businesses`.`id` = `activities_businesses`.`business_id`
INNER JOIN `activities`
ON `activities`.`id` = `activities_businesses`.`activity_id`
JOIN taggings activities_b_taggings_975e9c4
ON activities_b_taggings_975e9c4.taggable_id = activities_businesses.id
AND activities_b_taggings_975e9c4.taggable_type =
'ActivitiesBusiness'
AND activities_b_taggings_975e9c4.tag_id = 104
AND activities_b_taggings_975e9c4.created_at >=
'2014-04-30 13:36:44'
WHERE ( businesses.id = 1 )
AND ( activities.created_at > '2014-04-30 13:36:44' )
AND ( activities.created_at < '2014-05-30 12:27:03' )
ORDER BY activities.created_at;
Edit 4
There may be a chance that the indexes are not being applied due to difference in column type between the taggings and the activities_businesses, on the taggable_id column.
mysql> SHOW COLUMNS FROM activities_businesses;
+-------------+------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| activity_id | bigint(20) | YES | MUL | NULL | |
| business_id | bigint(20) | YES | MUL | NULL | |
+-------------+------------+------+-----+---------+----------------+
3 rows in set (0.01 sec)
mysql> SHOW COLUMNS FROM taggings;
+---------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| tag_id | int(11) | YES | MUL | NULL | |
| taggable_id | bigint(20) | YES | | NULL | |
| taggable_type | varchar(255) | YES | | NULL | |
| tagger_id | int(11) | YES | | NULL | |
| tagger_type | varchar(255) | YES | | NULL | |
| context | varchar(128) | YES | | NULL | |
| created_at | datetime | YES | | NULL | |
+---------------+--------------+------+-----+---------+----------------+
So it is examining way more rows than it shows in the explain query, probably because some indexes are not being applied.
Do you guys can help m with that?