I have a query that's used in a reporting system of ours that sometimes runs quicker than a second, and other times takes 1 to 10 minutes to run.
Here's the entry from the slow query log:
# Query_time: 543 Lock_time: 0 Rows_sent: 0 Rows_examined: 124948974
use statsdb;
SELECT count(distinct Visits.visitorid) as 'uniques'
FROM Visits,Visitors
WHERE Visits.visitorid=Visitors.visitorid
and candidateid in (32)
and visittime>=1275721200 and visittime<=1275807599
and (omit=0 or omit>=1275807599)
AND Visitors.segmentid=9
AND Visits.visitorid NOT IN
(SELECT Visits.visitorid
FROM Visits,Visitors
WHERE Visits.visitorid=Visitors.visitorid
and candidateid in (32)
and visittime<1275721200
and (omit=0 or omit>=1275807599)
AND Visitors.segmentid=9);
It's basically counting unique visitors, and it's doing that by counting the visitors for today and then substracting those that have been here before. If you know of a better way to do this, let me know.
I just don't understand why sometimes it can be so quick, and other times takes so long - even with the same exact query under the same server load.
Here's the EXPLAIN on this query. As you can see it's using the indexes I've set up:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY Visits range visittime_visitorid,visitorid visittime_visitorid 4 NULL 82500 Using where; Using index
1 PRIMARY Visitors eq_ref PRIMARY,cand_visitor_omit PRIMARY 8 statsdb.Visits.visitorid 1 Using where
2 DEPENDENT SUBQUERY Visits ref visittime_visitorid,visitorid visitorid 8 func 1 Using where
2 DEPENDENT SUBQUERY Visitors eq_ref PRIMARY,cand_visitor_omit PRIMARY 8 statsdb.Visits.visitorid 1 Using where
I tried to optimize the query a few weeks ago and came up with a variation that consistently took about 2 seconds, but in practice it ended up taking more time since 90% of the time the old query returned much quicker. Two seconds per query is too long because we are calling the query up to 50 times per page load, with different time periods.
Could the quick behavior be due to the query being saved in the query cache? I tried running 'RESET QUERY CACHE' and 'FLUSH TABLES' between my benchmark tests and I was still getting quick results most of the time.
Note: last night while running the query I got an error: Unable to save result set. My initial research shows that may be due to a corrupt table that needs repair. Could this be the reason for the behavior I'm seeing?
In case you want server info:
Accessing via PHP 4.4.4 MySQL 4.1.22
All tables are InnoDB
We run optimize table on all tables weekly
The sum of both the tables used in the query is 500 MB
MySQL config:
key_buffer = 350M
max_allowed_packet = 16M
thread_stack = 128K
sort_buffer = 14M
read_buffer = 1M
bulk_insert_buffer_size = 400M
set-variable = max_connections=150
query_cache_limit = 1048576
query_cache_size = 50777216
query_cache_type = 1
tmp_table_size = 203554432
table_cache = 120
thread_cache_size = 4
wait_timeout = 28800
skip-external-locking
innodb_file_per_table
innodb_buffer_pool_size = 3512M
innodb_log_file_size=100M
innodb_log_buffer_size=4M