MySQL query optimization - distinct, order by and limit
- by Manuel Darveau
I am trying to optimize the following query:
select distinct this_.id as y0_
from Rental this_
left outer join RentalRequest rentalrequ1_ on this_.id=rentalrequ1_.rental_id
left outer join RentalSegment rentalsegm2_ on rentalrequ1_.id=rentalsegm2_.rentalRequest_id
where
this_.DTYPE='B'
and this_.id<=1848978
and this_.billingStatus=1
and rentalsegm2_.endDate between 1273631699529 and 1274927699529
order by rentalsegm2_.id asc
limit 0, 100;
This query is done multiple time in a row for paginated processing of records (with a different limit each time). It returns the ids I need in the processing. My problem is that this query take more than 3 seconds. I have about 2 million rows in each of the three tables.
Explain gives:
+----+-------------+--------------+--------+-----------------------------------------------------+---------------+---------+--------------------------------------------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+--------+-----------------------------------------------------+---------------+---------+--------------------------------------------+--------+----------------------------------------------+
| 1 | SIMPLE | rentalsegm2_ | range | index_endDate,fk_rentalRequest_id_BikeRentalSegment | index_endDate | 9 | NULL | 449904 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | rentalrequ1_ | eq_ref | PRIMARY,fk_rental_id_BikeRentalRequest | PRIMARY | 8 | solscsm_main.rentalsegm2_.rentalRequest_id | 1 | Using where |
| 1 | SIMPLE | this_ | eq_ref | PRIMARY,index_billingStatus | PRIMARY | 8 | solscsm_main.rentalrequ1_.rental_id | 1 | Using where |
+----+-------------+--------------+--------+-----------------------------------------------------+---------------+---------+--------------------------------------------+--------+----------------------------------------------+
I tried to remove the distinct and the query ran three times faster. explain without the query gives:
+----+-------------+--------------+--------+-----------------------------------------------------+---------------+---------+--------------------------------------------+--------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+--------+-----------------------------------------------------+---------------+---------+--------------------------------------------+--------+-----------------------------+
| 1 | SIMPLE | rentalsegm2_ | range | index_endDate,fk_rentalRequest_id_BikeRentalSegment | index_endDate | 9 | NULL | 451972 | Using where; Using filesort |
| 1 | SIMPLE | rentalrequ1_ | eq_ref | PRIMARY,fk_rental_id_BikeRentalRequest | PRIMARY | 8 | solscsm_main.rentalsegm2_.rentalRequest_id | 1 | Using where |
| 1 | SIMPLE | this_ | eq_ref | PRIMARY,index_billingStatus | PRIMARY | 8 | solscsm_main.rentalrequ1_.rental_id | 1 | Using where |
+----+-------------+--------------+--------+-----------------------------------------------------+---------------+---------+--------------------------------------------+--------+-----------------------------+
As you can see, the Using temporary is added when using distinct.
I already have an index on all fields used in the where clause.
Is there anything I can do to optimize this query?
Thank you very much!