Optimize slow ranking query

Posted by Juan Pablo Califano on Stack Overflow See other posts from Stack Overflow or by Juan Pablo Califano
Published on 2010-05-07T13:20:14Z Indexed on 2010/05/07 13:48 UTC
Read the original article Hit count: 239

Filed under:
|
|
|

I need to optimize a query for a ranking that is taking forever (the query itself works, but I know it's awful and I've just tried it with a good number of records and it gives a timeout).

I'll briefly explain the model. I have 3 tables: player, team and player_team. I have players, that can belong to a team. Obvious as it sounds, players are stored in the player table and teams in team. In my app, each player can switch teams at any time, and a log has to be mantained. However, a player is considered to belong to only one team at a given time. The current team of a player is the last one he's joined.

The structure of player and team is not relevant, I think. I have an id column PK in each. In player_team I have:

id          (PK)
player_id   (FK -> player.id)
team_id     (FK -> team.id)

Now, each team is assigned a point for each player that has joined. So, now, I want to get a ranking of the first N teams with the biggest number of players.

My first idea was to get first the current players from player_team (that is one record top for each player; this record must be the player's current team). I failed to find a simple way to do it (tried GROUP BY player_team.player_id HAVING player_team.id = MAX(player_team.id), but that didn't cut it.

I tried a number of querys that didn't work, but managed to get this working.

SELECT 
    COUNT(*) AS total,
    pt.team_id,
    p.facebook_uid AS owner_uid, 
    t.color 
FROM 
    player_team pt 
JOIN player p ON (p.id = pt.player_id)  
JOIN team t ON (t.id = pt.team_id) 
WHERE 
    pt.id IN (
        SELECT max(J.id) 
        FROM player_team J 
        GROUP BY J.player_id
    )  

GROUP BY 
    pt.team_id 
ORDER BY 
    total DESC 
LIMIT 50            

As I said, it works but looks very bad and performs worse, so I'm sure there must be a better way to go. Anyone has any ideas for optimizing this?

I'm using mysql, by the way.

Thanks in advance

Adding the explain. (Sorry, not sure how to format it properly)

id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
1   PRIMARY     t   ALL     PRIMARY     NULL    NULL    NULL    5000    Using temporary; Using filesort
1   PRIMARY     pt  ref     FKplayer_pt77082,FKplayer_pt265938,new_index    FKplayer_pt77082    4   t.id    30  Using where
1   PRIMARY     p   eq_ref  PRIMARY     PRIMARY     4   pt.player_id    1
2   DEPENDENT SUBQUERY  J   index   NULL    new_index   8   NULL    150000  Using index

© Stack Overflow or respective owner

Related posts about mysql

Related posts about subquery