A very interesting MYSQL problem (related to indexing, million records, algorithm.)
- by terence410
This problem is pretty hard to describe and therefore difficult to search the answer. I hope some expert could share you opinions on that.
I have a table with around 1 million of records. The table structure is similar to something like this:
items{
uid (primary key, bigint, 15)
updated (indexed, int, 11)
enabled (indexed, tinyint, 1)
}
The scenario is like this. I have to select all of the records everyday and do some processing. It takes around 3 second to handle each item.
I have written a PHP script to fetch 200 items each time using the following.
select * from items where updated unix_timestamp(now()) - 86400 and enabled = 1 limit 200;
I will then update the "updated" field of the selected items to make sure that it wont' be selected again within one day. The selected query is something like that.
update items set updated = unix_timestamp(now()) where uid in (1,2,3,4,...);
Then, the PHP will continue to run and process the data which doesn't require any MYSQL connection anymore.
Since I have million records and each record take 3 seconds to process, it's definitely impossible to do it sequentially. Therefore, I will execute the PHP in every 10 seconds.
However, as time goes by and the table growth, the select getting much slower. Sometimes, it take more than 100 seconds to run!
Do you guys have any suggestion how may I solve this problem?