A very interesting MYSQL problem (related to indexing, million records, algorithm.)
Posted
by terence410
on Stack Overflow
See other posts from Stack Overflow
or by terence410
Published on 2010-05-03T17:34:07Z
Indexed on
2010/05/03
17:38 UTC
Read the original article
Hit count: 179
This problem is pretty hard to describe and therefore difficult to search the answer. I hope some expert could share you opinions on that.
I have a table with around 1 million of records. The table structure is similar to something like this:
items{ uid (primary key, bigint, 15) updated (indexed, int, 11) enabled (indexed, tinyint, 1) }
The scenario is like this. I have to select all of the records everyday and do some processing. It takes around 3 second to handle each item.
I have written a PHP script to fetch 200 items each time using the following.
select * from items where updated > unix_timestamp(now()) - 86400 and enabled = 1 limit 200;
I will then update the "updated" field of the selected items to make sure that it wont' be selected again within one day. The selected query is something like that.
update items set updated = unix_timestamp(now()) where uid in (1,2,3,4,...);
Then, the PHP will continue to run and process the data which doesn't require any MYSQL connection anymore.
Since I have million records and each record take 3 seconds to process, it's definitely impossible to do it sequentially. Therefore, I will execute the PHP in every 10 seconds.
However, as time goes by and the table growth, the select getting much slower. Sometimes, it take more than 100 seconds to run!
Do you guys have any suggestion how may I solve this problem?
© Stack Overflow or respective owner