MySQL : table organisation for very large sets with high update frequency
- by Remiz
I'm facing a dilemma in the choice of my MySQL schema application. So before I start here is a picture extremely simplified of my database :
Schema here : http://i43.tinypic.com/2wp5lxz.png
In one sentence : for each customer, the application harvest text data and attached tags to each data collected.
As approximation of the usage of each table, here is what I expect :
customer : ~5000, shouldn't grow fast
data : 5 millions per customer, could double or triple for big customers.
tag : ~1000, quite fixed size
data_tag : hundred of millions per customer easily. Each data can be tagged a lot.
The harvesting process is permanent, that means that around every 15 minutes new data come and are tagged, that require a very constant index refreshing.
A lot of my queries are a SELECT COUNT of DATA between specific DATES and tagged with a specific TAG on a specific CUSTOMER (very rarely it will involve several customers).
Here is the situation, you can imagine with this kind of volume of data I'm facing a challenge in term of data organization and indexing. Again, it's a very minimalistic and simplified version of my structure. My question is, is it better:
to stick with this model and to manage crazy index optimization ? (which involves potentially having billions of rows in the data_tag table)
change the schema and use one data table and one data_tag table per customer ? (which involves having 5000 tables on my database)
I'm running all of this on a MySQL 5.0 dedicated server (quad-core, 8Go of ram) replicated. I only use InnoDB, I also have another server that run Sphinx. So knowing all of this, I can't wait to hear your opinion about this.
Thanks.