Mysql: Working With 192 Trillion Records... (Yes, 192 Trillion)
Posted
by
Sarah
on Server Fault
See other posts from Server Fault
or by Sarah
Published on 2010-08-08T14:15:34Z
Indexed on
2010/12/27
20:55 UTC
Read the original article
Hit count: 121
Here's the question...
Considering 192 trillion records, what should my considerations be?
My main concern is speed.
Here's the table...
CREATE TABLE `ref` (
`id` INTEGER(13) AUTO_INCREMENT DEFAULT NOT NULL,
`rel_id` INTEGER(13) NOT NULL,
`p1` INTEGER(13) NOT NULL,
`p2` INTEGER(13) DEFAULT NULL,
`p3` INTEGER(13) DEFAULT NULL,
`s` INTEGER(13) NOT NULL,
`p4` INTEGER(13) DEFAULT NULL,
`p5` INTEGER(13) DEFAULT NULL,
`p6` INTEGER(13) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY (`s`),
KEY (`rel_id`),
KEY (`p3`),
KEY (`p4`)
);
Here's the queries...
SELECT id, s FROM ref WHERE red_id="$rel_id" AND p3="$p3" AND p4="$p4"
SELECT rel_id, p1, p2, p3, p4, p5, p6 FROM ref WHERE id="$id"
INSERT INTO rel (rel_id, p1, p2, p3, s, p4, p5, p6)
VALUES ("$rel_id", "$p1", "$p2", "$p3", "$s", "$p4", "$p5", "$p6")
Here's some notes...
- The SELECT's will be done much more frequently than the INSERT. However, occasionally I want to add a few hundred records at a time.
- Load-wise, there will be nothing for hours then maybe a few thousand queries all at once.
- Don't think I can normalize any more (need the p values in a combination)
- The database as a whole is very relational.
- This will be the largest table by far (next largest is about 900k)
UPDATE (08/11/2010)
Interestingly, I've been given a second option...
Instead of 192 trillion I could store 2.6*10^16 (15 zeros, meaning 26 Quadrillion)...
But in this second option I would only need to store one bigint(18) as the index in a table. That's it - just the one column. So I would just be checking for the existence of a value. Occasionally adding records, never deleting them.
So that makes me think there must be a better solution then mysql for simply storing numbers...
Given this second option, should I take it or stick with the first...
[edit] Just got news of some testing that's been done - 100 million rows with this setup returns the query in 0.0004 seconds [/edit]
© Server Fault or respective owner