mySQL Efficiency Issue - How to find the right balance of normalization...?
- by Foo
I'm fairly new to working with relational databases, but have read a few books and know the basics of good design.
I'm facing a design decision, and I'm not sure how to continue. Here's a very over simplified version of what I'm building: People can rate photos 1-5, and I need to display the average votes on the picture while keeping track of the individual votes. For example, 12 people voted 1, 7 people voted 2, etc. etc.
The normalization freak of me initially designed the table structure like this:
Table pictures
id* | picture | userID |
Table ratings
id* | pictureID | userID | rating
With all the foreign key constraints and everything set as they shoudl be. Every time someone rates a picture, I just insert a new record into ratings and be done with it.
To find the average rating of a picture, I'd just run something like this:
SELECT AVG(rating) FROM ratings WHERE pictureID = '5' GROUP by pictureID
Having it setup this way lets me run my fancy statistics to. I can easily find who rated a certain picture a 3, and what not.
Now I'm thinking if there's a crapload of ratings (which is very possible in what I'm really designing), finding the average will became very expensive and painful.
Using a non-normalized version would seem to be more efficient. e.g.:
Table picture
id | picture | userID | ratingOne | ratingTwo | ratingThree | ratingFour | ratingFive
To calculate the average, I'd just have to select a single row. It seems so much more efficient, but so much more uglier.
Can someone point me in the right direction of what to do? My initial research shows that I have to "find the right balance", but how do I go about finding that balance? Any articles or additional reading information would be appreciated as well.
Thanks.