MySQL table organization and optimization (Rails)
Posted
by
aguynamedloren
on Stack Overflow
See other posts from Stack Overflow
or by aguynamedloren
Published on 2011-03-09T21:50:25Z
Indexed on
2011/03/10
0:10 UTC
Read the original article
Hit count: 370
I've been learning Ruby on Rails over the past few months with no prior programming experience. Lately, I've been thinking about database optimization and table organization. I know there are great books on the subject, but I typically learn by example / as I go.
Here's a hypothetical situation:
Let's say I am building a social network for a niche community with 250,000 members (users). The users have the ability to attend events. Let's say there are 50,000 past/present/future events. Much like Facebook events, a user can attend any number of events and an event can have any number of attendees.
In the database, there would be a table for users and a table for events. Somehow I would have to create an association between the users and events. I could create an "events" column in the users table such that each user row would contain a hash of event IDs, or I could create an "attendees" column in the events table such that each event row would contain a hash of user IDs.
Neither of these solutions seem ideal, however. On a users profile page, I want to display the list of events they are associated with, which would require scanning the 50,000 event rows for the user ID of said user if I include an "attendees" column in the events table. Likewise, on an event page, I want to display a list of attendees for the event, which would require scanning the 250,000 user rows for the event ID of said event if I include an "events" column in the users table.
Option 3 would be to create a third table that contains the attendee information for each and every event - but I don't see how this would solve any problems.
Are these non-issues? Rails makes accessing all of this information easy, but I guess I'm worried about scale. It is entirely possible that I am under-estimating the speed and processing power of modern databases / servers / etc. How long would it take to scan 250,000 user rows for specific event IDs - 10ms? 100ms? 1,000ms? I guess that's not that bad. Am I just over-thinking this?
© Stack Overflow or respective owner