Database solution for 200million writes/day, monthly summarization queries

Posted by sb on Stack Overflow See other posts from Stack Overflow or by sb
Published on 2010-04-30T21:07:30Z Indexed on 2010/05/01 15:37 UTC
Read the original article Hit count: 160

Filed under:
|
|
|

Hello.

I'm looking for help deciding on which database system to use. (I've been googling and reading for the past few hours; it now seems worthwhile to ask for help from someone with firsthand knowledge.)

I need to log around 200 million rows (or more) per 8 hour workday to a database, then perform weekly/monthly/yearly summary queries on that data. The summary queries would be for collecting data for things like billing statements, eg. "How many transactions of type A did each user run this month?" (could be more complex, but that's the general idea).

I can spread the database amongst several machines, as necessary, but I don't think I can take old data offline. I'll definitely need to be able to query a month's worth of data, maybe a year. These queries would be for my own use, and wouldn't need to be generated in real-time for an end-user (they could run overnight, if needed).

Does anyone have any suggestions as to which databases would be a good fit?

P.S. Cassandra looks like it would have no problem handling the writes, but what about the huge monthly table scans? Is anyone familiar with Cassandra/Hadoop MapReduce performance?

© Stack Overflow or respective owner

Related posts about nosql

Related posts about database