Data architecture for event log metrics?

Posted by elliot42 on Programmers See other posts from Programmers or by elliot42
Published on 2012-07-19T18:21:19Z Indexed on 2012/09/17 3:52 UTC
Read the original article Hit count: 229

Filed under:
|
|

My service has a large ongoing number of user events, and we would like to do things like "count occurrence of event type T since date D."

We are trying to make two basic decisions:

  1. What to store? Storing every event vs. only storing aggregates

    • (Event log style) log every event and count them later, vs.
    • (Time-series style) store a single aggregated "count of event E for date D" for every day
  2. Where to store the data

    • In a relational database (particularly MySQL)
    • In a non-relational (NoSQL) database
    • In flat log files (collected centrally over the network via syslog-ng)

What is standard practice / where can I read more about comparing the different types of systems?


Additional details:

  • The total event stream is large, potentially hundreds of thousands of entries per day
  • But our current need is only to count certain types of events within it
  • We don't necessarily need real-time access to the raw data or aggregation results

IMHO, "log all events to files, crawl them at a later time to filter and aggregate the stream" is a pretty standard UNIX Way, but my Rails-y compatriots seem to think that nothing is real unless it's in MySQL.

© Programmers or respective owner

Related posts about architecture

Related posts about database