Developing an analytics's system processing large amounts of data - where to start
Posted
by
Ryan
on Programmers
See other posts from Programmers
or by Ryan
Published on 2012-07-10T19:00:31Z
Indexed on
2012/07/10
21:24 UTC
Read the original article
Hit count: 416
Imagine you're writing some sort of Web Analytics system - you're recording raw page hits along with some extra things like tagging cookies etc and then producing stats such as
- Which pages got most traffic over a time period
- Which referers sent most traffic
- Goals completed (goal being a view of a particular page)
- And more advanced things like which referers sent the most number of vistors who later hit a goal.
The naieve way of approaching this would be to throw it in a relational database and run queries over it - but that won't scale.
You could pre-calculate everything (have a queue of incoming 'hits' and use to update report tables) - but what if you later change a goal - how could you efficiently re-calculate just the data that would be effected.
Obviously this has been done before ;) so any tips on where to start, methods & examples, architecture, technologies etc.
© Programmers or respective owner