Summarising (permanently) data in a SQL table
- by Cylindric
Geetings, Stackers.
I have a huge number of data-points in a SQL table, and I want to summarise them in a way reminiscent of RRD.
Assuming a table such as
ID | ENTITY_ID | SCORE_DATE | SCORE | SOME_OTHER_DATA
----+-----------+------------+-------+-----------------
1 | A00000001 | 01/01/2010 | 100 | some data
2 | A00000002 | 01/01/2010 | 105 | more data
3 | A00000003 | 01/01/2010 | 104 | various text
... | ......... | .......... | ..... | ...
... | A00009999 | 01/01/2010 | 101 |
... | A00000001 | 02/01/2010 | 104 |
... | A00000002 | 02/01/2010 | 119 |
... | A00000003 | 02/01/2010 | 119 |
... | ......... | .......... | ..... |
... | A00009999 | 02/01/2010 | 101 | arbitrary data
... | ......... | .......... | ..... | ...
... | A00000001 | 01/02/2010 | 104 |
... | A00000002 | 01/02/2010 | 119 |
... | A00000003 | 01/01/2010 | 119 |
I want to end up with one record per entity, per month:
ID | ENTITY_ID | SCORE_DATE | SCORE |
----+-----------+------------+-------+
... | A00000001 | 01/01/2010 | 100 |
... | A00000002 | 01/01/2010 | 105 |
... | A00000003 | 01/01/2010 | 104 |
... | A00000001 | 01/02/2010 | 100 |
... | A00000002 | 01/02/2010 | 105 |
... | A00000003 | 01/02/2010 | 104 |
(I Don't care about the SOME_OTHER_DATA - I'll pick something - either the first or last record probably.)
What's an easy way of doing this on a regular basis, so that anything in the last calendar month is summarised in this way?
At the moment my plan is kind of:
For each EntityID
For each month
Find average score for all records in given month
Update first record with results of previous step
Delete all records that aren't the first
I can't think of a neat way of doing it though, that doesn't involve lots of updates and iteration.
This can either be done in a SQL Stored Procedure, or it can be incorporated into the .Net app that's generating this data, so the solution doesn't really need to be "one big SQL script", but can be :)
(SQL-2005)