Summarising (permanently) data in a SQL table

Posted by Cylindric on Stack Overflow See other posts from Stack Overflow or by Cylindric
Published on 2010-03-11T16:15:11Z Indexed on 2010/03/11 17:34 UTC
Read the original article Hit count: 287

Filed under:
|

Geetings, Stackers.

I have a huge number of data-points in a SQL table, and I want to summarise them in a way reminiscent of RRD.

Assuming a table such as

 ID | ENTITY_ID | SCORE_DATE | SCORE | SOME_OTHER_DATA
----+-----------+------------+-------+-----------------
  1 | A00000001 | 01/01/2010 |   100 | some data
  2 | A00000002 | 01/01/2010 |   105 | more data
  3 | A00000003 | 01/01/2010 |   104 | various text
... | ......... | .......... | ..... | ...
... | A00009999 | 01/01/2010 |   101 | 
... | A00000001 | 02/01/2010 |   104 | 
... | A00000002 | 02/01/2010 |   119 | 
... | A00000003 | 02/01/2010 |   119 | 
... | ......... | .......... | ..... | 
... | A00009999 | 02/01/2010 |   101 | arbitrary data
... | ......... | .......... | ..... | ...
... | A00000001 | 01/02/2010 |   104 | 
... | A00000002 | 01/02/2010 |   119 | 
... | A00000003 | 01/01/2010 |   119 | 

I want to end up with one record per entity, per month:

 ID | ENTITY_ID | SCORE_DATE | SCORE |
----+-----------+------------+-------+
... | A00000001 | 01/01/2010 |   100 |
... | A00000002 | 01/01/2010 |   105 |
... | A00000003 | 01/01/2010 |   104 |
... | A00000001 | 01/02/2010 |   100 |
... | A00000002 | 01/02/2010 |   105 |
... | A00000003 | 01/02/2010 |   104 |

(I Don't care about the SOME_OTHER_DATA - I'll pick something - either the first or last record probably.)

What's an easy way of doing this on a regular basis, so that anything in the last calendar month is summarised in this way?

At the moment my plan is kind of:

  • For each EntityID
    • For each month
      • Find average score for all records in given month
      • Update first record with results of previous step
      • Delete all records that aren't the first

I can't think of a neat way of doing it though, that doesn't involve lots of updates and iteration.

This can either be done in a SQL Stored Procedure, or it can be incorporated into the .Net app that's generating this data, so the solution doesn't really need to be "one big SQL script", but can be :)

(SQL-2005)

© Stack Overflow or respective owner

Related posts about sql

Related posts about sql-server-2005