Average over a timeframe with missing data
- by BHare
Assuming a table such as:
UID Name Datetime Users
4 Room 4 2012-08-03 14:00:00 3
2 Room 2 2012-08-03 14:00:00 3
3 Room 3 2012-08-03 14:00:00 1
1 Room 1 2012-08-03 14:00:00 2
3 Room 3 2012-08-03 14:15:00 1
2 Room 2 2012-08-03 14:15:00 4
1 Room 1 2012-08-03 14:15:00 3
1 Room 1 2012-08-03 14:30:00 6
1 Room 1 2012-08-03 14:45:00 3
2 Room 2 2012-08-03 14:45:00 7
3 Room 3 2012-08-03 14:45:00 8
4 Room 4 2012-08-03 14:45:00 4
I wanted to get the average user count of each room (1,2,3,4) from the time 2PM to 3PM. The problem is that sometimes the room may not "check in" at the 15 minute interval time, so the assumption has to be made that the previous last known user count is still valid.
For example the check-in's for 2012-08-03 14:15:00 room 4 never checked in, so it must be assumed that room 4 had 3 users at 2012-08-03 14:15:00 because that is what it had at 2012-08-03 14:00:00
This follows on through so that the average user count I am looking for is as follows:
Room 1: (2 + 3 + 6 + 3) / 4 = 3.5
Room 2: (3 + 4 + 4 + 7) / 4 = 4.5
Room 3: (1 + 1 + 1 + 8) / 4 = 2.75
Room 4: (3 + 3 + 3 + 4) / 4 = 3.25
where # is the assumed number based on the previous known check-in.
I am wondering if it's possible to so this with SQL alone? if not I am curious of a ingenious PHP solution that isn't just bruteforce math, as such as my quick inaccurate pseudo code:
foreach ($rooms_id_array as $room_id) {
$SQL = "SELECT * FROM `table` WHERE (`UID` == $room_id && `Datetime` >= 2012-08-03 14:00:00 && `Datetime` <= 2012-08-03 15:00:00)";
$result = query($SQL);
if ( count($result) < 4 ) {
// go through each date and find what is missing, and then go to previous date and use that instead
} else {
foreach ($result)
$sum += $result;
$avg = $sum / 4;
}
}