When does a Tumbling Window Start in StreamInsight

Posted on SQLIS See other posts from SQLIS
Published on Thu, 17 Mar 2011 10:07:57 +0100 Indexed on 2011/03/17 16:17 UTC
Read the original article Hit count: 292

Whilst getting some courseware ready I was playing around writing some code and I decided to very simply show when a window starts and ends based on you asking for a TumblingWindow of n time units in StreamInsight.  I thought this was going to be a two second thing but what I found was something I haven’t yet found documented anywhere until now.

 

All this code is written in C# and will slot straight into my favourite quick-win dev tool LinqPad

 

Let’s first create a sample dataset

 

var EnumerableCollection = new []
{
    new {id = 1, StartTime = DateTime.Parse("2010-10-01 12:00:00 PM").ToLocalTime()},
    new {id = 2, StartTime = DateTime.Parse("2010-10-01 12:20:00 PM").ToLocalTime()},
    new {id = 3, StartTime = DateTime.Parse("2010-10-01 12:30:00 PM").ToLocalTime()},
    new {id = 4, StartTime = DateTime.Parse("2010-10-01 12:40:00 PM").ToLocalTime()},
    new {id = 5, StartTime = DateTime.Parse("2010-10-01 12:50:00 PM").ToLocalTime()},
    new {id = 6, StartTime = DateTime.Parse("2010-10-01 01:00:00 PM").ToLocalTime()},
    new {id = 7, StartTime = DateTime.Parse("2010-10-01 01:10:00 PM").ToLocalTime()},
    new {id = 8, StartTime = DateTime.Parse("2010-10-01 02:00:00 PM").ToLocalTime()},
    new {id = 9, StartTime = DateTime.Parse("2010-10-01 03:20:00 PM").ToLocalTime()},
    new {id = 10, StartTime = DateTime.Parse("2010-10-01 03:30:00 PM").ToLocalTime()},
    new {id = 11, StartTime = DateTime.Parse("2010-10-01 04:40:00 PM").ToLocalTime()},
    new {id = 12, StartTime = DateTime.Parse("2010-10-01 04:50:00 PM").ToLocalTime()},
    new {id = 13, StartTime = DateTime.Parse("2010-10-01 05:00:00 PM").ToLocalTime()},
    new {id = 14, StartTime = DateTime.Parse("2010-10-01 05:10:00 PM").ToLocalTime()}
};

 

Now let’s create a stream of point events

 

var inputStream = EnumerableCollection
                    .ToPointStream(Application,evt=> PointEvent
                                                        .CreateInsert(evt.StartTime,evt),AdvanceTimeSettings.StrictlyIncreasingStartTime);

 

Now we can create our windows over the stream.  The first window we will create is a one hour tumbling window.  We’'ll count the events in the window but what we do here is not the point, the point is our window edges.

 

var windowedStream = from win in inputStream.TumblingWindow(TimeSpan.FromHours(1),HoppingWindowOutputPolicy.ClipToWindowEnd)
                        select new {CountOfEntries = win.Count()};

 

Now we can have a look at what we get.  I am only going to show the first non Cti event as that is enough to demonstrate what is going on

 

windowedStream.ToIntervalEnumerable().First(e=> e.EventKind == EventKind.Insert).Dump("First Row from Windowed Stream");

 

The results are below

 

EventKind Insert  
StartTime 01/10/2010 12:00  
EndTime 01/10/2010 13:00  
  { CountOfEntries = 5 }  
Payload CountOfEntries 5

 

Now this makes sense and is quite often the width of window specified in examples.  So what happens if I change the windowing code now to

var windowedStream = from win in inputStream.TumblingWindow(TimeSpan.FromHours(5),HoppingWindowOutputPolicy.ClipToWindowEnd)
                        select new {CountOfEntries = win.Count()};

Now where does your window start?  What about

 

var windowedStream = from win in inputStream.TumblingWindow(TimeSpan.FromMinutes(13),HoppingWindowOutputPolicy.ClipToWindowEnd)
                        select new {CountOfEntries = win.Count()};

 

Well for the first example your window will start at 01/10/2010 10:00:00 , and for the second example it will start at  01/10/2010 11:55:00

Surprised?

 

Here is the reason why and thanks to the StreamInsight team for listening.

 

Windows start at TimeSpan.MinValue. Windows are then created from that point onwards of the size you specified in your code.  If a window contains no events they are not produced by the engine to the output.  This is why window start times can be before the first event is created.

© SQLIS or respective owner

Related posts about StreamInsight

Related posts about Code Development