Spreading incoming batched data into a real-time stream
- by pr1001
I would like to display some events in 'real-time'. However, I must fetch the data from another source. I can request the last X minutes, though the source is updated approximately every 5 minutes. This means that there will be a delay between the most recent data retrieved and the point in time that I make the request.
Second, because I will be receiving a batch of data, I don't want to just fire out all the events down a socket once my fetcher has retrieved it: I would like to spread out the events so that they are both accurately spaced amongst each other and in sync with their original occurrences (e.g. an event is always displayed 6 minutes after it actually happened).
My thought is to fetch the data every 5 minutes from the source, knowing that I won't get the very latest data. The original data would be then queued to be sent down the socket 7.5 minutes from its original timestamp – that is, at least ~2.5 minutes from when its batch was fetched and at most 7.5 minutes since then.
My question is this: is this the best way to approach the problem? Does this problem have any standard approaches or associated literature related to implementation best-practices and edge cases?
I am a bit worried that the frequency of my fetches and the frequency in which the source is updated will get out of sync, leading to points where no data will be retrieved from the source. However, since my socket delay is greater than my fetch frequency, the subsequent fetch should retrieve newer data before the socket queue is empty.
Is that correct? Am I missing something?
Thanks!