How to stream semi-live audio over internet
- by Thomas Tempelmann
I want to write something like Skype, i.e. I have a constant audio stream on one computer and then recompress it in a format that's suitable for a latent internet connection, receive it on the other end and play it.
Let's also assume that the internet connection is fairly modern and fast, i.e. DSL or alike, no slow connections over phone and such. The involved computers will also be rather modern (Dual Core Intel CPUs at 2GHz or more).
I know how to handle the audio on the machines. What I don't know is how to transmit the audio in an efficient way.
The challenges are:
I'd like get good audio quality across the line.
The stream should be received without drops. The stream may, however, be received with a little delay (a second delay is acceptable). I imagine that the transport software could first determine the average (and max) latency, then start the stream and tell the receiver to wait for that max latency before starting to play the audio. With that, if the latency doesn't get any higher, the entire stream will be playable on the other side without stutter or drops.
If, due to unexpected IP latencies or blockages, the stream does get cut off, I want to be able to notice this so that I can take actions (e.g. abort the stream) and eventually start a new transmission.
What are my options if I want do use ready-made software for the compression and tranmission? I have no intention to write my own audio compression engine, really. OTOH, I plan to sell the solution in a vertical market, meaning I can afford a few dollars of license fees per copy, but not $100s.
I guess the simplest solution would be to just open a TCP stream, send a few packets back and forth to determine their running time (or even use UDP for that), then use the results as the guide for my max latency value, then simply fire the audio data in its raw form (uncompressed 16 bit stereo), along with a timing code over the TCP connection. The receiver reads the data and plays it with the pre-determined delay. That might just work with the type of fast connection I expect.
I just wonder if there are better solutions to reach this goal, with better performance (lower latency) and less data (compressed).
BTW, I first try to implement this on OS X, but might want to do it on Windows, too, if it proves successful.