Sink or Swim

By Ivan Gevirtz

created: Friday, March 10, 2006
updated: Monday, July 28, 2008

All y'all know the Oscars, right?  Where all the biggest stars in the world gather to congratulate each other and show off their latest surgical enhancements?  Well, some work I did way back helped lead Avid to the Oscars.  Our Oscar was in "Technical Achievement."  Naturally, our award ceremony was kinda like Star Wars -- "A long time ago, in a galaxy far, far away" from any actual movie stars.  But, the Academy president did bring plenty of champagne to our corporate cafeteria.

Since Avid, I've played around with a bunch of other media-related companies and technologies.  And I have realized that there is an important similarity to how media processing happens.  Generally, there are a whole bunch of little things which get some media, do pieces of processing, and pass the media along.  I've seen a bunch of patterns on how to do this, and figured out a concise way to generally depict these processes.  In this essay, I'm going to discuss the basic ways to couple together media processing units.  I'm not going to discuss the processing itself, only the flow of media throughout the system.

There are two basic ways to couple two processing units together.  The coupling can be synchronous or asynchronous.  A side which gets told when to act is considered to be asynchronous.  It is waiting for the appropriate trigger or event to activate it.  Otherwise, it is dormant.  The side which tells another side to act will be called the synchronous side.  The synchronous side is what determines the timing of the interaction.  Generally, media processing is done by coupling a synchronous side with an asynchronous side.  If you want to couple two synchronous sides together, you need a buffer to mediate the interaction -- to separate the timing of the two sides.  Similarly, if you want to couple two asynchronous sides, you need a thread to drive them together.

Let me draw some pictures to help explain. In these pictures, assume the left side happens before the right.  Also, assume that the < and > designate the synchronicity, and when the arrow points out of the box it is a synchronous side, actively pushing data.  Similarly, when it points in to the box, it is a passive asynchronous side, waiting for something to pass it data or grab data out of it.

====|     |=====   ====|     |=====
    >     >            <     <
====|     |=====   ====|     |=====
synch.    asynch.  asynch.   synch.
data out  data in  data out  data in

Coupling a data in side with a data out side, we have 8 shapes, which we can name:


|====================|         |====================|
>     "Pusher"       >         <     "Puller"       <
|====================|         |====================|
asynchronous synchronous       synchronous asynchronous
data in       data out         data in      data out

|====================|         |====================|
>     "Pool"         <         <     "Driver"       >
|====================|         |====================|
asynchronous asynchronous      synchronous synchronous
data in       data out         data in      data out

|====================|         |====================|
|   "Source Pool"    <         <   "Sink Driver"    | 
|====================|         |====================|
          asynchronous         synchronous
              data out         data in

|====================|         |====================|
|   "Source Driver"  >         >   "Sink Pool"      |
|====================|         |====================|
          asynchronous         asynchronous
               data in         data in

With Pusher and Puller shaped filters, the asynchronous side triggers the synchronous side.  In other words, each time the asynchronous side is called, the synchronous side follows.  For example, in the Puller, every time another filter asks the asynchronous side for data, it pulls it from its source.  In the pusher, once data is passed in, it processes it and passes it off to the next filter.

The pool and drivers each require more resources, and thus are somewhat less desirable.  The pool needs a buffer.  It needs to be able to hold on to data passed in until some other filter comes and grabs it out.  The driver needs a thread to keep passing data along.  However, these filter shapes are adapters, and can tie together disparately shaped filters.

There are some shapes which only have "synchronicity" on one side.  These shapes are your sources or sink's.

It is also important to understand the potential blocking behavior of these shapes.  Blocking behavior can be used to rate-limit processing, and also can be used to decouple timing.  This is very important to understand, for it is the key which helps eliminate polling loops and threads.  For pushers and pullers, you generally only want one side to block, the synchronous side.

On the other hand, if you have two filters which take time to do their thing, and you want to decouple the timing of one to another, you can use a Pool to buffer between them.  Alternately, you can use a Driver, allowing each side to block as needed. The determination is made on the basis of the shape of the two objects.

The network is best modeled as a Source Driver (in from the network) and a Sink Pool (out to the network).  Thus, the whole network looks like a Pusher coupling the sender with the reciever.  I send stuff to you when I want to, and you then get it and have to deal with it.

Sound cards, on the other hand, are best modeled the opposite.  Because their timing is much more precise than the computers, they know when they need data, and thus sending out sound to the speaker is done by modeling the sound card as a Sink Driver.  The sound card will grab more data to play when it is ready.  Similarly, it is generally a Source Driver as well, generating a consistent stream of data that the reader must deal with.  Thus, a sound card is a Driver type system, driving its own timing and controling/moderating the timing of others.

Efficient systems are comprised of chains of either pushers or pullers, often with one pool somewhere in the middle.  Take, for example, passing sound from the network to a sound card.  The network is a Source Driver.  It then pairs with Pusher filters, stripping off the headers, decoding the sound bytes, echo cancellation, etc...  But the sound card is a Sink Driver.  So, once the sound is ready for the sound card, you pass it from the final pusher filter into a Pool.  Then the sound card Sink Driver can pull from the pool when needed.  In this way, the sound card is decoupled from the timing of network events.  Interestingly enough, if the Pool has no data, instead of blocking it can generate silence, or "comfort noise" and thus handle network packet loss and things like that.  The picture looks like:

|==========| |========| |========| |========| |=============|
network  > > filter > > filter > > buffer < < sound card  |
|src driver| | pusher | | pusher | | pool   | | sink driver |
|==========| |========| |========| |========| |=============|

With pushers, you always want the asynchronous input to not block.  With pullers, you generally do want the asynchronous output to block.

On the other hand, you do want the synchronous output to block until the data is passed to the next step.  To do so, and keep the asynchronous side non-blocking, you may need a buffer

Generally, you want your asyhcnronous side to block if there is no data.  This limits the needs for threads.  This asynchronous coupling behavior is often established via call back functions, just like OS events.  OnDataIn(char* buffer, size_t len) or OnDataOut(char* buffer, size_t len)...  Having an asynchronous side return, but with no data, forces the synchronous other side to loop, which can be very inefficient.

NOTES FOR FURTHER EXPOUNDMENT:

input to output ratios -- if 1 to 1.  if not -- chunking or aggregating

if 1 to 1, then the stream processing time is really only the sum of the processing times for each filter.  Note, however, that if each step has one buffered, than the clock cycle for a chain is the slowest filter, not the aggregate length.  This is because each time a filter passes data, it passes the one from the previous cycle, which is already processed!  It holds on to the one just passed in, and processes it during the wait time between events.  This kind of processing is ideally suited for multiprocessor or distributed processing schemes.  Each filter can be within a computer, or could be a service box on a network!