How to approach parallel processing of messages?

Posted by Dan on Stack Overflow See other posts from Stack Overflow or by Dan
Published on 2010-06-13T10:18:08Z Indexed on 2010/06/13 10:22 UTC
Read the original article Hit count: 278

Filed under:
|
|
|

I am redesigning the messaging system for my app to use intel threading building blocks and am stumped trying to decide between two possible approaches.

Basically, I have a sequence of message objects and for each message type, a sequence of handlers. For each message object, I apply each handler registered for that message objects type.


The sequential version would be something like this (pseudocode):

for each message in message_sequence                     <- SEQUENTIAL
    for each handler in (handler_table for message.type)
        apply handler to message                         <- SEQUENTIAL

The first approach which I am considering processes the message objects in turn (sequentially) and applies the handlers concurrently.

Pros:

  • predictable ordering of messages (ie, we are guaranteed a FIFO processing order)
  • (potentially) lower latency of processing each message

Cons:

  • more processing resources available than handlers for a single message type (bad parallelization)
  • bad use of processor cache since message objects need to be copied for each handler to use
  • large overhead for small handlers

The pseudocode of this approach would be as follows:

for each message in message_sequence                              <- SEQUENTIAL
    parallel_for each handler in (handler_table for message.type)
        apply handler to message                                  <- PARALLEL

The second approach is to process the messages in parallel and apply the handlers to each message sequentially.

Pros:

  • better use of processor cache (keeps the message object local to all handlers which will use it)
  • small handlers don't impose as much overhead (as long as there are other handlers also to be run)
  • more messages are expected than there are handlers, so the potential for parallelism is greater

Cons:

  • Unpredictable ordering - if message A is sent before message B, they may both be processed at the same time, or B may finish processing before all of A's handlers are finished (order is non-deterministic)

The pseudocode is as follows:

parallel_for each message in message_sequence                     <- PARALLEL
    for each handler in (handler_table for message.type)
        apply handler to message                                  <- SEQUENTIAL

The second approach has more advantages than the first, but non-deterministic ordering is a big disadvantage..

Which approach would you choose and why? Are there any other approaches I should consider (besides the obvious third approach: parallel messages and parallel handlers, which has the disadvantages of both and no real redeeming factors as far as I can tell)?

Thanks!

© Stack Overflow or respective owner

Related posts about multithreading

Related posts about cache