Wednesday, September 17, 2008

MSMQ Transactional Message Processing using Multiple Receive Queues

Here's the situation:
  1. You want to asynchronously process messages on a queue in a transactional manner, such that the message is only taken from the queue upon successfully processing it. This allows messages that failed to be properly processed to be processed again later with a chance at success.
  2. You want to have multiple workers processing these messages to improve efficiency and performance of long-running processing on large numbers of messages.
  3. When you shutdown, you want all the processing code to complete before allowing the process to exit.
The first item can be accomplished with using MessageQueue's BeginPeek method and PeekCompleted event. This will allow you to peek at the queue and and begin a transaction before actually receiving the message. Beginning a transaction before receiving the message allows you to abort the transaction should an error occur during processing, leaving the message on the queue to be processed again later (hopefully with a higher chance of success). The following event handler shows the boilerplate code to accomplish this:

private void queue_PeekCompleted(object sender, PeekCompletedEventArgs e)
{
var queue = (MessageQueue)sender;

var transaction = new MessageQueueTransaction();
transaction.Begin();
try
{
var message = queue.Receive(transaction);
// process the message here
transaction.Commit();
}
catch (Exception ex)
{
// abort if processing fails
transaction.Abort();
}
finally
{
// start watching for another message
queue.BeginPeek();
}
}

The second item can be accomplished by creating multiple receiving queues and telling them to start watching for incoming messages. The following snippets of code demonstrate how to accomplish this:

private readonly MessageQueue[] Receivers; // member
...
this.Receivers = Enumerable.Range(0, (count <= 0) ? 1 : count)
.Select(i =>
{
var queue = new MessageQueue(path, QueueAccessMode.Receive)
{
Formatter = new BinaryMessageFormatter()
};
queue.MessageReadPropertyFilter.SetAll();
return queue;
})
.ToArray();

// begin watching
foreach (var queue in this.Receivers)
{
queue.PeekCompleted += queue_PeekCompleted;
queue.BeginPeek();
}

...
// closing
foreach (var queue in this.Receivers)
{
queue.PeekCompleted -= queue_PeekCompleted;
queue.Close(); // stop peeking
}

The third item can be accomplished by simply incrementing and decrementing a counter when processing begins and ends respectively; then you simply block until that counter reaches zero. You'll want to place the decrement in a finally block to ensure that the counter is decremented even if processing throws an exception. Assuming you have a Counter class that implements thread safe increment and decrement operations (see bottom), you can create a member named "ProcessingCounter", and your PeekCompleted handler has the following line to do the processing,
this.Handle(queue.Receive(transaction));

your Handle method would look like this,

private void Handle(Message message)
{
this.ProcessingCounter.Increment();
try
{
// process message here;
}
finally
{
this.ProcessingCounter.Decrement();
}
}

and you could block after your MessageQueue.Close() calls like this

while (this.ProcessingCounter.Value > 0)
Thread.Sleep(100);

The following abstract class puts it all together. Simply implement the Process method and away you go!

public abstract class MessageProcessor<TMessage>
{
private readonly MessageQueue[] Receivers;
private readonly Counter ProcessingCounter = new Counter();
private bool IsClosing;

public MessageProcessor(string path)
: this(path, 1) { }

public MessageProcessor(string path, int count)
: base()
{
if (string.IsNullOrEmpty(path))
throw new ArgumentNullException("path");

if (!MessageQueue.Exists(path))
MessageQueue.Create(path, true);

this.Receivers = Enumerable.Range(0, (count <= 0) ? 1 : count)
.Select(i =>
{
var queue = new MessageQueue(path, QueueAccessMode.Receive)
{
Formatter = new BinaryMessageFormatter()
};
queue.MessageReadPropertyFilter.SetAll();
return queue;
})
.ToArray();
}

public void Close()
{
this.IsClosing = true;

this.OnClosing();

foreach (var queue in this.Receivers)
{
queue.PeekCompleted -= queue_PeekCompleted;
queue.Close();
}

while (this.IsProcessing)
Thread.Sleep(100);

this.IsClosing = this.IsOpen = false;
this.OnClosed();
}

public bool IsOpen { get; private set; }

protected bool IsProcessing
{
get { return this.ProcessingCounter.Value > 0; }
}

protected virtual void OnClosing() { }
protected virtual void OnClosed() { }
protected virtual void OnOpening() { }
protected virtual void OnOpened() { }

public void Open()
{
if (this.IsOpen)
throw new Exception("This processor is already open.");

this.OnOpening();

foreach (var queue in this.Receivers)
{
queue.PeekCompleted += queue_PeekCompleted;
queue.BeginPeek();
}

this.IsOpen = true;
this.OnOpened();
}

protected abstract void Process(TMessage @object);

private void Handle(Message message)
{
Trace.Assert(null != message);

this.ProcessingCounter.Increment();
try
{
this.Process((TMessage)message.Body);
}
finally
{
this.ProcessingCounter.Decrement();
}
}

private void queue_PeekCompleted(object sender, PeekCompletedEventArgs e)
{
var queue = (MessageQueue)sender;

var transaction = new MessageQueueTransaction();
transaction.Begin();
try
{
// if the queue closes after the transaction begins,
// but before the call to Receive, then an exception
// will be thrown and the transaction will be aborted
// leaving the message to be processed next time
this.Handle(queue.Receive(transaction));
transaction.Commit();
}
catch (Exception ex)
{
transaction.Abort();
Trace.WriteLine(ex.Message);
}
finally
{
if (!this.IsClosing)
queue.BeginPeek();
}
}
}

Incidentally, the following is my implementation of a thread-safe counter.

public class Counter
{
private readonly object SyncRoot = new object();
private int value;

public int Value
{
get
{
lock (this.SyncRoot)
{
return value;
}
}
}

public int Decrement()
{
lock (this.SyncRoot)
{
return --value;
}
}

public int Increment()
{
lock (this.SyncRoot)
{
return ++value;
}
}
}

Tuesday, February 5, 2008

A System of Interactions

According to Newton's Third Law of Motion, our world is a system of continuously interacting objects. Essentially, whenever an action is taken by an object, hosts of other objects react, each in turn evoking more reactions ad infinitum; and thus, our world goes round. OK, I know what you're thinking. "Is a code-monkey really going to give me a lesson in classical physics?" The answer is, of course, no. I'm using Newton's Third Law as an analogy to the execution of an object-oriented application.

You see, procedural applications tend to execute in a manner easily defined in a decision tree; take this action, check a condition, take the action associated with that condition, check another condition, take another action, so on and so forth. I associate the simplest form of this type of execution almost directly with the world's most beloved pricing game from The Price Is Right, Plinko. In Plinko, one drops some input in at the top of the board, the Plinko chip, and it makes its way down a grid of pegs until it falls into one of a number of buckets determined by its path through the pegs. As the chip falls between any two pegs, it collides with another peg, forcing it to go either to the left or the right before it can fall again to the next peg where it will make another decision to go left or right. If one had the ability to observe the conditions of the falling chip (gravitational forces, rotation, etc.) at each peg, one could easily deduce the route the chip will take throught the pegs. Thus, enters the decision tree notion I mentioned early; based on the conditions surrounding any given input, a procedural application's execution is easily seen to be deterministic.

Because of the deterministic nature of procedural applications, this type of app doesn't lend itself to pluggability or extendability without modification. Not to say that every app should strive for this quality; I've certainly written many apps that didn't need to be extended (sometimes you just need a script; sometimes the number of use cases for the app is finite; there could be many reasons). However, it's undeniable that considering an app with this form of execution, a developer is more likely to create rigid dependencies on the deterministic nature thereof. For example, a developer modifying or extending some action F may assume that actions A, C and E have already been taken, programming in such a way that F is very dependent on A, C and E having done their work, and in the process, hampering F's ability to be reused when A, C and E have not all executed. For this reason, procedural applications can be difficult to extend or modify. In order to add a decision (or a whole new sub-tree) into the decision tree, a developer has to explicity know the addition's position in the tree; in a worst case scenario, where all the actions in the tree are tightly coupled to the actions taken before them, the logic for significant portions of the tree would have to be updated to take into account the addition.

Conversely, well-designed object-oriented applications execute in a manner remarkably similar to the system of objects existing in the physical reality of our world. Where objects in our reality monitor each other through physical media provided by the laws of physics, objects in OO apps monitor each other through some kind of event mechanism or messaging framework allowing both to react to actions taken in their space of existence.

Objects in the most loosely-coupled applications don't even need to be aware of each other, they could let the application (or some mechanism therein) do that for them; basically the application broadcasts the fact that some action has been taken and all those that want to react should do so now. A developer adding or modifying functionality need know nothing about the application as a whole, just simply, how to listen for an action to be taken and what to do then. Because the application is unaware of whom is listening and what actions they might take, execution of an OO application is seen to be somewhat non-deterministic (at least in the abstract).

The non-deterministic nature of OO applications forces developers to decouple the action(s) on which they're working from the rest of the application. Developers can't be explicitly sure what actions have already been taken, neither can they be sure what actions will be taken after theirs executes, eliminating their ability to create a dependency on them. This significantly reduces each object's awareness of the rest of the application, making its development small and focused, enhancing readability and maintainability, not to mention the ability for new developers to contribute value right away without needing "ramp-up" time to learn how the whole app works.

So, no physics lesson. However, if you're doing OO development, you might give some thought to how you think about how pieces of your app interact. Ask yourself, "Am I constructing my objects to interact in a similar fashion as objects do in our physical reality?" "Am I thinking about creating and using objects in my logical construct the way nature creates and uses them in her physical construct?" If so, you're probably headed down the right track from a maintainablity and extendability point of view.

Wednesday, January 30, 2008

Composite Gravitation

What do the users really want?

In application development, responsible developers realize that they are, in fact, developing for users, and we (yes, sometimes I too am responsible) ask regularly ourselves what users really want and need. In our hearts, we really do want to get it right; we want to deliver "added value"; it gives us great pleasure when we can show some users some new feature and with growing eyes and smiles an, "Oh, cool!" is uttered appreciatively. With that said, all too often we, as developers, are given (sometimes tacitly) the task to sift through the requirements docs, the discussions, the diagrams, the charts and all the other mediums through which our users are asking for something, and get down to the heart of what they want. However, sometimes even we cannot "see the pattern". Composite gravitation is a name I've given to one such pattern.

Composite gravitation can be manifest in many ways, but is most apparent when users start to grumble about having too many different applications to use in order to complete their tasks. I can sympathize with their angst; I certainly wouldn't want to be required to leave my IDE regularly in order to complete this task or that. On top of it all, who wants to learn and use a multitude of applications, each with its own look and feel? The desires of the users begin to gravitate toward some sort of consolidation and optimization, even though they may not understand what that means or voice it accurately. In any case, it's not long before you start hearing terms like "integrated desktop", "dashboard", "container application" and "portal".

In most cases, this situation results from introducing applications separately and disparately, generally with very narrow use-cases and without attention to the look and feel of previously introduced apps. It's a natural enough progression that could occur for various reasons; for example, they could have been developed at different times, or with different technologies, or perhaps they seem functionally unrelated, 3rd-party apps could be thrown into the mix, or a combination of these.

The grumbling of users is not the only symptom of composite gravitation. It can be seen when companies begin to require all internal applications to be web applications stating many justifications, and I use that term loosely, including that it consolidates the number of applications down to each user's browser. It can also be observed when IT and development shops begin to realize the maintenance nightmare that's been created by the existence of a myriad of disparate applications, each with separate yet remarkably similar code-bases, with run-time counterparts that soak up desktop and server resources utilizing no algorithm for sharing with sibling apps. In any case, the reasons are swelling, not dwindling, and something must be done.

Enter the Composite Application

A composite application is exactly what its name entails, an application composed of applications. You see composite applications in many forms, but generally each has a single extensible shell built using a modular and pluggable architecture. When new business use-cases arise, new modules can be built and plugged into the existing application in a seamless manner. Each module is built so as to maximize maintainability and efficiency, typically through code reuse, extensible menu structures and toolbars, connection management, authentication services or other resources.

If done correctly, composite applications greatly benefit both worlds, users and IT/Development. The users get their applications in a consolidated and cohesive package with intuitive look and feel, while the common shell and extensible, pluggable architecture eases the maintenance headaches generally involved with disparate applications.

For more information on composite applications, Atanu Banerjee of Microsoft Corporation has a great MSDN article entitled What Are Composite Applications?, and for you Windows hackers, let me also direct you to the Smart Client - Composite UI Application Block, a framework for building composite applications in .NET, graciously provided by the Microsoft Patterns and Practices group.

Tuesday, January 29, 2008

Web Applications are Overrated

For years we've been told that web apps are the way of the future. But are they really? The answer is "probably not". Think about where web applications are going. You don't know where they're going? We'll let's think about where they've been.

The Stateless Abyss
For almost 20 years now, web applications have lived in the stateless depths of the World Wide Web, thanks to that wonderful little protocol, HTTP. HTTP's request/response nature allowed for simple and straight-forward communication back in the day, when all we really wanted was to ask another machine on the web for a resource.

The Dynamic Shift
As the Web grew in popularity, we started placing more and more demands on the resources for which we asked. We started making the dog do tricks. We wanted more "interactivity"; static text and images were no longer enough; we now needed the ability to customize the resource and create "dynamic content" with each request. We wanted to conduct business and stream media all over a stateless channel, where each transaction had no knowledge of previous ones.

The Stateful Revolution
Soon, having the dog roll over or shake hands was not enough as well. The time came for stateful transactions. We now want to keep track of whom is logged in, what he/she has done and where he/she has been. We want state. Only now, we're making the dog walk a tight-rope and juggle chainsaws. We've begun to do things like stuff variable values in responses and re-collect them on the next request or write database records to keep track of transaction state on the web server; all to mimic stateful transactions. Dozens of web frameworks have surfaced, many having a mechanism built-in to get around the stateless nature of HTTP. We now have technologies that effectively mix web and desktop technologies, something I like to call "webtop clients", manifest in Adobe Flash Player, Java Applets, and Microsoft Silverlight.

The Pattern
Do you see the pattern? Since the inception of the Web, web applications have evolved significantly; from static content to dynamic content, from statelessness to statefulness, from browser-only technology to browser-desktop mixture technologies. However, their evolution is not toward something new and unseen, but to something all too familiar, the thick client.

With today's frameworks for desktop application development and deployment, the justification for many of today's web applications, especially web apps internal to a company, are no longer legit. Take a look at Apple's iTunes; everything Apple flows through that application. As for deployment, Microsoft's ClickOnce for .NET Windows applications and Sparkle for Cocoa applications make deployment and updates so easy, it's ridiculous. Don't get me wrong, web applications do have their place, but the number of scenarios for which they are the optimal choice is dwindling; sure they can be used for nearly everything, but what's the point? It all comes down to the following question: in a particular situation, what problem does a web application solve that a desktop application does not? When you realize how many times you can't find any, you'll realize that web applications are overrated.

Friday, January 18, 2008

Dataundermining

Suppose Bob is your investment agent. Bob uses many tools and algorithms to determine the best investment for your money; he always seems to make good decisions and you profit from his skill and judgment. Bob's investing ability is surpassed only by his technological ineptitude. He knows nothing of computers, yet remains meticulous with his record-keeping. After every day of investing, he places a paper log of his investment activities in the second drawer of his desk.

Suppose further that Sue is your accountant. She, too, is great at what she does, as long as she is provided with complete and accurate information. The trouble resides with you and your inability to provide her with the information she requires on the part of your investments (since Bob takes care of that for you). Luckily, Bob is a trusting soul, and he has provided you with a key to his office in case of emergencies. You know about the log in his desk drawer and realize how much easier your life would be if Sue could obtain your investment activity information straight from the source. In a shady move, you give a copy of the key to Sue and inform her about the information in this log. Periodically, Sue lets herself into Bob's office and searches the log for investment activity he's done on your part. She uses this information to keep accurate tabs on your investments without having to badger you.

A while later, Bob completes a computer course at the local college. He now recognizes the productivity benefits of electronic record keeping and transfers the records from the log to his PC. Bob realizes that maintaining a physical copy of his records has no merit and only takes time away from doing his job; he shreds the log, and against his nutritionist's better judgment, fills the desk drawer with soda and snacks.

Later that evening, Sue comes to inspect the log. She's alarmed to find the log has been replaced with soda, candy bars and other goodies. You get a phone call from an upset accountant, demanding that you have Bob continue to put the daily log in his desk drawer, "... or we're back where we started", she says.

This is an example of dataundermining; Bob and his record-keeping system have been undermined, and by my count, you have roughly four options:

  1. Demand Bob continue with his paper logging practices so as not to keep Sue from efficiently keeping track of your investment activity. (This would also lead to a confession about your giving Sue the proverbial key to the kingdom and her sneaking data out from under Bob.)

  2. Revert back to your having to keep Sue manually apprised of Bob's investment activities on your behalf.

  3. Do nothing, letting Bob continue on oblivious to the situation and forcing Sue to attempt her job with incomplete information.

  4. Force Bob and Sue to communicate with each other.

I personally don't like any of options 1 through 3.

Option 1 requires Bob to either abandon his path of evolution or do double the work by keeping both his electronic records and his physical records up-to-date. Neither suit your needs from an investment point of view. You want Bob to be able to grow and evolve with the rest of the world; after all, he's making your bank. You also want Bob to spend the maximum amount of time and resources furthering your investment success; maintaining two forms of records detracts from his ability to do that.

Option 2 requires extra effort on your part. Your keeping Sue updated on your investment activities is just not in the cards. The speed and accuracy with which you could deliver information to Sue severely handicaps her ability to keep your financial records straight. After all, let's face it, you're no expert on your investments, that's why you have Bob.

Option 3 is what I call classic. You recognize that you have a broken system, but too much effort is involved in keeping everybody happy, so you just sit on your hands and accept the fact that one piece of the equation is just not going to be broken.

Clearly, I saved the appropriate option for last.
To me, option 4 is the option that should have been chosen in the first place. When Bob and Sue were hired, they should have been informed of the need for your financial information to be shared and communicated to other entities (including each other) upon request by one of those entities. In short, they need to open themselves to integration.

The preceding is an analogy for a problem with which I've wrestled countless times. A shop creates an application to solve a particular business problem. This application requires a database for persisting data relating to the problem domain, and consequently one is created along-side the application's development. Sooner or later, this application is deployed, beginning its life in the wild. We'll refer to this application as "Application 1".

Later, another application is developed to solve one of the many other problems the business faces. Nothing is really special about this application, which we will call "Application 2" for lack of a better name, except that in order to fulfill its use-cases efficiently, it requires information created and managed by Application 1. No problem, we'll just hook in to Application 1's database and retrieve some of this data. Herein lies our problem. We've just undermined Application 1 in the same way that Sue undermined Bob in our analogy. We don't yet notice any adverse effects, and Application 2 is completed and released to the wild.

Sometime later, the business has acquired a list of new features and use-cases for Application 1, along with some obsolete use-cases that can be removed. After inspection of the requirements, a design for the new version of Application 1 has been formulated. This design calls for a few additions to existing structures and a few changes in how existing data is structured, greatly simplifying the application development work.

Suppose the development team quickly realizes that Application 2 depends on the data structures of Application 1's database. Dataundermining at its finest. Updating Application 1 is not going to be as easy as the design suggested, and they have a very tough choice to make. Do they update Application 2 as a part of the Application 1 upgrade? Sounds like scope-creep to me, and the business may not have budgeted for multiple applications to come under the knife. Do they re-think their approach and make updates to Application 1's database and use-cases to "work-around" the dependency that's been created? This would keep the data structures that are depended upon by Application 2 intact, but could potentially create other illogical data structures and increase the overhead & complexity for the interaction of Application 1 and its database. It would also make further evolution of Application 1 more difficult. I hope its clear to see this scenario resembles option 1 of our analogy.

Suppose again the development team realizes the dependency situation on Application 1's database. Another solution consists of an attempt to remove the integration piece from Application 2 and for the user to consult Application 1 prior to use, but this brings about a greater potential for error and a loss of productivity. In this case, we're closely matching option 2 of our analogy.

In a different light, suppose the development team is oblivious to processes outside of Application 1. Development ensues, the structures are modified, the application update is complete and Application 1 v2.0 is released. Now, the effects of dataundermining are manifest after the fact. Application 2 crashes. What happened? It's rather simple; Application 2 is looking in Application 1's database for a data structure shaped like v1.0 and got a data structure shaped like v2.0. It follows naturally that issues should arise from a scenario like this, a scenario very closely mirroring option 3 from our analogy.



The illustration above exemplifies how dataundermining becomes a problem. Application 2 has accessed Application 1's database to fulfill its use-cases, inhibiting Application 1's ability to evolve. The most robust solution uses techniques of SOA to allow each application to communicate without acquiring illegitimate access to the other's data store. Whether that be through text-based communication, like Web Services, or binary-based communication like CORBA, DCOM or RMI, the best scenario for all involved requires letting each application be solely responsible for its own data, and building into each app the communication abilities for access to its data.