Thursday, March 25, 2010

Sequential Asynchronous Workflows Part 2: Simplified

I received quite a bit of feedback related to my sequential asynchronous workflows post. Most people asked if I could simplify the example even more and/or show the inner workings of the coroutine engine itself. Because the source for the library I demonstrated in the last post is not available, I've put together a very simple solution to help you better understand and take advantage of coroutines.

First, the fine print. This framework is simply to help understand the concept and provide a starting point for building out coroutines yourself. I know some companies simple don't allow third-party code and it always helps to learn a framework by starting at the ground floor. Some things you won't find here that would belong in production code include thread-awareness (background or UI thread?), timeouts (i.e. if I call out and it never returns, does my workflow die?) and exception management. All very important but reasons why this framework is educational and not a "production-ready" implementation.

Download the source for this post

Remember the original interface we defined for illustration purposes, the ICoroutine interface? Here it is:

public interface ICoroutine
{
    void Yield();
    Action Yielded { get; set; }
}

Now, to drive a workflow, you simply need an engine. Let's build a quick and dirty engine to drive what we want. Last post I showed how to use an existing framework. This time we'll do it a little differently. Instead of queuing asynchronous calls using the asynchronous process model, we'll drive the process with the coroutine interface itself. This comes with some caveats. What I'm going to show you will work, but a production solution will have to take it a step further and deal with things like execution threads (background vs. UI) etc (this is all handled as part of the AsyncEnumerator class I showed you before).

So, here is our "bare bones" coroutine manager, and again, this is closer to Rob Eisenberg's implementation in his MIX session on building your own MVVM framework than the version I showed you last time, but I've taken it and dumbed it down as much as possible to make it easy to use and understand.

public class CoroutineManager
{
    private readonly IEnumerator<ICoroutine> _enumerator;

    public CoroutineManager(IEnumerable<ICoroutine> workflow)
    {
        _enumerator = workflow.GetEnumerator();
    }
    
    private void Yielded()
    {            
        if(!_enumerator.MoveNext())
            return;

        var next = _enumerator.Current;
        next.Yielded = Yielded; 
        next.Yield();
    }
    
    public static void Begin(object workflow)
    {
        if (workflow is ICoroutine)
        {
            workflow = new[] {workflow as ICoroutine};
        }
        
        if (workflow is IEnumerable<ICoroutine>)
        {
            new CoroutineManager(workflow as IEnumerable<ICoroutine>).Yielded();
        }            
    }     
}

Not much to it, is there? The class has an entry point that can take a single coroutine or group, instantiate the class, then kick off the workflow. The workflow simply takes the next coroutine from the stack, wires in the yielded event to call back into the coroutine manager, and then executes it. That's it!

So how does it look? Let's make two helper classes: one that is generic and can handle any type of action, and another that is specific to our random number service.

Here's what the generic action coroutine looks like:

public class ActionCoroutine : ICoroutine 
{
    private readonly bool _immediate;

    public Action Execute { get; set; }

    public ActionCoroutine()
    {
        
    }

    public ActionCoroutine(bool immediate)
    {
        _immediate = immediate;
    }

    public ActionCoroutine(Action action)
    {
        _immediate = false;
        Execute = action;
    }

    public ActionCoroutine(Action action, bool immediate)
    {
        _immediate = immediate;
        Execute = action;
    }

    public void Yield()
    {
        Execute();
        if (_immediate)
        {
            Yielded();
        }
    }

    public Action Yielded { get; set; }  
}

Again, not much to it. I can either pass an action to trigger immediately, or pass an action and set immediate to false. If I set immediate to false, then I need to wire in something to call the Yielded method (remember, our coroutine manager wires this up for us to re-enter the iterator state machine). I'll show you usage in a second. Finally, here is my random number service helper:

public class RandomResultCoroutine : ICoroutine
{
    private readonly RandomNumberService _service;

    public RandomResultCoroutine(RandomNumberService service)
    {
        _service = service;
    }

    public int Max { get; set; }

    public int Result { get; set; }

    public void Yield()
    {
        _service.GetRandomNumber(Max,
                                 result =>
                                     {
                                         Result = result;
                                         Yielded();
                                     });
    }

    public Action Yielded { get; set; }
}

Notice how this service "wires itself." It has a max setting, calls the random number service, and tells the random number service to call back into a lambda expression. The expression sets the return result, then fires the Yielded message to re-enter the state machine flow.

This is an example where we can make it work, but a more robust solution will have to handle the exceptions. For example, what if the service never calls my action? Then I'm in a bad state because the Yielded will never get executed. That's why having timeouts and other checks and balances are important for a production-ready solution.

OK, we've set up our simple helpers and coroutine manager, let's see it in action. I'm just going to do everything in the code-behind for the main page to keep it simple. I'll set up three workflows. Two will generate shapes (circles and squares) and then feed the shapes to the third workflow which animates colors. This means we'll actually have dozens of workflows running simultaneously, but they will still fire sequentially within the workflow.

Take a look at our color workflow (it has as many iterations as seconds are in a day, just to keep it going for you to watch):

private IEnumerable<ICoroutine> ColorWorkflow(Shape element)
{
    for (int x = 0; x < 60 * 60 * 24; x++)
    {
        var randomAction = new RandomResultCoroutine(_service) { Max = 128 };

        yield return randomAction;

        int a = randomAction.Result + 128;

        randomAction.Max = 255;
        yield return randomAction;

        int r = randomAction.Result;

        yield return randomAction;

        int g = randomAction.Result;

        yield return randomAction;

        int b = randomAction.Result;

        var color = new Color {A = (byte) a, R = (byte) r, G = (byte) g, B = (byte) b};

        var fromColor = color; 

        var storyboard = new Storyboard();
        if (element.Fill != null && element.Fill is SolidColorBrush)
        {
            fromColor = ((SolidColorBrush) element.Fill).Color; 
        }
        var colorAnimation = new ColorAnimation {Duration = TimeSpan.FromSeconds(2), From = fromColor, To = color};
        Storyboard.SetTarget(colorAnimation, element);
        Storyboard.SetTargetProperty(colorAnimation, new PropertyPath("(Shape.Fill).(SolidColorBrush.Color)"));

        var storyboardAction = new ActionCoroutine(storyboard.Begin, false);
        
        storyboard.Completed += (o, e) =>
                                    {
                                        element.Fill = new SolidColorBrush(color);
                                        ((Storyboard) o).Stop();
                                        storyboardAction.Yielded();
                                    };

        yield return storyboardAction; 
    }

    yield break; 
}

Notice how straightforward it is. With our direct random number service helper, we can keep yield returning results. We will only go past the yield statement when the service call actually returns, and we can inspect the result because of the field we added to the helper, which is only set when the result is received and before the state machine is re-entered by calling yielded.

What's nice about implementing the ICoroutine interface is that all you have to do to repeat the call and get a new result is simply yield the same helper class. The implementation ensures that the manager will call into it, block until a result is received, then continue execution with the new value available.

For the storyboard, we use the generic action coroutine. The begin action is set to kick off the storyboard and when it ends we set the new color and stop the storyboard. In this case I also wire in a call to yielded and set the "immediate" flag to false because we're depending on the story board completion to continue the workflow.

The square and circle workflows are exactly the same, so I'll just show the square one here (probably means they could be refactored to something simpler, too, but it works for this demonstration).

private IEnumerable SquareWorkflow()
{            
    var randomAction = new RandomResultCoroutine(_service) {Max = 20};

    randomAction.Yield();
    yield return randomAction;

    int iterations = randomAction.Result + 5;

    for (int x = 0; x < iterations; x++)
    {
        var rectangle = new Rectangle();
        rectangle.SetValue(NameProperty, Guid.NewGuid().ToString());

        randomAction.Max = 100; 
        yield return randomAction;
        
        var size = randomAction.Result + 10;

        rectangle.Width = size;
        rectangle.Height = size;

        yield return randomAction;

        int left = randomAction.Result;

        yield return randomAction;

        int top = randomAction.Result; 

        rectangle.Margin = new Thickness(left, top, 0, 0);

        var loadedSquare = new ActionCoroutine(() => LayoutRoot.Children.Add(rectangle), false);                                   
        rectangle.Loaded += (o, e) => loadedSquare.Yielded();

        yield return loadedSquare;

        CoroutineManager.Begin(ColorWorkflow(rectangle));
    }

    yield break;
}

This time we get a random number of squares and begin setting them up. We have a random size for the squares. Note we use the generic action coroutine to fire adding the square and loading it, and only when it is loaded do we kick off the color workflow to begin animating the colors on the square. This same workflow is repeated for circles.

You see how easy it is to kick off the routine? In fact, to kick off the main workflows, we simply do this:

public MainPage()
{
    InitializeComponent();
    Loaded += MainPage_Loaded;
}

void MainPage_Loaded(object sender, RoutedEventArgs e)
{
    CoroutineManager.Begin(SquareWorkflow());
    CoroutineManager.Begin(CircleWorkflow());
}

With those kicked off, even though we have two coroutines, you'll see they run asynchronously. While the circles and squares are sequentially added and animated (as opposed to popping in immediately as would happen if they were asynchronous within the workflow), they do so in parallel with each other (and once loaded, all of the color workflows continue to animate the individual elements but don't kick off a new color until the old storyboard is complete).

Here it is to play with. Due to a bug (can you find it?) the storyboard animation runs and completes but we only see the color change when it's done. That's OK because you can pick any shape on the surface and count 2 seconds and you'll notice the color changes every 2 seconds ... for every shape, proof that we have simultaneous sequential workflows all running asynchronously with each other. We also may see some CPU and memory issues over time because I'm not unhooking events.

Hopefully this helped simplify it a bit for those of you who were having trouble with the last post or wanted to see the innards of a framework so you can begin to build your own infrastructure.

Download the source for this post

Jeremy Likness

8 comments:

  1. Excellent posts Jeremy. I like the CoroutineManager, a bit less clunky than newing up a class every time. Rob's session was worth the price of admission to MIX alone.

    Joe

    ReplyDelete
  2. Great posts, Jeremy! I also like coroutines which I saw first with Robs talk.

    But then I started to compare it with a background thread, that uses blocking calls sequentially and synchronizes the results back with the dispatcher.

    And somehow, I can't find real advantages of the coroutine solution any more.

    What are your thoughts on that?

    Thanks, best regards
    Andreas

    ReplyDelete
  3. That was the point of my post, is to show how these can extend and make a workflow easier to read. For example, the background thread is a perfect example. You could easily create a helper routine that implements ICoroutine and internally creates a background worker. It would register to the ProgressChanged event and called Yielded when done, and you'd have the advantage of using the thread for the work but the easy readability of the workflow.

    The problem that the coroutines are designed to address are multiple sequential asynchronous calls. I believe with a background you'd have to spawn it, register for completed, then spawn the next, etc. This is a way to streamline by hiding the hooks of spawning/waiting in the workflow. Does that make sense?

    A coroutine in and off itself is simply a specialized state machine. It's what you do with it that makes the real "final" framework.

    ReplyDelete
  4. Hi!
    I'm very interested in the topic... :)

    So i post this
    http://social.msdn.microsoft.com/Forums/en/rx/thread/c51ef077-4433-4308-9c3d-fcafc90905da

    to get some info to achieve something similar with Reactive Framework!

    My compliments, I look forward to see async calls programming simplified! :)

    Thanks Andrea!

    ReplyDelete
  5. Hi Jeremy,

    This was an excellent simplification of the use of iterators for coroutines; the plumbing is something that really shouldn't be seen by someone other than the framework developer; however, it is very powerful.

    What is really awesome about this pattern is that anyone who has written a custom iterator could create a 'workflow' which is easy to: a) read; and b) maintain. (Not to mention the fact that the compiler does a lot of the 'grunt' work for this technique.)

    I think this is a very readable way of developing code which is asynchronous --most specifically when you have no choice but to use asynchronous calls (i.e. Silverlight) and you have cases where you *need* to serialize (I have to clarify 'serialize' by meaning that each asynchronous event occurs, in turn, one after another, while still allowing all things to go on) all the steps within an asynchronous workflow and allow them to be represented within one function.

    As Andrea pointed out, there is the Rx (reactive) framework; which is a very cool idea, but to read some of that code is so lambda, lambda, lambda it makes me omega moo! Sure, it's easy if you wrote the code; however, if it is conceptually difficult from the start, it is difficult to get others to use it. Not only that, I imagine that others who need to 'maintain' it later will hope they can easily find the bug or better yet, the spot in which they need to catch the next 'Shift+Ctrl+MouseMove' event.

    Thank you for 'simplifying' the original post, I actually grok what is happening from this example; it makes a lot of sense and I can start working from what you have provided.

    Again, thank you very much Jeremy, this is some really helpful stuff and I very much appreciate it, you rock! (another bookmark added for this post and you got the two toes up!)

    ReplyDelete
  6. One scenario that this solution doesn't cope with (nor does Robs for that matter) is the exceptional code path.

    When yield is used to create an IEnumerable the GetEnumerator returns an object that implements both IEnumerator and IDisposable. In the nominal path the IDisposable aspect can be ignored, however, in the exceptional path the object needs to be disposed.

    ReplyDelete
  7. Yes, that is why I called out this example is easier to understand, perhaps, but less complete ... and hence why part 1 focuses on the AsyncEnumerator which has most of those mechanisms built-in "out of the box." Thanks!

    ReplyDelete
  8. Hi Jeremy,
    Thanks for this wonderful post.
    I am currently facing problem in my design which is somewhat similar to the case you have discussed here.See if you could help me in arriving at one of the design options below:-

    I am calling a workflow service hosted in IIS from a windows service.

    This workflow service has to process data for 5000 orders within 4 minutes.
    This system should be high performing and scalable.
    I can foresee 4 options to implement this:-

    1. Have a FOR loop in windows service and create workflow service instance for each order processing in sequential way.

    Demerit of this approach -
    a) Less performing b) More calls to IIS
    Merit of this approach - If I setup webgarden or load balanced web farm at later stages, system will scale easily without any code

    2. Have a FOR loop in windows service and create one workflow service instance for each order processing in asequential way.


    Merit of this approach - More performing than approach number 1
    Demerit of this approach - a)More asynchromus call to workflow service might result in "not able to process or request refused" from the IIS side.
    b) More calls to IIS

    3. Create multiple threads in windows service, each thread will create a workflow service instance and pass on 1(one) order to it for processing to each workflow instance

    Merit- Performance and scalabiltiy will be the best
    Demerit - a)CPU utilistion might cross danger level due to so many instance creation/deletion which may lead to performance degradation
    b) Memory utilistion might cross danger level

    4. Create a single instance of workflow service in window service and pass on all the order numbers to it. Workflow service instance will process thes orders one by one

    Merit - Only one call to IIS

    Demerit -a) Less performing b) Less scalable as only one workflow service is processing all orders

    Which approach do you suggest to follow(can be different from the above mentioned approaches) assuming both performance and scalability are important to me

    Thanks again
    Bhaskar

    ReplyDelete