Wednesday, September 30, 2009

Silverlight/Prism ViewModel and DelegateCommand

In yesterday's post about Decoupled ChildWindow Dialogs in Silverlight using Prism, I demonstrated a way to use EventAggregator to decouple the implementation of a dialog from the code that requires the confirmation. In one example, I showed a code-behind click event that fired off the process, something like this:

private void Button_Delete_Click(object sender, System.Windows.RoutedEventArgs e)
{
    Button src = e.OriginalSource as Button;
    if (src != null)
    {
        _eventService.GetEvent<MessageBoxEvent>().Publish(
            MessageBoxPayload.GeneratePayLoad(src.DataContext, true, "Please Confirm", "Are you sure you wish to delete this item?",
                                      DeleteItem));
    }
}    

This, in my opinion, is way too much information for a view to know and understand. It has to know about my event aggregator service? Is that really necessary?

If you are not familiar with the Model-View-ViewModel (MVVM) pattern, you'll want to Google/research it now. It is important because a lot of the marshalling of data will happen in the ViewModel, and the view will simply be bound to the view model.

Prism provides a commanding interface that works well with buttons. You can use an attached property to bind the click to a command (which is in turn a DelegateCommand). The command will disable the button if it can't execute and fire when the button is clicked. The button becomes a binding, like this:

...
<Button Commands:Click.Command={Binding SearchCommand} .../> 
...

In the view model, the SearchCommand is defined like this:

...
public DelegateCommand<object> SearchCommand { get; set; }
...
SearchCommand = new DelegateCommand<object>(o => 
   _service.GetEventService().GetEvent<SearchEvent>().Publish(
      SearchCriteria),o => _CanSearch());

In this example, we are assuming the form fields are bound to a search criteria object in the view model. Presumably once the criteria meet our validation requirements, _CanSearch will return true, the button will be enabled, and the command will publish an event to begin the search using the populated criteria entity.

While I intend to dig more in depth in that pattern, I wanted to present something I believe is a common issue and describe how I addressed it. In this case, the view model contains the search criteria as well as a collection of objects for the results grid. This becomes problematic with binding, because my grid row is bound to the item in the list, not the view model itself. Unfortunately, Silverlight does not support (to my knowledge) the relative binding syntax available in WPF, so it can be troublesome to try to bind the click in the grid to a command.

If there is a simple way to do this I'd love to learn more, but any examples I've seen were fairly complex to maintain the "purity" of leaving everything out of the code behind. Me? I'm a bit more pragmatic. I'd rather go ahead and bind the click event (I know, some of you are shuddering ... that means a code behind ...) like I did above. However, there is a compromise!

Instead of making my view aware of the event aggregator service, I can do a bit better. The view model already knows about the service. Let's assume we could bind directly to a command. We'll create a command like this, in the view model:

public DelegateCommand<MyEntity> DeleteCommand { get; set; }

We can then wire the command to publication of a delete event:

DeleteCommand = new DelegateCommand<MyEntity>(
                _service.GetEventService().GetEvent<DeleteMyEntityEvent>().Publish, o => true);

Notice we don't supply parameters to Publish. That's because our event is already wired to act on an instance of MyEntity, and so the event will fire and pass that entity as a parameter.

Now we can clean up our code behind to look like this:

private void Button_Delete_Click(object sender, System.Windows.RoutedEventArgs e)
{
    Button src = e.OriginalSource as Button; 
    if (src != null)
    {
        MyEntity entity = src.DataContext as MyEntity;
        if (entity != null)
        {
            if (_ViewModel.DeleteCommand.CanExecute(entity))
            {
                _ViewModel.DeleteCommand.Execute(entity);
            }
        }
    }
}

So, it's not "pure" in the sense that I do end up with code behind ... but ask me, do I really care? Is it all that bad? The view is aware of the view model, so why not allow it to response to what views do well (view events) and marshall them to the view model? The implementation of the command is still hidden from view.

With some clever attached properties, we could also make the button disable if CanExecute is false, and because of data virtualization we'd only be evaluating it for the grid items that are in the view.

Jeremy Likness

Tuesday, September 29, 2009

Decoupled ChildWindow Dialogs with Prism in Silverlight 3

A common user interface component is the confirmation or message box, which is often presented as a dialog returns a boolean (OK/Cancel). There are a variety of ways to achieve this, but how can you decouple the implementation of the popup from the request itself? This is necessary, for example, for unit testing when you may not have a UI available. This article demonstrates one solution using Silverlight and the Composite Application Guidance library, also known as Prism.

The first piece we are going to build is a payload that allows us to deliver the popup message. The payload looks like this:

public class MessageBoxPayload
{
    public object DataContext { get; set; }

    public bool AllowCancel { get; set; }

    public string Title { get; set; }

    public string Message { get; set; }

    public Action<MessageBoxPayload> ResultHandler { get; set; }

    public bool Result { get; set; }

    private MessageBoxPayload()
    {            
    }

    public static MessageBoxPayload GeneratePayload(object dataContext, bool allowCancel, 
        string title, string message, Action<MessageBoxPayload> resultHandler)
    {
        MessageBoxPayload retVal = new MessageBoxPayload {AllowCancel = allowCancel, Title = title, Message = message, 
            ResultHandler = resultHandler,
        DataContext = dataContext};
        return retVal; 
    }
}

Because we are implementing this completely decoupled, we cannot make any assumptions about state. Therefore, the payload carries a data context that can be passed back to the requestor. This is done using an Action of the payload type, allowing the requestor to provide a method to get called when the dialog closes.

Next, we'll define the event we use to request the message box. That is based on the event mechanism supplied by Prism:

public class MessageBoxEvent : CompositePresentationEvent<MessageBoxPayload>
{
}

Notice that we derive from the CompositePresentationEvent. This is a base class for all events that will participate in the event aggregator service. Now that we have our payload and our service defined, we can easily begin to pubish to the event. If you had a data grid with a delete button, the event would look something like this (for simplicity's sake I'm not using DelegateCommand):

private void Button_Delete_Click(object sender, System.Windows.RoutedEventArgs e)
{
    Button src = e.OriginalSource as Button;
    if (src != null)
    {
        _eventService.GetEvent<MessageBoxEvent>().Publish(
            MessageBoxPayload.GeneratePayLoad(src.DataContext, true, "Please Confirm", "Are you sure you wish to delete this item?",
                                      DeleteItem));
    }
}      

public void DeleteItem(MessageBoxPayload payload)
{
    if (payload.result) 
    {
       MyItem item = payload.DataContext as item;
       Delete(item);
    }
}

For unit tests, you can now simply build an object that subscribes to the delete event and returns the desired results.

As you can see, the delete click wraps the message into a payload and publishes the event. A delegate is provided to call back to "DeleteItem" with the result of the dialog. If the user confirmed, then the data context is used to pull the entity for the given row and the delete command is performed.

The _eventService is defined like this:

...
private readonly IEventAggregator _eventService;
...

Because IEventAggregator is supplied as a parameter in the constructor for the view, it is automatically wired in by the dependency injection framework. There are two steps required to get the event aggregator to your objects.

First, in your bootstrapper, you'll want to register a single instance:

protected override void ConfigureContainer()
{
    base.ConfigureContainer();

    // provide a reference to the event aggregator
    Container.RegisterInstance<IEventAggregator>(Container.Resolve<EventAggregator>());

}

Notice that I use "resolve" to get the instance. This is good practice when using a dependency injection/inversion of control framework. When you new your objects, you must select the appropriate constructor and inject the appropriate values. By calling Resolve you ask the container to read the signature of the constructors and provide implementations based on your current configuration and registrations.

The second step is to inject the service to the appropriate handler for the popup. Let's focus on implementing the actual popup. The easiest way to handle common infrastructure items is to create a common module. That module can contain elements that are shared between projects. First, we'll create a new child window and call it "Popup." The XAML looks like this (in my case, the project is Modules.Common and the Popup is in a subfolder called Views).

<controls:ChildWindow x:Class="Modules.Common.Views.Popup"
           xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" 
           xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" 
           xmlns:controls="clr-namespace:System.Windows.Controls;assembly=System.Windows.Controls"
           Width="Auto" Height="Auto" 
           Title="{Binding Title}">
    <Grid x:Name="LayoutRoot" Margin="2">
        <Grid.RowDefinitions>
            <RowDefinition Height="Auto" />
            <RowDefinition Height="Auto" />
        </Grid.RowDefinitions>
        <TextBlock TextWrapping="Wrap" Text="{Binding Message}" FontFamily="Arial" FontSize="12" TextAlignment="Center" Grid.Row="0"/>
        <Button x:Name="CancelButton" Content="Cancel" Click="CancelButton_Click" Width="75" Height="23" HorizontalAlignment="Right" Margin="0,12,0,0" Grid.Row="1" />
        <Button x:Name="OKButton" Content="OK" Click="OKButton_Click" Width="75" Height="23" HorizontalAlignment="Right" Margin="0,12,79,0" Grid.Row="1" />
    </Grid>
</controls:ChildWindow>

The main thing to note is the use of Auto to ensure the dialog resizes based on the content. Now for the code behind.

public partial class Popup
{   
    public Popup()
    {
        InitializeComponent();
    }

    public Popup(MessageBoxPayload payload)
    {
        InitializeComponent();
        DataContext = payload;
        CancelButton.Visibility = payload.AllowCancel ? Visibility.Visible : Visibility.Collapsed;
    }                

    private void OKButton_Click(object sender, RoutedEventArgs e)
    {
        DialogResult = true;
        MessageBoxPayload result = (MessageBoxPayload) DataContext;
        result.Result = true; 
        result.ResultHandler(result);
    }

    private void CancelButton_Click(object sender, RoutedEventArgs e)
    {
        DialogResult = false;
        MessageBoxPayload result = (MessageboxPayload)DataContext;
        result.Result = false;
        result.ResultHandler(result);
    }
}

The key here is that we have a constructor that takes in the payload and sets the data context, then sets the visibility of the cancel button. Then there are events bound to the buttons that will set the appropriate result then call the handler specified for the dialog close event.

One confusing topic here may be how to get the view into the application. If this is truly a decoupled module, how will it "know" about the regions you've defined? Furthermore, even if you iterated the region collection and injected it to the first one you found, you will find a goofy, empty popup window just hanging out. Not exactly what we want! In order to manage this popup, we'll use a controller.

The popup controller looks like this:

public class PopupController
{
    public PopupController(IEventAggregator eventService)
    {
        eventService.GetEvent<MessageBoxEvent>().Subscribe(PopupShow);
    }

    public void PopupShow(MessageBoxPayload payload)
    {
        Popup popupWindow = new Popup(payload);
        popupWindow.Show();          
    }
}

Now we can set up our module to invoke the controller.

public class CommonModule : IModule
{               
    private readonly IUnityContainer _container;

    public CommonModule(IUnityContainer container)
    {
        _container = container;
    }

    public void Initialize()
    {
        _container.RegisterInstance(_container.Resolve(typeof (PopupController)));         
    }
}

Notice we aren't registering with a region. Instead, we simply resolve a single instance of the controller. This will subscribe to the popup event. Because we use the container to resolve the controller, the container will automatically reference the EventAggregator and inject it into the constructor. Last but not least, this module simply needs to get registered with the module catalog:

protected override IModuleCatalog GetModuleCatalog()
{
    ModuleCatalog catalog = new ModuleCatalog();
    catalog.AddModule(typeof (MySpecificModule));
    catalog.AddModule(typeof (CommonModule)); 
    return catalog; 
}

Again, because we registered the EventAggregator earlier, it will pass the container along to the module and inject the aggregator into the controller. Now, when we publish the event, a nice child window will appear for us:

Silverlight popup

Of course, you can extend this to have the controller manage multiple windows, customize the button text or add nice images as well. This is just one of the many ways use of dependency injection and the event aggregator pattern can help the need (give me a response) from the implementation (show a popup) and provide an easy way to reuse components across applications.

Jeremy Likness

Saturday, September 26, 2009

A Twist on the Twist of the Observer Pattern

Thanks to those of you who read my Twist on the Observer pattern and gave me the feedback. You said,

"Hey, Jeremy, that's neat, but there is already a pattern established for what you're talking about, and a few great solutions ready to use. Besides, they are much, much more powerful..."

Thanks to Microsoft MVP Jason Rainwater for taking the time to give me an excellent explanation and for really delving into the inner workings of the solution.

The solution is the event aggregator pattern. For a good introduction, check out Jeremy Miller's brain dump on the topic.

The PRISM/CAL comes with its own event aggregator. It supplies a IEventAggregator that you can wire into your dependency injection container, then reference throughout the project.

In my base service class (seen by all) I create an event like this:

...
public class EntitySelectedEvent : CompositePresentationEvent<MyEntity>
{
}
...

In the view model for the module that has to "wake up" when the entity is selected, I inject the aggregator into the constructor and then do this:

...
eventAggregator.GetEvent<EntitySelectedEvent>().Subscribe(u=>this.entity=u); 
...

When the entity is selected (in a completely different module), I can simply publish:

...
eventAggregator.GetEvent<EntitySelectedEvent>().Publish(entity);
...

That's it! One module listens for the selection and reacts, not caring where/who or how it is published (which means we can publish a test entity in our unit test and test the subscription mechanism) ... while another module publishes the event and doesn't care who is out there listening.

For those of you who weren't familiar with the pattern or the implementation, I hope this helps get you excited and adds another element to your arsenal of coding techniques!

Jeremy Likness

Friday, September 25, 2009

Silverlight on IE6: Nagging Security Bug

Do you have a wonderful Silverlight application that you enjoy running in all of the glorious browser flavors available, only to find out that some weird quirky issue pops up in our old friend, Internet Explorer version 6.0 (IE6)?

Perhaps you were as puzzled as we were when IE6 would complain with a "Security warning: nonsecure items on secure page ... blah, blah" (even though we scoured the source and even ran a packet inspector and no non-secured pages were being requested). No matter what your reply, it would pop to some random page and show "page not found."

The solution is quite simple. Most Silverlight examples include having an iframe to interact with for browser history, etc. It is just empty (doesn't link anywhere). Many people might include it without even really using it, but the interesting thing is that even though it points nowhere, IE6 automatically assumes that you are going somewhere insecure!

There are two solutions.

  • If you aren't using it, get rid of it. No iframe in the control that hosts it, and you're golden.
  • If you are using it, point it to a dummy page with a secure URL. Something like:
    ...
    <iframe src="https://myserver.com/transparentgif.GIF"...>
    ...
    
    should work just fine!

Jeremy Likness

IUnityContainer and the Rabbit Hole

I'm working on an exciting project using Silverlight and the PRISM framework/Composite Application Library. It is perfect for our use as a rich, interactive line of business application that has a lot of features. We want to be able to quickly and easily extend areas of functionality (which the dependency injection and region management features of the framework provide very nicely) and have concerns about performance and size of the plugin downloads (which the load-on-demand for modules satisfies quite well).

One thing I noticed, however, was that in some of my layers of abstractions I was creating distractions. Consider something like this:

public class MyModule
{

   private IService _myService;
   private IEntityBroker _myEntityBroker;

   public MyModule(IUnityContainer container) 
   {
      _myService = container.Resolve();
      _myEntityBroker = container.Resolve(); 
   }
   ...
}

Can you see what's wrong with this right away? Does it leave you with a sinking feeling that we're not in Kansas anymore?

Fundamentally, the problem is that injecting the container was easy, and then it makes sense to resolve my references, but now I'm violating several principles of sound design. Primarily, why should my module even know or care that there is a dependency injection framework? That is the whole point of having a bootstrapper ... someone "up there" can see the land and wire everything in. Those of us "down here" just need to focus on what we do and the minimal tools we need to do it with.

In other words, what I really need in my module is a service and an entity broker. Forget about the container.

So now I simply removed the references to the unity framework from my module. I refactored the constructor to look like this instead:

...
public MyModule(IService service, IEntityBroker broker)
{
   _myService = service;
   _myEntityBroker = broker;
}

There, isn't that nicer? Now I don't have to have an inversion of control framework. I can just as easily send in a concrete instance, use a factory, or whatever else I want. If I do happen to use a framework, it will happily resolve the interfaces for me when I ask for an instance because it recognizes the signature of the constructor.

Unity is a great tool, but just be careful how far down the rabbit hole you decide to shove it. A great design will deal with all of the interfaces and components everything needs, and chose how to inject the concrete implementation later in the game. Never dirty your classes with knowledge of how they are being managed.

Jeremy Likness

Wednesday, September 23, 2009

A Twist on the Observer Pattern

The observer pattern is well established and used. The typical scenario is to register to an class and then allow your Notify method to be called. This often involves keeping an internal list of observers and then iterating them to notify that something has changed. I had a situation recently that warranted a lighter weight approach that allows a many-to-many observation (multiple listeners, multiple objects to observe).

I was working with the Composite WPF/PRISM framework and had an interesting situation. One of my regions is implemented as a TabControl, so modules that register with the RegionManager inject their views into the tabs. This is a very powerful model because now I can add as many modules/views as I like and not worry about the implementation details of wiring up the tab control. I created a view interface IViewable to implement with my controls that exposes the name of the view along with a boolean indicating whether the view should be enabled (this allows me to restrict going to certain tabs if they depend on data from another tab).

The issue was figuring out the best way to handle the tab focused event. By default, the views are wired up and ready to go when you click the tab. I tried searching for a solution to allowing my view to know when it was "active" but didn't find much that was satisfactory. It could be I'm completely off on my approach because there is some pattern already existing, but I decided to go with something very lightweight ... even lighter than the observer pattern. I don't want to know the details of what the other views or modules are, I just want to know when I'm getting focus and that is controlled by my parent container (I also shouldn't know that my parent container is a TabControl - it could be any ItemsControl for that matter).

The pattern was quite simple, really. First, I created an interface:

public interface IObservableHelper 
{
   void Notify(object obj); 
   event EventHandler<EventArgs> Notified; 
}

The implementation was easy too:

public class ObservableHelper : IObservableHelper
{
   public void Notify(object obj) 
   {
      if (Notified != null) { Notified(obj, EventArgs.Empty); }
   }

   public event EventHandler<EventArgs> Notified;
}

See, very easy. I don't have to keep track of the objects observing because objects who want to know about this event simply register to the event. In my boot strapper class I make sure there is exactly one instance of the helper:

...
Container.RegisterInstance<IObservableHelper>(new ObservableHelper()); 
...

With the dependency injection, the helper can be injected simply by asking for it in the constructor. That means in my view, I can do this:

public MyView(IObservableHelper helper)
{
   helper.Notified += (obj,args) =>
   {
      if (obj == this) { DoSomething(); }
      else if (obj is ISomeInterface) { DoSomethingElse((ISomeInterface)obj); }
   }; 
}

// observe "me" 
public void DoSomething()
{
} 

// observe "ISomeInterface" 
public void DoSomethingElse(ISomeInterface someInterface)
{
}

I've just registered to, eh, observe "myself." That's OK ... as a view, what I really want to know is when I am activated. So in the shell, I simply tap into the selection changed event for the Tab Control. That will resolve IObservableHelper then call notify with the element that was selected. Notified will fire, I'll see its "me" and then call DoSomething. Very lightweight and decoupled ... and if there are other objects participating, I can interact with them as well!

Jeremy Likness

More Confusion over "by ref" versus "Reference Types"

Apparently my prior blog post about "by ref" versus "reference type" caused quite a stir and a bit of confusion. I think the underlying issue is that people confuse "by ref" as having something to do with "reference types." They are not the same.

Reference types and value types are about instances, passing by ref is about how those instances are passed in a call.

If you'd like to weigh in (and even vote on the post if you like) the discussion is taking place on a thread over at the CodeProject where they syndicate my blog. You can access it by clicking here, then scroll to the bottom. Feel free to add comments here as well if you like!

Jeremy Likness

Tuesday, September 22, 2009

What's in Your Collection? Part 3 of 3: Custom Collections

This is the last installment in a three part series about using collections in C#.

The entire series can be accessed here:

We've now covered the interfaces and some concrete instances of collections provided by the .NET Framework. Now you are interested in moving things to the next level. What if the provided collections simply don't meet your business requirements? What are some ways you can use the collections concept to build your own classes to solve business problems?

Yield to Iterators

The first important thing to understand when you begin building your custom collection is the concept of iterators in .NET and the yield statement. I'm surprised that many people use the language without truly understanding this statement, what it exists and how it can be used.

You might have encountered yield in your journeys. If you've built custom AJAX client controls, you probably implemented IScriptControl. One method asks for IEnumerable<ScriptReference>. The implementation is usually presented as:

...
ScriptReference sr = new ScriptReference("~/MyUserControl.js");
yield return sr;
...

You could alternatively have created a List or any other collection of ScriptReference and returned that. What does yield really do for us?

To better understand, I've created a short little console application. You can create a new console project and simply paste this code to build and run.

using System;
using System.Collections;
using System.Collections.Generic;

namespace Yield
{
    internal class Program
    {
        private delegate bool DoSomething();

        private sealed class Doer
        {
            private readonly DoSomething _doSomething;
            private readonly string _msg;

            public Doer(DoSomething doSomething, string message)
            {
                _doSomething = doSomething;
                _msg = message;
                Console.WriteLine(string.Format("{0}: Ctor()", _msg));
            }

            public bool Do()
            {
                Console.WriteLine(string.Format("{0}: Do()", _msg));
                return _doSomething();
            }
        }

        private sealed class DoerCollection : IEnumerable<Doer>
        {
            public IEnumerator<Doer> GetEnumerator()
            {
                yield return new Doer(() => true, "1");
                yield return new Doer(() => false, "2");
                yield return new Doer(() => true, "3");
                yield break;
            }

            IEnumerator IEnumerable.GetEnumerator()
            {
                return GetEnumerator();
            }
        }

        private static void _DoIt(IEnumerable<Doer> doerCollection)
        {
            foreach (Doer doer in doerCollection)
            {
                if (!doer.Do())
                {
                    break;
                }
                Console.WriteLine(".");                
            }
            Console.WriteLine("..");                
        }

        private static void Main(string[] args)
        {
            _DoIt(new DoerCollection());
            _DoIt(new List<Doer>
                      {
                          new Doer(() => true, "4"),
                          new Doer(() => false, "5"),
                          new Doer(() => true, "6")
                      });

            Console.ReadLine();
        }
    }
}

So let's walk through the code.

First, I define a delegate called DoSomething that simply states "I want a method that takes no parameters and returns a boolean." This is a contrived example, of course, but in the "real world" you may have a pipeline or chain of responsibility that performs actions and then returns a status indicating that the process should continue or there is another node to consider, etc. I encapsulated the delegate in the class Doer. The constructor takes a "message" and an implementation of the delegate. The only reason I pass in the message is to track which object is doing what. What's important here is to see when the classes are created compared to when the main method is called, which simply invokes the delegate.

Next, I created my custom collection, DoerCollection. This is a collection of "activities" to perform. Obviously I am simply returning true or false in the example, but again, in a real-world scenario this could be a file system processor that iterates through a directory and returns files until no more can be found, or calls a web service and returns the status ... you get the idea. Notice that I simply yield return different instances of Doer that I pass the delegate implementation and a unique message identifier. If you recall from the first article in this series, this class is a collection because it implements IEnumerable.

The DoIt method takes any collection typed to the Doer class, and loops through the classes calling their "do" method until false is returned. It also emits some output just to demonstrate how it is looping, etc.

Finally, we get to implementation. The whole point of this example is to demonstrate how the yield command operates. We perform the exact same function on two very similar collections. The first pass uses an instance of my custom collection. The second pass creates a list and passes that into the method. What do you expect the output to look like? Compile the program and run it, and if you guessed correctly, you have a strong grasp of IEnumerator and yield.

Both collections were wired to contain three instances. Both had an instance return true, then false, then true, so the expected result would be to make it through two items and then break out of our loop. This is exactly what happens, but it's the output that is interesting. It turns out that using the List forced me to create everything up front, even if I wasn't going to use it (and who knows when that garbage collector will come by). The custom class using yield however only created two instances. The third class was never created!

Yield is nothing more than syntactic sugar for a state engine.

But wait, we are simply spinning through a collection. What do I mean by state engine??

If you recall in part 1, the key to collections is the Enumerator. An enumerator is a state engine. The "current state" represents either nothing (empty collection or already iterated through the entire collection) or an instance within the collection. The only transition this state engine can make is to move to the next item or end up in an uninitialized state.

This is what the program outputs to the console:

Yield

Now we'll pull out ildasm to peek beneath the hood. I've highlighted the DoerCollection class.

Yield IL

You'll notice that the GetEnumerator implementation actually creates a nested class behind the scenes. That class is our state engine. In red you can see the key pieces of that engine: a state, a current Doer instance, and the reference to the parent class. Highlighted is the key method called to transition state, MoveNext.

What is really interesting is pulling open the MoveNext method. I've used RedGate's free Reflector tool to reverse engineer the code. This will take the generated IL and provide a C# representation, so we can see what the actual underlying algorithm for the enumerator is.

private bool MoveNext()
{
    switch (this.<>1__state)
    {
        case 0:
            this.<>1__state = -1;
            if (Program.DoerCollection.CS$<>9__CachedAnonymousMethodDelegatea == null)
            {
                Program.DoerCollection.CS$<>9__CachedAnonymousMethodDelegatea = new Program.DoSomething(Program.DoerCollection.<GetEnumerator>b__7);
            }
            this.<>2__current = new Program.Doer(Program.DoerCollection.CS$<>9__CachedAnonymousMethodDelegatea, "1");
            this.<>1__state = 1;
            return true;

        case 1:
            this.<>1__state = -1;
            if (Program.DoerCollection.CS$<>9__CachedAnonymousMethodDelegateb == null)
            {
                Program.DoerCollection.CS$<>9__CachedAnonymousMethodDelegateb = new Program.DoSomething(Program.DoerCollection.<GetEnumerator>b__8);
            }
            this.<>2__current = new Program.Doer(Program.DoerCollection.CS$<>9__CachedAnonymousMethodDelegateb, "2");
            this.<>1__state = 2;
            return true;

        case 2:
            this.<>1__state = -1;
            if (Program.DoerCollection.CS$<>9__CachedAnonymousMethodDelegatec == null)
            {
                Program.DoerCollection.CS$<>9__CachedAnonymousMethodDelegatec = new Program.DoSomething(Program.DoerCollection.<GetEnumerator>b__9);
            }
            this.<>2__current = new Program.Doer(Program.DoerCollection.CS$<>9__CachedAnonymousMethodDelegatec, "3");
            this.<>1__state = 3;
            return true;

        case 3:
            this.<>1__state = -1;
            break;
    }
    return false;
}

You can quickly see that what is generated is really a massive switch statement. Based on the current state, it updates the current reference and changes the state. Most importantly, however, is the fact that the results of the yield are executed "on demand." In other words, it is not creating a large list, filling it with instances, and then iterating. Instead, the classes are instantiated "on demand" and then referenced for re-use later in case the collection is iterated again.

The whole key to this process is that the enumerator hides the underlying implementation. The consuming code simply knows there is a collection to iterate through. How that collection is built is up to the enumerator, which leads to very interesting possibilities. In the case of the ASP.NET page, this means that controls can be called iteratively and yield their script references and descriptors. The "master" code is simply iterating through the collection and wiring up the script references.

Thinking of collections as different ways of grouping objects is certainly valuable and can pertain to many different business situations. Understanding that Enumerator is really a state machine, however, allows you to start thinking of collections as processes. They aren't necessarily pools of instances, but can be algorithms or other processes as well. The key is that the use of the enumerator hides the implementation so that the consuming code simply iterates through something without having to understand the underlying implementation of how something is provided.

Jeremy Likness

Sunday, September 20, 2009

Editing WPF Controls in Expression Blend

This is one of those quirks that until you try it, you may not know it exists or what the answer is.

I am working on a WPF project and have a separate control library (actually, a module because I am using the Component Application Guidance/PRISM pattern). I pulled it into Expression Blend and was surprised to see that I had no "design" view. The option simply didn't exist/was grayed out in the menu.

So, I did some digging around and it turns out that because my controls file is just a C# class library, Expression doesn't know it is "allowed" to design the controls. To tell it this, you simply need to open your .csproj file that the XAML is in, then add this to the first PropertyGroup tag that you find:

<PropertGroup>
...
<ProjectTypeGuids>{60dc8134-eba5-43b8-bcc9-bb4bc16c2548};{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}</ProjectTypeGuids>
...
</PropertyGroup>

Just enter it exactly with the same Guids ... now close out of Expression, reload, and voila! you can now design.

Jeremy Likness

Tuesday, September 15, 2009

Ref Keyword for Reference Types

The Ref keyword is well known. It indicates that you are passing a reference, not a value, to a method. That means that if the method modifies the value, the changes will be apparent to the calling method as well.

Where I see a lot of confusion, however, is what happens when dealing with reference types. It is common to say that methods pass objects by reference, but that's not entirely true.

First, a pop quiz. Without actually running the code, what do you think this code snippet will produce?

using System;

namespace ByRef
{
    internal sealed class MyClass
    {
        public MyClass(int value)
        {
            Value = value;
        }

        public int Value { get; set; }
    }

    internal class Program
    {
        private static void _SwapByValue(MyClass myClass)
        {
            myClass = new MyClass(5);
        }
   
        private static void _SwapByRef(ref MyClass myClass)
        {
            myClass = new MyClass(5);
        }
        
        private static void Main(string[] args)
        {
            MyClass testclass = new MyClass(4);
            _SwapByValue(testclass);
            Console.WriteLine(testclass.Value);
           
            MyClass testclass2 = new MyClass(4);
            _SwapByRef(ref testclass2);
            Console.WriteLine(testclass2.Value);            

            Console.ReadLine();
        }
    }
}

We'll come back to that in a minute.

When you make a method call, all of the variables you pass are copied to the stack. This is the first place some people get confused. If I have an integer:

int x = 5;
CallMethod(x);

x is on my stack. A copy of the value of x ("5") is made and also placed on the stack. We now have two stack entries: "my x" and "the method's x."

But what about this?

MyClass myClass = new MyClass(5);
CallMethod(myClass);

The important thing to remember is that myClass is really a reference to the instance. So the first line gives us two allocations in memory: a block of heap that contains the class, and a local stack pointer to the class. So when we call the method, a copy is made just as in the value type example. In this case, it is a copy of the reference. So now I have my class in the heap, my local reference, and a copy of my local reference being passed to the method.

This is why the first case in the above example will print "4". All the method did was to change the reference on the method's stack to point to a new allocation in the heap - the new instance of MyClass. When the method returns, the copy is forgotten. The new instance (5) becomes orphaned, and is eventually garbage collected. The local reference still points to (4).

The second case is more interesting. As we mentioned, the ref keyword forces a reference, not a copy, to be passed. So this:

int x = 5;
MyMethod(ref x);

Skips making the copy. It simply gives the method access to the "5" value on the stack (some people mistakenly believe that the 5 is somehow boxed or unboxed, but that doesn't happen ... it simply isn't copied). If you change it, you change the same stack reference the calling program has, and therefore it will recognize the change.

What about with our object? This is where understanding the ref keyword is, well, key. Remember that when we new the (4) class, we have two important pieces: the class on the heap, and the pointer to the class (reference) on the stack. When we call the method with the ref keyword, it is not allowed to make a copy. Therefore, the method gains access to the same reference on the stack as the calling program. When it makes a new class (5), the stack reference is changed to point to this new instance. Now, the references point to (5) and (4) is orphaned and will be subject to garbage collection. This is why the second example shows "5" — the reference has been updated.

So, as you can see, objects are not "passed by reference" by default. Instead, when a reference type is passed in a parameter list, the reference is passed by value. This allows you to modify the object. If you want to change the reference, then you must ref it!

Jeremy Likness

Thursday, September 10, 2009

Pragmatic Reflection on Singletons

So today I was wading through some code that gets called quite a bit. It is in a process that might be hit thousands of times per second. It uses a pipeline pattern so there are several objects to "new up" and place in the pipeline.

Being performance-minded I originally was tempted to follow the singleton pattern so that I didn't have to create those objects every call and then chain them together. This would be quite expensive, right?

Of course, singletons done right can be powerful, but done wrong can create issues ... you must be very careful about wiring up your properties correctly and understand synchronization between threads, etc. Unfortunately, more often than not (and I've been guilty of this myself) I see the pattern used as a poor man's cache rather than for a real business purpose. The question really becomes, when does this pattern truly make sense? ... and when is it misused or even overkill?

The first thing to do is address the idea that creating objects is expensive. I created a simple console application that would iterate millions of times and simply create an array out of two properties accessed from a class. This is purely an artificial construct just to get to the root of what we do every day: we get a class, somehow (through factory, inversion of control containers, a singleton pattern, or even with goood old-fashioned new keyword), then we do something with that class (access properties, execute methods, etc).

One pass follows the singleton pattern. I do something ugly on purpose, but it essentially makes a singleton, then makes a nested singleton that is referenced by the parent (it's ugly how I do it, but this is a quick and dirty example). The point is that every iteration, when I get the singleton, I'm getting the same set of nested objects.

The next pass news up the objects every time - I'm created a class with a nested class, then referencing their properties, and making a new copy of both classes each pass. Just for kicks and giggles I even added another pass that uses System.Reflection to activate the classes to show the expense of reflection.

Again, keep in mind this is an artificial case but it helps us better understand the cost of either accessing a singelton or newing up an object each time.

Here is my test:

using System;

namespace Activation
{
    public interface INestedWidget
    {
        INestedWidget NestedWidget { get; set; }
        string Identifier { get; }
    }

    public class WidgetA : INestedWidget
    {
        private static readonly WidgetA _instance; 

        static WidgetA()
        {
            _instance = new WidgetA();
        }

        public virtual INestedWidget GetInstance()
        {
            return _instance;
        }

        public INestedWidget NestedWidget { get; set; }
        
        public virtual string Identifier
        {
            get { return "Widget A"; }
        }        
    }

    public class WidgetB : WidgetA
    {
        private static readonly WidgetB _instance; 

        static WidgetB()
        {
            _instance = new WidgetB();
        }

        public override INestedWidget GetInstance()
        {
            return _instance;
        }
        public override string Identifier
        {
            get
            {
                return "Widget B";
            }
        }
    }

    class Program
    {
        static void Main()
        {
            const int ITERATIONS = 99999999; 
            Console.WriteLine("Here we go...");

            DateTime start = DateTime.UtcNow;
            WidgetA parent = new WidgetA();         
            parent.GetInstance().NestedWidget = new WidgetB().GetInstance();
            
            for (int x = 0; x <= ITERATIONS; x++)
            {
                INestedWidget widget = parent.GetInstance();
                string[] identifier = new string[] {widget.Identifier, widget.NestedWidget.Identifier};                
            }
            DateTime finish = DateTime.UtcNow;
            TimeSpan interval = finish - start;

            long ms = interval.Ticks/TimeSpan.TicksPerMillisecond;

            Console.WriteLine("Took me {0} using singletons, {1} per ms", interval, ITERATIONS/ms);                

            start = DateTime.UtcNow; 
            for (int x = 0; x <= ITERATIONS; x++)
            {
                INestedWidget iParent = new WidgetA() {NestedWidget = new WidgetB()};
                string[] identifier = new string[] { iParent.Identifier, iParent.NestedWidget.Identifier };                
            }
            finish = DateTime.UtcNow;
            TimeSpan secondInterval = finish - start;

            ms = secondInterval.Ticks/TimeSpan.TicksPerMillisecond;

            Console.WriteLine("Took me {0} using new objects, {1} per ms", secondInterval, ITERATIONS/ms);

            start = DateTime.UtcNow;
            for (int x = 0; x <= ITERATIONS; x++)
            {
                INestedWidget iParent = (WidgetA) Activator.CreateInstance(typeof (WidgetA));
                iParent.NestedWidget = (WidgetB) Activator.CreateInstance(typeof(WidgetB));
                string[] identifier = new string[] { iParent.Identifier, iParent.NestedWidget.Identifier };
            }
            finish = DateTime.UtcNow;
            TimeSpan thirdInterval = finish - start;

            ms = thirdInterval.Ticks/TimeSpan.TicksPerMillisecond;

            Console.WriteLine("Took me {0} using activation, {1} per ms", thirdInterval, ITERATIONS/ms);

            Console.ReadLine();
        }
    }
}

When I run it, it's pretty much as expected. On my machine, it takes 4 seconds to spin through the singletons, 7 seconds (wow, over 50% longer) to new up the objects, and a whopping 30 seconds (over 6 times as long) to use reflection. That proves without a doubt that creating the objects is way too expensive and I should use that singleton, right?

Well ... maybe not.

The real question here is whether or not 99,999,999 objects is a realistic test case.

For me, the more important piece of information is more frequency based ... remember, I'm getting thousands of requests per second, so what can I do in a millisecond? As it turns out, quite a bit. Even when creating a new object every time, I can create over 13,000 of those nested instances every millisecond. That means if my test case ended there, I should be able to start handling millions of requests per second without faltering ... even without resorting to the singleton.

Even the activator gives me a good run ... and I included that for the dependency injection fans because ultimately what a lot of the engines are doing is reflection-based.

So here's the crunch ... what I really need to worry about when I'm managing my requests is what I'm doing from the request coming in to the point I'm done with that. This might be code logic, making calls to my data access layer, etc. In fact, if I truly break down what's inside of a "request" I might find that newing the object or even activating it is really 1% of the entire request. So why am I using static classes just to address 1% of the problem when there are so many other potential pitfalls?

This is where I see the common mistake ... most likely, I have some semi-static data I've loaded and want to keep, so the singleton gives me a convenient cache. If the database calls take 500ms then suddenly I get REAL big savings keeping a copy of the class around rather than making it new every time.

But there is the real rub ... is that the right solution? I would say ... NO.

The cache is a concern that belongs somewhere else. Depending on my architecture it may "live" in the data access layer or in one of my providers, but the point is that my "consuming" class shouldn't be concerned and shouldn't have to be implemented differently out of a concern for how the underlying data is persisted and retrieved. In other words, I should be able to make a new class every time if my class is the "do something fantastic" class and simply leans on the "do something with the database class."

I really have a few options here. My ORM framework may supply a cache layer, in which case I will always go out and make my data request but sometimes it will come from the local cache and sometimes it will come from disk and take longer. My provider layer that sits on top of the data layer might manage this for me.

Finally, I might even take a look at aspect-oriented programming and think about the cache as a "cross-cutting" concern. Perhaps I have a policy that controls the caching ... but that's a story for a different day.

The bottom line is, if I am using the pattern, I better well know what and why ... using it for a cache when it's not the class that is fetching the data violates separation of concerns, because now I'm not concerned about the data I am working with, but how I actually get and hold onto the data. That belongs in the data layer. Perhaps THAT layer might use a singleton somewhere.

I'm very interested in your comments/feedback about where you feel the singleton pattern makes sense and why, and more importantly ... what are some ways you manage the concept of singleton? Is that a design aspect, or an implementation aspect? In other words, does it make sense to have a "GetInstance" method on my interface? Or should I make everything "singleton-ignorant" then give it a lifetime policy using a dependency injection framework? If you are mostly using it for caching/performance reasons, who realy should own the cache, and how does one truly separate caching of data as a concern?

Jeremy Likness

Monday, September 7, 2009

What's in Your Collection? Part 2 of 3: Concrete

The collection is a powerful construct that allows a developer to logically group related elements and navigate through them. In this article, we'll explore some concrete implementations of collections that are part of the base .NET framework.

The entire series can be accessed here:

(If you enjoy this article, please vote for it at the Code Project by clicking here, then scroll to the bottom to vote!)

There are two basic name spaces that provide rich collection functionality "out-of-the-box" and are useful in a number of ways. There are System.Collections for non-generic collections and System.Collections.Generic. For more specialized collections, we'll also look at System.Collections.ObjectModel. (Extra Credit: we won't cover it here, but after reading this article you may want to investigate System.Collections.Specialized).

An Out of the Box Interview Answer

A very common interview question is to explain the difference between ArrayList and List. If you got that one correct, you probably mentioned something about boxing, or taking a value type and converting it to an object so it essentially becomes part of the heap instead of the local stack. This operation is expensive. Because ArrayList is not generically typed, it must box and unbox value types. For this reason, any type of collection that deals with value types (and for that matter, structs) should focus on the List<T> implementation. Just how expensive is the boxing operation? Try this little console program and see for yourself:

using System;
using System.Collections;
using System.Collections.Generic;

namespace Arrays
{
    internal class Program
    {
        private static void Main()
        {
            const int ITERATIONS = 9999999;

            DateTime startBuild = DateTime.UtcNow;

            ArrayList integers = new ArrayList();

            for (int x = 0; x < ITERATIONS; x++)
            {
                integers.Add(x);
            }

            DateTime endBuild = DateTime.UtcNow;

            for (int x = 0; x < ITERATIONS; x++)
            {
                int y = (int) integers[x];
            }

            DateTime endParse = DateTime.UtcNow;

            TimeSpan buildArray = endBuild - startBuild;
            TimeSpan parseArray = endParse - endBuild;

            startBuild = DateTime.UtcNow;

            List<int> integerList = new List<int>();

            for (int x = 0; x < ITERATIONS; x++)
            {
                integerList.Add(x);
            }

            endBuild = DateTime.UtcNow;

            for (int x = 0; x < ITERATIONS; x++)
            {
                int y = integerList[x];
            }

            endParse = DateTime.UtcNow;

            TimeSpan buildList = endBuild - startBuild;
            TimeSpan parseList = endParse - endBuild;

            double build = (double) buildArray.Ticks/(double) buildList.Ticks;
            double parse = (double) parseArray.Ticks/(double) parseList.Ticks;
            double total = (double) (buildArray.Ticks + parseArray.Ticks)/(double) (buildList.Ticks + parseList.Ticks);

            Console.WriteLine(string.Format("Build Array: {0} List: {1} {2}", buildArray, buildList, build));
            Console.WriteLine(string.Format("Parse Array: {0} List: {1} {2}", parseArray, parseList, parse));
            Console.WriteLine(string.Format("Total Array: {0} List: {1} {2}", buildArray + parseArray, buildList + parseList, total));

            Console.ReadLine();
        }
    }
}

It basically spins through a list of integers, storing them in both an ArrayList and a List. On my machine, the ArrayList takes over 7 times longer to load, and 1.2 times longer to retrieve values, than the strongly typed List implementation. That is something important to keep in mind when considering collections.

I'm Just Not Your Type

The first collections we'll look at are not generically typed. That doesn't mean they aren't typed ... some in fact are designed for explicit types, but they don't support generics. We already covered the ArrayList, which I believe is there for backwards compatibility to the versions that didn't support generics, as I cannot imagine a situation when I would use that over a List.

These classes derive from CollectionBase and DictionaryBase which are abstract classes that implement ICollection and, in the dictionary, IDictionary.

BitArray

Use this class when manipulating bits. It exposes the bits as an array of bool, so you can do something fun like:

...
if (myArray[x])
{
blah blah
}
...

The underlying storage is done at a bit level for compact storage. What's nice is that you can initialize the collection with a byte array and perform bitwise operations (logical NOT, AND, OR, XOR) between two arrays (great for masks, etc).

Hashtable

The Hashtable serves and important function. It makes large collections of objects easier to parse and search based on the implementation of the hashcode algorithm. One important decision to make is whether you will use a Hashtable or a Dictionary. What's the difference?

The dictionary maps a key to a value. Each key is unique. Different keys might have the same value, but if you are searching for a specific key, you will get exactly one entry. What's more important to note with the Dictionary type is that it is defined with a generic type. Therefore, there is no boxing or unboxing and it will, in general, perform faster and better than a hash table when you are using value types.

The hash table requires that its object implement the hashcode algorithm (or that an algorithm is injected into the constructor). The idea is that objects will have a "mostly" unique key per the hashcode algorithm. However, hash tables will allow multiple objects to exist for the same hash code because the algorithm does not guarantee uniqueness. Hash tables are most often used when there is not a well-defined key to map to the value. The hash code function is used to resolve a "region" of objects, then that subset of objects can be further scanned to complete the algorithm. Using a hashcode when you have a well-defined key is also more expensive because it only stores objects, not generic types, so boxing and unboxing will occur if the targets are value types.

Queue: First in Line, First to Eat!

The Queue is often compared to a physical line. In the lunch line, the first person in the line is also the first person to leave the line (usually). The queue functions this way. To put someone in line, you call the Enqueue method. To get the person at the front of the line (the next one to "leave") you call the Dequeue method.

For an idea of how the Queue collection could be used, consider this practical example: syslog. Syslog is a standard way for network equipment to broadcast status. By default, syslog messages are sent to a host via the UDP protocol on port 514. UDP, unlike TCP/IP, is a disconnected protocol (it doesn't wait for nor require a response, and does not support routing of large packets that must be broken into chunks and reassembled). While you can configure what hardware sends for syslog, some equipment can be incredibly verbose and send out dozens of status updates every second.

Imagine writing a syslog server that retrieves these values from a listening UDP port. The thread listening to the port must be incredibly fast or it will block the port and miss important messages. In order to keep the listen port open, you could implement a synchronized queue. The listener would simply Enqueue the incoming message, then go back and listen to the next message. A background thread (or even several threads running simultaneously) could then call Dequeue to perform processing on those messages.

Most of the time you'll want to use the generically typed equivalent for the Queue to avoid boxing and unboxing.

SortedList

The sorted list is a hybrid between the List and the Dictionary. The keys in the list are sorted, so after adding values to the list, you can enumerate the keys and get the value in the sort order of the key. This might be useful to enumerate countries based on the country or perhaps files based on a key that includes their directory, etc.

Just Put it on the Stack

The stack is a very popular pattern for collections. It is a last-in first-out (LIFO) collection, compared to the queue which is FIFI (first-in, first-out). Stacks are important for composite operations that require a history of state. Calculators work by pushing operands and operators on the stack, computing the values, then popping those values to integrate into the next operation. Stacks are also important in recursive functions — if you wanted to recurse without using a method call, you'd loop instead and place your values on the stack, then pop them off until the stack is empty.

Many years ago in the days of VB6 I helped build a complex web application that had many multi-page transactions. To enable the user to navigate these transactions, we used a custom stack. Each navigation involved pushing the parameters and page directives onto the stack, then the target pages would pop these values and use them. A multi-page transaction would only pop the final values when the transaction was complete. This allowed us to rollback transactions, as well as nest transactions (for example, if you were in the middle of transaction A, then navigated to "B" and hit cancel, you'd pop back into A instead of some generic menu).

Again, you will more often than not use the generically-typed version of the Stack to get the job done.

Generics are Less Expensive

Many of the collections we discussed have generically-typed equivalents that eliminate the need for boxing and un-boxing. When it comes to value types, generically typed classes are almost always less expensive and provide better performance. In addition to generically typed versions of the collections we've already discussed, System.Collections.Generic provides some unique collections only available as strongly-typed implementations.

Dictionary Lookup

By far one of the more commonly used collections, the dictionary has a strongly typed key that maps to a strongly typed value. This is the classic use for mapping one item to another, whether it's an image name to the bytes of the actual image or a security key to a security context object.

Ready, HashSet, Go!

The HashSet class does what the name implies: manages sets. Sets are different than typical lists in a few ways. Sets are loose collections of objects: order is not important. Each object must be unique. If you do not require the classes you are collecting be in a particular order, hash sets exhibit very good performance benefits over indexed and ordered lists. The hash set also provides set operations such as union and intersection. According to Kim Hamilton's article introducing the Hash set, the preferred name for this would have been simply set (you can see the article to learn why the hash part was added).

LinkedList

The linked list is a powerful linked list implementation that provides nodes that link both forward (to the next node) and backwards (to the previous node). The list maintains an internal count. Inserting, deleting, and counting nodes are all O(1) operations. An O(1) operation is an operation that takes the same amount of time regardless of the size of the data it is being performed against ... this means that the list performans just as well when adding or removing nodes as a small or a large list.

Type<T>

The remaining items in this namespace are counterparts to the collections and implementations of the interfaces we've discussed. The caveat is that they are all strongly typed which means better performance in almost all cases involving value types and often for reference types as well. This really leads us to the last collection to be discussed (remember, I left the specialized namespace for homework). This also takes us into a new namespace!

Just an Observation

The System.Collections.ObjectModel namespace is for object-based operations that belong in reusable libraries. It relates to classes which have methods that return or consume collections. Perhaps the most often used collection here is the ObservableCollection.

The ObservableCollection provides a collection that implements INotifyCollectionChanged, which is similar to INotifiyPropertyChanged but at the collection level. In short, whenever an object is added to or removed from the collection, or items within the collection are refreshed, the collection will raise the CollectionChanged event. This is important when there is a dependency on the collection that should be notified whenever the underlying collection changes.

Of course, the most common implementation of this is for databound user interface elements. Objects like lists and grids need to refresh when the underlying lists change. Technologies like Windows Presentation Foundation (WPF) and Silverlight rely on observable collections to optimize the UI and only refresh the elements when there is a requirement, such as the list changing. In fact, these frameworks automatically hook into the events when databound to refresh, so whenever you are dealing with lists that change, you should consider using the observable collection instead of one of the other collection types for binding.

Conclusion

That is a lot of information to cover but hopefully provided insights into the various types of collections and some uses for them. In the next and final installment, we'll consider custom collections and how to tap into IEnumerable for more advanced functionality.

Jeremy Likness

Thursday, September 3, 2009

Lambda Expressions, Anonymous Methods, and Syntactic Sugar

What is the relationship between lambda expressions, anonymous methods, and delegates? The answer is, in two words: syntactic sugar ... or is it?

To show this, I put together a real simple console program that declares a delegate which returns a string. I then declare a concrete method that matches the delegate signature, and another method that takes the delegate and writes the return value to the console. I then call it with the concrete instance, an anonymous method, and a lambda expression. The code looks like this (just go into VisualStudio and start a new C# console project, then paste this over everything and run it):

using System;

namespace SyntacticSugar
{
    /// <summary>
    ///     Main program
    /// </summary>
    class Program
    {
        /// <summary>
        ///     Delegate that returns a string
        /// </summary>
        /// <returns>The string</returns>
        private delegate string WriteSomething();       

        /// <summary>
        ///     Concrete implementation of the delegate
        /// </summary>
        private static string _Concrete()
        {
            return "Concrete method";
        }

        /// <summary>
        ///     Main method
        /// </summary>
        static void Main()
        {
            // call using a concrete method
            _DoWrite(_Concrete);

            // call using an anonmymous method
            _DoWrite(delegate() { return "Anonymous method"; });

            // call using lambda expression
            _DoWrite(()=>"Lambda expression");

            Console.ReadLine();
        }

        /// <summary>
        ///     Takes the delegate and writes the string to the console
        /// </summary>
        /// <param name="something">The string to write</param>
        private static void _DoWrite(WriteSomething something)
        {
            Console.WriteLine(something());
        }
    }
}

When you call the program, you get what we expect: three lines of text, concrete, the anonymous, then lambda. That's all great, but what happens under the covers? Again, we'll go to ildasm.exe to pick apart the code that was generated.

The first thing of note is that the compiler generated two delegates for us, one is highlighted below ... also note the <Main>b__ methods generated for the expressions:

Syntactic Sugar Example

What's even more interesting is when we look at the IL generated for the methods. I have put them together here for easy reference ... you can tell which is the concrete class, anonymous method, and lambda expression by the string literal:

  IL_0001:  ldstr      "Concrete method"
  IL_0006:  stloc.0
  IL_0007:  br.s       IL_0009
  IL_0009:  ldloc.0
  IL_000a:  ret
...
  IL_0001:  ldstr      "Anonymous method"
  IL_0006:  stloc.0
  IL_0007:  br.s       IL_0009
  IL_0009:  ldloc.0
  IL_000a:  ret
...
  IL_0000:  ldstr      "Lambda expression"
  IL_0005:  stloc.0
  IL_0006:  br.s       IL_0008
  IL_0008:  ldloc.0
  IL_0009:  ret

It's interesting to note that, other than the screen literal (and a "nop" at the top of the anonymous method), the code is exactly the same in each case.

Next, let's dig into the actual calls from the Main method ... I've truncated the code to focus on the important pieces:

// load a pointer to the method, note the pointer "points to" _Concrete
  IL_0002:  ldftn      string SyntacticSugar.Program::_Concrete()
  IL_0008:  newobj     instance void SyntacticSugar.Program/WriteSomething::.ctor(object,
                                                                                  native int)
  IL_000d:  call       void SyntacticSugar.Program::_DoWrite(class SyntacticSugar.Program/WriteSomething)
// now we're a little more involved ... note the "difference" between
// the anonymous method and the lambda expression ... none! 
// 
// anonymous:
  IL_0013:  ldsfld     class SyntacticSugar.Program/WriteSomething SyntacticSugar.Program::'CS$<>9__CachedAnonymousMethodDelegate2'
  IL_0018:  brtrue.s   IL_002d
  IL_001a:  ldnull
  IL_001b:  ldftn      string SyntacticSugar.Program::'
b__0'() IL_0021: newobj instance void SyntacticSugar.Program/WriteSomething::.ctor(object, native int) IL_0026: stsfld class SyntacticSugar.Program/WriteSomething SyntacticSugar.Program::'CS$<>9__CachedAnonymousMethodDelegate2' IL_002b: br.s IL_002d IL_002d: ldsfld class SyntacticSugar.Program/WriteSomething SyntacticSugar.Program::'CS$<>9__CachedAnonymousMethodDelegate2' IL_0032: call void SyntacticSugar.Program::_DoWrite(class SyntacticSugar.Program/WriteSomething) IL_0037: nop // now the lambda ... deja vu? IL_0038: ldsfld class SyntacticSugar.Program/WriteSomething SyntacticSugar.Program::'CS$<>9__CachedAnonymousMethodDelegate3' IL_003d: brtrue.s IL_0052 IL_003f: ldnull IL_0040: ldftn string SyntacticSugar.Program::'
b__1'() IL_0046: newobj instance void SyntacticSugar.Program/WriteSomething::.ctor(object, native int) IL_004b: stsfld class SyntacticSugar.Program/WriteSomething SyntacticSugar.Program::'CS$<>9__CachedAnonymousMethodDelegate3' IL_0050: br.s IL_0052 IL_0052: ldsfld class SyntacticSugar.Program/WriteSomething SyntacticSugar.Program::'CS$<>9__CachedAnonymousMethodDelegate3' IL_0057: call void SyntacticSugar.Program::_DoWrite(class SyntacticSugar.Program/WriteSomething)

As you can see, all of these methods (pardon the pun) use delegates and method pointers, but when it comes to lambda expressions, you're really just using what some consider to be a more elegant way of generating an anonymous method, which in turn is really the "value" of the delegate "variable."

But wait! There's more ...

So this example is compelling, but according to Eric Lippert in his blog post, there is more than just syntactic sugar taking place. Here is the important bit:

The problem is that since we do not know the types of the parameters until the target type is determined, it means that we cannot aggressively bind (by "bind" I mean "do full semantic analysis") the body of the lambda when the binder encounters the lambda. Rather, we have to put the lambda aside and say "come back to this thing later when we know what the target type is". In C# 2.0 anonymous method bodies were bound eagerly because we always had enough information to determine if there was an error inside the anonymous method even if we didn't know the target type. We could bind the body first, and then later on double-check during convertibility checking to make sure that the parameter types and return type were compatible with the delegate. Every expression type in the compiler worked this way: you do a full analysis of the expression, and then you see if it is compatible with the type that it is being converted to. With lambdas, the information flows in the opposite direction through the binder; first we have to know where we're going, and that then influences how the body is bound during the convertability checking.

He goes into more detail which I encourage you to follow the blog thread for.

Jeremy Likness