Thursday, September 3, 2009

Lambda Expressions, Anonymous Methods, and Syntactic Sugar

What is the relationship between lambda expressions, anonymous methods, and delegates? The answer is, in two words: syntactic sugar ... or is it?

To show this, I put together a real simple console program that declares a delegate which returns a string. I then declare a concrete method that matches the delegate signature, and another method that takes the delegate and writes the return value to the console. I then call it with the concrete instance, an anonymous method, and a lambda expression. The code looks like this (just go into VisualStudio and start a new C# console project, then paste this over everything and run it):

using System;

namespace SyntacticSugar
{
    /// <summary>
    ///     Main program
    /// </summary>
    class Program
    {
        /// <summary>
        ///     Delegate that returns a string
        /// </summary>
        /// <returns>The string</returns>
        private delegate string WriteSomething();       

        /// <summary>
        ///     Concrete implementation of the delegate
        /// </summary>
        private static string _Concrete()
        {
            return "Concrete method";
        }

        /// <summary>
        ///     Main method
        /// </summary>
        static void Main()
        {
            // call using a concrete method
            _DoWrite(_Concrete);

            // call using an anonmymous method
            _DoWrite(delegate() { return "Anonymous method"; });

            // call using lambda expression
            _DoWrite(()=>"Lambda expression");

            Console.ReadLine();
        }

        /// <summary>
        ///     Takes the delegate and writes the string to the console
        /// </summary>
        /// <param name="something">The string to write</param>
        private static void _DoWrite(WriteSomething something)
        {
            Console.WriteLine(something());
        }
    }
}

When you call the program, you get what we expect: three lines of text, concrete, the anonymous, then lambda. That's all great, but what happens under the covers? Again, we'll go to ildasm.exe to pick apart the code that was generated.

The first thing of note is that the compiler generated two delegates for us, one is highlighted below ... also note the <Main>b__ methods generated for the expressions:

Syntactic Sugar Example

What's even more interesting is when we look at the IL generated for the methods. I have put them together here for easy reference ... you can tell which is the concrete class, anonymous method, and lambda expression by the string literal:

  IL_0001:  ldstr      "Concrete method"
  IL_0006:  stloc.0
  IL_0007:  br.s       IL_0009
  IL_0009:  ldloc.0
  IL_000a:  ret
...
  IL_0001:  ldstr      "Anonymous method"
  IL_0006:  stloc.0
  IL_0007:  br.s       IL_0009
  IL_0009:  ldloc.0
  IL_000a:  ret
...
  IL_0000:  ldstr      "Lambda expression"
  IL_0005:  stloc.0
  IL_0006:  br.s       IL_0008
  IL_0008:  ldloc.0
  IL_0009:  ret

It's interesting to note that, other than the screen literal (and a "nop" at the top of the anonymous method), the code is exactly the same in each case.

Next, let's dig into the actual calls from the Main method ... I've truncated the code to focus on the important pieces:

// load a pointer to the method, note the pointer "points to" _Concrete
  IL_0002:  ldftn      string SyntacticSugar.Program::_Concrete()
  IL_0008:  newobj     instance void SyntacticSugar.Program/WriteSomething::.ctor(object,
                                                                                  native int)
  IL_000d:  call       void SyntacticSugar.Program::_DoWrite(class SyntacticSugar.Program/WriteSomething)
// now we're a little more involved ... note the "difference" between
// the anonymous method and the lambda expression ... none! 
// 
// anonymous:
  IL_0013:  ldsfld     class SyntacticSugar.Program/WriteSomething SyntacticSugar.Program::'CS$<>9__CachedAnonymousMethodDelegate2'
  IL_0018:  brtrue.s   IL_002d
  IL_001a:  ldnull
  IL_001b:  ldftn      string SyntacticSugar.Program::'
b__0'() IL_0021: newobj instance void SyntacticSugar.Program/WriteSomething::.ctor(object, native int) IL_0026: stsfld class SyntacticSugar.Program/WriteSomething SyntacticSugar.Program::'CS$<>9__CachedAnonymousMethodDelegate2' IL_002b: br.s IL_002d IL_002d: ldsfld class SyntacticSugar.Program/WriteSomething SyntacticSugar.Program::'CS$<>9__CachedAnonymousMethodDelegate2' IL_0032: call void SyntacticSugar.Program::_DoWrite(class SyntacticSugar.Program/WriteSomething) IL_0037: nop // now the lambda ... deja vu? IL_0038: ldsfld class SyntacticSugar.Program/WriteSomething SyntacticSugar.Program::'CS$<>9__CachedAnonymousMethodDelegate3' IL_003d: brtrue.s IL_0052 IL_003f: ldnull IL_0040: ldftn string SyntacticSugar.Program::'
b__1'() IL_0046: newobj instance void SyntacticSugar.Program/WriteSomething::.ctor(object, native int) IL_004b: stsfld class SyntacticSugar.Program/WriteSomething SyntacticSugar.Program::'CS$<>9__CachedAnonymousMethodDelegate3' IL_0050: br.s IL_0052 IL_0052: ldsfld class SyntacticSugar.Program/WriteSomething SyntacticSugar.Program::'CS$<>9__CachedAnonymousMethodDelegate3' IL_0057: call void SyntacticSugar.Program::_DoWrite(class SyntacticSugar.Program/WriteSomething)

As you can see, all of these methods (pardon the pun) use delegates and method pointers, but when it comes to lambda expressions, you're really just using what some consider to be a more elegant way of generating an anonymous method, which in turn is really the "value" of the delegate "variable."

But wait! There's more ...

So this example is compelling, but according to Eric Lippert in his blog post, there is more than just syntactic sugar taking place. Here is the important bit:

The problem is that since we do not know the types of the parameters until the target type is determined, it means that we cannot aggressively bind (by "bind" I mean "do full semantic analysis") the body of the lambda when the binder encounters the lambda. Rather, we have to put the lambda aside and say "come back to this thing later when we know what the target type is". In C# 2.0 anonymous method bodies were bound eagerly because we always had enough information to determine if there was an error inside the anonymous method even if we didn't know the target type. We could bind the body first, and then later on double-check during convertibility checking to make sure that the parameter types and return type were compatible with the delegate. Every expression type in the compiler worked this way: you do a full analysis of the expression, and then you see if it is compatible with the type that it is being converted to. With lambdas, the information flows in the opposite direction through the binder; first we have to know where we're going, and that then influences how the body is bound during the convertability checking.

He goes into more detail which I encourage you to follow the blog thread for.

Jeremy Likness