Reflection, Lambda, and Expression Magic

Sometimes I love how a little reflection can work magic. In this case I was building what I'll call a "non-intrusive" validation system. The project contains entities that are generated by templates, and it would be extremely difficult to crack open those templates to put in a lot of custom validation code. So I decided to approach it using a helper class that would attach to the generated entities.

As always, I work backwards. I like to have readable, fluent interfaces, so I decide what I want them to look like and then implement the classes and interfaces to make them work. In this case the entities implemented INotifyPropertyChanged and IDataErrorInfo. This makes it easy on the validation class because if it needs to validate a property, it can simply hook into the property change notification and pick up the new value for validation. Problem solved.

The validation mechanism ended up looking like this:

myClass.GetValidationHelper()
   .WithValidation<FieldIsRequired>(()=>MyField)
   .WithValidation<FieldIsInRange>(()=>MyField,0,999)

So, how do we get there? I'm not going to give you all of the code but I'll touch on a few salient points. First, creating a common interface for validations makes it easy for the validation helper to parse through a dictionary (i.e. if property "x" comes in, run all validations for "x"). Implementing the validators with a constructor makes it easy to pass parameters. In the case of the FieldIsInRange there is actually a constructor that takes the minimum and maximum values. When you use reflection, you pass in parameters and it finds the matching constructor for you. This means I can have an extension method like this:

public static ValidationHelper WithValidation<TValidator, TProperty>(
   this ValidationHelper helper,
   Expression<Func<TProperty>> propertyExpression,
   params object[] args)
   where TValidator : IMyValidator

And I can simply pass the arguments through to reflection and it will match the appropriate constructor like this:

var validator = (TValidator)Activator
    .CreateInstance(typeof(TValidator), args);

The first parameter after the "extension" property of the helper is a property expression. You may find it is familiar because that's the guidance that Prism provides for strongly-typed property names and has been the topic of much debate. I settled on that format for my Jounce framework and while I initially used it to get property names, it actually comes in very handy for validation.

I won't rehash extracting the property name because it's been done. There are lots of debates over the relative performance of different methods. Here's a hint: for me, most of my property change notification is so I know when the user changes a field. Last time I checked, a user can't type fast enough to make my property changes fire so often that I'm really going to lose sleep over milliseconds here or there. I'd rather have something that is strongly typed so I don't have refactoring nightmares and put some of that plumbing in than have something that is convuluted to implement and code but saves me a tiny fraction of time that the user won't even notice.

You can see the source code in the ExtractPropertyName method by navigating to the BaseNotify class in Jounce - here is a direct link to the source for your viewing pleasure.

In the validation engine, we get an added benefit. What's passed in with this format:

CallSomeMethod(()=>myEntity.MyProperty);

Has a specific signature. It's an expression that contains a function that returns a property, and looks like this: Expression<Func<TProperty>>. Pretty cool. So we can walk the expression tree and get the property name. Once the property change notification fires, we can inspect the arguments for the event and see the property name, and if it is one we want to validate it, we go ahead and perform the validation. But of course, to validate it, we need to get the value. How do we do it?

Often I see the solution og taking the property name that's been extracted from the expression and then using reflection to find the getter and invoking it, like this:

var value = (TProperty) typeof(T)
                  .GetProperty(propertyName)
                  .GetGetMethod()
                  .Invoke( instance, null );

That will work, but it's like tying your arms into a pretzel and reaching around your back to scratch your nose. Doesn't really make sense when you've got the expression already — what was passed in? It's a function that resolves the value! So we use some expression magic to run the lambda expression.

Given the propertyExpression passed in with the signature Expression<Func<TProperty>>, all you have to do is compile the body of the expression. The following code will extract the value, first by compiling the function to a lambda expression, then by calling the lambda expression:

var lambdaExpression = Expression.Lambda<Func<TProperty>>(
   propertyExpression.Body).Compile(); 
var value = (TProperty) lambdaExpression();

Now you can pass that value (which was resolved to the most current value of the property, without using reflection) into the validation routines. One of those validation routines is a generic routine to compare values. Why write this multiple times when the framework provides a nice IComparable interface for things that can be compared? The framework I built works with strings, so my first challenge was parsing the strings to the target type before comparing them. This is where we can have some fun.

Here is the code to parse any string to any type ... provided the type supports the TryParse method. First, you get the type and you create a variable to hold the output of the parse call:

public static bool TryParse(string input, out T value) 
{
   var type = typeof(T); 
   value = default(T); 
   ...
}

Next, you build the parameters to the call. You want the method with the signature that takes a string type for the first parameter, and a reference to the type for the output parameter. Here's where it is important to know your C# and your CLR. The "out" keyword is enforced by the compiler, and is a language construct. The framework knows nothing about it. To the framework, it's simply a reference type. You might not know that you can create special types that are tagged as "reference" - read the documentation here for more information. If you suffix a type name with an ampersand (&) you are flagging that type to be passed by reference.

So, here's the rest of the method down to the call:

var parameters = new[]
{
   typeof(string),
   System.Type.GetType(string.Format("{0}&", type.FullName))
};

var method = type.GetMethod(
   "TryParse",
   parameters);

if (method == null)
{
   return false;
}

var invokeParameters = new object[] { input, value };
var returnValue = (bool) method.Invoke(null, invokeParameters);

Now comes just a slight trick that will cause problems if you're not aware of it. You tagged the parameter as a reference type, but most of the time your comparisons will be dealing with value types (integers, decimals, etc.) so what happens in the call is that the types are boxed. It's a common mistake to make that call above and assume your value will now be populated if the returnValue is true. Imagine your frustration when you run this and it is always set to the default value instead! What gives? It turns out the call does update the reference — but not the one you provided. It updates the reference in the array you passed in!

The last step in the global TryParse method is therefore to pull that value back out and return the result:

public static bool TryParse(string input, out T value) 
{
   ...
   if (returnValue)
   {
      value = (T)invokeParameters[1];
   }

   return returnValue;
}

And there you have it ... the generic try parse. Now if you wan to compare the values, you can cast to IComparable and simply use:

public int ParseAndCompare(string valueOne, string valueTwo) 
   where T: IComparable 
{
   T parsedOne, parsedTwo;
   if (TryParse(valueOne, out parsedOne) && 
       TryParse(valueTwo, out parsedTwo))
   {
       return parsedOne.CompareTo(parsedTwo);
   }

   throw new ArgumentException("One of the values didn't parse.");
}

I will admit the details of the implementation can get a little bit complex if you're not familiar with expression trees or reflection. However, the beauty of this approach is that the core pieces are not likely to change much as the basic infrastructure to build the validations on top of. The payoff is enormous, because now a developer can write a new validation by implementing a simple interface and then add the validation using the fluent syntax I showed earlier. That makes the code both readable and easy to maintain.

C#er : IMage

Friday, July 29, 2011

Reflection, Lambda, and Expression Magic