Wednesday, April 27, 2011

Supporting Non-Supported Entities like IEnumerable in Sterling

While the Sterling NoSQL database supports many items out of the box, there are certain item types it just doesn't know enough about. For example, consider a property defined as IEnumerable. Unlike a list, an enumerable is not bounded. It simply exposes an iterator, and for all we know, that iterator might be an enumerable method that produces random numbers in an infinite loop. The property doesn't reveal what the backing store is, and while Sterling could guess that it is a list or array or other structure, guessing doesn't cut it when faithfully persisting an object graph.

Fortunately, Sterling was designed with extensibility in mind. If it can't infer how to handle a structure, you can train it using custom serializers. With the latest changeset (Sterling Changeset 76881 - part of the path from the 1.4 beta to 1.5 RTM) some key internals to Sterling were exposed that help this process. In the 1.0 version of Sterling, a custom serializer had to manage all serialization from that node. There was no "re-entry" into the recursive model that Sterling uses to walk object graphs, handle foreign keys and manage lists, dictionaries, and other structures. This limited the flexibility of custom serializers. The latest changeset, however, exposes the "serialization helper" that performs most of Sterling's heavy lifting and makes it easy to create extension points to save IEnumerable lists or even teach Sterling how to populate private collections (by default, Sterling only handles public fields and public properties with getters and setters).

Here is an example taken directly from the core Sterling tests. The hypothetical "non-supported" class is based on IEnumerable:

public class NotSupportedList : IEnumerable<TestModel>
{
    private readonly List<TestModel> _list = new List<TestModel>();

    public void Add(IEnumerable<TestModel> newItems)
    {
        _list.AddRange(newItems);
    }

    public IEnumerator<TestModel> GetEnumerator()
    {
        return _list.GetEnumerator();
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }
}

Obviously this is a contrived example, but it is common to use this pattern to provide read-only lists and hide the management of the list from the processing or iterating of it. Here is a non-supported class that holds the list (non-supported because the list property would ordinarily cause Sterling to choke):

public class NotSupportedClass
{
    public NotSupportedClass()
    {
        InnerList = new NotSupportedList();
    }

    public int Id { get; set; }

    public NotSupportedList InnerList { get; set; }
}

The TestModel is a helper class defined for Sterling tests that demonstrates a lot of the built-in featuers. It contains nested classes, various data types, sub-lists and more. Serializing it means traversing a rich object graph. In addition, it is defined as a table in the database, so saving a class with these instances should result in being able to query the test models directly and load them individually.

Now that the class is defined, Sterling just needs to "learn" how to manage the non-supported list. This is done by a custom serializer. Custom serializers derive from BaseSerializer. They indicate the types they support by overriding the CanSerialize method. A serializer can handle one or many types. For example, if you wanted to create a serializer that manages any instance that implements an interface ICustomInterface, you would specify support by returning:

return typeof(ICustomInterface).IsAssignableFrom(targetType);

This example will just support the "non-supported" list. To serialize the property, it is simply cast into a list (which Sterling understands) and is passed off to the serializer. This is where the power of the recursive serialization comes into play, because if any other classes or tables exist in the list, Sterling will automatically handle those directly without further intervention from the developer. This will be evidenced by the recursive parsing of the TestModel object graph. Here is the custom serializer:

public class SupportSerializer : BaseSerializer
{
    public override bool CanSerialize(Type targetType)
    {
        // only support the "non-supported" list
        return targetType.Equals(typeof (NotSupportedList));
    }

    public override void Serialize(object target, BinaryWriter writer)
    {
        // turn it into a list and save it 
        var list = new List<TestModel>((NotSupportedList) target);

        // this takes advantage of the special save wrapper for injecting into the stream
        TestCustomSerializer.DatabaseInstance.Helper.Save(list, writer);
    }

    public override object Deserialize(Type type, BinaryReader reader)
    {
        // grab it as a list - again, unwrapped from a node and returned
        var list = TestCustomSerializer.DatabaseInstance.Helper.Load<List<TestModel>>(reader);
        return new NotSupportedList {list};
    }
}

Note that list initializers automatically look for the "add" method. That is why the NotSupportedList can be returned using the list initialization notation, as it is passed to the public method to add to the list and then stored internally.

As you can see, very straightforward - the only need is to convert the non-supported enumerable to a list for saving, then extract it and set it back to the property on deserialization. The helper method exposes the functionality needed. Sterling does not require that you modify your types in any way to be used, instead you create a database instance and describe the objects and keys to Sterling. Here is the test database definition:

public class CustomSerializerDatabase : BaseDatabaseInstance
{        
    protected override List<ITableDefinition> RegisterTables()
    {
        return new List<ITableDefinition>
                        {
                            CreateTableDefinition<NotSupportedClass, int>(t=>t.Id),
                            CreateTableDefinition<TestModel,int>(t=>t.Key)
                        };
    }
}

The definition includes the type of the class and the type of the key, then a lambda expression that exposes how to access the key. Saving and loading is simple, as you will see in the test. Setting up the database is as simple as registering the database type as well as registering the custom serializer:

[TestInitialize]
public void TestInit()
{
    _engine = new SterlingEngine();
    _engine.SterlingDatabase.RegisterSerializer<SupportSerializer>();            
    _engine.Activate();
    DatabaseInstance = _engine.SterlingDatabase.RegisterDatabase<CustomSerializerDatabase>();
    DatabaseInstance.Purge();
}

The purge is just there to clear any previous data from prior test runs and would not be part of a normal production application. Now for the test. The test creates the object and stores several test models, then saves it. It confirms that not only was the main class saved, but that independent test model instances were automatically saved as well using Sterling's built-in foreign key functionality. It then reloads the class and confirms that it reloaded correctly - in this case just validating the keys on the test models, as the full object graph is validated with other tests.

[TestMethod]
public void TestCustomSaveAndLoad()
{
    var expectedList = new[] {TestModel.MakeTestModel(), TestModel.MakeTestModel(), TestModel.MakeTestModel()};
    var expected = new NotSupportedClass {Id = 1};
    expected.InnerList.Add(expectedList);

    var key = DatabaseInstance.Save(expected);

    // confirm the test models were saved as "foreign keys" 
    var count = DatabaseInstance.Query<TestModel, int>().Count();

    Assert.AreEqual(expectedList.Length, count, "Load failed: test models were not saved independently.");

    var actual = DatabaseInstance.Load<NotSupportedClass>(key);
    Assert.IsNotNull(actual, "Load failed: instance is null.");
    Assert.AreEqual(expected.Id, actual.Id, "Load failed: key mismatch.");

    // cast to list
    var actualList = new List<TestModel>(actual.InnerList);

    Assert.AreEqual(expectedList.Length, actualList.Count, "Load failed: mismatch in list.");

    foreach (var matchingItem in
        expectedList.Select(item => (from i in actualList where i.Key.Equals(item.Key) select i.Key).FirstOrDefault()).Where(matchingItem => matchingItem < 1))
    {
        Assert.Fail("Test failed: matching models not loaded.");
    }
}        

That's it. As you can see, the current Sterling model makes it extremely simple and easy to save even complicated object graphs and retrieve them with simple queries and load statements. The 1.5 release will be available before the end of May.

Jeremy Likness

1 comment:

  1. very cool. I can't wait to get some time to use Sterling.

    chadbr

    ReplyDelete