Friday, July 2, 2010

Introducing Sterling, the Object-Oriented Database for Silverlight and Windows Phone 7

Today, I'm excited to share a project that I've been working on for some time now. The project, called Sterling (for Sterling Silverlight, of course!) is a very light weight approach to handling serialization and deserialization to and from isolated storage. I call it an object-oriented database because it provides LINQ to Object queries over keys and indexes.

I guess the timing works well ... I was recently honored with the 2010 Microsoft Most Valuable Professional (MVP) award for my work with Silverlight. Much of this award is about what we do for the community, and this is one of my first major efforts to put a utility and open source project out there. Lots of hours went into this but I hope there will be great benefits for those of you who find value in using it.

Sterling is not yet released and I don't anticipate we'll have an alpha ready until late July. I work on many projects for my company, so this effort has been primarily late nights or snatches during lunch and breaks and will probably continue to be so. I'm excited that some members of the community have offered to reach out and help push development forward.

First, let me emphasize that Sterling is not intended as a replacement for a robust, transactional database system. There are some commercial and non-commercial solutions that are optimized for performance and massive scale. I wrote Sterling to be a simple serialization engine to avoid having to deal with the goop of writing my own serializers all of the time. I also know that even if you have a simple list of contacts, querying the contact names is going to be a priority over deserializing all of them, so I built a key and index infrastructure to facilitate storing key items in memory so you can bind to lists and combo-boxes without de-serializing the entire object.

Having said all of this, while a lot of more powerful features may not exist, my goal was to hit 80% of the needs with 20% of the footprint. Not only is Sterling incredibly lightweight, but extensible and available on Windows Phone 7.

Here are the key goals I had in mind:

Non-intrusive

I don't like mangling my classes to persist them. I didn't want to force anyone to inherit from a base class or decorate the class with attributes to make it work. I believe I've achieved this goal. Defining a "table" to sterling is as simple as passing a type and a lambda expression that returns the key:

public override List<ITableDefinition> _RegisterTables()
{
 return new List<ITableDefinition>
                       {
                           CreateTableDefinition<Contact,string>(c => c.Email)
                       }
}

I can even do this for sealed classes or third-party classes - whatever your code can see, Sterling can, too. Sterling automatically serializes the same values that the BinaryWriter class supports. If you need something custom, no problem - just define a serializer of your own and register it with Sterling.

Lightweight

I wanted Sterling to fit on the Windows Phone 7 and also to facilitate building projects without bloating them. I purposefully kept it lightweight. Currently the DLL weighs in at only 70 kilobytes, which I believe is very trivial compared to some solutions I've seen.

Flexible and Portable

Again, these are both features that collapse into the previous items. By keeping it flexible, I can accommodate needs I didn't know at design time. There is a very loose interface for logging and for extending serialization. This allows you to do pretty much whatever you like, without having to think about the underpinnings of setting up tables and folders and checking if they exist, etc. Portability means it was easy to build on Windows Phone 7 and should be very easy to bring forward into future versions of Silverlight.

The Reference Application

The easiest way to get to know Sterling is by the reference application. I will warn you that it takes a long time to build because of the time it takes to initially serialize. I decided a good test case would be the USRDA nutrient database. It contains over 500,000 data elements. I had to write some parsers to take the source text-based database files and turn them into data objects that Sterling could deal with, but once the conversion is done, you can see the power of how Sterling operates.

The left column shows a custom logger I made that is spitting out Sterling log information. The data models were straightforward. A food group is a general "category" for foods to fall under. The food description is an actual food item, and contains a list of nutrient data elements (things like calories, protein, vitamin A, etc). These point to a nutrient definition.

public class FoodGroup
{
    public int Id { get; set; }
    public string GroupName { get; set; }
}

public class NutrientDefinition
{
    public int Id { get; set; }

    public string UnitOfMeasure { get; set; }

    public string Tag { get; set; }

    public string Description { get; set; }

    public int SortOrder { get; set; }
}

public struct NutrientDataElement
{
    public int NutrientDefinitionId { get; set; }
    public double AmountPerHundredGrams { get; set; }
}

public class FoodDescription
{
    public FoodDescription()
    {
        Nutrients = new List();
    }

    public int Id { get; set; }

    public int FoodGroupId { get; set; }

    public string Description { get; set; }

    public string Abbreviated { get; set; }

    public string CommonName { get; set; }

    public string Manufacturer { get; set; }

    public string InedibleParts { get; set; }

    public double PctRefuse { get; set; }

    public string ScientificName { get; set; }

    public double NitrogenFactor { get; set; }

    public double ProteinCalories { get; set; }

    public double FatCalories { get; set; }

    public double CarbohydrateCalories { get; set; }

    public List&t;NutrientDataElement> Nutrients { get; set; }
}

Because I have a struct for the nutrient information, I had to provide a custom serializer:

public class FoodSerializer : BaseSerializer  
{
    public override bool CanSerialize(Type targetType)
    {
        return targetType.Equals(typeof (NutrientDataElement));                
    }

    public override void Serialize(object target, BinaryWriter writer)
    {
        var data = (NutrientDataElement)target;
        writer.Write(data.NutrientDefinitionId);
        writer.Write(data.AmountPerHundredGrams);
    }

    public override object Deserialize(Type type, BinaryReader reader)
    {
        return new NutrientDataElement
                    {
                        NutrientDefinitionId = reader.ReadInt32(),
                        AmountPerHundredGrams = reader.ReadDouble()
                    };
    }
}

As you can see, fast and easy to do. Defining the tables took a little bit of thought. I wanted a "covered index" for food groups so I wouldn't have to de-serialize them at all. For the food descriptions, I needed an index on description and food group for fast filtering and sorting. Finally, the nutrient definitions provided an index for unit of measure and sort order (so they sort consistently in each food item).

public class FoodDatabase : BaseDatabaseInstance
{
    public override string Name
    {
        get { return "Type Database"; }
    }

    public const string FOOD_GROUP_NAME = "FoodGroup_GroupName";
    public const string FOOD_DESCRIPTION_DESC_GROUP = "FoodDescription_Description_Group";
    public const string NUTR_DEFINITION_UNITS_DESC = "NutrientDefinition_Units_Description";
    public const string NUTR_DEFINITION_SORT = "NutrientDefinition_Sort";
        
    protected override List<ITableDefinition> _RegisterTables()
    {
        return new List<ITableDefinition>
                    {
                        CreateTableDefinition<FoodGroup, int>(fg => fg.Id)
                            .WithIndex<FoodGroup, string, int>(FOOD_GROUP_NAME, fg => fg.GroupName),
                        CreateTableDefinition<FoodDescription, int>(fd => fd.Id)
                            .WithIndex<FoodDescription, string, int, int>(FOOD_DESCRIPTION_DESC_GROUP,
                                                                            fd =>
                                                                            Tuple.Create(fd.Description, fd.FoodGroupId)),
                        CreateTableDefinition<NutrientDefinition,int>(nd=>nd.Id)
                            .WithIndex<NutrientDefinition,string,string,int>(NUTR_DEFINITION_UNITS_DESC,
                            nd=>Tuple.Create(nd.UnitOfMeasure,nd.Description))
                            .WithIndex<NutrientDefinition,int,int>(NUTR_DEFINITION_SORT,
                            nd=>nd.SortOrder)
                    };
    }
}

The main view model allows you to select groups and enter search terms, displays the food items, and then will show a chart breaking out nutrition information when you click on the food item. The food group list is queried using the index like this:

public IEnumerable<FoodGroup> FoodGroups
{
    get
    {
        return DesignerProperties.IsInDesignTool
                    ? _samples.AsEnumerable()
                    : from fg in
                            SterlingService.Current.Database.Query<FoodGroup, string, int>(
                                FoodDatabase.FOOD_GROUP_NAME)
                        select new FoodGroup {Id = fg.Key, GroupName = fg.Index};
    }
}

Note that I have a default list for design-time, otherwise I run the actual query.

When you hover over a food group, I supply a tool-tip to show the number of food items in that group. I decided to go ahead and store these in a dictionary after querying the first time, but only because I know they won't change. The converter looks like this:

private readonly Dictionary<int,int> _foodCounts = new Dictionary<int,int>();

public object Convert(object value, Type targetType, object parameter, CultureInfo culture)
{
    var count = DesignerProperties.IsInDesignTool ? 500 : 0;
            
    var foodGroup = value as FoodGroup;
    if (foodGroup != null && !DesignerProperties.IsInDesignTool)
    {
        if (_foodCounts.ContainsKey(foodGroup.Id))
        {
            count = _foodCounts[foodGroup.Id];
        }
        else
        {
            count =
                (from index in
                        SterlingService.Current.Database.Query<FoodDescription, string, int, int>(
                            FoodDatabase.FOOD_DESCRIPTION_DESC_GROUP)
                    where index.Index.Item2.Equals(foodGroup.Id)
                    select index).Count();
            _foodCounts.Add(foodGroup.Id,count);
        }
    }

    return string.Format("There are {0} food items in this food group.", count);
}

Here you can see I'm providing a design-time default. If I'm not in the designer, then I go ahead and calculate the amount and save it using the index I created. The index has two values (description and food group key) so I use a Tuple to access the value. None of this requires any de-serialization because I'm only touching the index.

Food group counts

The search view model uses the same index to pull food items. The query is a bit more complex. I'm not allowing you to search all 10,000 food items. You must narrow it to a category or enter at least three characters of a search (the search is a containing search, not a "starts with" or "ends with").

public IEnumerable<FoodDescriptionIndex> SearchResults
{
    get
    {
        if (DesignerProperties.IsInDesignTool)
            return _sampleDescriptions.AsEnumerable();

        if (_currentGroup != null)
        {
            if (string.IsNullOrEmpty(_searchText) || _searchText.Length < 3)
            {
                var query1 = from fg in
                                    SterlingService.Current.Database.Query
                                    <FoodDescription, string, int, int>(
                                        FoodDatabase.FOOD_DESCRIPTION_DESC_GROUP)
                                where
                                    fg.Index.Item2.Equals(_currentGroup.Id)
                                select
                                    new FoodDescriptionIndex {Id = fg.Key, Description = fg.Index.Item1};

                return query1.Count() == 0 ? _noResults.AsEnumerable() : query1;
            }

            // group and search text)
            var query2 = from fg in
                        SterlingService.Current.Database.Query
                        <FoodDescription, string, int, int>(
                            FoodDatabase.FOOD_DESCRIPTION_DESC_GROUP)
                    where
                        fg.Index.Item2.Equals(_currentGroup.Id) &&
                        fg.Index.Item1.ToUpperInvariant().Contains(_searchText.ToUpperInvariant())
                    select
                        new FoodDescriptionIndex {Id = fg.Key, Description = fg.Index.Item1};
                        
            return query2.Count() == 0 ? _noResults.AsEnumerable() : query2;
        }

        if (string.IsNullOrEmpty(_searchText) || _searchText.Length < 3)
            return _noResults.AsEnumerable();

        var query3 = from fg in
                    SterlingService.Current.Database.Query
                    <FoodDescription, string, int, int>(
                        FoodDatabase.FOOD_DESCRIPTION_DESC_GROUP)
                where
                    fg.Index.Item1.ToUpperInvariant().Contains(_searchText.ToUpperInvariant())
                select
                    new FoodDescriptionIndex { Id = fg.Key, Description = fg.Index.Item1 };
        return query3.Count() == 0 ? _noResults.AsEnumerable() : query3; 
    }
}

Food items

Notice how I'm using a "covered index" (covered for the items I need, which include the key and description) to query and return a list of types that are bound to the list box. You can see when you run the example this happens very fast. Finally, when you click on the item, I am always deserializing rather than trying to cache a ton of objects.

The food description context class synchronizes the current food item between view models. Notice when it is passed a value, it loads the new item:

public class FoodDescriptionContext : BaseNotify 
{
    public static FoodDescriptionContext Current = new FoodDescriptionContext();

    public FoodDescription CurrentFoodDescription { get; private set; }

    private int _foodDescriptionId; 

    public int FoodDescriptionId
    {
        get { return _foodDescriptionId; }
        set
        {
            _foodDescriptionId = value;
            CurrentFoodDescription = SterlingService.Current.Database.Load<FoodDescription>(value);
            RaisePropertyChanged(()=>CurrentFoodDescription);
            RaisePropertyChanged(()=>FoodDescriptionId);
        }
    }

}

This isn't thread-safe, but does it need to be? The user can only click on one item at a time.

Food detail

Finally, the whole engine is configured using an application service.

There's a lot more to it but I wanted to get this out there and have people start looking at it to provide me with feedback. While there is not yet a release, you can visit the Sterling Codeplex site to download the code (pre-alpha, so use at your own risk, right?) and build/test/integrate on your own. Let me know what you like and don't like and if you are interested in being a serious beta tester, and we'll see what we can do to release a solid version 1.

(PS, I took a simple list project for the Windows Phone 7 just to prove the engine works there - I simply save the list to the database then bind to the query - but I'm looking for a more comprehensive example there, so volunteers are welcome to work on that as well!)

Jeremy Likness

27 comments:

  1. Congratulations on the MVP award, Jeremy... well deserved!

    -Dave

    ReplyDelete
  2. Sterling sounds very interesting Jeremy, are you planning to release it as a commercial or free product?

    I am working on an offline app that needs to read and write XML files and serialize objects to local storage but I may hold off if this would suit the purpose better.

    ReplyDelete
  3. Rodney, Sterling is a free, open source product. In fact, you can pull down the bits and start working with it right now at the CodePlex site (http://sterling.codeplex.com/) ... there is not an official release yet but we should be alpha in July and hopefully have a solid RTM soon thereafter. In the meantime, feel free to kick the tires and see how well it integrates with your project and your feedback will be very valuable to me for finalizing the release.

    ReplyDelete
  4. Jeremy,
    Congrats on your MVP award. Nice to see a new contribution in the Silverlight IsolatedStorage object database space. I've followed your "Stuff" and it's first class.

    ReplyDelete
  5. A very interesting project. Looking forward to watching this progress. Very best of luck too!

    ReplyDelete
  6. Cool.

    Just I'm looking for the link to download the USDA (not USRDA right?) database as I'm interested too...

    Thanks

    ReplyDelete
  7. Here is the link to download the databases from the USDA (yes, you are right, not USRDA, the USRDA is a set of guidelines establish by the USDA).

    http://www.ars.usda.gov/Services/docs.htm?docid=5121

    ReplyDelete
  8. Hey Jeremy,

    Quick question -- is it possible to populate the database in one app, and somehow transfer that database to another app?

    I have an application I need to build shortly that has a complex schema; I can't manually insert records/objects, so I need to build a tool to create and populate objects into the database. Since Sterling uses IsolatedStorage, that hints at "no, you can't pre-populate and move the DB."

    Is it possible?

    ReplyDelete
  9. @ashiq: Programmatically it would be possible, with a bit of work. Basically, from Application "A" you can access the isolated storage, and stream it into a big chunk that you use the save file dialog to allow the user to "download." Then, in the other application, you allow an upload and parse out the file to save to disk. Thanks for the idea, I may extend Sterling to make this a native feature (import/export database). Manually, you simply navigate to the Sterling directory in isolated storage, physically copy the files to the new app, and Sterling will parse them in as if they were native.

    ReplyDelete
  10. Hi Jeremy,

    I was taking a look at this again -- I think I misrepresented my original requirements and your subsequent answer. The short of it is that I would like to include the database WITH the application as an Embedded Resource type, so that whoever gets hold of it gets the "startup" database -- I can't control isolated storage outside of my own PC :)

    It seems to me that this boils down to being able to serialize/deserialize from an (embedded) file, instead of isolated storage.

    I'm currently looking at a different DB, but if that one can't easily do what I want, I'll be back ... !

    ReplyDelete
  11. Hi Jeremy!

    Is the full source code for this project available for download at all? This project links well into a project that I am currently working on.

    Thanks.

    ReplyDelete
  12. Yes, full source code is available at the site:
    http://sterling.codeplex.com/

    ReplyDelete
  13. Sorry, I was meaning the source code for the above nutrient application?

    Thanks.

    ReplyDelete
  14. Yes, full source for the example is available. Just navigate to this link:

    http://sterling.codeplex.com/SourceControl/list/changesets

    And choose the "Download" in the upper right (latest version) and that includes the code for the nutrient database example.

    ReplyDelete
  15. Excellent! :) Which solution package is it packaged into? I'm guessing Wintellect.Sterling but I can't seem to locate the nutrient frontend?

    Thanks.

    ReplyDelete
  16. So it just runs as a standalone project, that's fine....I thought I had to add it to the Wintellect.Sterling solution for access reasons etc...DOH!

    Thanks.

    ReplyDelete
  17. Is it possible to write a service to load data into Sterling from a SharePoint 2010 foundation site? I'm hoping to load data just like your nutrient example but with the source being multiple SharePoint lists?

    Thanks Jeremy.

    ReplyDelete
  18. Yes, very possible. Using the Silverlight client you could query the CAML from SharePoint and inject into the Silverlight database.

    ReplyDelete
  19. Any tips on how I would do this injection from CAML to the silverlight db?

    Thanks.

    ReplyDelete
  20. Hi Jeremy,

    I've copied your FoodDescription.xaml page and named it FoodGroupDescription.xaml. I'm trying to show another chart view when selecting a food group, not just when selecting a food description. I have called the FoodGroupDescription view within the MainView.xaml page and ammended the FoodGroup class to accomodate for the extra data that the chart requires and physically added the data to the .txt file but upon running the project my new chart doesn't appear? Do I have to instanciate the chart anywhere, how is the chart physically called to appear? Thanks.

    ReplyDelete
  21. I'd have to see some more of your code to really break it down. The TXT is parsed to entities, so those entities would need to hold the data, and then there is a another view that holds the chart - you'd want to swap that out and also listen for the selection event in the group.

    ReplyDelete
  22. Hi Jeremy!

    I have now altered the parser class to build the correct food group entity from the TXT. I have also copied out "FoodDescriptionContext.cs" and "FoodDescriptionViewModel.cs" into "FoodGroupDescriptionContext.cs" and "FoodGroupDescriptionViewModel.cs" and made sure that the selections reflect the data from their corresponding entities. From what I can see the views are binding up correctly also but the chart still doesn't show once I run the project? :(

    Thanks.

    ReplyDelete
  23. Can I implement your SterlingExample into a Silverlight Business Application which uses WCF RIA Services?

    Greatful.

    ReplyDelete
  24. Would I be best to copy all the Models and Views etc from your SterlingExample project into my new Silverlight Business Application structure or should I just add your SterlingExample project to the solution? I'm wanting to take advantage of the default RIA login pages etc and the easy WCF services integration for injecting data into the app.

    Thanks.

    ReplyDelete