Tuesday, January 12, 2010

Quick Tip: Embedding and Accessing Schemas in .NET

I have a project I'm working on that requires the use of some extensive XML manipulation. While XML can be very powerful, XML without a schema is like JavaScript objects: no strong typing and the wild west as far as standards are concerned. A good, solid XML document will have a valid schema to validate against.

It's really not that difficult, either. There are several online tools that can help generate a template of the schema from a sample XML document, like this one. Visual Studio provides full intellisense for building out your schema.

You can choose to actually publish the schema to a website for validation if it's an "open schema." If it's closed and you are simply using it for internal validation, then it makes sense to embed as a resource.

A best practice is to validate your method parameters before acting on them. In my methods that take in the XML document, I want to validate that it maps to the schema or throw an exception right away.

First, I'll add the schema to the project, then right-click and change it to an embedded resource. This ensures it is complied into the distributable binary.

Next, I create a helper class to access the schema. I'm almost always going to want to validate it as a schema set, so instead of having to remember how to parse the manifest and get the resource stream each time, why not do it like this:


public static class SchemaAccess
{
    private const string SCHEMA = "MySchema.xsd";
    private const string SCHEMANAMESPACE = "http://mycompany.com/MySchema.xsd"; 

    public static XmlSchemaSet GetSchemaSet()
    {
        // replace this fully qualified name with the schema file
        string schemaPath = typeof(SchemaAccess).FullName.Replace("SchemaAccess", SCHEMA);
        XmlSchemaSet schemas = new XmlSchemaSet();
        schemas.Add(SCHEMANAMESPACE, XmlReader.Create(typeof(SchemaAccess).Assembly.GetManifestResourceStream(schemaPath)));
        return schemas;
    }
}

Instead of hard-coding the path to the schema, I make sure my helper class is at the same level. The embedded resources are always accessed by the full namespace followed by the file name, so I can simply take the fully qualified name of the current helper class and then replace the class name with the resource name. If my helper class is at "Foo.Bar" then it will be "Foo.Bar.SchemaAccess." The replace will turn that into "Foo.Bar.MySchema.xsd" which is what I need to access the schema.

Then, I simply add it to the set and return it. Of course, if I'm dealing with multiple schemas, I can add those, too.

With LINQ, it's very easy to validate a schema. Simply add System.Xml.Schema to your usings, then validate your XDocument like this, using the Validate extension method. Here is a sample unit test that prints the errors to the debug console:


XmlSchemaSet schemas = SchemaAccess.GetSchemaSet();
bool validationErrors = false;
myDoc.Validate(schemas, (o, e) =>
    {
        validationErrors = true;
        Debug.Print(e.Message);
    });
Assert.IsFalse(validationErrors, "The xml was invalid. View the debug console for details."); 

In production, I'll just throw the exception. That's it!

Jeremy Likness