Friday, June 8, 2012

Compressing Data in Windows 8 Metro Applications

The following post is an excerpt from Chapter 6 of my upcoming book, Designing Windows 8 Metro Applications with C# and XAML. Keep reading to learn how you can receive a free copy of the full chapter.

Storing large amounts of data can take up a large amount of disk space. Data compression encodes information in a way that reduces its overall size. There are two general types of compression. Lossy compression may not preserve all of the original information and is often used in image, video, and audio compression. Lossless compression preserves the full fidelity of the original data set.

The Windows 8 runtime exposes the Compressor and Decompressor classes for compression. The Compression project provides an active example of compressing and decompressing a data stream. The project contains a text file that is almost 100 kilobytes in size. It loads that text and displays it with a dialog showing the total bytes. You can then click a button to compress the text, and click another button to decompress it back.

The compression task performs several tasks. A local file is open for output to store the result of the compressed text. There are various ways to encode text, so it first uses the Encoding class to convert the text to a UTF8 encoded byte array:

var storage = await ApplicationData.Current.LocalFolder
   .CreateFileAsync("compressed.zip", 
   CreationCollisionOption.ReplaceExisting);
var bytes = Encoding.UTF8.GetBytes(_text);

You learned earlier in this chapter how to locate the folder for a specific user and application. You can examine the folder for the sample application to view the compressed file after you click the button to compress the text. The file is saved with a zip extension to illustrate that it was compressed, but it doesn’t contain a true archive so you will be unable decompress the file from Explorer.

The next lines of code open the file for writing, create an instance of the Compressor and write the bytes. The code then completes the compression operation and flushes all associated streams.

var stream = await storage.OpenStreamForWriteAsync();
var compressor = new Compressor(stream.AsOutputStream());
await compressor.WriteAsync(bytes.AsBuffer());
await compressor.FlushAsync();
await compressor.FinishAsync();
await stream.FlushAsync();

Once the compression operation is complete, the bytes are read back from disk to save and to show the compressed size. You’ll find the default algorithm cuts the text file down to almost half of its original size. The decompression operation uses the Decompressor class to perform the reverse operation and retrieve the decompressed bytes in a buffer (it then saves these to disk so you can examine the result).

var decompressor = new Decompressor(stream.AsInputStream());
var bytes = new Byte[100000];
var buffer = bytes.AsBuffer();
var buf = await decompressor.ReadAsync(buffer, 999999, 
   InputStreamOptions.None);

When you create the classes for compression you can pass a parameter to determine the compression algorithm that is used. Table 6.4 lists the possible values.

Table 6.4: Compression Algorithms

CompressAlgorithm Member

Description

InvalidAlgorithm

Invalid algorithm. Used to generate exceptions for testing.

NullAlgorithm

No compression is applied and the buffer is simply passed through. Used primarily for testing.

Mszip

Uses the MSZIP algorithm.

Xpress

Uses the XPRESS algorithm.

XpressHuff

Uses the XPRESS algorithm with Huffman encoding.

Lzms

Uses the LZMS algorithm.

The Windows Runtime makes compression simple and straightforward. Use compression when you have large amounts of data to store and are concerned about the amount of disk space your application requires. Experiment to find the algorithm that provides the best compression ratio for the type of data you are storing and remember that you must pass the same algorithm to the decompression routine that you used to compress the data.

This excerpt is from Chapter 6, “Data” from my upcoming book, Designing Windows 8 Metro Applications with C# and XAML. The full chapter covers topics including application settings, local and roaming storage, serialization, async and await, IO helpers available in the Windows Runtime, accessing embedded resources, dealing with collections, loading content from the web (including web pages and syndicated feeds), streams, buffers, byte arrays, compression, encryption, and signing. If you’re interested, there are two ways you can get access to this chapter next week:

1. Attend one of my Codestock Sessions next week and receive a free hard copy of the chapter.

2. Visit the Wintellect booth at TechEd and have your badge scanned. You will receive a PDF copy of the full chapter.

If you’re interested in early access to the book (you’ll be able to read chapters even before they are edited and technically reviewed, so that you can provide insights and request additional content as I am writing it) you can obtain a copy through Safari Rough Cuts. Thanks!