Thursday, April 7, 2011

The Art of Debugging

Mastery of the art of debugging is rare. I know this from years of experience working on enterprise systems. If it was simple, more people would be doing it and everyone would be able to track down bugs. The reality is that most shops have that one "go to person" known as "The Exterminator" who is called in to sweep the place for those bugs no one else was able to track down. At Wintellect, our "Chief Bugslayer" is John Robbins and with him and the rest of our team, a significant part of our business revolves around finding nasty bugs and fixing them.

I've been working on bug fixes for decades now. One of my first feature articles published in print was a piece called "The Exterminator's Guide." One thing I've found is that effective bug hunting involves a combination of skills. It's not enough to know the technology. There is a method to the madness. There are certain steps that can be learned, and as you encounter more systems during your career, experience only adds to the mix. What has always amazed me is the gap between those who are good at finding defects and those who aren't. You'd think it would be a continuous spectrum of skills but what I've found is either people get it, or they don't - the ones who do, do it quickly and consistently. So what is the secret?

Train Your Eyes

Do me a favor and take a quick pop quiz. Read the quote below and quickly count the number of F's in the passage.

Finished files are the result
of years of scientific study
combined with the experience
of years.

I'll come back to the answer for that in a second. It would be too easy if I put it there. Just note down what you thought it was, and then let's take something a little more involved. Here's another set of instructions, and trust me, this is all leading up to something. Are you ready for another contest?

I want you to watch a very short movie. Don't click before reading these important instructions! I want you to watch the movie once. Only once - that's it. No more. Otherwise, you're cheating, and we don't like cheaters. But when you watch it, you'll have an assignment.

In the video you are going to watch, you're going to see a group of basketball players. Some are wearing white. Some are wearing black. They are passing two balls around. Got it? White, black, and balls being passed. Here's your mission:

Count how many times the ball is passed by the players wearing white.

That's it. If you think this is an exercise about focus and attention to detail, you're right. So again, when you click the link, watch it only once and count the number of passes by players wearing white.

Here is the link. Go ahead and watch it, then write down your score.

Step One: Click Here and Start Counting!

By now I hope you are starting to see my point, and the first step to mastering the art of debugging. In my experience, the majority of developers don't debug code the right way. When they hit F5 and start stepping through the program, they're not watching what is going on.

What? Am I kidding? They've got break points set. There are watch windows. They are dutifully hammering F10 and F11 to step into and out of subroutines. What do I mean? Here's the problem:

They are waiting for the program to do what they expect it to do. And it's hard not to, especially when you are the one who wrote the program! So when you step through that block of code and go, "Yeah, yeah, I'm just initializing some variables here" and quickly hit F10, you've just missed it because that one string literal was spelled incorrectly or you referenced the wrong constant.

The answer to the "F" quiz is 6. Most people count 3 because they sound the words in their head and listen for the "f" sound, rather than just looking at the letters. And that's what people do when they debug - they feel out the program, rather than watching what it is really doing.

Seriously, Train Your Eyes

Did you see the gorilla? Most people won't their first time. It's because they are following instructions. They are counting passes, which is exactly what the exercise was about. But can you believe how obvious it is (and yes, now you have permission to watch the video again) when you see it and know what you're looking for? How could you miss something like that?

Hopefully by now we've established that your mind has a pretty good filter and is going to try to give you what you want. So when you step through code with expectations, guess what? You'll see the debugger doing what you expect, and miss out on what is really happening that may be causing the bug.

So What's the Next Step?

There are several things you can do to help hone your debug skills, and I encourage you to try these all out.

Have someone else debug your code, and offer to debug theirs. The best way to understand how to look at code and see what it is doing is to step through code you're not familiar with. It may seem tedious at first, but it's a discipline and skill that can help you learn how to walk through the code the right way and not make any assumptions.

Try not to take in the code as blocks. In other words, when you have a routine that is initializing variables, don't step over it as the "block of initialization stuff." Step through and consider each statement. Don't look at the statements as sentences, but get back to your programming roots and see a set of symbols to the left of the equal sign and a set of symbols to the right of the equal sign. You'll be surprised how this can help you hone in quickly to a wrong or duplicate assignment. It's common in MVVM, for example, for developers to cut and paste and end up with code like this:

private string _lastName; 
private string _firstName; 

public string FirstName 
{
   get { return _firstName; }
   set { _firstName = value; RaisePropertyChanged(()=>FirstName); }
}

public string LastName 
{
   get { return _firstName; }
   set { _lastName = value; RaisePropertyChanged(()=>LastName); }
}

Did you spot the bug? If not, take some time and you will. This is far more difficult when it's code you've written because that expectation is there for it to "just work."

Get Back to the Basics

With all of the fancy tools that tell us how to refactor code and scan classes for us, sometimes we forget about the basic tools we have to troubleshoot.

I was working with a client troubleshooting a memory leak issue and found myself starting at huge graphs of dependencies, handles, and instances. I could see certain objects were being created too many times, but looking at the code, it just looked right. Where were the other things coming from?

So, I got back to the basics. I put a debug statement in the constructor and ran it again. Suddenly I realized that some of the instances were faithfully reporting themselves, and others weren't. How on earth? Ahhh ... the class was derived from a base class. So I put another debug statement in the base class. Sure enough, it was getting instanced as well. A quick dump of the call stack and the problem was resolved ... not by graphs and charts and refactoring tools, but good old detective work.

Make it Unique

Simple little steps can go a long ways. If you are dealing with multiple instances of the same object and all of the properties and fields are the same, don't pull your hair out with frustration (you see what it did to me). Instead, do something simple and easy: put a Guid inside the class and then override the ToString() to print the Guid, or use it in your debug statements. Now you'll be able to trace where each statement is coming from.

The First Debugging Tool is Your Mind

Finally, I'm going to give you the same advice my mentor gave me so many years ago when I started troubleshooting my first enterprise issues. He told me the goal should be to never have to fire up the debugger. Every debugging session should start with a logical walkthrough of the code. You should analyze what you expect it to, and walk through it virtually ... if I pass this here, I'll get that, and then that goes there, and this will loop like that ... this exercise will do more than help you comprehend the code. Nine times out of ten I squash bugs by walking through source code and never have to hit F5.

When I do hit F5, I now have an expectation of what the code should do. When it does something different, it's often far easier to pinpoint where the plan went wrong and how the executing code went off script. This skill is especially important in many production environments that don't allow you to run the debugger at all. I was taught and have since followed the philosophy that the combination of source code, well placed trace statements and deep thought are all that are needed to fix even the ugliest of bugs.

It's an Art you CAN Learn!

While I can't guarantee you'll be able to squash defects like the Wintellect Bugslayer himself, debugging is an art that can be learned with patience, focus, and experience. I hope the earlier exercises helped you understand the filters that sometimes block your efforts to fix code, and that these tips will help you think differently the next time you are faced with an issue. Remember, there is no defect that can't be fixed ... and no debugger more powerful than the one between your ears.
Jeremy Likness

6 comments:

  1. I knew both of your initial exercises already, but I still remember how I failed when I came across them for the first time.

    Trying to "avoid F5" as long as possible when you come across a problem really is one of the simplest methods to become a true expert in analyzing code, and it will also train other aspects of your coding skills better than just firing up the debugger every other minute and stepping through the code (which I unfortunately see a lot).

    I like that approach very much; another great (and entertaining) article!

    ReplyDelete
  2. I found it very easy to see the gorilla. Your instructions said to watch for people who are wearing white SHORTS. (Not just white as the video said.) Once I saw that none of them were wearing shorts, I just sat back and watched the video. :)

    But again, that speaks to the same point about debugging.

    ReplyDelete
  3. Heh. Yeah. I fixed that little nuance but you make a good point!

    ReplyDelete
  4. There is only one 'F'

    ReplyDelete
  5. Thanks Jeremy,

    A great article. Although it's seem very simple routines but not everybody pay enough attention to it :)

    ReplyDelete
  6. Another great post, useful for any developer.

    ReplyDelete