Daniel Pitts’ Tech Blog

Archive for May, 2008

New Server

Tuesday, May 27th, 2008

About a month ago, I bought a new server.  This weekend, I finally had time to set it up.  Only a few minor issues on the way :-)

Fixing Bugs

Tuesday, May 20th, 2008

The best way to fix a bug is to prevent the bug in the first place. There are several fundamental causes of bugs which we can explore.

One very common cause of bugs is miscommunication. This is really not a bug, but an unimplemented feature; if there was a requirement that wasn’t met, it is often labeled as a bug. It really could have been a misunderstanding of what the requirement actually was, or even that there was a requirement at all. No one is to blame, but that doesn’t mean someone couldn’t have prevented it. If you think there is a missing requirement, bring it up. If you don’t understand a requirement, get it clarified.

Bad assumptions can also lead to bugs. Specifically, if you assume that some external API will handle all the cases you need it to, but for at least one case the behavior isn’t what you assumed, you have a bug. It can be hard to determine for this whether the bug is in your code or the external code. Usually that can be determined by the documentation of the third party code. If its not documented, its undefined.

Ambiguity can lead to bad assumptions. If the documentation is ambiguous, try looking in the source, or writing a unit test. Update the documentation to be less ambiguous if you have access. Also make sure to add assertions to your code.

Mistakes happen. A bug can be the result of a simple mistake. Sometimes you type one thing while you’re thinking something else. If you were (un)lucky enough to type something the compiler was okay with, a bug is born. eg, A common bug in c and c++ is the =/== bug where “if (i = 1)” is an (=) assignment, but was probably meant as an (==) comparison.

What was I writing about? Forgetfulness might also lead to bugs. “Oh, I’ll just fill this method out later,” or “I’ll write the ‘else’ case tomorrow.” Judicious use of “TODO” comments and the like can help with this problem.

“That does not compute.” Sometimes, there isn’t a bug, per se, but your algorithm fails in certain cases. Maybe the case is impossible to solve in human-scale times, maybe you’re using a heuristic that has a flaw. These kinds of “bugs” are harder to solve, since the programmer has done everything right. Research other algorithms, or put in checks to verify (as much as possible) the input will work with the algorithm.

And as such, these bugs get into the code base. The next phase should filter out bugs. The testing phase. How do bugs progress from codebase into production systems?

Murphy’s law applies a few ways to testing. There is always “the tiniest little change that couldn’t possibly cause any problems.” If you don’t test after making it, it will cause a problem.

Unit testing can do a lot to verify that one unit works for the tested use cases. Sometimes the unit test doesn’t find bugs that come from complex interactions between two separate units, or the unit test doesn’t actually test all cases.

Putting your product in front of a user can help you find more bugs. Murphy’s law as it applies: The first thing a user will do is the only combination of things you didn’t test. That combination will fail. This is not a bad thing though. It lets you know exactly what unit-test you forgot to write. Don’t just fix the bug, fix the test. Write a test that fails for that series of actions, and then make the code work correctly.

No code will ever be 100% bug free. If it is 100% bug free, it probably isn’t very useful. Of course, this is a generalization, but it is often the case. Don’t become discouraged because of bugs. Fix them one at a time, and document the ones you won’t fix in this release as “unsupported features”. If you think a poor design is causing the easy of creating bugs, or a better design would prevent them, then go ahead and consider refactoring. Just know that a new design will have new bugs too. They may be easier to spot and easier to fix, so don’t rule the redesign out flat.

So now we know the reasons bugs get created, how do we find their cause?

Causality. We know of the existence of a bug because of its effect. A number is wrong, or we get a NullPointerException, or Mr. Smith gets Mrs. Robinson’s bill. Where in the codebase should the “correct” behavior be? Look there, add debug logging or break points of some sort. Assert your assumptions and watch what happens. What if that code appears to be doing the right thing? Maybe the bug happens before that point. From where in the codebase does this behavior get invoked? Repeat the process there. Feel free to use “gut” intuition to find where you think the bug is, but verify that the bug is there before “fixing” it.

Also, no bug is so old that it can’t be fixed. Even 25 year old BSD bugs can be found and fixed.