New Server
May 27th, 2008About a month ago, I bought a new server. This weekend, I finally had time to set it up. Only a few minor issues on the way ![]()
About a month ago, I bought a new server. This weekend, I finally had time to set it up. Only a few minor issues on the way ![]()
The best way to fix a bug is to prevent the bug in the first place. There are several fundamental causes of bugs which we can explore.
One very common cause of bugs is miscommunication. This is really not a bug, but an unimplemented feature; if there was a requirement that wasn’t met, it is often labeled as a bug. It really could have been a misunderstanding of what the requirement actually was, or even that there was a requirement at all. No one is to blame, but that doesn’t mean someone couldn’t have prevented it. If you think there is a missing requirement, bring it up. If you don’t understand a requirement, get it clarified.
Bad assumptions can also lead to bugs. Specifically, if you assume that some external API will handle all the cases you need it to, but for at least one case the behavior isn’t what you assumed, you have a bug. It can be hard to determine for this whether the bug is in your code or the external code. Usually that can be determined by the documentation of the third party code. If its not documented, its undefined.
Ambiguity can lead to bad assumptions. If the documentation is ambiguous, try looking in the source, or writing a unit test. Update the documentation to be less ambiguous if you have access. Also make sure to add assertions to your code.
Mistakes happen. A bug can be the result of a simple mistake. Sometimes you type one thing while you’re thinking something else. If you were (un)lucky enough to type something the compiler was okay with, a bug is born. eg, A common bug in c and c++ is the =/== bug where “if (i = 1)” is an (=) assignment, but was probably meant as an (==) comparison.
What was I writing about? Forgetfulness might also lead to bugs. “Oh, I’ll just fill this method out later,” or “I’ll write the ‘else’ case tomorrow.” Judicious use of “TODO” comments and the like can help with this problem.
“That does not compute.” Sometimes, there isn’t a bug, per se, but your algorithm fails in certain cases. Maybe the case is impossible to solve in human-scale times, maybe you’re using a heuristic that has a flaw. These kinds of “bugs” are harder to solve, since the programmer has done everything right. Research other algorithms, or put in checks to verify (as much as possible) the input will work with the algorithm.
And as such, these bugs get into the code base. The next phase should filter out bugs. The testing phase. How do bugs progress from codebase into production systems?
Murphy’s law applies a few ways to testing. There is always “the tiniest little change that couldn’t possibly cause any problems.” If you don’t test after making it, it will cause a problem.
Unit testing can do a lot to verify that one unit works for the tested use cases. Sometimes the unit test doesn’t find bugs that come from complex interactions between two separate units, or the unit test doesn’t actually test all cases.
Putting your product in front of a user can help you find more bugs. Murphy’s law as it applies: The first thing a user will do is the only combination of things you didn’t test. That combination will fail. This is not a bad thing though. It lets you know exactly what unit-test you forgot to write. Don’t just fix the bug, fix the test. Write a test that fails for that series of actions, and then make the code work correctly.
No code will ever be 100% bug free. If it is 100% bug free, it probably isn’t very useful. Of course, this is a generalization, but it is often the case. Don’t become discouraged because of bugs. Fix them one at a time, and document the ones you won’t fix in this release as “unsupported features”. If you think a poor design is causing the easy of creating bugs, or a better design would prevent them, then go ahead and consider refactoring. Just know that a new design will have new bugs too. They may be easier to spot and easier to fix, so don’t rule the redesign out flat.
So now we know the reasons bugs get created, how do we find their cause?
Causality. We know of the existence of a bug because of its effect. A number is wrong, or we get a NullPointerException, or Mr. Smith gets Mrs. Robinson’s bill. Where in the codebase should the “correct” behavior be? Look there, add debug logging or break points of some sort. Assert your assumptions and watch what happens. What if that code appears to be doing the right thing? Maybe the bug happens before that point. From where in the codebase does this behavior get invoked? Repeat the process there. Feel free to use “gut” intuition to find where you think the bug is, but verify that the bug is there before “fixing” it.
Also, no bug is so old that it can’t be fixed. Even 25 year old BSD bugs can be found and fixed.
I went to post a comment on an old acquaintance’s blog, and it asked me for either a Blogger account, or an OpenID. I didn’t have the former, and hadn’t heard of the latter. I checked it out, downloaded and installed phpMyID, and now my OpenID will be http://virtualinfinity.net/openid/daniel_pitts/ Pretty awesome if you ask me ![]()
Sometimes when dealing with OO design in conjunction with relational persistence (RDBMS), it becomes difficult to reconcile the differences between the two and still maintain a consistent approach.
The problem that I seem to run into time and again is that I’m an OO programmer first. I tend to think of things top-down. What’s the overall system look like? What are the large pieces? What fits “in” those large pieces?
When you work this way, you tend to end up with a tree of objects or even more complex graph of objects. The interesting thing about a graph of objects is that it is, above all else, a graph, and there are many ways to represent graphs.
Fundamentally, all graphs are a set of Nodes and (directed) Edges, lets examine a few common ways to represent nodes and edges.
Any graph can be represented in any one of the above ways (as well as many others). This is important to realize, because it gives you the ability to refactor an OO design into something that better fits the relational model. There will always be a disconnect between OO and RDBMS, because they look at design in fundamentally different ways, but that doesn’t mean you can’t work with both of them in the same system, leveraging their individual strengths.
Escaping and Encoding, two things that most web developers need to do quite often. Unfortunately, most people are never taught when (or even how) to do so. Depending on what you’re doing, it can be a security risk and a bad user experience.
Encoding and escaping are both similar in concept, and what they ultimately try to do is represent values (in our context, characters) that either have special meanings, or are otherwise not representable in the underlying format.
For example, the “<” character has special meaning in HTML/XML, so if you want it to actually show up, you have to escape it. To do that, you use “<” in its place.
Another example is in URLs, “&” has special meaning (to separate query parameters). It is replaced by “%26″. “%” itself also has to be encoded (as “%25″).
If you wanted to link to “http://virtualinfinity.net/dictionary?word=%nfinity&fun=true” in HTML, you’d first want to URL encode “%nfinity” to “%25nfinity”, and then you’d want to HTML encode the full URL to “http://virtualinfinity.net/dictionary?word=%25nfinity&fun=true”.
You’re final output would look something like <a href=”http://virtualinfinity.net/dictionary?word=%25nfinity&fun=true”>Words ending in nfinity & having fun</a>. Notice the “&” in the href. Most web browsers are tolerant of such mistakes, but they can cause you problems down the road.
A good way to know what to encode, and which method to use, is to think of each encoding as a layer. You want to put the string “%nfinity” in the URL query parameter layer, so you need to encode it with the URL encoding. You want to put the URL into an HTML document, so you need to HTML escape it. And so on and so forth.
Things can get even more interesting with RSS feeds. The <description> elements’ text values can contain HTML within them. A naive first attempt might be something like:
<description> <a href=”http://virtualinfinity.net/dictionary?word=%25nfinity&fun=true”>Words ending in nfinity & having fun</a>. </description>
Unfortunately, this doesn’t work quite as expected. in XML, this actually creates an “a” element with-in the “description” element. This is *not* valid RSS. So you need to escape the contents of the description element.
<description> <a href=”http://virtualinfinity.net/dictionary?word=%25nfinity&amp;fun=true”>Words ending in nfinity &amp; having fun</a>. </description>
This may look funny, but it’s actually correct. It is not a typo to have “&amp;”. The first “&” will be converted back into “&” by the XML parser, so “&amp;” becomes simply “&” After that, the HTML parser gets a hold of it, and converts it to the expected “&”.
So, there you have it. A brief explanation of when and what to encode. How to encode is left as an exercise of the reader. (Hint: google is your friend)
So many Java programmers get loured into the Swing trap. I’m not saying Swing is bad, but it has some pitfalls that many programmers fall into simply because they weren’t told otherwise.
The biggest pitfall is that most Swing classes are not thread safe, and there is a specific approach you must take to communicate with these classes.
Where there are many ways to approach thread-safety, the designers of Swing choose the “thread-partition” approach. What this means in simple terms is that any time you want to access data or methods in a Swing class, you should do it on a specific thread. This is much easier done than said.
Normally when you access an object, you use “ob.member”, for example: textField.setText("foo"). In Swing, you need to make sure your code gets called only from the Event Dispatch Thread (that’s the “single thread” as designated by Swing).
In order to get your code to run on that thread, you can pass a Runnable to the java.awt.EventQueue. If you don’t know, Runnable is an interface in the standard Java API that has one method, public void run(); For example, in the above “foo” example, I could do this:
public void setFoo() {
Runnable setFooRunnable = new Runnable() {
public void run() {
textField.setText("foo");
}
}
EventQueue.invokeLater(setFooRunnable);
}
EventQueue.invokeLater tells the Event Dispatch Thread (EDT) to execute that code at its earliest convenience. invokeLater returns immediately, which may have consequences if you expect things to be done. There is another method that lets you wait until the EDT has completed your task, but it is strongly discouraged to use it.
There was a time in the past when Sun suggested that you could initialize your GUI outside of the EDT. They were wrong. Yes, I said it. Sun made a mistake. They have since sent out a correction, but unfortunately the harm was done and you’ll see some tutorials exemplifying this dangerous practice. If you so as much create an instance of a GUI object, do so on the EDT under bane of glitch. It may appear to work, but it will crash sometime, somewhere, and you might not ever hear about it from the customer who decided not to buy.
There is another consequence of this single thread approach. If you do something that takes a long time, and you do it on the EDT, you’ll make the whole application unresponsive. Most GUI’s, Swing included, are Event Driven, so instead of waiting for something, you should listen for that something. Most Swing components allow you to add listeners to them. The methods on your listeners get called when something they care about happens. This lets you avoid things like “while (!isFooRead()); doSomethingWithFoo(getFoo());” and instead you have “public void fooIsReady(Foo foo) { doSomethingWithFoo(foo); }“.
For the same reason you shouldn’t block the EDT, you should move time-consuming tasks (IO, calculations, etc…) off of the EDT. Every millisecond you take on the EDT is another millisecond that a mouse-click doesn’t get registered, or the window doesn’t repaint. This will give your program the appearance of being slow and unresponsive.
Java 1.6 has added a new class called SwingWorker that helps you pass work between a worker thread and the EDT in a clean and “correct” manor.
Hopefully at this point you’ve learned the basics of what it takes to write a proper Swing program. From here, I suggest reading Concurrency in Swing from the Java tutorials. For a much more complete understanding of the why’s and how’s of Java concurrency, I suggest buying and reading Java Concurrency in Practice.
If you decide to go the route of using Inversion of Control and Dependency Injection, it can make your program a lot more flexible. If you do this manually, you’ll find that even a complex system can be managed well this way.
A common idiom is to use Dependency Injection to separate the creation of the object from the use of the object. The client object has some way to receive other objects it needs to “talk” to, but the client object doesn’t care how they were created:
class MyClient {
MyFooService fooService;
MyBarService barService;
public MyClient(MyFooService fooService, MyBarService barService) {
this.fooService = fooService;
this.barService = barService;
}
public void doStuff() {
fooService.iLikeFooFighters();
barService.buyRoundsForEveryone();
}
}
From this code snippet, FooService and BarService could be interfaces, abstract classes, or even concrete classes. If you change your mind later about them, MyClient doesn’t have to change at all
Another common approach for DI is to use setter based injection. This approach is useful for complex object graphs that have cyclic dependencies. I’ll leave it as an exercise for the reader to re-write MyClient with setters.
So, in this phase one, you’ve separated the concern of object creation and wiring from object usage. It is perfectly acceptable to stop here. However, depending on how complicated your object system is, you may wish to separate the concerns even further. You can do this via the Builder pattern (nb. The current Wikipedia entry on Builder pattern is incorrect and misleading.)
With the builder pattern, you separate the concern of creation from the concern of “wiring”. The builder accepts a series of objects that belong in a graph. The builder knows how to connect the objects through DI, but doesn’t care where the objects came from or how they were created/configured.
In a recent project of mine, I had to create a Robot instance, and wire it with a lot of components. To make things worse, some of the components also needed sub-components, and some components needed to know about other components, and some needed to know about the robot itself. I had successfully created a class that would create the components and wire them together, but it was a bit fragile (do things in the wrong order and you get an NPE, since not all of the components were created yet).
To solve this design issue, I created a class (HardwareContext), which held a reference to every component that it takes to build a robot. Those fields were populated using setter based dependency injection. Once those references are all created, hardwareContext.wireRobot() is called, and that is where all of the object graph is connected. This would allow me to create a new hardware context that wired the components differently if I needed to. It also allows me to create a new HardwareSpecification (the class that creates components), which creates different configurations for the robot.
This leads to a highly extensible and configurable system. It also makes it easy to figure out where to add a line of code. If the line is doing wiring, it goes into the Context, if it is creating an object, it belongs in the Spec. Easy, clean, clear.
Image this. You develop a Java application that uses only standard Java features. It only uses a hand full of classes, so you just distribute individually. It works great until someone using windows runs your program.
“Why am I getting ‘Exception in thread “main” java.lang.NoClassDefFoundError: FOO (wrong name: Foo)’?”
Remember, Java is case sensitive, but the Windows file system is *not*.
public class Foo {
public static void main(String...args) {
new FOO();
}
}
class FOO {
public FOO() {
System.out.println("Works like a charm.");
}
}
When you compile this on Linux, you get “Foo.class” and “FOO.class”, but on Windows, you only get one file. Oops!
So, its always a good idea to follow good convention with regards to capitalization. Use CamelCase, and treat abbreviations as words. Use XmlParser rather than XMLParser. Note that some of the standard Java API doesn’t do that (URL for instance).
Work’s been crazy, we just moved, and we have a baby on the way (Due date 2008-07-04). I promise I’ll come up with something to write about soon.
As you can see, the site is back up on schedule. Yippy.
One of these days, when I can afford it, I’ll get a dedicated server somewhere. ![]()