Daniel Pitts’ Tech Blog

Archive for October, 2007

Avoiding Primitive Obsession

Sunday, October 28th, 2007

Particularly (but not just) programmers who have a background in C tend to think of simple data as primitive data. Measurements are one of the simplest types of data, so why not do something like this:

public class Bob {
  double distance;
  double duration;
  double weight;
  /* ... rest of code which handles calculation based on distance,
          duration, and volume, based on some predefined units for each. */
}

While there isn’t anything broken about this code, it needlessly couples the algorithms used in Bob with the units used to express distance, duration, and weight. A more robust design would give each of these there own types, which could be unit-independent. We’ll deal with just Distance. I’ve decided to implement the Distance class use meters as its underlying representation, but this fact is hidden from all clients of the class. Furthermore, I’ve tried hide that fact from as many internal methods as possible.

public final class Distance {
    private final double meters;

    private  Distance(double meters) {
        this.meters = meters;
    }

    public double getMeters() {
        return meters;
    }

    public static Distance fromMeters(double meters) {
        return new Distance(meters);
    }

    public Distance times(double scalar) {
        return fromMeters(getMeters()*scalar);
    }

    public Area times(Distance distance) {
        return Area.fromSquareMeters(getMeters() * distance.getMeters());
    }

    public Distance plus(Distance distance) {
        return fromMeters(getMeters() + distance.getMeters());
    }
}

So, what do we have here? Distance now has methods that give you more meaningful operations. A Distance multiplied by another Distance is specifically an Area, but a Distances multiplied by a scalar (simple number), is just a scaled Distance. You can’t add an arbitrary scalar to Distance on accident.

If we were using simple double distance and used the convention that distances was measured in meters, we might see code like distance+=3; // add three meters Unfortunately, if we switch to a different metric, we just broke that line. Using the non-primitive Distance, the same line would be distance = Distance.plus(Distance.fromMeters(3)); Now, no one is constrained to use meters. We could easily add fromInches, getInches, fromYards, getMicrons, etc… We could also decide change the internal representation of Distance to use BobUnits (3.14 Bob Units is one Meter ), and no client code need be touched!

Another useful example are measurements which usually have other associated values with them. For example, Angles. Angles can be measured in Degrees, Radians, and many other units. They also have associated sine, cosine, tangent. So, lets take a look at an Angle class:

public final class Angle {
    private final double radians;

    private Angle(double radians) {
        this.radians = radians;
    }

    public double cosine() {
        return Math.cos(getRadians());
    }

    public double sine() {
        return Math.sin(getRadians());
    }

    public static Angle fromCartesian(Distance x, Distance y) {
        return fromRadians(Math.atan2(y.getMeters(), x.getMeters()));
    }

    public static Angle fromRadians(double radians) {
        return new Angle(radians);
    }

    public Angle plus(Angle angle) {
        return fromRadians(getRadians() + angle.getRadians());
    }

    public double getRadians() {
        return radians;
    }
}

Again, we arbitrarily choose radians as the implementations unit, and again we don’t expose this to the client, and we don’t rely on it internally unless we have to. We have an Angle plus an Angle is an Angle. We also have a way to get an angle from a Cartesian coordinate.
What’s more, we have sine() and cosine(). That’s the big deal with Angle! Now our clients don’t need to use ugliness such as “Math.cos(angle * degreeToRadians)”. If they have an Angle, they can get the sin, cos, degrees, radians, etc… Without having to know the details of the underlying units.

Now, any of these can be turned into interfaces or abstract classes, and then you could have different implementations based on the system needs. For instance, if you’re dealing with subatomic measurements, meters doesn’t make sense for distances, and seconds doesn’t make sense for durations. You could have a MicronDistanceImpl implementation of Distance, and NanosecondDurationImpl implementation of Duration. Similarly for a cosmic simulation, you could have light-years for Distance and millennia for Duration.

Oh, almost forgot. This also gives you the ability to have nice toString() representations. I haven’t added those to the above classes, but I could imagine the toString printing “12.0 meters” and “32.6 degrees”. This will help when presenting these values to a user as well.

This kind of abstraction is what makes good design. Next time you declare something as a primitive (or a dumb primitive wrapper such as Double, or Integer), think about what type units you might assign to it. You’ll find that your classes become simpler as well, since they don’t get bogged down in the formulas needed to convert between units. It also gives you much more expressive power. Think of a Speed class which has Distance and Duration fields. Maybe even a method on Distance public Speed divide(Duration duration);

HTML and ShortTags: Its valid but doesn’t work, or its invalid but it works

Thursday, October 25th, 2007

I discovered something interesting the other day. There are some HTML syntaxes that are surprising, but most browsers don’t support them anyway…

Specifically, <p<a href="/">Some Text</> some other text is technically the same as <p><a href="/">Some Text</a> some other text. Unfortunately, most browsers don’t properly support SGML short-tags, so the outcome is likely different than the spec says it should be.

Read more about this at w3c html blog.

Using enums as a flyweight pattern.

Monday, October 22nd, 2007

When the Java people introduced enums, they went out of their way to support the switch statement with enums. Good OO design tends to avoid switches and instead use polymorphism to “decide” on behavior. While you can switch on enums, you also can add behavior to enums. As a matter of fact, this is useful for the flyweight pattern.

First, for those who don’t know, a Flyweight is a stateless object that represents some behavior for a set of other objects. For example, if you had to have an ‘e’ Glyph”object for every letter ‘e’ in this article, you’d have a lot of instances of the e glyph object. Using a flyweight, you have one instance of the ‘e’ Glyph, and whenever you need to evoke the behavior, you pass in the “state” (eg. the position on the screen) to the appropriate method.

For small sets of Flyweights, where the difference between any two instances is mostly behaviorally, enums are very useful. They wouldn’t be so useful for the ‘e’ Glyph example, since we’d have to write an enum with every character we wished to support. Glyphs also mostly differs in the shape, which can be broken down into data rather than behavior.

So, for our example, we’ll be using a Flyweight to mimic animals. Lets start out with our different types of animals.

public enum AnimalType {
    Cow,
    Chicken,
    Pig,
    Lion
}

Now, if we wanted to have an Animal class. Lets assume that we have an “World” class that contains information about our animal’s world. Including food sources and animal position, etc…

public final class Animal {
   private final AnimalType type;
   private final World world = World.getWorld();
   private int hungerLevel;
   public Animal(AnimalType type) {
      this.type = type;
   }
}

Now, objects aren’t generally useful unless they have behavior. We’re going to add simple delegation from the Animal class to its AnimalType instance.

public final class Animal {
   private final AnimalType type;
   private final World world = World.getWorld();
   private int hungerLevel;
   private int restLevel;
   public Animal(AnimalType type) {
      this.type = type;
   }

   public void eatFood() { type.eatFood(this);}

   public void sleep() {
     type.sleep(this);
   }

   public World world() { return world; }
}

public enum AnimalType {
    Cow {
        public void eatFood(Animal animal) {
            animal.world().findGrass().consumeBy(animal);
        }
        public void sleep(Animal animal) {
            animal.world().findBarn().sleptInBy(animal);
        }
    },
    Chicken{
        public void eatFood(Animal animal) {
            animal.world().findCorn().consumeBy(animal);
        }
        public void sleep(Animal animal) {
            animal.world().findCoop().sleptInBy(animal);
        }
    },
    Pig{
        public void eatFood(Animal animal) {
            animal.world().findTrough().getContents().consumeBy(animal);
        }
        public void sleep(Animal animal) {
            animal.world().findBarn().sleptInBy(animal);
        }
    },
    Lion {
        public void eatFood(Animal animal) {
            animal.world().findPrey().consumeBy(animal);
        }
        public void sleep(Animal animal) {
            animal.world().findSavana().sleptInBy(animal);
        }
    },
;
    public abstract void eatFood(Animal animal);
    public abstract void sleep(Animal animal);
}

So, as you can see, each animal has its own behavior. This particular flyweight implements the Strategy pattern for both eatFood() and sleep().

Shrinking Source Code: Java initialization

Saturday, October 20th, 2007

There have been a few discussions on how to do a particular task with the smallest amount of “code”. Some people talk about this with regards to soure code, and others with regards to object (a.k.a machine-instructions or byte code). While the later has some actual application, its often more “fun” to talk about the former.

Shrinking source code down for no other reason is generally bad practice, but it is an interesting exercise. I think that in this article we can distill the basic concept down to what is the smallest valid (in characters) Java source file that will compile, and when run does absolutely nothing.

For our first attempt, lets try the straight-forward approach. Not bending any rules.

class C{public static void main(String[]a){}}

That is 45 characters long. This compiles (javac C.java) and executes (java C) .Nothing spectacular, and there doesn’t appear to be anything superfluous there, but I assure you there is.

Think back to the JLS. Specifically, before the JVM can execute main() on a class, it must initialize it first (JLS 12.4.1).This gives use another way to execute code.

class C{static {} public static void main(String[]a){}}

This is a little longer, but bear with me. It still compiles and executes, just like our previous versions. What about removing main now?

class C{static{}}

This is indeed very short, and compiles just fine, but unfortunately we get Exception in thread “main” java.lang.NoSuchMethodError: main. Well, that doesn’t exactly do nothing, which is what our goal is. Note that we can put code into the static initializer.

class C{static{System.exit(0);}}

Now we’re down to 32 characters, and it compiles and does nothing. Sweet. What happens is that the JVM executes the static initializer before looking for the main method. The initializer tells the JVM to terminate (JLS 12.8), so it complies. Hence, no exception.

Is there anything else we can rid ourselves of? With the advent of Enums in Java 5, the answer is yes! We don’t need to explicitly create a static initializer, because enums will do that for us.

enum C{A;{System.exit(0);}}

27 charecters long, and it compiles and does nothing. Amazing. So what’s happening here? Enum types in Java are actually classes. Furthermore, they are singletons. In this case, the compiler creates two classes. C extends Enum, and A extends C. When we run java C, the JVM loads the class C which has a static initializer that sets C.A = new A(), which starts the instance initialization (JLS 12.5) process. Part of this process is to call the initializer of the parent class (in this case C). We added an instance initializer in C which kills the JVM.

So there you have it. The smallest possible Java program which does absolutely nothing (as far as I know). When trying to create the smallest possible Java program which does something, you would probably be wise to start from this template, unless you find a tricky way of having existing library code closing the JVM after doing your bidding otherwise. Not terribly useful, but somewhat entertaining, and it might help give you a better understanding of what goes on under the hood.

As with most of my esoteric Java features, don’t try this in production code.

Learning to code with a speech impediment: Or, What’s all the buzz about Lisp?

Monday, October 8th, 2007

Although I started by journey through Computer Science much later than many people (in 1989), I feel that I’ve experienced the evolution of paradigms, recreated just for me.

I started out when I was 8, programming in BASIC on the Atari 800. The paradigm there was “unstructured programming”. I used line numbers and *gasp* GOTO to manage flow control, had no idea what a sub-routine was, and used one letter variable names. I even learned the hard way that you needed to preload DOS before you start your project, or you’ll never be able to save it.

Then I learn structured programming in QBasic. These things called for/next and do/while and even subroutines! Line numbers were no longer necessary or desirable, and Labels became common place. I even got into some basic data encapsulation (not abstraction), but using the basic structure mechanism QBasic offered.

After that, I learned to program in C. Not much of a difference though. Sure it was lower level, and ran 50 times faster. I could access the VGA hardware directly and do some pretty nifty things. Around the same time I also began to learn Assembly. This brought me back to GOTO to manage flow control, but it fit for the language.

Eventually, I learned C++. Wow, this is great! A much better C! I didn’t understand most OO principals at the time. so I was still stuck in Structured/Procedural land.

I started studying Java. One day, an epiphany! I (thought I) understood what OOP was all about. Combine related data into one place, and the procedures that affect that data also go in one place. It made so much sense, I couldn’t imagine how I thought it was any other way.

Well, then I started to get really complicated programs, and couldn’t figure out why they were so brittle. Oh, whats this about Patterns and Refactoring? Ah, I see. I should abstract the data and behavior more fully. I should use polymorphism! Duh, why didn’t I understand this before! Strategy and State, Dependency Injection, Command, Flyweight, how I missed you and didn’t even know it!

That was my most recent epiphany. I realized even as it occurred that there had to be more. There had to be a better way to do things. Guess what. There is.

Enter: Lisp. Keep in mind that I’m still very new at Lisp, and this is about my journey and impressions.

The attention to Lisp has been centered in academic circles for a long time. Originally conceived of in the 50′s by John McCarthy (more history). It is an elegant language that focused on possibilities, not syntax. Many people see some example Lisp, and their first thoughts are generally “What’s with all those parenthesis?” I have a counter though. Look at some Java code, and ask “Whats with all the dots, commas, brackets, braces, ambiguities, and weird structure?”

The core of Lisp is to provide very small building blocks to work with. In this regard its much like a low-level language, except for the fact that it doesn’t mirror machine-language very closely. With these building blocks, its surprisingly easy to adopt whatever paradigm best fits your problem.

You can choose between Functional and Procedural programming (and even take the hybrid approach). You can use OO design or not. You have the ability to use Closures (passing and returning Functions/Procedures as values).

You can even treat code as data. This might not sound so impressive, but think about all of the repetitive code you’ve had to write or copy’n'paste. Think about how often you’ve thought “Maybe I could write a program to help me write this program.” For those of you who’ve used C, think about all the #define hacks you’ve created. Now, realize that Lisp lets you do this inline, and in lisp! Using something called “defmacro”, you can create a macro that writes your program for you! And most of the time, its as easy as simple list manipulation.

So, what brought me to Lisp in the first place? I was thinking about creating a programming language that handled all of the special things that I wanted. I wrote a post on the newsgroup comp.object that detailed a brain dump of all the features I wanted. I had a few replies that either criticized a feature, or pointed out that Lisp supported a feature already. I did a little investigation and found that not only did Lisp support all the features I wrote about, it supported more! And on top of that it was extensible (with defmacro).

Well, I’m on this journey for better or worse. If I find that Lisp is best left to the Academics, so be it. I enjoy academia myself, and I’m sure that I’ll find away to apply the principals of Lisp to write better code in other languages. If I find out that Lisp is *the* language, then all the better.

Off Topic: Separation of Church and State.

Tuesday, October 2nd, 2007

I saw an interesting article in the paper today.  Apparently separation of Church and State means that you should ban prayer in the library.  Somehow, I disagree with that interpretation.  I’ll admit that I’m not religious, but I believe that people should have the right to worship however they see fit, without interfering with others’ rights.

If a space is open public (as this library is), why should people be allowed to view pornography in a library, but not hold a religious meeting? Does the library not have a copy of the Bible, Koran, or Torah?

My interpretation of the doctrine of separation of Church and State is that the state must not endorse or favor one religion over another.  The doctrine should not persecute Christians who want to pray in school or the library.  It should be an argument to allow it.  By telling people what religious activities they can do and where, you are (to coin a programming term) coupling church and state.  The church is now dependent on what the state says is allowable.

Writing as an artform.

Tuesday, October 2nd, 2007

When we write, we seek an elegance in our expression. We intend the writing to be read either by ourselves or others. Stylistic devices help us convey our intent. Rewriting and moving parts around comes naturally, if we think it’s better.

There are also different levels of writing. I feel we label them backwards, when we say “reading level”. Some writings are much more difficult to read, due to the structure used or the difficulty encountered while trying to express your intent with the written language. The product of some authors is easier to read because it is better organized in both its structure and expressions.

It can be difficult to express certain things unless we assume a reading level of our audience, and by all rights we should assume the audience is capable of understanding, provided the structure is as elegant as possible.

Usually, the same meaning can be expressed in many different ways. Sometimes we strive to find a concise expression, even if that conciseness of expression increases the reading level.

You might feel like asking me “So, what’s with the essay on writing? Isn’t this a Tech Blog? It says so right in the title!” I never said I was talking about writing in English. All of those statements apply to writing a program as much as (if not more than) writing an essay. Why is it that we spend so much time teaching programmers-to-be about syntax and making the compiler understand, and so little time on teaching them how to make an understandable and expressive source file for humans?

We all look back on our early projects and thing WTF was I thinking? Even our comments were unhelpful. /* this statement tells the readers that the comments didn’t help */

For people who are passionate about developing software, the realization comes eventually that code can be expressive without the need for comments. Comments become useful only to explain justification for choosing an approach over another, instead of reiterating what the code says it does. Breaking a procedure down into its component parts and naming them appropriately not only allows a person to read the code in a more natural manor, it helps the original author not repeat themselves.

It could be that part of the problem is that text books use comments to explain syntax to the students.

/* The following line assigns the value “foo” to the variable bar. */

This may be useful as a teaching tool, but when you’re writing software in a professional environment, you’re (usually) not trying to teach your colleagues the syntax of that particular programming language. Computer Science teachers often compound the problem be setting a comment quota. The only time I felt that was at all justified was my assembly language teacher, who required more than one comment per line of actual code.

The moral of the story? Separate the teaching of syntax and the teaching of how to write a program. Teach people that using a meaning name can be more powerful than two comments (one at the call, and the other at the definition). Help people understand that your writing isn’t meant only for the compiler, but as much (if not more) for the human.

Make the world a better place. If you feel the need to comment on what a piece of code does, turn it into a function/method/procedure call whose name implies your would-be comment. Do this even for conditions. “if (cans * pricePerCan < goal)” can be rewritten as “if (belowPriceGoalForCans())”.

Food for thought: What does “the business layer” really mean?

Tuesday, October 2nd, 2007

The other day I read a blog post titled The Mythical Business Layer. This blog talks about the dangers of trying to separate the “Business Logic” from all the different layers of a system.

One of the more important points that this article makes is:

There is absolutely nothing wrong with having a multi-layered application. In many cases, anything but that would be a bad design. It’s absolutely critical, however, to not think of these layers as persistence, business, and presentation. Database, processing, and user interface are much more appropriate terms.

I think that all of us can get stuck on the trying to move some logic into a place where it doesn’t really belong. If moving the logic into another location causes your system to be more complicated, then its probably the wrong place to move it.