Daniel Pitts’ Tech Blog

Posts Tagged ‘Java’

Why is C so slow? Java vs. C benchmark.

Saturday, December 8th, 2007

Recently I’ve seen a few attacks on Java’s performance on comp.lang.java.programmer. So, I’ve decided to write my own benchmarks and test it myself. I expected the C version to perform slightly better, but at least in the same range, as the Java version. I was surprised that the Java version performed better, on both the client vm and the server vm.

I did my own benchmarks using these files:

bench.c

#include <stdio.h>
#include <time.h>

void bench() {
  long foo = 0;
  clock_t start = clock();
  for (long i = 1; i < 5000; ++i) {
    for (long j = 1; j < i; ++j) {
      if ((i % j) == 0) {
        foo ++;
      }
    }
  }
  clock_t end = clock();
  printf("%d %dms\n", foo,
     (int) ((end - start) * 1000 / CLOCKS_PER_SEC));
}

int main() {
  for (long i = 1; i < 10; ++i) {
    printf("%d: ", i);
    bench();
  }
}

Bench.java

public class Bench {
  static final long CLOCKS_PER_SEC = 1000;
  static void bench() {
    int foo = 0;
    long start = System.currentTimeMillis();
    for (int i = 1; i < 5000; ++i) {
      for (int j = 1; j < i; ++j) {
        if ((i % j) == 0) {
          foo ++;
        }
      }
    }
    long end = System.currentTimeMillis();
    System.out.printf("%d %dms\n", foo,
       (int) ((end - start) * 1000 / CLOCKS_PER_SEC));
  }

  public static void main(String[] args) {
    for (int i = 1; i < 10; ++i) {
      System.out.printf("%d: ", i);
      bench();
    }
  }
}

Then I ran these:

-bash-3.00$ java -version
java version "1.5.0_09"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_09-b03)
Java HotSpot(TM) Client VM (build 1.5.0_09-b03, mixed mode, sharing)
-bash-3.00$ javac Bench.java
-bash-3.00$ g++ --version
g++ (GCC) 3.3.3 (NetBSD nb3 20040520)
Copyright (C) 2003 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
-bash-3.00$ g++ bench.c -o bench

Now I’m ready to run the individual tests:

-bash-3.00$ java -server Bench
1: 38357 457ms
2: 38357 416ms
3: 38357 401ms
4: 38357 394ms
5: 38357 394ms
6: 38357 401ms
7: 38357 395ms
8: 38357 401ms
9: 38357 394ms
-bash-3.00$ java -client Bench
1: 38357 421ms
2: 38357 400ms
3: 38357 394ms
4: 38357 400ms
5: 38357 393ms
6: 38357 393ms
7: 38357 400ms
8: 38357 394ms
9: 38357 401ms
-bash-3.00$ ./bench
1: 38357 450ms
2: 38357 440ms
3: 38357 450ms
4: 38357 430ms
5: 38357 450ms
6: 38357 440ms
7: 38357 450ms
8: 38357 440ms
9: 38357 450ms

As you can see, the Java version is approximately 10% faster than the c version. So, here is my challenge. Why is C so slow? I thought it was supposed to be faster than Java.

More Discussion On Operator Overloading

Wednesday, December 5th, 2007

Updated: See notes below.

I was surprised to see that within one day of posting my previous entry on Operator Overloading, I received several comments. Aviad Ben Dov from Chaotic Java even took my idea and ran with it. Ricky Clarkson suggested using Haskell’s approach of allowing anything that is of the “Num” type to define +,-,/,*, etc…. I have a few things to add to this discussion.

Aviad’s idea for operators by interface is not a bad one; it works well for overloading “[]” but it breaks down on a few use cases (such as ‘+’, ‘*’, etc..) that are important (to me). Ricky’s idea for subtypes of a specific class getting to have operator overloading isn’t bad either, but for physical unit manipulation it is too inflexible. The core concept that both of them seem to have suggest is that a limited selection of types can have overloaded operators, but the operations that are possible aren’t limited to the scalar quantities that this would limit the operators to.

Suppose I have the classes Distance, Area, and the built-in “Scalar” type Double I would expect at least these sets of operations:
Distance * Distance => Area
Distance * Double => Distance

If I had to implement the Multipliable<T> interface, I wouldn’t be able to handle Distance * Distance and Distance * Double. You can’t implement an interface twice, even with different type parameters. I don’t know if this is something that Reified generics would fix, but it feels like it might be. Maybe someone could comment on that.

Also, if Distance had to extend Number, what would doubleValue return? Meters? Inches? Smoots? There might be some way to solve these problems, but I can’t think of a way to prevent abuse while allow good use.

Actually, now that I have thought a little about it…

The semantics of plus (+), minus (-), times (*), dividedBy (/), moduloOf (%), shiftLeft(<<), shiftRight(>>), unsignedShiftRight(>>>), or(|), and(&), xor (^), negative(-), and inverse(~), are all well-defined enough for so many not-necessarily-numeric types that allowing, even if only through naming conventions, the overloading of those operations seems like a good idea.

I think a good way to go would be to convert at compile time a * b to the method call a.times(b). Assignment operators like a += b would be replaced with a = a.plus(b). This would help reduce abuse while creating a more expressive language. The assignment operator rule is important, as it will help prevent the “clever” idiom of using += for appending elements to a collection.

Note on updates: I previously misspelled Aviad as “Avaid”. I also have added clarification for which use-cases Aviad’s Indexer doesn’t work for me, namely for algebraic operators.

Almost Useful: Operator overloading

Tuesday, December 4th, 2007

On suns site, there is an open bug for operator overloading. Many people have pointed out that Java has one special case of operator overloading (String + String), so why not allow the programmer to overload operators?

Operator overloading would become especially useful when the addition of the units and measures API, or other custom libraries that are similar. It becomes especially useful when trying to avoid primitive obsession, and create numeric-like types.

Imagine this case:

Speed s = endDistance.minus(startDistance).divide(duration);

could be simplified to

Speed s = (endDistance - startDistance) / duration;

This of course is a simple example, and yet one that I would love to use in some of my existing code-bases.

Another use case would be a cleaner syntax for lists/maps:

myMap["Hey"] = "There";
System.out.println(myList[10]);

And hey, what about a special case for compareTo? Although it might be too dangerous to overload =/==, I could see overloading <, > <=, and >=. It might be nice to add a couple of operators to the mix. I’m officially suggesting “#” for concatenation. Maybe “:=” for shortand to .equals().

Almost Useful: Java Type Intersection.

Friday, November 23rd, 2007

First, for my subscribers that celebrate it, Happy Thanksgiving!

I’ve written about it before, but I think its worth revisiting. Type intersection would be a highly useful feature if not for one thing. “It is not possible to write an intersection type directly as part of a program; no syntax supports this.” - JLS (§4.9). That seems like a poor excuse to not support a feature as potentially powerful as type intersection could have been.

They were bold enough to add syntax all over the place for Generics. They added new syntatical meanings for ‘?’ ‘<’, and ‘>’. As a matter of fact, they added a syntax within that construct for handling Type Intersections. Would it really be that difficult to reuse that syntax outside of capture conversions and type inference? Heck, maybe even make it reifiable, although that’s not *as* important.

One example others have used in the past where it would be useful to have this type intersection is with the marker interface RandomAccess. While marker interfaces are less useful now that we have annotations, it none-the-less exists, and can be useful for ensuring that the user of an algorithm passes in a compatible list.

For example, its quite possible to do the following, even with the currently crippled implementation of type intersection.
<T extends List<String> & RandomAccess> void foo(T list) { ... }
You know that you’re getting a random access list. The pain point is that you can not do the following:
<T extends List<String> & RandomAccess> T foo() { return new ArrayList<String>(); }
The reason that isn’t legal is quite simple, even if its not obvious. T is any type that satisfies List<String>&RandomAccess, so you don’t know that it is an ArrayList. You might have MyNonArrayList<String> list = foo(); Oops, that would be an incompatible assignment.

The better approach would be to have the return type be an explicit type that is List<String>&RandomAccess. As a matter of fact, my suggestion is to use that syntax exactly, unless there is a compiler-grammar reason not to. So, our T foo() line becomes:
List<String>&RandomAccess foo() { return new ArrayList<String>(); }
So then we can do: List<String>&RandomAccess list = foo(); Actually, we could just use List<String> list=foo() if we don’t care about RandomAccess.

An important addition to make to this would be casting. For legacy support, if I have a List<String>, but I know that it should be an ArrayList (or some other RandomAccess), I should be able to cast: foo((List<String>&RandomAccess)list);

Shrinking Source Code: Java initialization

Saturday, October 20th, 2007

There have been a few discussions on how to do a particular task with the smallest amount of “code”. Some people talk about this with regards to soure code, and others with regards to object (a.k.a machine-instructions or byte code). While the later has some actual application, its often more “fun” to talk about the former.

Shrinking source code down for no other reason is generally bad practice, but it is an interesting exercise. I think that in this article we can distill the basic concept down to what is the smallest valid (in characters) Java source file that will compile, and when run does absolutely nothing.

For our first attempt, lets try the straight-forward approach. Not bending any rules.

class C{public static void main(String[]a){}}

That is 45 characters long. This compiles (javac C.java) and executes (java C) .Nothing spectacular, and there doesn’t appear to be anything superfluous there, but I assure you there is.

Think back to the JLS. Specifically, before the JVM can execute main() on a class, it must initialize it first (JLS 12.4.1).This gives use another way to execute code.

class C{static {} public static void main(String[]a){}}

This is a little longer, but bear with me. It still compiles and executes, just like our previous versions. What about removing main now?

class C{static{}}

This is indeed very short, and compiles just fine, but unfortunately we get Exception in thread “main” java.lang.NoSuchMethodError: main. Well, that doesn’t exactly do nothing, which is what our goal is. Note that we can put code into the static initializer.

class C{static{System.exit(0);}}

Now we’re down to 32 characters, and it compiles and does nothing. Sweet. What happens is that the JVM executes the static initializer before looking for the main method. The initializer tells the JVM to terminate (JLS 12.8), so it complies. Hence, no exception.

Is there anything else we can rid ourselves of? With the advent of Enums in Java 5, the answer is yes! We don’t need to explicitly create a static initializer, because enums will do that for us.

enum C{A;{System.exit(0);}}

27 charecters long, and it compiles and does nothing. Amazing. So what’s happening here? Enum types in Java are actually classes. Furthermore, they are singletons. In this case, the compiler creates two classes. C extends Enum, and A extends C. When we run java C, the JVM loads the class C which has a static initializer that sets C.A = new A(), which starts the instance initialization (JLS 12.5) process. Part of this process is to call the initializer of the parent class (in this case C). We added an instance initializer in C which kills the JVM.

So there you have it. The smallest possible Java program which does absolutely nothing (as far as I know). When trying to create the smallest possible Java program which does something, you would probably be wise to start from this template, unless you find a tricky way of having existing library code closing the JVM after doing your bidding otherwise. Not terribly useful, but somewhat entertaining, and it might help give you a better understanding of what goes on under the hood.

As with most of my esoteric Java features, don’t try this in production code.