For performance, write it in C - Part 2, comparing C, Ruby and Java

P

Peter Hickman

This is the follow up to my "Write it in C post" and is intended to
report the timings for the Java implementation that I said I would write
for Charles O Nutter and the Ruby version by Simon Kroeger. First let us
deal with the Ruby version.

The program differs from the Perl and C versions in that the various
values it requires are not precomputed. Simon's program is completely
self contained.

[Latin]$ time ruby latin.rb 5 > r5

real 0m35.793s
user 0m32.081s
sys 0m0.843s

This quite clearly pisses all over the Perl version, and yes the results
were correct. Both faster than the Perl version and considerably less
code, a testament to the power and expressiveness of Ruby.

Now the Java version. I will be honest here, I might be paid to program
in Java but it hasn't been my language of choice since around 1992. I
find it gets in my way and today it found yet another way to do it.

A straight translation like the C version worked fine for a 4 x 4 grid
but when I got to the 5 x 5 grid I got the following error 'code too
large'. Yes Java has hard coded limits as to the allowed size of various
data structures within class files and the Compared array of 120 x 120
boolean values could not be initialised with the following code:

private static boolean[][] Compared = {
{false, false, ...
...
{true, true, ...
};

I had to have a whole load of 'Compared[0][44] = true;' and the like to
get the data in. This got the 5 x 5 grid to run but the 6 x 6 grid blew
up even that. Java has a 64Kb limit for various structures in the class
file (see
http://java.sun.com/docs/books/vmspec/2nd-edition/html/ClassFile.doc.html).
The last time that I had to work round such mind numbingly arbitrary
limits was when I was programming Quick Basic. Now the timings.

[Latin]$ time ./j_version.sh 5 > j5

real 0m29.553s
user 0m13.813s
sys 0m10.745s

Sorry Java fans but "as fast as C" or "faster than C" it is not. It's
only a bit faster than Ruby despite having much more resources being
dedicated to speeding it up.

The really odd thing here is that Java should actually be much faster
than this. I did manage to get the 4 x 4 grid to be written with the
same initialisation method as the C version and the timings (admittedly
on a much smaller problem) were much closer to the C version for the
same 4 x 4 grid. The solution just didn't scale because of the 64Kb
limit in the class files, which is probably not going to be change any
time in the near future.

In the interest of fairness I also looked at the timings of just the
execution of the C and Java version so that the performance of the
compilers were not impacting the times. So here is the C and Java
versions without the precomuting phase and without the compiling.

[Latin]$ time ./latin > /dev/null 2>&1

real 0m1.961s
user 0m1.680s
sys 0m0.051s

[Latin]$ time java Latin > /dev/null 2>&1

real 0m15.483s
user 0m9.641s
sys 0m4.280s

There you have it, C is still faster by an order of magnitude.
Performance is yours for the asking, but it comes at a price - you have
to write it in C. Ease of development also comes at a price, you don't
get the same performance as C. Of course if you have a fear of C this
does show that you can go some of the way by converting to Java, if that
is fast enough for you then well and good but know this, C is faster.
 
I

Isak Hansen

Peter Hickman wrote:

*snip*
Now the Java version. I will be honest here, I might be paid to program
in Java but it hasn't been my language of choice since around 1992. I
find it gets in my way and today it found yet another way to do it.
*snip*

In the interest of fairness I also looked at the timings of just the
execution of the C and Java version so that the performance of the
compilers were not impacting the times. So here is the C and Java
versions without the precomuting phase and without the compiling.

[Latin]$ time ./latin > /dev/null 2>&1

real 0m1.961s
user 0m1.680s
sys 0m0.051s

[Latin]$ time java Latin > /dev/null 2>&1

real 0m15.483s
user 0m9.641s
sys 0m4.280s

There you have it, C is still faster by an order of magnitude.
Performance is yours for the asking, but it comes at a price - you have
to write it in C. Ease of development also comes at a price, you don't
get the same performance as C. Of course if you have a fear of C this
does show that you can go some of the way by converting to Java, if that
is fast enough for you then well and good but know this, C is faster.

When people want 'speed' they care about how fast the code runs, not
JVM startup time..

Your benchmarks are utterly irrelevant, sorry.


Isak
 
P

Peter Hickman

Isak said:
When people want 'speed' they care about how fast the code runs, not
JVM startup time..

Your benchmarks are utterly irrelevant, sorry.


Isak
Interesting, so just how to you run a Java program without the JVM start-up time?

And if you can't run a Java program without the JVM start-up then your point is
what exactly?
 
P

Pedro Côrte-Real

Interesting, so just how to you run a Java program without the JVM start-up time?

You can time it inside java by fetching the system clock before and after.
And if you can't run a Java program without the JVM start-up then your point is
what exactly?

It makes sense to do that for long running applications but in this
case it doesn't. If what you really wanted to do was calculate this
latin squares thing then the startup time matters. Are there any JVMs
around that keep a shared daemon running for all processes to share so
as to avoid some of the startup time?

Pedro.
 
R

Roland Schmitt

Peter said:
Interesting, so just how to you run a Java program without the JVM
start-up time?

And if you can't run a Java program without the JVM start-up then your
point is what exactly?

It's like measuring database performance and including the startup time
of the database server. Sure, mysql startup is faster then oracles, but
does it make sense? I don't know...

Regards,
Roland
 
K

Kroeger, Simon (ext)

=20
From: Isak Hansen [mailto:[email protected]]=20
Sent: Friday, July 28, 2006 1:15 PM
When people want 'speed' they care about how fast the code runs, not
JVM startup time..

Of course each Benchmark is to be taken with a lot of caution, but I=20
doubt the JVM takes more than 13 seconds to start, even on a slow
machine.
=20
cheers

Simon
 
P

Peter Hickman

That would only get me the elapsed time of the execution which does not
reflect the actual time taken to run the program. Some higher priority
task may switch in and suspend the Java program before it completes and
thus give it a much worse timing. And then you can bet they would be
pointing this out and saying that the timing were 'utterly irrelevant'.

For most people they have to start up the JVM to run a Java program so I
cannot in all honesty see this as being 'utterly irrelevant'. Hey the
executable created from the C source has to be loaded into memory before
it is run so this means that the timings for the C version are also
'utterly irrelevant'.

It looks like a troll. It posts like a troll. It is a troll.
 
P

Peter Hickman

I'm not too sure that your analogy holds. It's not like I included the
time to turn on my computer, log in, get the command prompt and type in
the commands. Do you really think that it is unreasonable to include the
start up for the JVM when that what you have to do to run the program?
 
K

Karl von Laudermann

Peter said:
Now the Java version. I will be honest here, I might be paid to program
in Java but it hasn't been my language of choice since around 1992. I
find it gets in my way and today it found yet another way to do it.

Um, just a nitpick, but Java didn't exist in 1992. Unless you count Oak.
 
R

Roland Schmitt

Peter said:
I'm not too sure that your analogy holds. It's not like I included the
time to turn on my computer, log in, get the command prompt and type in
the commands. Do you really think that it is unreasonable to include the
start up for the JVM when that what you have to do to run the program?

But startup time for the computer and so on is the same regardless of
executing a java or c or ruby program next... Sorry, my intention was
not to criticize your statements and i think Robert is right here when
he says:
I feel he gets critizised a little bit too much for his post.

But i think there are two valid viewpoints.
I'm a professional java developer, so at 8.00am i'm starting the Eclipse
IDE and uses it until 6.00pm when i turn my computer of. In this case
the startup-time for the java-vm is not of interest when the overall
performance of Eclipse "is good enough".

The second scenario is yours: A relative small program where the
startup-time of the vm is significant in relation to the overall time
the programm is running.

Regards,
Roland
 
P

pat eyler

This is the follow up to my "Write it in C post" and is intended to
report the timings for the Java implementation that I said I would write
for Charles O Nutter and the Ruby version by Simon Kroeger. First let us
deal with the Ruby version.

Was the code for the Java and Ruby versions posted somewhere in
the other thread? (I must admit, I tuned a lot of that thread out once it
became a shouting match about the relative speeds of C and Java.)
 
P

Peter Hickman

Hmm. I left Uni in 1992 and it was around then that my, obviously flaky,
memory says I was reading the O'Reilly Java in a Nutshell.

Digs out the brown book. Oh yes it is dated 1996, what the hell was I
doing for four years?

Good catch.
 
P

Peter Hickman

pat said:
Was the code for the Java and Ruby versions posted somewhere in
the other thread? (I must admit, I tuned a lot of that thread out
once it
became a shouting match about the relative speeds of C and Java.)

The Ruby version was posted in the previous thread by Simon. I didn't
post the Java version because, code wise, it is pretty much a line for
line translation of the C version. But so you can see what the code was
like here is the 3 x 3 version. The 5 x 5 version is just too damn big
for a post, being as it is 5449 lines long!

public class Latin {
private static int WidthOfBoard = 3;

private static int NumberOfPermutations = 6;

private static String[] OutputStrings = {
"321",
"231",
"213",
"312",
"132",
"123"
};

private static boolean[][] Compared = new boolean[6][6];

private static int work[] = { 0, 0, 0 };

private static void addARow(int row) {
if (row == WidthOfBoard) {
for (int x = 0; x < WidthOfBoard; x++) {
if (x == 0) {
System.out.print(OutputStrings[work[x]]);
} else {
System.out.print(":" + OutputStrings[work[x]]);
}
}
System.out.println();
} else {
for (int x = 0; x < NumberOfPermutations; x++) {
work[row] = x;

boolean is_ok = true;
if (row != 0) {
for (int y = 0; y < row; y++) {
if (Compared[work[row]][work[y]] != true) {
is_ok = false;
break;
}
}
}
if (is_ok == true) {
addARow(row + 1);
}
}
}
}

public static void main(String[] args) {
// This nonsense is to get around the fact that Java will not allow
// me to initialise an array in the declaration.

Compared[0][2] = true;
Compared[0][4] = true;
Compared[1][3] = true;
Compared[1][5] = true;
Compared[2][0] = true;
Compared[2][4] = true;
Compared[3][1] = true;
Compared[3][5] = true;
Compared[4][0] = true;
Compared[4][2] = true;
Compared[5][1] = true;
Compared[5][3] = true;

addARow(0);
}
}
 
M

M. Edward (Ed) Borasky

Peter said:
There you have it, C is still faster by an order of magnitude.
Performance is yours for the asking, but it comes at a price - you
have to write it in C. Ease of development also comes at a price, you
don't get the same performance as C. Of course if you have a fear of C
this does show that you can go some of the way by converting to Java,
if that is fast enough for you then well and good but know this, C is
faster.
In my younger days, I did a lot of development in assembler languages,
and for many years my main high-level language was FORTRAN. Towards the
end of my FORTRAN days (about 1990) I was still dropping into assembler
for speed, even though the (FORTRAN) compilers were quite good by that
time. C compilers really sucked, especially for numerical applications.

Now here's where I'm going to put on my asbestos suit. I think the
difficulty of C development is *vastly* exaggerated by the fans of
"dynamic/scripting/interpreted" languages! In addition, I think the
difficulty of *assembler* development is vastly exaggerated, except in
bizarre architectures. (Of course, x86 does border on bizarre, until you
get to 64-bit addressing). :)

So what is the source of "fear of C?"
 
J

John Gabriele

Peter Hickman wrote:
[snip]

Now here's where I'm going to put on my asbestos suit. I think the
difficulty of C development is *vastly* exaggerated by the fans of
"dynamic/scripting/interpreted" languages! [snip]

So what is the source of "fear of C?"

Well, there's a number of "shoot yourself in the foot" and "C
pitfalls" type books out there which can give you a few ideas. My
guess is that, often, new C programmers get tripped up regularly on
things like:

* arrays vs. pointers, extern vs. static, and other possibly tricky
spots in the language,
* build issues, like dealing with cryptic makefiles and gcc args (ex.
passing in -lfoo args in the right order),
* discipline with conventions on memory management

But I agree with you that it's not so bad if you use it for what it's
good at. Maybe what's happened is, folks have a bad taste in their
mouth from trying to use C to write end-user apps, when it's really
best at lower-level libs, drivers, and number crunching.

---John
 
C

Chad Perrin

Nevertheless startup times matters for some programms, so Peters post was
not useless, I feel he gets critizised a little bit too much for his post.

. . and timings without JVM startup time are at least as "useless"
since there are similarly "irrelevant" parts of the total time for
completion of C, Ruby, Perl, Python, PHP, and other-language programs
that might be benchmarked. Unless we're going to eliminate all time
from all benchmarks that isn't strictly related to execution, we'd
better just admit "defeat" on this one, and include JVM time (especially
since there's no sane way to eliminate all time not strictly related to
execution in a "true" interpreted language -- making any possible Ruby
benchmarks "irrelevant" and "useless" by that argument).
 
I

Isaac Gouy

Kroeger said:
From: Isak Hansen [mailto:[email protected]]
Sent: Friday, July 28, 2006 1:15 PM
When people want 'speed' they care about how fast the code runs, not
JVM startup time..

Of course each Benchmark is to be taken with a lot of caution, but I
doubt the JVM takes more than 13 seconds to start, even on a slow
machine.

cheers

Simon

Startup time varies with the classes that are being loaded, this hello
world comparison is a wild assed guess
http://shootout.alioth.debian.org/gp4sandbox/benchmark.php?test=hello&lang=all
 
I

Isak

Kroeger said:
From: Isak Hansen [mailto:[email protected]]
Sent: Friday, July 28, 2006 1:15 PM
When people want 'speed' they care about how fast the code runs, not
JVM startup time..

Of course each Benchmark is to be taken with a lot of caution, but I
doubt the JVM takes more than 13 seconds to start, even on a slow
machine.

This may seem like a reasonable assumption, but it really isn't that
simple. Read up on how the JVM/Hotspot works, it's interesting stuff really.


Isak
 
M

Mike Harris

Chad said:
. . . and timings without JVM startup time are at least as "useless"
since there are similarly "irrelevant" parts of the total time for
completion of C, Ruby, Perl, Python, PHP, and other-language programs
that might be benchmarked. Unless we're going to eliminate all time
from all benchmarks that isn't strictly related to execution, we'd
better just admit "defeat" on this one, and include JVM time (especially
since there's no sane way to eliminate all time not strictly related to
execution in a "true" interpreted language -- making any possible Ruby
benchmarks "irrelevant" and "useless" by that argument).
Uh, you start your interpreter/jvm/whatever, get everything started,
then time an operation. it's certainly not "impossible"
 
C

Chad Perrin

The limit is not arbitrary; it's to allow the JVM to maintain certain
constraints over the memory used by incoming class definitions, since
they're typically not garbage collected. It would not be advisable to allow
loading an extremely large class definition into permanent memory space,
eating up the entirety of the heap. Put your gigantic data in a separate
file and load it at runtime.

I think you might be misunderstanding the usage of "arbitrary" here. An
arbitrary limit, in uses such as this, is one where someone picks a
"magic number" as a limit.

I have no fear of C. I have fear of making C work everywhere, which I do not
have to worry about with either Ruby or Java. I also have a fear of C
fanboys giving up on improving Ruby and always advising that people drop to
C for their problems.

If you really believe there's no worry about portability of Java code,
you haven't been dealing with multiplatform, multi-VM Java deployments
enough.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,190
Members
46,740
Latest member
AdolphBig6

Latest Threads

Top