Java vs C++ speed (IO & Sorting)

J

Jerry Coffin

[ ... ]
This shows that you are either lying or have a bad memory. The last
time I checked the group, many here were claiming that Java IO is much
slower.

Have a look at this post by Pete Becker Dinkumware, Ltd.
(http://www.dinkumware.com) posted to this very newsgroup...

http://groups.google.com/group/comp.lang.java.programmer/msg/1313c62be872ba7c?dmode=source


What is he trying to prove with these fake benchmarks?

Since you have been proven a liar by claiming no one has ever said IO
is slow in Java, I will ignore the rest of your post without bothering
to read.

In other words, you really DID read the rest, and since it proved you
(badly) wrong, you chose to find any kind of excuse you could to ignore
it!

Your logic is lousy in any case -- even if my doubt was proven wrong, it
means I was _wrong_, not that I'm lying. Looking at Pete's earlier post,
I find that I wasn't wrong either. While Pete posted some numbers and
you might _infer_ from those numbers that he claimed C++ would have a
major speed advantage on an I/O bound application, that's purely an
inference on your part. Pete did not say any such thing.

I'm left wonder if you don't really work on behalf of a C++ compiler
vendor, trying to smear the Java community by associating yourself with
them, and by that association trying to make it look like Java
programmers as a whole are stupid, illogical and dishonest.
 
R

Razii

#include <fstream>
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
#include <ctime>
#include <iterator>
#include <set>

// the main modification is here
#ifdef CSTDIO
#include "cstdio.h"
namespace s = JVC;
#else
namespace s = std;
#endif

// I've also gotten rid of the "using namespace std;" and explicitly
// qualified the names below.
//

int main() {

typedef std::multiset<std::string> mss;
mss buf;
std::string linBuf;

s::ifstream inFile("bible.txt");

clock_t start=clock();

while(s::getline(inFile,linBuf)) buf.insert(buf.end(), linBuf)
;

s::eek:fstream outFile("output.txt");

std::copy(buf.begin(),buf.end(),
s::eek:stream_iterator<std::string>(outFile,"\n"));

clock_t endt=clock();
std::cout <<"Time for reading, sorting, writing: " <<
double(endt-start)/CLOCKS_PER_SEC * 1000 << " ms\n";
return 0;
}


Cut and pasted above to IOSort.cpp. Didn't do anything...


C:\>cl IOSort.cpp /O2
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08
for 80x86
Copyright (C) Microsoft Corporation. All rights reserved.

IOSort.cpp
C:\Program Files\Microsoft Visual Studio 9.0\VC\INCLUDE\xlocale(342) :
warning C
4530: C++ exception handler used, but unwind semantics are not
enabled. Specify
/EHsc
Microsoft (R) Incremental Linker Version 9.00.21022.08
Copyright (C) Microsoft Corporation. All rights reserved.


C:\>IOSort
Time for reading, sorting, writing: 328 ms
C:\>IOSort
Time for reading, sorting, writing: 328 ms
C:\>IOSort
Time for reading, sorting, writing: 312 ms
C:\>IOSort
Time for reading, sorting, writing: 328 ms
C:\>IOSort
Time for reading, sorting, writing: 312 ms
C:\>IOSort
Time for reading, sorting, writing: 312 ms
C:\>IOSort
Time for reading, sorting, writing: 328 ms

That's same I got berfore. Chaging the file bible2.txt (43 meg file)


C:\>IOSort
Time for reading, sorting, writing: 6938 ms

C:\>IOSort
Time for reading, sorting, writing: 3812 ms

C:\>IOSort
Time for reading, sorting, writing: 3828 ms

C:\>IOSort
Time for reading, sorting, writing: 4250 ms

C:\>IOSort
Time for reading, sorting, writing: 4750 ms


No improvement
 
J

Jerry Coffin

[ ... ]
Cut and pasted above to IOSort.cpp. Didn't do anything...

Now you've proven that you really did lie. You previously claimed that
you didn't look at this at all, but now you've proven that you really
have looked at it!
C:\>cl IOSort.cpp /O2

Perhaps you need to work on your reading skills as well. As I clearly
pointed out in my previous post, for this code to make any difference,
you have to define "CSTDIO" when you compile it. That would be done with
something like:

cl /DCSTDIO /O2 IOSort.cpp

As it stands, you're effectively compiling exactly the same code as
before, so it's no surprise that you got the same result.
 
R

Razii

Now you've proven that you really did lie. You previously claimed that
you didn't look at this at all, but now you've proven that you really
have looked at it!

You have proven that you are an idiot. When I posted the response, I
said I am not going to bother reading the rest of it. Like right now I
am responding to you without any clue what's below the text. After a
quick scroll when I saw some code, I changed my mind and read what you
are babbling about. How does that prove that I lied?
cl /DCSTDIO /O2 IOSort.cpp

As it stands, you're effectively compiling exactly the same code as
before, so it's no surprise that you got the same result.

No improvement...

C:\>cl /DCSTDIO /O2 IOSort.cpp
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08
for 80x86
Copyright (C) Microsoft Corporation. All rights reserved.

IOSort.cpp
C:\Program Files\Microsoft Visual Studio 9.0\VC\INCLUDE\xlocale(342) :
warning C
4530: C++ exception handler used, but unwind semantics are not
enabled. Specify
/EHsc
IOSort.cpp(12) : fatal error C1083: Cannot open include file:
'cstdio.h': No suc
h file or directory

C:\>IOSort
Time for reading, sorting, writing: 6750 ms

C:\>IOSort
Time for reading, sorting, writing: 3781 ms

C:\>IOSort
Time for reading, sorting, writing: 3859 ms

C:\>IOSort
Time for reading, sorting, writing: 3828 ms
 
R

Razii

C:\>cl /DCSTDIO /O2 IOSort.cpp
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08
for 80x86
Copyright (C) Microsoft Corporation. All rights reserved.

IOSort.cpp
C:\Program Files\Microsoft Visual Studio 9.0\VC\INCLUDE\xlocale(342) :
warning C
4530: C++ exception handler used, but unwind semantics are not
enabled. Specify
/EHsc
IOSort.cpp(12) : fatal error C1083: Cannot open include file:
'cstdio.h': No suc
h file or directory

ops... I missed that
 
D

dave_mikesell

Complete nonsense. If you are using the VM to load and run your class
file, the time must count.

Do you count the time to boot the computer when running C++? :)
[Yes, I agree that if you did you'd have to the same for Java!]

I have an environment where I can leave the JVM running and
load/run/unload class files. Why should the time to start
up that JVM matter in this test?

OK, I'll cede the point. When you carve away all of the costs of
running a Java program, like startup and GC (as Jerry pointed out),
and though not on-topic in this particular benchmark - memory
footprint (which I found to be on average 300% greater in the Java
version) - Java can perform about the same as C++.

But in real world applications, those things matter, which is why
seven years later you still see Java and C++ largely being used in the
same application domains that they were then.
 
M

Michael.Boehnisch

It's not that "natural". If you go back and read the old thread, you
will see that the claim by many people was that IO is really slow in
Java compared to C++. I posted this (that was 2001) and asked why I
see no great speed advantage for C++.

The choice of language should not make a big difference, at least no
longer. In ancient times, Java was executed as byte code only and
there was no JIT-Compiler. Major parts of the Java library also moved
to machine code - I believe that is why so many elementary classes in
the library are "final"; they are not byte code anymore and lose their
ability to be subclassed - a trade off from language purity to
efficiency. There may be other reasons, though, I do not claim to be a
Java wizard.

I can only assume, the complaints in the ancient thread about slow I/O
in java are just unchecked repetitions of "wisdom of the old", but
things changed while they were not looking.
I noticed that. The C++ version was posted to this group by someone
named Pete Becker (from dinkumware). My version was using set. It
didn't make much difference in Java (or even C++) whether I use 5000
or 50000 so I ignored that part.

Agreed. I removed the line from the C++ code completely and saw no
effect, too.
It's irrelevant to reading, writing and sorting. If you are going to
include everything like Java virtual machine load time, how about the
fact that IOSort.exe is 135 KB and IOSort.class is only 1 kb. Count
that as an advantage for Java :) The file size is 135 times smaller.

Eh? I used Visual Studio 2005 Pro for my peek on the code - 17kB size
of the optimized executable. Did you forget to strip debug
information?

The C++ program's size is a little burdened by the use of STL style
programming. Templates like this are not part of a link library, they
need to be compiled and linked into the executable, increasing the
size compared to a mere reference into a dynamic link library. [C++
fans, don't misunderstand me here - I really appreciate the STL and
think its one of the most versatile parts of the language, well worth
the little increase in exe file size.]

I do not think the global memory footprint is favouring the Java side,
though - the Java interpreter and JIT compiler and libraries are
loaded in addition to your .class file. Also, have a look at the JIT
cache in the filesystem, where your .class file gets a way larger
machine language duplicate.
Disregarded the cache, you need to have large programs, or a lot of
them, to compensate for the runtime environment. Asymptotically, Java
wins here, of course :)

best,

Michael.
 
J

Jerry Coffin

And since your code didn't compile, that's the end of that.

Go back and re-read what was posted. The code is in two parts, one a
modified version of your program, the other the header file you need
to save (with the correct name). That header is conditionally included
into the modified version of your program, and used when CSTDIO is
defined.
 
E

Eric.Malenfant

That's the real problem. It makes for a lot of extra work when
using Java in large applications. (For small applications,
Java's actually not too bad. Although I find that once you've
gotten used to programming cleanly, it's frustrating to not be
able to.)

This makes me curious: Could you elaborate?
 
C

Christopher

This topic was on these newsgroups 7 years ago :)

http://groups.google.com/group/comp.lang.c++/msg/695ebf877e25b287

I said then: "How about reading the whole Bible, sorting by lines, and
writing the sorted book to a file?"

Who remember that from 7 years ago, one of the longest thread on this
newsgroup :)

The text file used for the bible is hereftp://ftp.cs.princeton.edu/pub/cs126/markov/textfiles/bible.txt

Back to see if anything has changed

(downloaded whatever is latest version from sun.java.com)

Time for reading, sorting, writing: 359 ms (Java)
Time for reading, sorting, writing: 375 ms (Java)
Time for reading, sorting, writing: 375 ms (Java)

Visual C++ express and command I used was cl IOSort.cpp /O2

Time for reading, sorting, writing: 375 ms (c++)
Time for reading, sorting, writing: 390 ms (c++)
Time for reading, sorting, writing: 359 ms (c++)

The question still is (7 years later), where is great speed advantage
you guys were claiming for c++?

------------------- Java Code -------------- (same as 7 years ago :)

import java.io.*;
import java.util.*;
public class IOSort
{
public static void main(String[] arg) throws Exception
{
ArrayList ar = new ArrayList(5000);

String line = "";

BufferedReader in = new BufferedReader(
new FileReader("bible.txt"));
PrintWriter out = new PrintWriter(new BufferedWriter(
new FileWriter("output.txt")));

long start = System.currentTimeMillis();
while (true)
{
line = in.readLine();
if (line == null)
break;
if (line.length() == 0)
continue;
ar.add(line);
}

Collections.sort(ar);
int size = ar.size();
for (int i = 0; i < size; i++)
{
out.println(ar.get(i));
}
out.close();
long end = System.currentTimeMillis();
System.out.println("Time for reading, sorting, writing: "+
(end - start) + " ms");
}

}

--------- C++ Code ---------------

#include <fstream>
#include<iostream>
#include <string>
#include <vector>
#include <algorithm>
#include <ctime>
using namespace ::std;

int main()
{
vector<string> buf;
string linBuf;
ifstream inFile("bible.txt");
clock_t start=clock();
buf.reserve(50000);

while(getline(inFile,linBuf)) buf.insert(buf.end(), linBuf);
sort(buf.begin(), buf.end());
ofstream outFile("output.txt");
copy(buf.begin(),buf.end(),ostream_iterator<string>(outFile,"\n"));
clock_t endt=clock();
cout <<"Time for reading, sorting, writing: " << endt-start << "
ms\n";
return 0;

}

I like how you start the time _after_ allocations and initializations
for Java, but _before_ them in C++.
Also, clock has granularity has big as my toe. I've seen it skew as
much as a second on some machines. There are several pages on Google
on how to do _REAL_ performance time measurements. It gets even worse
on multi-core systems.

To be completely fair, I'd start a performance timer, start another
process to do the work, let the process exit, stop the timer. I'd also
limit both to a single core, and use a timer that has a guaranteed
granularity of 1 ms. Only then could you say ,"Java parsed and sorted
this particular example faster than C++" and that would be all you
could say.

I'm not trying to say one language performs better than the other.
Frankly, I don't care. But your experiment defies several laws of
scientific testing.
 
M

Mark Space

James said:
That's the real problem. It makes for a lot of extra work when
using Java in large applications. (For small applications,
Java's actually not too bad. Although I find that once you've
gotten used to programming cleanly, it's frustrating to not be
able to.)


I'd like to ditto Eric's request. Can you elaborate on what you are
referring too here? Is it the lack of multiple inheritance in Java that
you feel prevents designing clean abstraction layers? Maybe it's the
lack of direct interface to system/external libraries (ie, one has to
use Java Native Interface to call libraries with C bindings.)?

Something else maybe?
 
R

Razii

Go back and re-read what was posted. The code is in two parts, one a
modified version of your program,

I know and already tried that ... didn't compile...


C:\Program Files\Microsoft Visual Studio 9.0\VC\INCLUDE\xutility(764)
: error C2
039: 'iterator_category' : is not a member of
'JVC::eek:stream_iterator<T>'
with
[
T=std::string
]
C:\Program Files\Microsoft Visual Studio
9.0\VC\INCLUDE\xutility(2553) :
see reference to class template instantiation
'std::iterator_traits<_Iter>' bei
ng compiled
with
[
_Iter=JVC::eek:stream_iterator<std::string>
]
IOSort.cpp(37) : see reference to function template
instantiation 'JVC::
ostream_iterator<T>
std::copy<std::_Tree<_Traits>::iterator,JVC::eek:stream_iterato
r<T>>(_InIt,_InIt,_OutIt)' being compiled
with
[
T=std::string,
_Traits=std::_Tset_traits<std::string,std::less<std::string>,std::al
locator said:
,std::allocator<std::string>,true>>::iterator,
_OutIt=JVC::eek:stream_iterator<std::string>
]
<snip>
 
R

Razii

I like how you start the time _after_ allocations and initializations
for Java, but _before_ them in C++.

that was actually some kind of typo as I posted it here. In ay case,
the result didn't change. It had no effect on time.
 
Î

ÎÌ

I made bible.txt 10 times and made it a 43 meg file

C++ is doing far worse now (the code used was multiset version)

Time for reading, sorting, writing: 2047 ms (java)
Time for reading, sorting, writing: 2016 ms (java)
Time for reading, sorting, writing: 2016 ms (java)
Time for reading, sorting, writing: 2015 ms (java)

and for c++

Time for reading, sorting, writing: 5281 ms (c++)
Time for reading, sorting, writing: 5703 ms (c++)
Time for reading, sorting, writing: 3921 ms (c++)
Time for reading, sorting, writing: 3718 ms (c++)

How come? c++ is at least 45% times slowe (if using 3718 ms)

I have to say, your program is not powerful to show Java is faster
than C++.
Such as your C++ code is not very good.
 
I

Ian Collins

Very good. You've proven beyond a shadow of a doubt that Java is the
better language for this particular toy benchmark, at least on your
machine.
Without any mention of the C++ compiler options used to build it.
 
I

Ian Collins

Razii said:
More results this time java.class compiled to native Windows (instead
of using VM) by using JET compiler.

http://www.excelsior-usa.com/ (JET compiler can be found here)

10 bibles (43 meg file)

Time for reading, sorting, writing: 2453 ms (Java with JET)
Time for reading, sorting, writing: 2391 ms (Java with JET)
Time for reading, sorting, writing: 2344 ms (Java with JET)
Time for reading, sorting, writing: 2437 ms (Java with JET)

Time for reading, sorting, writing: 5281 ms (c++)
Time for reading, sorting, writing: 5703 ms (c++)
Time for reading, sorting, writing: 3921 ms (c++)
Time for reading, sorting, writing: 3718 ms (c++)
Every action has an equal and opposite reaction:

javac IOSort.java
java IOSort
Time for reading, sorting, writing: 2952 ms

CC IOSort.cc -library=stlport4 -fast
../a.out
Time for reading, sorting, writing: 1100ms

So either my system has a poor Java implementation, or yours has a poor
C++ one. Which proves nothing.
 
R

Razii

You didn't tell us how you compile the C++ code in Visual C++.

cl /O2 filename.cpp

Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08
for 80x86 Copyright (C) Microsoft Corporation. All rights reserved.
 
C

courpron

The question still is (7 years later), where is great speed advantage
you guys were claiming for c++?

First, you ( and the guys claiming C++ had a speed advantage over
Java ) are not comparing the inherent performances of C++ and Java.
You are measuring the performances given by some C++ and Java
implementations (compilers). For example, most STL implementations are
known for applying rather good algorithms and having low I/O
performances.

Second, if, for some project involving I/O, networking or
multithreading, my primary goal was to get extreme performances, ahead
of any other consideration such as portability or maintenance, I would
almost always choose C++ over Java, not because of an inherent
language performance advantage, but because I would have to use fine
tuned system calls to get maximal performances and this can't be done
entirely in Java ( you must use JNI ; and 100% java solutions allowing
API calls can't be used for a non-supported platform ). I also have
more control of the memory usage in C++ and can generally develop a
less memory-consuming program than with Java. On the other hand,
without resorting to system calls, pure Java programs can be faster
than pure C++ ones ... or the opposite, that depends on the program,
the compiler, the architecture, etc.

At first sight, a benchmark that sticks more to the direct performance
capabilities of each language would test :
- memory allocation ( including garbage collection, small object
allocation, heap / stack based allocation allowed by the C++ or Java
compiler , ...)
- array operations and access, in order to test 1. the pointer
aliasing handling in C++ ( although most C++ compilers propose
specific "noaliasing" keywords to deal with this issue ) and 2. the
bounds checking overhead in Java ( note that most of the benchmarks
measure array accesses in a "for" loop where a Java compiler can apply
bounds checking elimination )
- dynamic dispatching ( polymorphic inline caches vs virtual call
speculation and such)
- [...]

But even then, those are not directly related to the language itself.
For example, PICs and their ability to inline virtual calls is a
feature of a JIT compiler, no matter the compiled language. In fact, a
C++ compiler could be a JIT (see .NET framework and C++/CLI).
You can only compare performances of current compilers, not languages
themselves, and very few developpers got the skills to write the best
performing program in either languages while considering current
compilers optimizations.

As a last note, I would say that template metaprogramming IS a C++
language feature that can speed-up runtime performances ; java and its
generics, while powerful, don't allow such metaprogramming
computations, mainly because of the impossibility to write recursive
structure definitions with specializations.

Alexandre Courpron.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,183
Messages
2,570,966
Members
47,516
Latest member
ChrisHibbs

Latest Threads

Top