Juha Nieminen said:
I don't see how this is so much different from what Java does.
»[A]llocation in modern JVMs is far faster than the best
performing malloc implementations. The common code path
for new Object() in HotSpot 1.4.2 and later is
approximately 10 machine instructions (data provided by
Sun; see Resources), whereas the best performing malloc
implementations in C require on average between 60 and 100
instructions per call (Detlefs, et. al.; see Resources).
If somebody wanted to make an equally meaningless claim in the opposite
direction, they could just as accurately claim that "freeing a block
memory with free() typically consumes no more than 4 machine
instructions, while a single execution of a garbage collector typically
consumes at least 10,000 clock cycles."
And allocation performance is not a trivial component of
overall performance -- benchmarks show that many
real-world C and C++ programs, such as Perl and
Ghostscript, spend 20 to 30 percent of their total
execution time in malloc and free -- far more than the
allocation and garbage collection overhead of a healthy
Java application (Zorn; see Resources).«
If you're using the exact versions of Ghostscript and Perl they tested,
compiled with the exact C++ compiler they used, running the exact
scripts they used for testing, this comparison probably means a lot.
Changing any of these will reduce the meaning of the tests -- and with
any more than minimal changes, there's likely to be no meaning left at
all.
Again, I could equally easily exchange "Java" and "C++", by merely
replacing "Perl and Ghostscript" with a couple of carefully chosen Java
programs.
To put things in perspective, consider the I recently profiled an
application that I wrote and maintain for my real work. According to the
profiler, the combined total of time spent in operator new and operator
delete (including everything else they called) was 0.115%. The very best
Java (or anything else) could hope to do is beat that by 0.115%, which
would hardly be enough to measure, not to mention caring about.
Of course, you don't know the exact nature of that program or even
generally what it does. You don't have the source code, so you have no
idea how it works, or whether it normally uses dynamic allocation at all
-- IOW, you know exactly as much about it as you do about the tests
cited by IBM.
Unlike them, I'll tell you at least a bit about the code. Like most of
my code, it uses standard containers where they seem useful. Unlike
some, it and makes no real attempt at optimizing their usage either
(e.g. by reserving space). Essentially all the data it loads (typically
at least a few megabytes, sometimes as much as a few gigabytes) goes
into dynamically allocated memory (mostly vectors). OTOH, after loading
that data, it does multidimensional scaling and then displays the result
in 3D (using OpenGL). It allocates a lot of memory dynamically, but most
of its time is spent on computation.