Do you use a garbage collector?

G

gpderetta

Well, now you see, this is nearly the same problem as with your Java
flags...

Exactly my point :).

And btw, you can make N up to the max virtual memory size on your
machine. No need to fine tune for the exact N of your program.
Also, no memory wasted, any modern (read: written in the last 30
years) will fault in memory on demand, so you waste 0 bytes (modulo
the page granularity).

Razii, did you try my version with __attribute__((pure)) (http://
pastebin.com/m76543e97)?
A good compiler with powerful intraprocedural optimizations could
figure out the attribute by itself, for real life compilers we have to
help them a bit.

If tried the same program in haskel, you would also get Time:0, for
whaterver n you could immagine (even some order of magnitude more than
your phisical memory). That's to prove that testing how fast you can
make a program that does nothing makes absolutely no sense.

My purpose (other than having fun) is not to show that C++ is faster
than java or viceversa, but simply that benchmarking is hard. This
benchmark is simply useless and is not benchmarking at all how costly
is dynamic allocation, but simply how good is a specific compiler at
interprocedural optimizations.
 
G

gpderetta

Well, IMVHO, I would expect a dynamic memory test to at least free some
memory here and there...

In a dynamic memory test yes, maybe. But this test is not it. This
test is just showing how fast is a language at doing nothing.
A real test would do something that must be computed at runtime (for
example by reading an input file containing the specific create/delete
operations).
You can't really free any memory when you define it
as:

static char buf[5000000];

Some might say that this could possibly be a form of "cheating" within the
narrow scope of a benchmark that deals with "dynamic" memory.

As James Kanze said, never trust a benchmark you didn't rig
yourself :).
I'll have to try this benchmark with LLVM to see if it can figure out
that it does nothing even without __attribute__((pure)).
Who knows. ;^)


There is the __attribute__ ((malloc))

Unfortunately this simply tells gcc that the result doesn't alias with
any other pointer.
GCC is, by desing, incapable of removing call to mallocs. The
mantainers believe, as you can legally replace the standard malloc,
that every malloc invocation is a visible side effect and cannot be
safely removed. If gcc could do link time optimiations, it could of
course detect if the standard malloc was being overridden.
... Anyway, I do agree that C++ does
not have that many opportunities to optimize calls into malloc.

Actually it is just that current implementations refrain from doing
that. The standard certainly allows those optimizations, if they
cannot be detected (as-if rule).
You can certainly implement a dynamic region allocator. However, IMVHO, I
would classify your benchmark code as a static region allocator.

Yes that's true. I could do dynamic region allocation. It would
probably run a little slower than the current program, but it could
make Razii happy. Unfortunately nobody is paying me to do that :)
Your
basically simulating dynamic memory. I would expect that a dynamic version
could allocate and free multiple regions to/from the underlying allocator
(e.g., malloc/free) or OS (e.g., mmap/unmap, VirtualAlloc/Free).

I do not think that doing any kind of syscall has anything to do with
memory allocation.
Using a syscall instead of a preallocating a static buffer might means
that your program could be a good OS a good citizen,
but for specific high performance applications, you do not care about
that.
The only problem I have with region allocators is that you need to find a
good enough granularity. The code you posted is extremely coarse.

Sure, I spent 10 minutes to write it :) What would you expect? As I
said, is it nothing that any sane C++ programmer would do in practice.
 
G

gpderetta

Well, that's not true for your this version since you overloaded
delete and the memory is never released.

If it is never aquired in the first place (i.e. faulted in), it is not
a problem.
Let's say you make N max memory. The user enters CreateTree(25). The
memory goes all the up to whatever 400 MB in your case. After the for
loop ends, the application must perform some different kind of
calculation. However, in your version the 400 MB memory will never go
down.

If it is faulted in once, and then never reused, it is not a problem
either, it will move to swap and the OS just forget about it.
In a sense, you are leaking memory.

The leak is 'bounded'. The test program will never run out of memory.
Isn't that right for the
version that you have right now?

How about if I change the benchmark and have two Tree classes? Tree1
and Tree2.
Once the first for loops ends with Tree1, the second for
loop with Tree2 starts.

You would use a typeless region allocator: the two tree types can
reuse the same heap region. Anyways, this is another test. there is no
way to win a constest if the rules keep changing.
However, your version is not releasing memory
from the first for loop.

yes, it is: first_free = 0 does exactly that.
It's leaking memory.

How it could, if itsn't calling malloc in the first place?
How about you fix that problem first?

In real life I wouldn't have this problem in the first place.
 
M

Matthias Buelow

Jerry said:
Offhand, I can't
think of any other language I've used that has nearly as good of support
for domain-specific languages as C++.

Well, I've been using Lisp in the past, maybe you haven't. C++ templates
vs. Lisp macros? Not really a comparison.
Also, correct me if I'm wrong, but the Boost MPL (and similar libraries)
seem to be strictly compile-time extensions. What if I need it at
runtime? That's maybe the biggest problem with languages like C++: Once
the stuff is compiled, the language is gone.
 
G

gpderetta

But isn't the problem that this allocator works with only one object,
only this tree? You have overloaded both the delete and new. How would
this work with applications that have dozens of classes and objects
that need to be created dynamically?

This (http://pastebin.com/m7765502c) is another arena based version.
It doesn't use typed pools (different types can freely use the same
pool), it takes memory from the system allocator and returns it to it
when it is done. It supports allocation of any size (it falls back to
malloc for huge sizes). It is withing 110% of the time for my previous
static typed region allocator.
The only problem is that the user must be careful with alignment (do
not allocate objects with different aligment without manually
padding). I could add automatic alignment, but it could waste a little
of space and it isn't worth it for this test.

Ah, it uses tr1::array. If your compiler doesn't have it, upgrade or
use boost::array instead. I could have coded a replacement (10 lines
maybe?) but didn't bother.

Does this satisfy your (arbitrary) constraints?
 
G

gpderetta

That thread became too big for me to follow, so we continue here.

C++ version by Chris Thomassonhttp://pastebin.com/m45f642a5

http://pastebin.com/f3f559ae2(java version)
[...]
gè++ -O2 -fomit-frame-pointer -finline-functions "new.cpp" -o "new.ex
e"

I get much better timing with Chris version if I use -O3:
g++ -v:
gcc version 4.3.0 20070916 (experimental)

/usr/lib/gcc-snapshot/bin/g++ new.cc -O3 -fomit-frame-pointer

new 22:
Time: 1140 ms

/usr/lib/gcc-snapshot/bin/g++ new.cc -O2 -fomit-frame-pointer

new 22:
Time: 2040 ms
 
G

gpderetta

Yes, I do too ( 1600 ms). What happens when you do n=23 though (since
you didn't have -DNDEBUG)

Without -DNDEBUG I get; n=23

Modified Chris version with printf unconditionally compiled in:
g++ create_tree.cc -O3 -fomit-frame-pointer -DNDEBUG

n=23:

<lotsa debug info>
Time: 2890 ms

Compiling out the printf I get

Time: Time: 3430 ms

Werid! BTW, did you try my version (http://pastebin.com/m7765502c)?
 
G

gpderetta

gpderetta said:
That thread became too big for me to follow, so we continue here.
C++ version by Chris Thomassonhttp://pastebin.com/m45f642a5
http://pastebin.com/f3f559ae2(javaversion)
[...]
gè++ -O2 -fomit-frame-pointer -finline-functions "new.cpp" -o "new.ex
e"
I get much better timing with Chris version if I use -O3:
g++ -v:
gcc version 4.3.0 20070916 (experimental)
/usr/lib/gcc-snapshot/bin/g++ new.cc -O3 -fomit-frame-pointer
new 22:
Time: 1140 ms
/usr/lib/gcc-snapshot/bin/g++ new.cc -O2 -fomit-frame-pointer
new 22:
Time: 2040 ms

Very interesting C++ conversation.

Agree on the interesting :), but it is actually C++: we are trying to
see which is the fastest C++ useless [1] allocator!
 
G

gpderetta

Werid! BTW, did you try my version (http://pastebin.com/m7765502c)?

yes, didn't compile...

new.cpp:26: error: 'buffer__size' was not declared in this scope
new.cpp:30: error: '(-1073741824u / ((unsigned
int)region::buffer_byte_size))' i
s not a valid template argument for type 'unsigned int' because it is
a non-cons
tant expression
new.cpp: In member function 'void region::add_buffer()':
new.cpp:51: error: invalid types 'int[size_t]' for array subscript
new.cpp:52: error: invalid types 'int[size_t]' for array subscript
new.cpp:54: error: invalid types 'int[size_t]' for array subscript
new.cpp: In member function 'void region::free()':
new.cpp:70: error: request for member 'size' in
'((region*)this)->region::m_buff
ers', which is of non-class type 'int'
new.cpp:71: error: invalid types 'int[size_t]' for array subscript
new.cpp:73: error: request for member 'size' in
'((region*)this)->region::m_buff
ers', which is of non-class type 'int'
new.cpp:74: error: invalid types 'int[size_t]' for array subscript
new.cpp: In constructor 'region::region()':
new.cpp:95: error: request for member 'size' in
'((region*)this)->region::m_buff
ers', which is of non-class type 'int'
new.cpp:95: error: invalid types 'int[size_t]' for array subscript

Sorry, I keep posting broken versions :(.
Try this http://pastebin.com/m2649e007
 
N

niklasb

True memory leaks memory are impossible, other than by holding
references to objects that are not needed.

If I'm allowed to define what is meant by "true memory leaks" I could
say the same for any language. I've never seen a puddle of memory
under my computer after running any piece of software. :)
 
I

Ian Collins

Razii said:
Yes, it worked. There was not much difference between struct and class
in this class.
Why did you expect there to be? There should be no difference at all.

Do you actually know any C++?
 
I

Ian Collins

Razii said:
Well, at least the C does not have actual objects and strucs are much
lighter weight at the cost of being less useful than objects.
Now what are you blithering on about?

A valid C struct is a valid C++ struct.
 
B

Bill Butler

Ian Collins said:
Now what are you blithering on about?

A valid C struct is a valid C++ struct.

That is an incorrect statement.

The creators of C++ went to great lengths to make C++ structs look and
feel like C structs but there are differences.
A struct in C++ is nothing more than a class with default public access.
C++ structs have constructors/destructors. If you don't define them the
compiler will do it for you.
You can subclass a struct.
You can have virtual methods (you can actually have methods in the
struct)

Just because you can use the same C syntax when using C++ structs, don't
make the mistake of think that they are one and the same.

In C you could assign one struct to another and you got a straight
memory copy.
In C++ it calls the assignment operator.
Of course, if you forgot to write one the Compiler will create one for
you (that does a straight memory copy)

A C++ struct is NOT a "lightweight class" as many like to use it.
A class/struct can be equally heavy/light depending on usage.

C structs on the other hand ARE lighter weight.
No Vtable
No inheritance
No constructor/destructor


Bill
 
K

Kevin McMurtrie

Razii said:
Well, at least the C does not have actual objects and strucs are much
lighter weight at the cost of being less useful than objects.

You need to learn how a compiler works. Many OO features attributed to
poor performance actually have very low or no overhead. Modern OO
coding styles usually scale much better than older coding styles. Take
a look at some old code - linear searches, insertion sorts, globals
preventing concurrency, high complexity due to low code reuse, lots of
buffering to protect data exposed in APIs, etc. Old apps work because
they're small.

Bloat and poor performance is is caused by the way software is designed
today. Code efficiency requires more money spent on software
engineering. Management weighs that cost against the cost of purchasing
faster hardware or the cost of losing some customers. The economical
balance is somewhere in the middle for many companies. If you're Apple
or Microsoft, that economical balance is probably found in extremely
sloppy and inefficient code. You can't blame the language and you can't
blame the engineers. It's the way the software was designed to be.
 
I

Ian Collins

Bill said:
That is an incorrect statement.
No, it was a correct statement. Otherwise it would be impossible to use
any C++ code that used the C standard library.
The creators of C++ went to great lengths to make C++ structs look and
feel like C structs but there are differences.

There are additions the programmer can make, but a vanilla C struct is
the same in C as it is in C++. Otherwise we would not be able to mix C
and C++ code using the same struct declarations.
A struct in C++ is nothing more than a class with default public access.

Did I say otherwise?
C++ structs have constructors/destructors. If you don't define them the
compiler will do it for you.

A POD (C) struct does not require any. There is nothing to initialise
or destroy.
You can subclass a struct.
So?

You can have virtual methods (you can actually have methods in the
struct)
Not if it's a legal C struct.
Just because you can use the same C syntax when using C++ structs, don't
make the mistake of think that they are one and the same.
It's not a mistake.
In C you could assign one struct to another and you got a straight
memory copy.
In C++ it calls the assignment operator.

If you define one.
A C++ struct is NOT a "lightweight class" as many like to use it.

I don't know any that do.
A class/struct can be equally heavy/light depending on usage.
I never said otherwise.
C structs on the other hand ARE lighter weight.
No Vtable
No inheritance
No constructor/destructor
As they are when compiled as C++.
 
E

EJP

Bill said:
In C++ it calls the assignment operator.
Of course, if you forgot to write one the Compiler will create one for
you (that does a straight memory copy)

Actually the generated assigment operator calls the assignment operator
on all the members, and so on recursively, which may or may not
ultimately result in a straight memory copy depending on whether any
application-defined assignment operators are reached.
 
B

Bill Butler

EJP said:
Actually the generated assigment operator calls the assignment
operator on all the members, and so on recursively, which may or may
not ultimately result in a straight memory copy depending on whether
any application-defined assignment operators are reached.

Correct, I was thinking about structs with only primitives members.
Thanks for the correction.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,175
Messages
2,570,944
Members
47,491
Latest member
mohitk

Latest Threads

Top