Garbage collection in C++

C

Chris M. Thomasson

Thomas J. Gritzan said:
It's a bit unfair to compare one algorithm in one language and another
algo in another language. You can find a test program that shows that
whatever you want is better than whatever you want.

To make it a little bit more fair, you could run the Java GC the same
time you flush the region allocator in C++:
http://pastebin.com/m40d757b7

Okay. Sadly, it still hits a heap exception on my old platform; it just
takes longer to. When I change the line `create_tree(22);' to
`create_tree(15);' it works, but its a lot slower than the version without
explicit triggering of GC. The version without gets:


Java:
________________________________________
Time: 450 ms


with:


Java:
________________________________________
Time: 1653 ms




Compared to




C++ (MSVC 2005):
________________________________________
Time: 64 ms


C++ (MINGW):
________________________________________
Time: 216 ms



By the way, how is your region allocator different from the pool
allocator in boost?

I guess it would be more similar to:


http://www.boost.org/doc/libs/1_37_0/libs/pool/doc/interfaces/object_pool.html


However, it seems like Boost pool is using a Reaps algorithm described here:

http://www.cs.umass.edu/~emery/pubs/berger-oopsla2002.pdf


which combines region and heap allocation. Anyway, the similarity is my
region allocation can destroy all object in single call, so can Boost
object_pool. The difference is that Boost object_pool seems to be able to
allow one to destroy individual objects, my region allocator as-is simply
cannot do this. I am in the process of tinkering around with the design to
see if I can indeed pull this off.
 
J

James Kanze

On 21 Nov., 11:26, James Kanze wrote:
Well, the term "scope guard" is just much more popular for
this C++ idiom than "cleanup".

The only use of scope guard I've heard to date has referred to a
complex mixture of macros and templates; innovative, in its way,
but something that really could be handled better with better
language support.
No. By scope guard I refer to a common C++ idiom.

See above. The only "scope guard" I've seen is Andrei's. So I
There's nothing complex about it. It won't require macros or
any compiler specific hacks. The scope_guard template class
doesn't even need any C++0x features. But what does the
implemtation matter? This scope_guard thingy is just an
#include away ...

The implementation matters a lot if I have to write it. Or if
I'm picking it up off the net, and it's not part of my compiler.
   double* example()
   {
      double* p = new double[123];
      auto && sg = make_scope_guard( [p]{delete[] p;} );
      // compute data or throw exception here
      sg.cancel(); // makes dtor do nothing
      return p;
   }
There's no derivation, no virtual functions and no dynamically
allocated function object involved.
But there's library magic required, in order for the scoper
guard to be able to access local variables.  If this is
possible as you've written it with the current lambda
proposal, fine.
The scope_guard template class implementation doesn't care
about what the function object knows or does. It shouldn't
throw an exception though :) The scope_guard contains this
function object that happens to be a lambda function object in
this case. The lambda function object captured the local
pointer p by value. That's where a C++0x feature comes into
play.

Along with the rvalue reference. And auto. It looks to me like
it vitally depends on C++0x. And has a very unintuitive way of
being written. And requires a named variable. It's still far
from ideal.
Haven't you noticed that the ability to refer to the scope
guard later on can be very useful? Think of transactions for
example.

If it's useful, then what I need isn't cleanup. If it's useful,
and I need to declare a variable, I have no problems with the
current situation, really; the case comes up rarely enough that
a little extra effort doesn't matter. What I'm looking for is a
simple replacement for finally (which, of course, would be just
as good a name as cleanup).
   unique_ptr<double[]> example2()
[...]
It comes with the benefit of the function's declaration being
self- explanatory w.r.t. the life-time management of the
allocated array. It gets even better if you stick with
std::vector which will become movable in C++0x:
   std::vector<double> example3();
This works in C++99.  I do it all the time.
The difference is that the language standard doesn't force
compilers to do return value optimization as far as I know.
Also, it works only for the first one or two levels of passing
that vector around -- assuming RVO + function call that takes
its argument by value: func (example3()).

And how is that relevant? I have no idea whether my compiler
does RVO or not in the contexts I do it in. I don't care,
either. Until the profiler says I have a problem, I'll stick
with clean code.
I sketched one. I'm 100% certain that the example usage code
will be possible in C++0x.

You left out the most important part: the "library magic".
Actually it's not at all magic. I just didn't consider it to
be interesting enough for this thread. I'm confident that you
-- as a professional -- are able to implement it by yourself
in no time.

I doubt it.
 
J

James Kanze

[snipped debate re. garbage collection]
The "garbage" is by definition unused memory and will be
reclaimed if more memory is needed
That's the point.  A mark-and-sweep garbage collector waits
until more memory is "needed," or until a hint is provided to
indicate that the time is ripe for cleanup.  GC is guaranteed
not to free anything that's still in use by the program, but
is not guaranteed to free resources as soon as they cease to
be in use.  In contrast, the latter guarantee is supported
(though not imposed) by resource management techniques based
on deterministic destruction.

I'm trying to figure out how this is relevant to anything. If
you're managing resources other than memory, you need to manage
them. That would seem clear. Destructors are fine when the
resource obeys the rules of scope, but in that case, whether you
use garbage collection or not doesn't change anything.
Otherwise, you need some sort of explicit management to call the
destructor, or whatever other function you want to use to free
the resources.
Non-GC C++ programs will therefore tend to have better
resource retention times than programs that use GC but lack
deterministic destruction.

That is, of course, total bullshit, since you manage resources
in exactly the same way with or without garbage collection.

[...]
You may get a sort of compromise by using GC with multiple,
independent pools of memory, but honestly, I don't see any
reason to manage memory in a totally different way from every
other resource in a program.

Maybe because you have to. In practice, no two resources are
managed in the same way anyway; I certainly manage locks
differently than I do file descriptors.
Tying the management of non-memory resources to the automatic,
scope-based memory management that was already present in C
was, IMO, a stroke of genius on Bjarne Stroustrup's part.

You're not making much sense. Automatic, scope-based memory
management is fine for objects which have automatic scope-based
lifetimes. Nothing is going to change there. We're talking
here about dynamically allocated objects.
 
S

SG

The only use of scope guard I've heard to date has referred to a
complex mixture of macros and templates; innovative, in its way,
but something that really could be handled better with better
language support.

That's your opinion. I stated mine.
The only "scope guard" I've seen is Andrei's.

There you go! BTW: It doesn't contain any macros. With lambdas you can
slim this design down (you only need one "implementation class" that
takes a functor) and rvalue references can make the "mutable hack" go
away.
The implementation matters a lot if I have to write it.  Or if
I'm picking it up off the net, and it's not part of my compiler.

Well, you don't have to write it yourself (like any other library that
doesn't come as part of the std library).
Along with the rvalue reference.  And auto.  It looks to me like
it vitally depends on C++0x.  And has a very unintuitive way of
being written.  And requires a named variable.  It's still far
from ideal.

Far from ideal? Well, I suppose it's a matter of taste. Still, you
seem to be exaggerating.

'auto' and rvalue references are not vital to this. But with 'auto'
you don't need a common base clase like it's done in the scope guard
version by Andrei Alexandrescu and Petru Marginean. And with rvalue
references you don't need to declare the 'dismissed' flag mutable.
(This mutably flag is a bit of a move semantics emulation hack.)
[...]. It gets even better if you stick with
std::vector which will become movable in C++0x:
   std::vector<double> example3();
This works in C++99.  I do it all the time.
The difference is that the language standard doesn't force
compilers to do return value optimization as far as I know.
[...]
And how is that relevant?

Let me explain it to you: You said "works in C++99". It's nice that it
works for you but I like to know what happens when I write stuff like
this. C++0x guarantees that the vector's data won't be copied because
either RVO kicks in or the vector is just moved to its new
destination. I'm sorry If I wasn't clear enough. For me "it works"
implies no copying of the vector's data in this case. That is why it's
relevant.
You left out the most important part: the "library magic".
Actually it's not at all magic. I just didn't consider it to
be interesting enough for this thread. I'm confident that you
[...] are able to implement it by yourself in no time.
I doubt it.

Then, you should read that Dr Dobbs article from 2000 again.
http://www.ddj.com/cpp/184403758

Cheers!
SG
 
G

George Kettleborough

I'm not sure that this is quite relevant to the original
question. The poster said that he didn't get "why *you* would
ever want to create objects on the stack".

Actually I made a mistake in that post (which was obvious if you read
the whole post, but not obvious what I really meant). What I meant to
say was why would you want to create objects on the heap ie. *not*
create them on the stack. Objects on the stack is what you can't do in
Java and it's much quicker than objects on the heap afaik (due to
overhead of new and delete).
 
C

Chris M. Thomasson

The only use of scope guard I've heard to date has referred to a
complex mixture of macros and templates; innovative, in its way,
but something that really could be handled better with better
language support.
See above. The only "scope guard" I've seen is Andrei's. So I

Don't you mean Andrei's and Petru's? AFAICT, the original idea was Petru's.
I like the technique a lot. I have even converted it into a generic delegate
scheme for a multi-threaded message-passing system.

[...]
 
J

James Kanze

[...]
Actually I made a mistake in that post (which was obvious if
you read the whole post, but not obvious what I really meant).
What I meant to say was why would you want to create objects
on the heap ie. *not* create them on the stack. Objects on the
stack is what you can't do in Java and it's much quicker than
objects on the heap afaik (due to overhead of new and delete).

Speed isn't really the issue; semantics are. The most obvious
reason for dynamically allocating an object is that it needs an
arbitrary lifetime, not linked to scope. Other reasons are
linked to language limitations---if you don't know the type or
size at compile time, for example, C++ doesn't allow you to
allocate on the stack, and sometimes (rarely, I think, in
practice) to runtime considerations; copying an object is often
more expensive than copying a pointer. (You have to be very,
very careful with this latter, however, since copying an object
and copying a pointer have different semantics.)
 
G

George Kettleborough

On Nov 21, 12:37 am, (e-mail address removed)-berlin.de (Stefan Ram) wrote:
[...]
Actually I made a mistake in that post (which was obvious if
you read the whole post, but not obvious what I really meant).
What I meant to say was why would you want to create objects
on the heap ie. *not* create them on the stack. Objects on the
stack is what you can't do in Java and it's much quicker than
objects on the heap afaik (due to overhead of new and delete).

Speed isn't really the issue; semantics are. The most obvious
reason for dynamically allocating an object is that it needs an
arbitrary lifetime, not linked to scope. Other reasons are
linked to language limitations---if you don't know the type or
size at compile time, for example, C++ doesn't allow you to
allocate on the stack, and sometimes (rarely, I think, in
practice) to runtime considerations; copying an object is often
more expensive than copying a pointer. (You have to be very,
very careful with this latter, however, since copying an object
and copying a pointer have different semantics.)

In that case wouldn't you use a counted pointer such as
boost::shared_ptr ? Same effect as garbage collection but without the GC
engine overhead...
 
J

Jean-Marc Bourguet

George Kettleborough said:
In that case wouldn't you use a counted pointer such as
boost::shared_ptr? Same effect as garbage collection but without the GC
engine overhead...

Counted pointer is a GC algorithm. With its own overhead -- the counter,
the fact that it has to be incremented and decremented -- with its
advantages -- main one is probably that it can be deterministic -- and
disadvantages -- main one is the difficulty to handle cycles.

Yours,
 
C

Chris M. Thomasson

On Nov 21, 12:37 am, (e-mail address removed)-berlin.de (Stefan Ram) wrote: [...]
Actually I made a mistake in that post (which was obvious if
you read the whole post, but not obvious what I really meant).
What I meant to say was why would you want to create objects
on the heap ie. *not* create them on the stack. Objects on the
stack is what you can't do in Java and it's much quicker than
objects on the heap afaik (due to overhead of new and delete).
Speed isn't really the issue; semantics are. The most obvious
reason for dynamically allocating an object is that it needs an
arbitrary lifetime, not linked to scope. Other reasons are
linked to language limitations---if you don't know the type or
size at compile time, for example, C++ doesn't allow you to
allocate on the stack, and sometimes (rarely, I think, in
practice) to runtime considerations; copying an object is often
more expensive than copying a pointer. (You have to be very,
very careful with this latter, however, since copying an object
and copying a pointer have different semantics.)

I did a scaleable memory allocator a while back that was based entirely on a
plurality of threads stack spaces. Each dynamically created object was
literally a stack based object. I used it on a platform that did not have a
heap, IIRC it was an ARM based system running an old Quadros implementation.
 
C

Chris M. Thomasson

Java is not C++ with garbage collection. Java and C++ are
different languages, with different idioms. The fact that
garbage collection is mandatory in Java, and optional in C++, is
only one difference among many.

Fair enough. Too bad Java craps out on my old 512mb platform. I am glad I
know C, and some C++. I am no C++ expert.

[...]

That's the idiomatic way of doing things in the language. It's
also the cheapest way (in terms of cost).

Using `new/delete' for every single object allocation/deallocation is way
too expensive, IMVHO that is.
 
C

Chris M. Thomasson

Thomas J. Gritzan said:
It's a bit unfair to compare one algorithm in one language and another
algo in another language. You can find a test program that shows that
whatever you want is better than whatever you want.

To make it a little bit more fair, you could run the Java GC the same
time you flush the region allocator in C++:
http://pastebin.com/m40d757b7

BTW, isn't explicitly triggering the GC kind of a "form" of manual memory
management? I consider calling `flush' on my region allocator manual
management.

[...]
 
G

George Kettleborough

Thomas J. Gritzan said:
It's a bit unfair to compare one algorithm in one language and another
algo in another language. You can find a test program that shows that
whatever you want is better than whatever you want.

To make it a little bit more fair, you could run the Java GC the same
time you flush the region allocator in C++:
http://pastebin.com/m40d757b7

BTW, isn't explicitly triggering the GC kind of a "form" of manual memory
management? I consider calling `flush' on my region allocator manual
management.

[...]

With Java you can only "hint" to GC that it might be a good time to run.
There is no guarantee at all that any memory will be freed when you
"hint" to GC. Also GC can run at the worst possible time in your
program. I think a lot of Java programmers use hacks to force GC to run
at certain times.
 
C

Chris M. Thomasson

Noah Roberts said:
To compare fairly, did you compile the Java version with the GCC native
compiler? If not then you're comparing a native byte code program to one
running in a VM.


Although I imagine it's possible to do this within Java, or any GC'd
language, I also imagine it wouldn't translate as cleanly...it would be
harder.


You can very likely write an allocation pool in Java. A fair comparison
would do that. Then you could compare the difficulty of the task, writing
a fast manager in C++ vs. in a GC environment hell bent on doing it for
you. Then weigh the benefits of each with the frequency in which such a
task is necessary.

I can most certainly create object pooling in Java. However, manually
returning object to the pool in a GC-based lang? Well, okay, but it seems to
defeat the "master" purpose. I made a totally retarded mistake of not
telling pastebin to keep my post forever. Well, I will post my stuff again,
along with Java pool allocator. It just kind of seems like using manual
frees/flushes to pool in GC is not... Well, humm... I was only interested in
manual vs automatic. However, I see your valid point...

BTW, how do I enforce "strict" high-water marks in Java without rejecting
pooled allocation requests? In manual world, I can forcefully shrink off
excess crap in the pool. In GC world, well, how can I do that?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,161
Messages
2,570,892
Members
47,431
Latest member
ElyseG3173

Latest Threads

Top