Garbage collection in C++

J

James Kanze

You seem to think that a program that uses GC uses more
memory, which is just false in the general case.

It's certainly possible for a program to not use more memory
when garbage collection is used. If a copying collector is
used, it's even possible for it to use less, because there will
be no fragmentation. But typically, in order to use so little
memory, the collector must be run very, very often, which has a
decidedly negative impact on performance. Of course, a copying
collector could improve locality, thus reducing swapping. But
there are also applications where this won't make a difference.

Applications which systematically allocate a very few, large
arrays should probably not use garbage collection, at least not
"out of the box". (The Boehm collector has provisions for a
mixed model; you can allocate the large arrays outside of
garbage collection, and let the garbage collector take care of
the rest. But of course, this negates one of the advantages of
garbage collection; you now have to think about how memory is
going to be allocated and freed, using different strategies for
different objects.)
Why should it use more?

Because it does:). That's been my experience, anyway.

I don't know enough about Juha's application, but there are
certainly applications which shouldn't use garbage collection,
and others which should use some sort of mixed model, or perhaps
require special tuning for garbage collection to be effective.
A lot of programs which calculate something and then terminate
don't really ever need to free anything. In which case, they
could use something a lot simpler than either malloc or garbage
collection. (Of course, if you never trigger the collector when
you're using garbage collection, garbage collection can be
incredibly fast.)
The amount of memory a program is using depends on the program
logic and its memory trace and not on how the memory is
managed, which is, as I said, an uninteresting implementation
detail.

That's not true when you start pushing towards the limits. In
the distant past, I've had programs which would run out of
memory using one implementation of malloc, and not with another.
 
J

James Kanze

Would you say that lisp encourages people to write programs
using a functional style? Would you say that C encourages
people to write with an imperative style?

A programming language is more than just a low level tool.
Using Basic can be harmful. Garbage collection just isn't at
that level; it just doesn't have that much power.

In other words, C++ with garbage collection will still be C++,
and will encourage and discourage whatever habits C++ encourages
or discourages today. C++ with garbage collection will have no
relationship with Java, or C#, or Lisp, or whatever, or at
least, no more relationship with them than it has today.
 
J

Juha Nieminen

Matthias said:
You seem to think that a program that uses GC uses more memory, which is
just false in the general case.

I was not talking about GC in particular, but modern "high-level
languages" in general (and thinking about Java in particular).

GC in itself doesn't make a language memory-hog if the language offers
the tools to create very memory-efficient implementations and hide them
behind nice abstractions (eg. in similar ways as C++ does).

OTOH there are languages (such as Java) where allocating each object
dynamically is basically the only choice you have, which makes creating
memory-efficient programs a lot more difficult. For instance, it makes
it very difficult to squeeze every object into 2 bytes inside a byte
array (as a completely imaginary example).
 
S

SG

Exactly. The current support becomes a lot more awkward if the
cleanup code needs to access local variables.
[...]
the references out in the generated code.) A lambda function
would be nothing more than a functional object which derived
from such a class; a lambda expression would wrap the expression
in a lambda function, then generate the object. Cleanup, above
would be derived from the class as well, except that the code
would define the destructor, rather than an operator()().

What you want seems to be a "scope guard". A Lambda-based scope guard
could be used like this:

template<typename F> requires Callable<F> && MoveConstructible<F>
class scope_guard { /* ... library magic ... */ };

template<typename F> requires Callable<F> && MoveConstructible<F>
scope_guard<F> make_scope_guard(F f);

double* example()
{
double* p = new double[123];
auto && sg = make_scope_guard( [p]{delete[] p;} );
// compute data or throw exception here
sg.cancel(); // makes dtor do nothing
return p;
}

There's no derivation, no virtual functions and no dynamically
allocated function object involved. Of course, the example's purpose
was ONLY to demonstrate the how lambdas can be used to make arbitrary
scope guards. In this particular instance managing the dynamically
allocated array should be done differently. At least something like
this:

unique_ptr<double[]> example2()
{
unique_ptr<double[]> p (new double[123]);
// compute data or throw exception here
return p;
}

It comes with the benefit of the function's declaration being self-
explanatory w.r.t. the life-time management of the allocated array. It
gets even better if you stick with std::vector which will become
movable in C++0x:

std::vector said:
I've not studied the lambda proposition in detail, so I don't
know how much of the above it might incorporate. A quick
glance, however, does show that it involves a "closure object",
which sounds very much like the instance of my anonymous class
(although described in somewhat different language). So it
shouldn't be too hard to add "cleanup" as an extension, or in a
later version of the standard, if someone can find time to write
up a proposal for it.

It seems that the library solution is as convenient and flexible as a
dedicated language feature would be. So, a "clean up" core language
feature is hardly justified.

Cheers!
SG
 
M

Matthias Buelow

Dilip said:
The quantity of memory your
application needs is your application's own business. GC simply
_manages_ how much you create.

Ah, someone's got the point I was trying to make, thank you!
 
M

Matthias Buelow

James said:
Because it does:). That's been my experience, anyway.

Juha was saying that certain applications (or algorithms, in general)
can't be run in a GC'd environment on the same problem size as they can
in a manually managed environment, which is just plain false. Of course
a, say, Unix process may have some more memory mapped to it if that's
still in the GC's memory pool internally just as it also happens with
some malloc implementations which don't free their stuff in a timely
fashion. This is, however, irrelevant to the application itself.
I don't know enough about Juha's application, but there are
certainly applications which shouldn't use garbage collection,
and others which should use some sort of mixed model, or perhaps
require special tuning for garbage collection to be effective.

Well, a GC-managed environment can always be turned into a manually
managed environment much easier than the other way round.
That's not true when you start pushing towards the limits. In
the distant past, I've had programs which would run out of
memory using one implementation of malloc, and not with another.

Application specialities and implementation details. :)
 
M

Matthias Buelow

SG said:
It seems that the library solution is as convenient and flexible as a
dedicated language feature would be. So, a "clean up" core language
feature is hardly justified.

It's beginning to resemble Perl, though.
 
J

Juha Nieminen

Matthias said:
Juha was saying that certain applications (or algorithms, in general)
can't be run in a GC'd environment on the same problem size as they can
in a manually managed environment, which is just plain false.

Except that I didn't say anything of the sorts.

What said had absolutely nothing to do with GC. What I said is that in
many "high-level" languages it's very difficult to implement, for
example, very memory-efficient programs while still keeping a fair
amount of abstraction and maintainability to the code. I like C++
because it offers tools to do that.

It is true, however, that there are *some* GC'd programming languages
which are more memory hog than others.
 
N

Noah Roberts

Hendrik said:
Oh come on. You're not trying to tell me that this simple
rabulistic assumption is all the base you have for your
argument, are you?
(For a starter: Just about everyone coming to C++ from
languages like Java will tell you that the need to take
care of your memory is the most important aspect of C++.

I remember learning C++ after learning Java. I didn't know jack shit
and thought C++ was horrible. It didn't do OO right, didn't clean up
memory, waa waa waa. Years of working with it, and actually learning
the techniques that those with experience use, I've changed my mind.

In other words, of course someone from Java is going to tend to bitch
and complain about this and that, GC being among the top complaints.
From what I can tell though this is an argument from ignorance and
inexperience. Although you might find uses for GC, and there are
collectors for C++, I've yet to see any need for it.
If GC removes that need, it's more than a tool, too, and
does indeed encourage a very different programming style.)

If by encouraging a different style you mean slacking and lacking
discipline as a developer I really do have to agree there. People that
are used to the techniques for handling memory and ownership issues
don't have as much trouble when non-memory resources need to be tracked
and dealt with.
 
C

Chris M. Thomasson

Matthias Buelow said:
Ah, someone's got the point I was trying to make, thank you!

The point I was trying to make is that the sample pseudo-code program only
needed to have less than 33 foo objects allocated at any one time. This
cannot be guaranteed with GC because of its non-deterministic nature. Your
program will be using more memory than it has to. The GC will most likely
let you allocate a shi%load of foo objects before it decides to run a scan.
Likely, the scan will only be run once the internal heap of the current
generation is exhausted. This is a side-effect of letting the GC handle
everything for you.
 
M

Matthias Buelow

Chris said:
Seriously... Why do you think they choose C++ for their custom
web-server's, indexers, file-systems, databases ect...?

First, I don't know if what you say is true. Second, I have no clue.
Probably for a similar reason why I'm programming in C++; that's not
because I like the language.
 
M

Matthias Buelow

Chris said:
The point I was trying to make is that the sample pseudo-code program
only needed to have less than 33 foo objects allocated at any one time.
This cannot be guaranteed with GC because of its non-deterministic
nature. Your program will be using more memory than it has to.

No it won't. It will be using exactly those 33 objects. What makes you
think otherwise? The "garbage" is by definition unused memory and will
be reclaimed if more memory is needed just in the same way that if you
delete or free() some stuff in C++, it might hang around on a free list
for a while but that's an implementation detail that is completely
irrelevant to the question of how much memory the program is using.

If you have an algorithm that uses 1 million memory "cells" on a machine
with 1 million "cells" available, it can run in the same way no matter
if that memory is managed by hand or if the GC does it.
 
C

Chris M. Thomasson

I created a little test application for Java and C++:


http://pastebin.com/mc3e3f4e
(Java version)


http://pastebin.com/m4bf7db0a
(My C++ version)


As you can see, my C++ version uses a custom region allocation scheme I
created and posted to this group here:

http://groups.google.com/group/comp.lang.c++/msg/3e8dc967c7ddba7c


I compile C++ version with MSVC 2005 Express. I compile Java version with
Java version 1.6.0_05. The command line I use for Java version is `java
tree'. The platform is Windows XP SP2 on my old HyperThreaded P4 3.06gz with
only 512mb ram.




C++ Output:
______________________________________________________________________
Time: 12953 ms




Java Output:
______________________________________________________________________
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at tree.create_tree(tree.java:9)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.main(tree.java:19)




The Java version crashes because it apparently runs out of memory! It's
memory usage in Task Manager spikes up and beyond 80 MB, and then BAM! it
dies... What total crap!


Anyway, the C++ version memory usage peaks out at around 29-30 MB.


So, on my platform, the GC in Java is not good for this example program. It
simply does not keep up, and the memory usage flies out of control.


However, the C++ version uses custom region allocator, and its memory usage
is non-volatile and very stable. I can actually run it on my older system.
Cool.


I conclude, that for this example, GC is no good. Clever manual memory
management is key to achieving efficient memory usage.
 
M

Matthias Buelow

Chris said:
The Java version crashes because it apparently runs out of memory! It's
memory usage in Task Manager spikes up and beyond 80 MB, and then BAM!
it dies... What total crap!

It probably exhausts the stack in the recursion.
I conclude, that for this example, GC is no good. Clever manual memory
management is key to achieving efficient memory usage.

If anything at all, your example would only "prove" that C++ is more
than 10x as verbose as Java.
 
C

Chris M. Thomasson

Matthias Buelow said:
No it won't. It will be using exactly those 33 objects. What makes you
think otherwise? The "garbage" is by definition unused memory and will
be reclaimed if more memory is needed just in the same way that if you
delete or free() some stuff in C++, it might hang around on a free list
for a while but that's an implementation detail that is completely
irrelevant to the question of how much memory the program is using.

If you have an algorithm that uses 1 million memory "cells" on a machine
with 1 million "cells" available, it can run in the same way no matter
if that memory is managed by hand or if the GC does it.

Why does the Java version run out of memory on my machine with 512mb of
memory, and the C++ version does not:


http://pastebin.com/mc3e3f4e
(Java version)


http://pastebin.com/m4bf7db0a
(My C++ version)



Here is the output I get:



C++ Output:
______________________________________________________________________
Time: 12953 ms




Java Output:
______________________________________________________________________
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at tree.create_tree(tree.java:9)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:10)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.create_tree(tree.java:11)
at tree.main(tree.java:19)



Humm... Interesting.
 
M

Matthias Buelow

Chris said:
Why does the Java version run out of memory on my machine with 512mb of
memory, and the C++ version does not:

Programming error? From a quick glance, the two programs look totally
different so they're hard to compare (the C++ version is more than 10x
the size of the Java version, so wtf.?)
Besides, you won't trick me into defending Java, anyway.
 
C

Chris M. Thomasson

Matthias Buelow said:
Programming error?

Where? I don't see any at all. None at all.



From a quick glance, the two programs look totally
different so they're hard to compare (the C++ version is more than 10x
the size of the Java version, so wtf.?)

The C++ version has my region allocator in there. Focus on the latter end of
the code under the:


/* Tree: C++ -vs- Java
_____________________________________________________________*/


section please. Thanks.
 
C

Chris M. Thomasson

Matthias Buelow said:
It probably exhausts the stack in the recursion.

Can you blow the stack in a Java program? BTW, the error explicitly
mentioned Java heap space. Also, if its the stack, why did it not blowup in
the C++ version?



If anything at all, your example would only "prove" that C++ is more
than 10x as verbose as Java.

Here is the code with my region allocator stripped:




/* Tree: C++ -vs- Java
_____________________________________________________________*/
#include <ctime>
#include <iostream>


struct tree {
tree* m_left;
tree* m_right;
};


static region_allocator<tree, 1024 * 32> g_tree_alloc;


tree* create_tree(int n) {
if (n < 1) {
return NULL;
}

tree* const t = g_tree_alloc.allocate();
t->m_left = create_tree(n - 1);
t->m_right = create_tree(n - 1);

return t;
}


int main() {
std::clock_t const start = std::clock();
for (int r = 0; r < 10; ++r) {
for (int i = 0; i < 15; ++i) {
create_tree(22);
g_tree_alloc.flush();
}
}
std::clock_t const end = std::clock();
std::cout <<"Time: " <<
double(end-start)/CLOCKS_PER_SEC * 1000 << " ms\n";

return 0;
}






Anyway, if I change the line `create_tree(22);' to `create_tree(15)', the
Java version finally works on my platform. Here is the output I get:



C++:
________________________________________
Time: 64 ms




Java:
________________________________________
Time: 450 ms





So much for Java GC allocator being faster than well written C++ allocator!


:^D
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,161
Messages
2,570,892
Members
47,431
Latest member
ElyseG3173

Latest Threads

Top