Array optimizing problem in C++?

J

James Kanze

"James Kanze" <[email protected]> wrote in message
Sorry. I was confused with another sub-thread. What actual
benchmarks show is that garbage collection, even in C++, can be
faster than manual memory management.

Everything has its tradeoffs. GC works fine for a lot of
projects. However, if you know what your doing, manual memory
management can work very well for designs that must exist at a
"low-level". Especially in platforms that do not have GC. :^)

Certainly. At a low enough level, garbage collection simply
isn't appropriate. I wouldn't use it in kernel code, for
example, at least not in the lowest levels, and you obviously
can't count on it when writing the garbage collector itself.
 
J

Jerry Coffin

(e-mail address removed) says...



Are you flame-baiting here? The claims I've seen come from well-respected
authors who neither have "questionable" expertise nor an overt vested interest
in anything but the truth. They aren't putzes.

You are mistaken.

No, I was not flame baiting, I was simply stating the situation as I saw
it.

Your conclusion, however, strikes me as illogical. I'm perfectly willing
to believe that your experience differs from mine -- but that's hardly
an indication that either of us is mistaken, only that our experiences
differ. I specifically did NOT state nor did I intend to imply any
greater conclusion from that.

Personally, I suspect there's a bit more to the story than just that,
however. First of all, writing a book strikes me as a very difficult
undertaking. Worse, every author I've heard talk about it says that the
amount of work exceeds expectations by a huge margin, so it's almost
certainly an even larger undertaking than it appears to me.

I am reasonably certain that anybody willing to go to that amount of
work has some specific ideas s/he thinks others should know. I'd add
that I think most probably have specific opinions that they want their
target audience to believe. The reasoning behind those opinions may well
be reasonable, intelligent and logical, but to go to the trouble of
writing a book, the author almost HAS to be convinced that it's
important for others to believe or at least understand their position.
As such I'd place almost any author (practically by definition) in the
first category.

While I have a tremendous amount of respect for many authors, I also
realize that authors tend to have direct experience in inverse
proportion to the degree of respect they garner -- simply put, a highly
successful author tends to spend a great deal of time writing teaching
and speaking. In fact, he will typically spend so much of his time on
other activities that he has little or none left over for actual coding.

Of course, there are exceptions -- the most obvious would be people who
write only a few books, so they can spend most of their time coding.
Andrei Alexandrescu and David Abrahams would be two obvious examples.
How is that different in non-GC environments?

That depends. Many non-GC environments lack support for RAII as well --
but I rarely if ever use most of them. The environment in which I
normally work (C++) has quite reasonable support for RAII, and only
minimal support for GC -- I doubt a truly conforming implementation of
C++ could include GC, but the differences between the standard and what
you need to do to allow GC are minimal (and most of what you "give up"
are things that I, for one, would generally avoid anyway).
That has never happened in any instance in Java that I've every heard of, over
the last decade-plus of using it.

Well, I'll openly admit that my serious use of Java happened early
enough in its history that the JVMs at the time were probably quite
immature. I'll also openly admit that many of the problems I encountered
with Java were only partially related to its use of GC. Nonetheless, the
inclusion of GC often seems (at least to me) to almost encourage designs
that are likely to be problematic.

OTOH, I have encountered a couple of problems that could be traced
directly back to GC problems -- just in languages other than Java.
Java is not free of memory errors because you have GC. It sounds like you're
describing JNI scenarios, analogous to using a C++ program to deal with an
external resource, and liberating the variable prior to releasing the
resource. That's a programming mistake, not a language flaw.

You seem to be equating use of GC with Java -- at least for me, most use
of GC has been in ML, Smalltalk and Scheme (though I've probably used a
couple of other Lisp variants at least as much as Java as well).
 
J

Jerry Coffin

[ ... ]
Which is, of course, why it is so difficult to find concrete
evidence (for either side) for programs which are correct:).

Quite true -- and I've trimmed the reference specifically because that
statement applies to essentially _any_ argument about programming. :)

[ ... ]
That's simply false. I've used garbage collection in a few
applications, and RAII played exactly the same (important) role
it played without garbage collection. RAII is concerned with on
stack objects (with very, very few exceptions), and garbage
collection changes absolutely nothing with regards to on stack
objects.

Sorry -- I accidentally conflated a couple of different situations. If
you add GC to a language that supports RAII, you don't (necessarily)
lose RAII in the process. OTOH, your typical language that includes GC
doesn't support RAII well (if at all). _If_ I have to choose between the
two, I tend to prefer RAII because it applies to a larger range of
resources.
That sounds more like an OS problem than a garbage collection
problem, and I don't see how manual memory management would
change anything. But perhaps if you'd give a concrete
example... (I've never encountered this.)

I don't see how it could be blamed on the OS -- the OS was just doing
what you asked it to. I suppose if you want to get technical, the real
problem was in the library where it interfaced to the OS, not
(technically) inside of the GC itself. Nonetheless, if garbage
collection hadn't been in use, the problem wouldn't have existed.

Sitting back and thinking about it a bit more, I think I left out what I
see as really the biggest problem with GC -- that is, the mindest that
often accompanies its use. When they're using GC, many people see memory
management as almost an afterthought that they don't need to really
think about. By contrast, I think manual memory management tends to "put
the fear of God" into some people.

The result seems to be that many people accustomed to GC produce designs
that are excessively complex. Manual memory management encourages
simplicity of memory management, and in doing so encourages simplicity
of the overall design. Although not directly related to the memory
management itself, this simplicity in the overall design is generally
desirable in and of itself.
 
J

James Kanze

[ ... ]
That's simply false. I've used garbage collection in a few
applications, and RAII played exactly the same (important) role
it played without garbage collection. RAII is concerned with on
stack objects (with very, very few exceptions), and garbage
collection changes absolutely nothing with regards to on stack
objects.
Sorry -- I accidentally conflated a couple of different
situations. If you add GC to a language that supports RAII,
you don't (necessarily) lose RAII in the process. OTOH, your
typical language that includes GC doesn't support RAII well
(if at all). _If_ I have to choose between the two, I tend to
prefer RAII because it applies to a larger range of resources.

Why choose? With C++ and the Boehm collector, you can have
both.

Seriously, I don't know. My own experience is that managing
other resources (or determinate lifetime in general) isn't that
difficult, even without RAII. In some cases, in fact, I find an
explicit finally block clearer---it makes it far more visible
how the resource is being handled, since the code for releasing
it is local to the same function in which it is freed. But
having seen a lot of code, in a lot of different languages, I'm
forced to recognize that my own experience really is my own
experience; missing finally blocks seem to be a very frequent
error in Java.

The real point, of course, is that memory is not a resource like
another. It's more part of the basic support for the program,
like the CPU. And that explicitly managing memory is somewhat
along the lines of explicitly managing CPU use (aka optimizing);
you do it when it's necessary, but most of the time, provided
you're using reasonable algorithms, you don't give it too much
thought.
I don't see how it could be blamed on the OS -- the OS was
just doing what you asked it to.

Which was? Either memory management has been turned over to the
OS (in which case, the memory in question isn't even part of the
process image, for the garbage collection to free), or it hasn't
been (in which case, you still have a pointer to it, so garbage
collection won't free it).

I think my point is that if the OS is using part of your process
memory, and you don't know about it, then you have a problem.
Garbage collection or not. (How do you know not to explicitly
reuse the memory, for example?) But I know that you understand
these sort of issues, so I'm pretty sure there's some aspect
that I've not seen.
I suppose if you want to get technical, the real problem was
in the library where it interfaced to the OS, not
(technically) inside of the GC itself. Nonetheless, if garbage
collection hadn't been in use, the problem wouldn't have
existed.
Sitting back and thinking about it a bit more, I think I left
out what I see as really the biggest problem with GC -- that
is, the mindest that often accompanies its use. When they're
using GC, many people see memory management as almost an
afterthought that they don't need to really think about. By
contrast, I think manual memory management tends to "put the
fear of God" into some people.

That is a real problem. In general, any "new" technology gets
overused---people seem to believe in silver bullets. And for
various historical reasons, many programmers don't seem to be
aware of the distinction between object lifetime and memory
management; more than a few Java programmers do make claims to
the effect that you can ignore object lifetime considerations;
by tying memory management to object lifetime (and thus forcing
a determinate lifetime on all objects), we certainly draw
attention to object lifetime in general. But should we really
refuse a technology which will improve the productivity of the
competent programmers just because it can and will be abused?
And is artificially creating unnecessary work (consideration of
object lifetime when it is not necessary) really the best way to
stimulate awareness of an issue.
The result seems to be that many people accustomed to GC
produce designs that are excessively complex.

I've not really found that GC was the problem here. More
runtime dynamicism: people who consider that everything is an
object tend to produce designs that are excessively complex.
Manual memory management encourages simplicity of memory
management, and in doing so encourages simplicity of the
overall design.
Although not directly related to the memory management itself,
this simplicity in the overall design is generally desirable
in and of itself.

Certainly. That's the main reason I argue for GC. By not
artificially forcing determinate lifetime on objects that don't
need it, you simplify the design.

Of course, your design should be as simple as possible, but no
simpler. My experience with programmers (mis)using garbage
collection is that they tend to ignore object lifetime
completely, and make the design simpler than possible (which is
another way of saying that it doesn't handle all cases
correctly). (Of course, some do take advantage of the time
gained by neglecting some essential issues in order to add
unnecessary complexity elsewhere.)
 
J

James Kanze

(e-mail address removed) says...

[...]
You seem to be equating use of GC with Java -- at least for
me, most use of GC has been in ML, Smalltalk and Scheme
(though I've probably used a couple of other Lisp variants at
least as much as Java as well).

Just a note, but in all of these languages, *all* objects are
dynamically allocated. I wonder if this isn't more what caused
you problems, rather than garbage collection.

In a language in which all objects are dynamically allocated,
garbage collection is almost a necessity. Can you imagine what
it would be like if you had to do a delete for each
java.lang.String you new'ed? And of course, a language in which
all objects are dynamically allocated can't really support RAII,
at least not in any meaningful way. And that's a route I don't
particularly want to go, at least not in C++. (Note that done
correctly, such a route is not necessarily wrong, and even has
certain immediate advantages---in a very real sense, dangling
ponters are impossible. But personally, I still prefer a
language where values behave like values.)
 
J

Jerry Coffin

(e-mail address removed)>, (e-mail address removed)
says...

[ ... ]
Seriously, I don't know. My own experience is that managing
other resources (or determinate lifetime in general) isn't that
difficult, even without RAII. In some cases, in fact, I find an
explicit finally block clearer---it makes it far more visible
how the resource is being handled, since the code for releasing
it is local to the same function in which it is freed. But
having seen a lot of code, in a lot of different languages, I'm
forced to recognize that my own experience really is my own
experience; missing finally blocks seem to be a very frequent
error in Java.

Right -- a finally block is explicit, but requires duplication of the
resource deallocation everywhere the resource is used. Given a choice
between something that's explicit by likely to be used incorrectly, and
something that's less explicit but makes incorrect use difficult, I'll
pick the less explicit version every time.
The real point, of course, is that memory is not a resource like
another. It's more part of the basic support for the program,
like the CPU. And that explicitly managing memory is somewhat
along the lines of explicitly managing CPU use (aka optimizing);
you do it when it's necessary, but most of the time, provided
you're using reasonable algorithms, you don't give it too much
thought.

If memory was really just memory, I'd more or less agree. In reality,
it's not -- memory is used to hold _objects_. In an OO program, managing
those objects tends to be most of what the program does. RAII and GC
provide different ways of managing objects. RAII automates quite a bit
of the management of many types of objects, while GC automates one part
of managing a much smaller subset of objects.

[ ... I had said: ]
Which was? Either memory management has been turned over to the
OS (in which case, the memory in question isn't even part of the
process image, for the garbage collection to free), or it hasn't
been (in which case, you still have a pointer to it, so garbage
collection won't free it).

TTBOMR, the code in question did something along the lines of allocating
an array, and then writing it to disk. The sole remaining pointer to the
array had been passed to the OS for writing, so the GC (apparently)
decided it was ready to be collected. As I said previously, the real
problem was probably in the library rather than the GC per se -- the
library should have kept a pointer to the data around until the OS call
to write the data had returned, but it apparently didn't. At the same
time, I should add that most of this is more or less surmise -- I know
what failed was a relatively simple write of data to the disk, but I'll
admit that I never knew with absolute certainty why it failed. The
vendor never admitted to the problem, but after they shipped a new
version of the compiler, the problem disappeared.

Then again, I was never entirely comfortable with that either: the old
version had been for Windows 3.1, and the new one was for Win32. I was
never sure whether the problem was fixed, or the bug remained, but we
never saw it because the greater memory reduced the likelihood of the GC
running at the crucial time.
I think my point is that if the OS is using part of your process
memory, and you don't know about it, then you have a problem.
Garbage collection or not. (How do you know not to explicitly
reuse the memory, for example?) But I know that you understand
these sort of issues, so I'm pretty sure there's some aspect
that I've not seen.

We knew the OS was using the memory -- but apparently, the library code
hadn't ensured that the garbage collector did.

[ ... ]
That is a real problem. In general, any "new" technology gets
overused---people seem to believe in silver bullets. And for
various historical reasons, many programmers don't seem to be
aware of the distinction between object lifetime and memory
management; more than a few Java programmers do make claims to
the effect that you can ignore object lifetime considerations;
by tying memory management to object lifetime (and thus forcing
a determinate lifetime on all objects), we certainly draw
attention to object lifetime in general. But should we really
refuse a technology which will improve the productivity of the
competent programmers just because it can and will be abused?
And is artificially creating unnecessary work (consideration of
object lifetime when it is not necessary) really the best way to
stimulate awareness of an issue.

I don't know for sure -- on a number of counts. First of all, I'm not
sure it really improves productivity by any noticeable margin. Second,
I'm not sure that consideration of object lifetime really causes much
extra work it the long run either. As you've pointed out, the variance
is so great that the average means little, so meaningful measurement is
difficult; when I say "I don't know", I'm not trying to voice
disagreement, but really admitting that I just don't know.
Of course, your design should be as simple as possible, but no
simpler. My experience with programmers (mis)using garbage
collection is that they tend to ignore object lifetime
completely, and make the design simpler than possible (which is
another way of saying that it doesn't handle all cases
correctly). (Of course, some do take advantage of the time
gained by neglecting some essential issues in order to add
unnecessary complexity elsewhere.)

Ah yes, the "it doesn't work, but look at my beautiful GUI" syndrome.
The worst part is that (IMO) their GUI usually reflects the confusion
and complexity of the code...
 
J

Jerry Coffin

[ ... ]
Just a note, but in all of these languages, *all* objects are
dynamically allocated. I wonder if this isn't more what caused
you problems, rather than garbage collection.

True -- the languages are enough different from C++ that a meaningful
comparison is difficult.
In a language in which all objects are dynamically allocated,
garbage collection is almost a necessity. Can you imagine what
it would be like if you had to do a delete for each
java.lang.String you new'ed? And of course, a language in which
all objects are dynamically allocated can't really support RAII,
at least not in any meaningful way. And that's a route I don't
particularly want to go, at least not in C++. (Note that done
correctly, such a route is not necessarily wrong, and even has
certain immediate advantages---in a very real sense, dangling
ponters are impossible. But personally, I still prefer a
language where values behave like values.)

Dangling pointers are impossible as long as everything is working
correctly, that's true. OTOH, makes some other things easy to manage
that most of the other languages make more difficult. The questions are
1) which problems arise more often, and 2) which are more difficult to
fix.

In fairness, the other point I should probably raise is that quite a bit
of what I write is _fairly_ low-level kinds of code. In particular, I've
written a fair number of debugger-like things (mostly for automating
various code tracing, rather than really debugger-like things such as
manual single-stepping, but fairly similar nonetheless). In any case,
their relatively low-level nature may mean that I just don't encounter
as many of the situations where GC would do a lot of good, but do
encounter more were RAII is useful. In particular, creating or
destroying an object in my program is typically tied directly to some
specific event in the code being tested, leaving little room for the
garbage collector to really contribute a whole lot.
 
I

Ian Collins

Jerry said:
(e-mail address removed)>, (e-mail address removed)
says...

[ ... ]
Seriously, I don't know. My own experience is that managing
other resources (or determinate lifetime in general) isn't that
difficult, even without RAII. In some cases, in fact, I find an
explicit finally block clearer---it makes it far more visible
how the resource is being handled, since the code for releasing
it is local to the same function in which it is freed. But
having seen a lot of code, in a lot of different languages, I'm
forced to recognize that my own experience really is my own
experience; missing finally blocks seem to be a very frequent
error in Java.

Right -- a finally block is explicit, but requires duplication of the
resource deallocation everywhere the resource is used. Given a choice
between something that's explicit by likely to be used incorrectly, and
something that's less explicit but makes incorrect use difficult, I'll
pick the less explicit version every time.
I think it's also a matter of delegation; one delegates the management
of the resource to the object implementing RAII. Any cleanup code is
encapsulated in that object and only needs to be maintained in one place.
 
M

Mark Thornton

Jerry said:
If memory was really just memory, I'd more or less agree. In reality,
it's not -- memory is used to hold _objects_. In an OO program, managing
those objects tends to be most of what the program does. RAII and GC
provide different ways of managing objects. RAII automates quite a bit
of the management of many types of objects, while GC automates one part
of managing a much smaller subset of objects.

RAII and GC have overlapping applicability, and each has capability not
possessed by the other. I think it is meaningless to talk about the
relative sizes of their areas of use. For my work RAII is applicable to
a tiny fraction of objects. My work is characterised by many graph
structures (with numerous cycles), overlapping lifetimes, and more than
one thread.

Mark Thornton
 
J

James Kanze

(e-mail address removed)>, (e-mail address removed)
says...
[ ... ]
Seriously, I don't know. My own experience is that managing
other resources (or determinate lifetime in general) isn't that
difficult, even without RAII. In some cases, in fact, I find an
explicit finally block clearer---it makes it far more visible
how the resource is being handled, since the code for releasing
it is local to the same function in which it is freed. But
having seen a lot of code, in a lot of different languages, I'm
forced to recognize that my own experience really is my own
experience; missing finally blocks seem to be a very frequent
error in Java.
Right -- a finally block is explicit, but requires duplication
of the resource deallocation everywhere the resource is used.
Given a choice between something that's explicit by likely to
be used incorrectly, and something that's less explicit but
makes incorrect use difficult, I'll pick the less explicit
version every time.

Seriously, I didn't say I preferred it. I just wanted to point
out that the argument isn't as cut and dried as it is often made
out. And depends on the resource; one of the classical
"resources" that gets mentioned, for example, are file
descriptors---since "freeing" a file descriptor (close()) can
fail, and you probably have to treat the error, differently in
different cases, there's a very strong argument for a finally
block in this case (with an assertion failure that the file has
been correctly closed in the destructor).

The most frequent clean-up in C++ is memory, with locks probably
coming in second. The first isn't needed with garbage
collection, and a lot of languages have make synchronization
part of the language, to avoid the second. I disagree with this
choice, because there are cases where you want to hold a lock
independantly of scope---I use boost::shared_ptr to manage locks
more often than I do to manage memory.
If memory was really just memory, I'd more or less agree. In
reality, it's not -- memory is used to hold _objects_. In an
OO program, managing those objects tends to be most of what
the program does. RAII and GC provide different ways of
managing objects.

And that's where you're wrong. Obviously, memory is used to
hold objects. And code. And a lot of other things. Memory,
per se, is part of the underlying abstraction, like the CPU.

Back in the old days, we really had to manage memory. Including
the memory in which functions were situated---do you remember
programming with overlays, before virtual memory? Today,
virtual memory makes all of that a bad memory. Not using
garbage collection today is in many ways like refusing virtual
memory was back then (and a lot of programmers did refuse it,
and insisted it was better to have to manually manage your
overlays).

Garbage collection doesn't manage objects, or object lifetime.
That's still up to you.
RAII automates quite a bit of the management of many types of
objects, while GC automates one part of managing a much
smaller subset of objects.

GC has nothing to do with managing objects. It only manages
memory.
[ ... I had said: ]
Which was? Either memory management has been turned over to the
OS (in which case, the memory in question isn't even part of the
process image, for the garbage collection to free), or it hasn't
been (in which case, you still have a pointer to it, so garbage
collection won't free it).
TTBOMR, the code in question did something along the lines of
allocating an array, and then writing it to disk. The sole
remaining pointer to the array had been passed to the OS for
writing, so the GC (apparently) decided it was ready to be
collected. As I said previously, the real problem was probably
in the library rather than the GC per se -- the library should
have kept a pointer to the data around until the OS call to
write the data had returned, but it apparently didn't.

The real problem is even more fundamental, I think. Regardless
of the system you're using to manage memory---garbage collection
or manual---how do you know when it is safe to reuse this
memory? Solve that, and you can find solutions with garbage
collection or with manual management.

The one case this could be a problem is if you're adding garbage
collection to C++, and the library isn't aware of it, and
expects you to free the pointer manually in some call-back.
Except that even then, if you're going to free the pointer,
you've got to have a copy of it somewhere.
[ ... ]
That is a real problem. In general, any "new" technology
gets overused---people seem to believe in silver bullets.
And for various historical reasons, many programmers don't
seem to be aware of the distinction between object lifetime
and memory management; more than a few Java programmers do
make claims to the effect that you can ignore object
lifetime considerations; by tying memory management to
object lifetime (and thus forcing a determinate lifetime on
all objects), we certainly draw attention to object lifetime
in general. But should we really refuse a technology which
will improve the productivity of the competent programmers
just because it can and will be abused? And is artificially
creating unnecessary work (consideration of object lifetime
when it is not necessary) really the best way to stimulate
awareness of an issue.
I don't know for sure -- on a number of counts. First of all,
I'm not sure it really improves productivity by any noticeable
margin.

It's measurably improved mine, in the applications where I've
been able to use it. How much improvement obviously will depend
on the application.
Second, I'm not sure that consideration of object lifetime
really causes much extra work it the long run either. As
you've pointed out, the variance is so great that the average
means little, so meaningful measurement is difficult; when I
say "I don't know", I'm not trying to voice disagreement, but
really admitting that I just don't know.

It depends on the objects. Maybe it's just my style, but I tend
to use polymorphic agents in some cases; typically, managing
their lifetime isn't a lot of extra work (boost::shared_ptr
works well in this case, since agents never point to other
agents, only to entity objects), but the amount isn't 0, either.
And in a few cases (I'm working on one right now), I have to
deal with more or less complicated graphs---again, because it is
a graph (with no parent links), I can get by with
boost::shared_ptr (in this case, my pre-Boost reference counted
pointer, in fact, since the code in question was originally
written some 15 years ago). But it makes things more
complicated. Not enormously so, but the little bits add up.
Ah yes, the "it doesn't work, but look at my beautiful GUI"
syndrome. The worst part is that (IMO) their GUI usually
reflects the confusion and complexity of the code...

Sometimes, it's what the customer asks for. I remember one case
where the customer was constantly on to us because some toolbar
was one pixel too high, or things like that. But never noticed
that the code systematically multiplied by 3.5 to convert
currency, i.e. DM to Euro, multiply the DM by 3.5; Euro to DM,
multipy the Euro by 3.5. The customer also paid a very large
sum to a research institute to establish ergonomic guidelines
for the GUI interface---all of which turned out to be just
common sense or good esthetics, and most of which, they then
insisted that we violate.
 
J

James Kanze

[ ... ]
Dangling pointers are impossible as long as everything is working
correctly, that's true. OTOH, makes some other things easy to manage
that most of the other languages make more difficult. The questions are
1) which problems arise more often, and 2) which are more difficult to
fix.

I think that this will largely depend on the application and the
programming style. Given that, I'd argue that GC is applicable
for certain applications and certain styles, and that one of the
goals of C++ is to give the programmer the choice. In this
regard, using garbage collection shouldn't be manditory (and
really can't be, if you want C++ to be usable in critical,
embedded applications, where all use of dynamic allocation is
banned), but it should be a requirement that an implementation
provide it.

As I said, it's a useful tool in some cases, and as such, should
be available.
 
J

Jerry Coffin

(e-mail address removed)>, (e-mail address removed)
says...

[ ... ]
I think that this will largely depend on the application and the
programming style. Given that, I'd argue that GC is applicable
for certain applications and certain styles, and that one of the
goals of C++ is to give the programmer the choice. In this
regard, using garbage collection shouldn't be manditory (and
really can't be, if you want C++ to be usable in critical,
embedded applications, where all use of dynamic allocation is
banned), but it should be a requirement that an implementation
provide it.

I don't see how it makes sense to mandate it. Some implementations are
targeted specifically and entirely at systems for which GC just doesn't
make sense at all. If you restricted the mandate to hosted systems, I'd
consider that a lot closer to reasonable, but even there I don't think
it really makes sense for it to be mandatory. Along with it just not
making sense, truly mandating GC is almost impossible anyway. The
current paper that talks about it has pages and pages of description,
but when you get down to it (and I know I've mentioned this before) the
garbage collection itself isn't really mandatory at all -- it all comes
down to one non-normative note saying that high quality implementations
are expected to make as much memory available as possible.

In the end, it's all about return on investment. In this case we're
mandating a large investment with no certainty of a return and (by your
estimate) only about a 10% return at best. I can think of a number of
things I think would benefit C++, but I've never formally suggested most
of them, because I don't think the payoff is worth the investment -- but
in these terms, GC scores considerably _worse_ than almost anything else
I can think of.
 
J

James Kanze

(e-mail address removed)>, (e-mail address removed)
says...
[ ... ]
I think that this will largely depend on the application and the
programming style. Given that, I'd argue that GC is applicable
for certain applications and certain styles, and that one of the
goals of C++ is to give the programmer the choice. In this
regard, using garbage collection shouldn't be manditory (and
really can't be, if you want C++ to be usable in critical,
embedded applications, where all use of dynamic allocation is
banned), but it should be a requirement that an implementation
provide it.
I don't see how it makes sense to mandate it. Some
implementations are targeted specifically and entirely at
systems for which GC just doesn't make sense at all. If you
restricted the mandate to hosted systems, I'd consider that a
lot closer to reasonable, but even there I don't think it
really makes sense for it to be mandatory. Along with it just
not making sense, truly mandating GC is almost impossible
anyway. The current paper that talks about it has pages and
pages of description, but when you get down to it (and I know
I've mentioned this before) the garbage collection itself
isn't really mandatory at all -- it all comes down to one
non-normative note saying that high quality implementations
are expected to make as much memory available as possible.

When you get right down to it, that's all the standard
"mandates" for delete, too. There's certainly nothing in the
standard which would guarantee that you'll eventually be able to
reuse the memory which is freed. For various reasons, such a
guarantee isn't possible, or at least, no one has found a
reasonable way to word it so that it still takes into account
all of the aleas of actual implementations (fragmentation,
different strategies with regards to coalescence, etc.). On the
other hand, the intent is clear.

The wording in the case of garbage collection is even more
wishy-washy, for political reasons. There's no real reason not
to mandate that the implementation support it whenever dynamic
allocation is supported (which is formally, always). In
practice, of course, in implementations will do whatever they
have to to be usable on the system in question; I've used
implementations of C which didn't support float and double, or
at least had compile time switches to turn it off. In the case
of garbage collection, of course, if you do delete everything
you've allocated, you don't see it. An implementation could
certainly make a version of malloc/free available which would
only work in this case, just as quality implementations
generally make several versions of malloc/free available today.
In the end, it's all about return on investment. In this case
we're mandating a large investment

What large investment? On the part of whom? Integrating the
Boehm collector is certainly not a lot of work on the part of
the implementors.
with no certainty of a return and (by your estimate) only
about a 10% return at best. I can think of a number of things
I think would benefit C++, but I've never formally suggested
most of them, because I don't think the payoff is worth the
investment -- but in these terms, GC scores considerably
_worse_ than almost anything else I can think of.

All I can say is that there are a number of users (myself
included) who find the investment worthwhile enough to do it
ourselves. It took me less than a day to get the Boehm
collector up and running with g++ under Linux, and I don't have
access to all of the inside knowledge of the implementors. And
10%, over the life of a project, is a lot more than one man-day,
even for a single project. 10% gain for 90% of the C++ projects
almost certainly adds up to man-years.
 
J

Jerry Coffin

[ ... ]
When you get right down to it, that's all the standard
"mandates" for delete, too. There's certainly nothing in the
standard which would guarantee that you'll eventually be able to
reuse the memory which is freed. For various reasons, such a
guarantee isn't possible, or at least, no one has found a
reasonable way to word it so that it still takes into account
all of the aleas of actual implementations (fragmentation,
different strategies with regards to coalescence, etc.). On the
other hand, the intent is clear.

It's true that very little is mandated about delete. The fundamental
difference is that it stops there -- i.e. we don't have pages and pages
throughout the rest of the standard devoted to enabling it to do
nothing. N2287 is 23 pages long, and it looks like about 15 pages of
that is proposed to be added to the standard. Along with that, they
propose changing the language to make some things that are currently
defined parts of the language become undefined behavior.

[ ... ]
What large investment? On the part of whom? Integrating the
Boehm collector is certainly not a lot of work on the part of
the implementors.

A couple of points. First of all, it looks to me like some work will
need to be done on the Boehm collector to make it conform with the
proposal in N2287.

Second, you seem to be looking at it entirely in terms of use with the
half dozen (or so) implementations to which it has already been ported.
As you know perfectly well, however, that's a long ways short of every
C++ compiler around. Porting it to a compiler that's not already
supported isn't nearly so trivial.

Finally, it looks like although the collector proper is covered by quite
a liberal license, parts of integrating it uses code that falls under
more restrictive licenses -- which might be a bit of a problem for some
commercial compilers and such.

[ ... ]
All I can say is that there are a number of users (myself
included) who find the investment worthwhile enough to do it
ourselves. It took me less than a day to get the Boehm
collector up and running with g++ under Linux, and I don't have
access to all of the inside knowledge of the implementors. And
10%, over the life of a project, is a lot more than one man-day,
even for a single project. 10% gain for 90% of the C++ projects
almost certainly adds up to man-years.

Adding it yourself, on a compiler for which it's already supported, for
the right kind of project may well be a win. Mandating that it be added
for every implementation of C++, regardless of the target is a whole
different story -- you know as well as I do that for quite a few high-
reliability and/or real-time situations, almost all dynamic allocation
is verboten. When/if that's the primary (or only) target of a particular
implementation, requiring it to include a garbage collector simply makes
no sense.

Likewise, there are a fair number of smaller systems for which GC would
be possible, but rarely if ever of any real help. Just for example, a
few years ago I was dealing with some code for some security cameras.
The system had a joy-stick, a half dozen (or so) buttons, and four
motors (X, Y, Z and focus). C++ was a reasonable fit for the system, but
garbage collection wouldn't have been at all. More importantly, I can
hardly imagine a system that would be built around that processor that
WOULD benefit from GC -- so trying to get that compiler to include it
wouldn't make any sense.

I think one of the biggest real problems with C++ today is that almost
nobody follows the standard, or really even makes an attempt at doing
so. Changing it in a way that makes it even less applicable to a large
number of systems would not, IMO, be a good move at all.
 
R

Roedy Green

To make it clear why that is not necessarily true, consider

1. It takes months to produce a Japanese sword that decapitates in
less than a second.

2. an optimising compiler that spends an hour fine tuning code that
is close to theoretically perfect. The tool is slow but the result is
not.

It is also possible to optimise code in many ways, including judicious
use of assembly to make a smart pig outperform a dumb jackrabbit.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,241
Members
46,831
Latest member
RusselWill

Latest Threads

Top