[ ... ]
A non-normative note still creates expectations with regards to
quality of implementation.
In that case I think they should add the note, and enough rules to allow
a garbage collector to work, and forget about the rest of it...
Technically, I think we probably can live without it, although
it seems stupid to do so.
I disagree. Garbage collection enables quite a bit in the design of a
language, but I see _very_ little gain from it in a language that wasn't
designed with GC in mind from the beginning. Yes, there are a few
situations in C++ that would benefit to some minimal degree from GC, but
only a few, and only minimally at that.
Since I've also used (and, for that matter, implemented) a number of
other languages that always have GC (Scheme, Smalltalk, etc.), when I
first found a GC for C++ I was quite enthused. Using it changed my mind
though: I now put a garbage collector for it in about the same class as
a goto. Neither is exactly evil, but well designed code doesn't seem (to
me) to really benefit from either.
I think that non-technical issues
mean that without threads and garbage collection, the language
is doomed to a slow death, much like C today.
I agree about supporting threads -- but I don't see much relationship
between threads and GC.
What do you mean by a _good_ garbage collector?
I probably should have said "decent" or something on that order -- I
wasn't trying to establish what good was, only that we shouldn't start
by restricting it to such a degree that it can't really improve on
what's available right now.
A necessary (but probably not sufficient) condition would be that the GC
is allowed to compact the heap. This at least allows really fast
allocations, and eliminates the possibility of death by fragmentation.
Practically
speaking, the presense of undiscriminated unions means that
perfect garbage collection is probably not in the cards.
(Formally speaking, accessing anything but the last written
member of a union is undefined behavior, and the standard allows
an implementation to maintain a hidden discriminator. In
practice, any implementation which actually did this would break
so much code as to be inviable.) But there are mostly accurate
collections which are used with C++, see for example
http://www.hpl.hp.com/techreports/Compaq-DEC/WRL-88-2.pdf.
I'm not after perfect, or really anything close. I just think if we
restrict the language to allow for garbage collection that we at least
impose enough restrictions that it can be useful.
Looking at it in terms of the classic mark/sweep collector, you seem to
be concerned primarily with the mark phase, and the degree to which
garbage can/will be recognized. I'm more concerned with the sweep phase,
and ensuring that when it's done, it's really done some good. The fact
that there might be some unused blocks still marked as being in use is
nearly a given with GC, and increasing that percentage doesn't bother me
a lot. Having a heap that's still fragmented, and a heap manager that
still has to search through lists of free blocks to do an allocation
bothers me quite a bit more.
That's not guaranteed today, and I've used implementations where
you had to take particular precautions when hashing a pointer,
because different pointers to the same data could have different
bit patterns. (Think of 8086 compilers in model huge, which
only normalized a pointer when necessary.)
Right -- at best it's not clear how much you can really expect today.
I'm pretty sure code that depends on it is vanishingly rare.
It's guaranteed to work today, if there is a sufficiently large
integral type.
Right -- but there's no guarantee that there will be a sufficiently
large integral type, so code that does anything of the sort isn't
portable anyway.
It does mean that a conservative collector is
required. Or a "mostly-accurate" one---the mostly accurate
collectors identify cases where this occurs, and treat them
conseratively, while treating the rest accurately (including
copying, in some cases).
Right -- this is what I was thinking of when I contrasted a GC that's
purely in the library to a GC with compiler support. The compiler needs
to be aware of all the casts from pointer to integer, unions that
include pointers, and so on, making it fairly trivial for it to support
the GC in dealing with those. Implementing the GC entirely in the
library loses that visibility.
[ ... ]
(On another note: I think I'll write up a proposal real quick
for a standard hash function for pointers, since it's something
that a user can't write in an architecture independent manner.)
Putting it into the standard library also makes it trivial to ensure
that it keeps working when/if blocks of memory get moved around. Just
for one obvious possibility, the original address of the memory block
can be stored in a hidden location in the memory block and always used
for the hash, even when/if the block moves.
Of course, that requires that no other block of memory be allocated at
the same address, but (especially with a 64-bit address range) the same
trick that ensures that ordering of addresses also prevents reuse of
addresses (when you copy the data out of a block and start to reuse the
memory, you assign it a new range of addresses. As long as you
consistently either increment or decrement the addresses, you maintain
both order and uniqueness).