Gene said:
I assume you are implementing a VM in C because you are posting
here.
You are confused about LLVM. SSL is an intermediate form
representation for compilers, not generally for interpreters like
VMs. VMs normally use stacks wherever possible because allocation/
eallocation is extremely fast. But they're LIFO, which doesn't work
for everything. For this there is the heap.
actually, a VM can also be a compiler, just typically compilation is done
just before running the code (as often called, JIT).
now, the tradeoffs are, as I see it:
for interpreters, stack machines are easier to implement and work with,
whereas register machines tend to be somewhat more complicated.
however, stack machines are also a problem when it comes to flexibility,
such as the freedom to reorganize code or adapt ordering to be better suited
to a particular target architecture (more important for compilers), which
gives an advantage to more flexible forms (such as SSA, or the use of ASTs).
The functional language community has looked at various schemes to
eliminate the stack completely in favor of keeping all storage on a
GC'ed heap. They have never been seriously competitive with stack.
Even though the allocation/deallocation instruction count can be made
comparable for both approaches, the heap approach doesn't play well
with memory hierarchies.
yeah, heap allocation tends to be slower...
"sliding pointer" allocators can also be very fast, but tend to require
either memory copying (like in a copy collector or compacting collector), or
periodically destroying an entire region of the heap (the app being
responsible for not having any important data there when this happens),
which limits applicability as a general-purpose allocation strategy...
Re GC root-finding, the answer is pretty obvious: VM stacks and
registers must be stored only in C data structures that the GC can
inspect for roots. If the GC is running in a thread different from
the program, appropriate mutexes are needed to make pointer copies
look atomic to the GC. C code that's "GC safe" often looks wierd and
verbose and is tricky to get right. A favorite example is the rather
innocent-looking
*context.register = allocate(&context, 42);
/* allocate 42 bytes to the location pointed to by the register */
where context is being exposed to the GC as a source of roots
and .register (obviously) holds a pointer.
Since the C compiler is free to emit code that computes the LHS L-val
first (and many do), and the allocation (by a copying collector) may
cause the pointer target to be relocated by the GC
(therefore .register holds a new value afterward), this is a hard-to-
find bug. You must instead say
context.tmp = allocate(&context, 42);
*context.register = context.tmp;
yeah.
we call this precise GC, and IME in most cases (in C), it is too much of a
hassle to bother with...
multi-threading with a precise GC is even worse, as then one has to start
being very pedantic about everything, ...
I much prefer the use of either conservative GC, or the "slide memory and
destroy the heap" strategy (this being well suited for cases where one knows
that something will only run a short time, and none of its local data will
be needed later, so to the code in question it is like having a GC...).
also, the above strategy can also give better performance in many usage
patterns than does having an actual GC.
Many people have tried "conservative" GC that inspects arbitrary C
stack contents for possible roots (see for example the Boehm
collector, which is about as well as this can be done). You can't do
this portably. Here, "portable" extends even to different command
line options for the same compiler on the same machine. It's a risky
approach.
The lccwin compiler supports GC as an extension, avoiding the above
risk because the compiler-writer is in the loop.
I do conservative GC, and in general it works fairly well...
(although I tend to use a custom GC, rather than Boehm...).
a partial downside though is performance (during a GC), namely that a
conservative GC with a large heap generally doesn't perform nearly as well
as the other options (largely since a conservative).
another problem is one of "write barriers":
there is no good and portable way to do implicit write barriers with a GC,
which is needed to be able to safely do concurrent GC (where the GC runs at
the same time as the program is running);
this means either having to forsake a little safety (running without write
barriers, and having some added risk of crashes), or having to use pedantic
little functions or macros whenever assigning regions of memory on the heap,
which is annoying.
like, if:
gcSet(foo->ptr, value);
has to be used rather than:
foo->ptr=value;
and so on...
there are also incremental strategies (such as reference counting), which
have their own tradeoffs.
sadly, all this means that one will often end up using a number of different
memory-management strategies in an app depending on what one is doing,
leading in some cases to "hidden rules" regarding the treatment and handling
of memory objects.
or such...