Dave Vandervies wrote:
It would be rather beyond the scope of a student project, but I would
expect that a sufficiently enthusiastic (or masochistic) C programmer
would be able to implement a Java compiler and JVM entirely in C
(though not the entire Java class library, but even there the parts
that don't have equivalents in C could be faked in C or passed on to
platform-specific extensions).
Except for the aforementioned bits of the class library that aren't
found in C, it would also be possible to write a Java-to-C compiler that
outputs standard C. (I don't know why anyone would bother, and it would
have a few Rather Ugly bits in both the translator and the generated C
(f'rexample, exception handling), but it could be done[1].)
Back in university I used to be involved in a (small part of a) project
that comes very close to what you describe, and your remark about
exception handling is right on the mark; so much in fact that back then
it was decided to target C++ instead of C to be able to piggyback on
C++'s exception handling.
Yep. C is interesting here in that it's too high-level to approximate
the way exceptions would be handled in a native-code implementation[2],
and too low-level to piggyback on an abstraction built into the language
(since the language doesn't have an appropriate abstraction).
setjmp and longjmp could be bludgeoned into useability for implementing
exceptions, but getting the appropriate information about what exception
occurred up the call stack would introduce some nasty complications.
The other obvious way to do it would involve checking whether you're
returning normally or throwing an exception (and handling it if you're
throwing an exception that's caught at that level) every time a function
returns, and even though the code generator does the hard work of writing
the code, I can't exactly see great efficiency gains being made by doing
it that way.
Funny thing: the compiler had no trouble
rewriting classes (including polymorphism, etc.) to C; only exception
handling was, well, exceptional.
Yep. Polymorphism is just an indirect function call through a pointer;
the cleanest way to do inheritance-that-adds-members requires some
mucking about with structs that isn't quite required to be legal (though
I think we once discussed it here and decided that there was no way it
could go wrong unless the implementation was actively checking for it,
because of the requirements that did exist on struct layouts), but that
can be worked around without getting too ugly.
Well, this is just saying that both are Turing-complete, basically. By
the same token you can say that Prolog can be implemented in COBOL and
vice versa.
Exactly.
I guess it would be fair to say that implementing C in Java
would have to involve some sort of simulation of 'untyped' memory, for
which Java doesn't have an equivalent.
Yep. Which is where the "Java can be implemented in C, but not the
converse" originally came from, if I haven't lost track.
dave
[1] Old footnote
[2] The obvious way to do it there would be to add an extra flag to
every stack frame and have the exception-throw handler unwind based
on those until it gets to the point where the exception is caught;
I almost did this for a simple compiler I wrote for a CS assignment
once[3].
[3] The language to be implemented had a poorly specified interaction
between looping and functions; the loop construct was a "loop until
a break is executed" and functions could be defined anywhere, and
one of the test cases was something like this:
--------
loop
{
function func()
{
break;
}
func();
}
--------
I did the obvious thing and didn't copy the compiler's internal
"break to" stack when I entered a function, so my compiler gave a
"break outside of loop" error here. The behavior the markers were
looking for was to break out of the loop here; after finding that
out, I thought about it for a bit and pointed out that the options
to correctly implement that behavior were to not use a stack at
all (contradicting a clarification I'd asked for earlier - local
variables in functions had to be reentrant) or to treat breaking as
throwing an exception and dynamically unwind the stack to the last
loop entered.
Apparently the simple (broken, or (at the most charitable) non-robust)
implementation was what they were looking for, but had I known
they'd be throwing that test case at it I'd've done it properly.
'Twouldn't've been hard, either, since most of the stack management
framework it would have needed was already built to handle other
things that it was already doing.