glen herrmannsfeldt said:
(snip, someone wrote)
There are still a few cases left where compact is fast, and fast
is important. For real-time systems, it is either fast enough or
doesn't work at all.
That doesn't follow, unless you're dealing with actual source code at run
time.
And even if significant, we don't know in this example how compact or
otherwise the expanded code might be.
The inner loop of an interpreter should also be fast, sometimes
at the expense of readability. (Though there is no excuse
for not having enough comments to explain the unreadable part.)
This is one dispatch loop for a bytecode interpreter:
typedef void (*(*fnptr))(void);
do {
(**(fnptr)(pcptr))();
} while (!stopped);
It's reasonably fast (over 100M bytecodes per second), and is still pretty
clear, which is not unexpected for three lines of code! (BTW removing the
stop condition, just while(1), makes it slower for some reason.)
Faster (approaching 200M) is this, showing just one of nearly 300 similar
cases, each of which is handled in the same way:
while (1) {
switch (*pcptr) {
case knop: do_nop(); break;
....
}
}
That's about as far as I could go while using standard C, and still having
an actual inner loop. I've gone a couple of further steps, and managed
200-400M bytecodes per second, but the basic C bytecode handler functions
/are the same/ in all cases.
That is, for every bytecode instruction, there is a discrete, dedicated
function to handle it. You can't get more straightforward than that. And
compiler inlining will also collapse a lot of these, there is no need for
the code to be physically compact and unreadable.