W
websnarf
What's wrong with nested switch statements or tables of function pointers?
Well, nothing if you don't care about performance. You can use bubble
sorts too if you like. Have a blast.
What's wrong with nested switch statements or tables of function pointers?
[...]Correct. The C language standard itself creates this problem, for
really, no good reason. The simplest way to improve the C standard
for finite state machines is to extend either the continue or goto
statements to be able to go to a specific case label. In that way a
FSM can be implemented inside of a switch statement without paying the
exorbitant cost of rerunning a literal switch operation on every state
transition. This kind of thinking is basically beyond what the C
standards committee is capable of conceiving.
Assuming the branch logic dominates over the state actions, which wouldWell, nothing if you don't care about performance. You can use bubble
sorts too if you like. Have a blast.
The equivalent of an extra branch miss *PER* state transition. On
today's processors that's about 15 clocks or so. Think about that in
terms of a parser implemented as a FSM, where you are expecting each
state to be processed in, say, 5 clocks or so plus, say, a randomly
predicted branch (so half of those 15 clocks) at about 8 clocks. So
by adding an extra mis-predicted branch, you've just halved your
performance (a direct goto has an amortized cost of about 1 clock; but
it can be less in optimal circumstances).
Look. A few years back here, someone dared to challenge me to a
direct competition on this very issue for implementing a CSV parser
right here in this very newsgroup. We both implemented it using
FSMs. After he finally figured out that fread() can be faster then
millions of fgetc()'s (I basically told him so, so that the
competition would not be a complete joke), the real competition began,
and I was still able to clean his clock. Not even James Antill was
able to post much of a threat. The two other "competitors" in this
little duel, each had their own agenda and were incentivised to beat
me (the first, as a matter of honor/pride, and the second to push his
own string library).
Why couldn't they beat me? Because I was the only one who implemented
the FSM with gotos. And that was just too "gross" for them to deal
with. But the performance difference is very measurable.
Allow me to do some of my own tests. I'm not disagreeing.
Assuming the branch logic dominates over the state actions, which would
appear to be both a premature and speculative optimisation.
I may well be wrong, I've only ever implemented FSMs (using function
pointers) on CPUs without an instruction cache.
Care to post an example?
OK, I will. I must admit I've never looked at using gotos for FSMs.In my response to Christopher Layne you have direct examples of two
users implementing "for() switch()" based state machines, and myself
who made a goto dominated state machine implementation. You can go
download those and see for yourself. They complained that my
implementation was obfuscated, because of the gotos, but of course
that was as a cover for the fact that my solution comes out faster
precisely because of this.
In my response to Christopher Layne you have direct examples of two
users implementing "for() switch()" based state machines, and myself
who made a goto dominated state machine implementation. You can go
download those and see for yourself. They complained that my
implementation was obfuscated, because of the gotos, but of course
that was as a cover for the fact that my solution comes out faster
precisely because of this.
Correct. The C language standard itself creates this problem, for
really, no good reason. The simplest way to improve the C standard
for finite state machines is to extend either the continue or goto
statements to be able to go to a specific case label. In that way a
FSM can be implemented inside of a switch statement without paying the
exorbitant cost of rerunning a literal switch operation on every state
transition. This kind of thinking is basically beyond what the C
standards committee is capable of conceiving.
[...]
Ignoring the gratuitous insults, there's no reason you can't have
labels within a switch statement.
switch (x) {
case FOO: FOO:
/* do some stuff */
break;
case BAR: BAR:
/* do some other stuff */
break;
/* etc. */
}
Wrap it in a macro if you like:
#define LABEL(name) case name: name:
I'm checking out the code right now.
BTW: This line fails parsing:
test5.csv:"1234 West "Q" St. (for "Quantum", or "Quality")", 0
Whereas this line does not:
test6.csv:"1234 West ""Q"" St. (for ""Quantum"", or ""Quality"")", 0
Should that be the case?
Yes, I did have some examples that were not parsable for severe
conditions testing purposes. I'm one of these people who doesn't
think "Undefined Behaviour at the least provocation" is an acceptable
state of affairs.
Sure, sometimes 15 clocks doesn't matter. So your per-state
operations needs to be 300-1000 clocks (or about 600 to 3000 pipelined
x86 instructions) before you decide that state transition performance
isn't that important. (Just like bubble sort for 25 elements is
probably good enough.) Of course, its easy to find examples where the
the per-state operations are way less than that (any parser). I can't
think of a single counter example where the state transition is really
high like that off the top of my head though. Perhaps you can?
In my response to Christopher Layne you have direct examples of two
users implementing "for() switch()" based state machines, and myself
who made a goto dominated state machine implementation. You can go
download those and see for yourself. They complained that my
implementation was obfuscated, because of the gotos, but of course
that was as a cover for the fact that my solution comes out faster
precisely because of this.
Alright, I ended up taking it out of the large test file I created (which
was just a concatenation).
I cannot find a copy of Michael B. Allen's version, referenced in this
post:
http://groups.google.com/group/comp.lang.c/tree/browse_frm/thread/4fb...
Getting a 404 on his provided URL. Do you have a copy of it laying around?
There's not really anything the compiler can do unless its a forever
for (as in for(;) and the compiler is willing and able to do some
really serious optimizations for switches (something I have never
seen.)
Keith Thompson said:What definition would that be?
Keith Thompson said:True.
Of course, in the *worst* case the whole program is one function.
}
What definition would that be?
Hmm ... no; though it doesn't surprise me that he chucked that code
since it was so bug ridden and slow. I found an update of his general
csv parser here:
http://www.ioplex.com/~miallen/libmba/dl/src/csv.c
I don't know if it fits in the same code hoist, however. He's still
making the same performance mistake though (its amazing how
obstinantly stupid people can be about performance.)
Serve Laurijssen said:see Joe Wright's post ^^
Mark McIntyre said:The definition contained within the statement. Thats what "by
definition" means.
Want to reply to this thread or ask your own question?
You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.