Horror! Using goto statement in finite state machines?

F

Flash Gordon

Rui Maciel wrote, On 09/02/07 19:19:
The main property on which I based my definition of "best way to implement a
finite state machine" was code simplicity to write, read, debug, modify. As
far as I can tell the end result may end up being clearer and simpler to
deal with than a whole bunch of convoluted if/switch nests peppered with
loops.

Whether they are complicated or not can depend on what you are doing.
I see what you mean. When I started this project first I delved into Ragel
but as I'm mainly interested in the learning experience and I never wrote a
FSM before, I found it best to try to write it by hand.

Yes, certainly worth doing once or twice.
That sounds very interesting. Where can I learn more about those approaches,
specially the one that uses table-and-function-pointers?

A simple one could be something like:

int state = 0;
functype funcarray[] = { func1, func2, ... };

for (state = START; state != FINISHED; state = funcarray[state]())
continue;

Obviously each function returns the new state and all functions have to
have the same parameter list. Similarly a simple approach with switch is:

for ( state = START; state != FINISH; ) {
switch (state) {
case START: ... state = whatever; break;
case FRED: ... state = whatever; break;
}
}

With as much additional complexity as you require.

I know some people do not like the introduction of a state variable, but
others do not like the use of goto. You have to select the pattern that
best matches your problem, including possibly using a mixture.
 
I

Ian Collins

Richard said:
So if you were to write a FSM, what method would you choose and why? If the
method isn't exclusively based on goto statements, why did you opted it?


My usual method is to use two tables, a successor table, and an action
table. Each table is two dimensional, one dimension being the state
number and the second being the event number. The successor table
contains the next state for each (state,event) pair; the action table
either contains an action code (if I am using a switch) or a function
pointer (if I am using functions for each action). There two special
states, an init state and a term state.

When using function pointers the execution engine is really simple,
basically

while (state != TERMINAL)
{
action[state][event]();
state = successor[state][event];
event = get_next_event();
}
Along with simplicity, this approach also has the advantage of
separating the engine from the state logic, giving a more generic solution.
 
R

Roberto Waltman

Rui said:
... Stuff about FSMs deleted.
That sounds very interesting. Where can I learn more about those approaches,
specially the one that uses table-and-function-pointers?

For some reason I have never seen FSMs as anything other than tables
(with or without function pointers,) and felt that straight code,
switch statements, etc. are the wrong/unnatural way of implementing
them. Seems to be the other way around for most people.

A simple implementation follows. It is easily extended with default
actions, default transitions for given events, timeouts, etc.

Please note that:

(a) The linear search approach will be inefficient for large FSMs
(b) I verified it compiles without errors, but I have not really used
it. There may be dragons^H^H^H^H^H^H^H bugs.


/*-----------------------------------------------
* tiny_fsm.c
* Trivial implementation of
* a table-driven FSM in C
*
* Put in the public domain
* Roberto Waltman - 9-Feb-2007
*/

#include <stdio.h>

/* Valid states */

enum state_tag
{
NO_STATE,
ST_A,
ST_B,
ST_C,
ST_D /* etc. ...*/
};
typedef enum state_tag state_e;

#define INITIAL_STATE ST_C

/* Valid events triggering
* state transitions */

enum event_tag
{
NO_EVENT,
THING_UP,
THING_DOWN,
STUFF_OPEN
/* etc. ...*/
};
typedef enum event_tag event_e;

typedef struct transition transition_s;

/* Slot in transition table:
*
* current: current state.
* next: next state to
* transition to, if
* event: this event happens.
* action: function to call when
* transitioning from
* current to next.
*/

struct transition
{
state_e current;
event_e event;
state_e next;
void (*action)(transition_s *slot);
};

/* Actions to be invoked
* when changing state */

void f1(transition_s *slot);
void f2(transition_s *slot);
void f3(transition_s *slot);
/* etc. ... */

/* Transition table. There should be an
* entry for each valid transition (arch)
* in the FSM diagram. */

transition_s transition_table[] =
{
/* If current state is ST_A and event
* STUFF_OPEN happens, call f2() and
* move to state ST_C */
{ST_A, STUFF_OPEN, ST_C, f2},

/* If current state is ST_A and event
* THING_UP happens, call f1() and
* stay in same state. */
{ST_A, THING_UP, ST_A, f1},

/* If current state is ST_D and event
* THING_DOWN happens, move to ST_B.
* No action invoked. */
{ST_D, THING_DOWN, ST_B, NULL},

/* etc ... */

{NO_STATE, NO_EVENT, NO_STATE, NULL}
};


state_e next_state(
state_e current,
event_e event)
{
transition_s *tp = transition_table;
while (tp->current != NO_STATE)
{
if ((tp->current == current) &&
(tp->event == event ))
{
if (tp->action)
{
(tp->action)(tp);
}
return tp->next;
}
tp++;
}
return NO_STATE;
}


/* dummies to pass compilation */

int there_is_a_reason;
event_e what_happened();


void some_obscure_process()
{
state_e current_state = INITIAL_STATE;
state_e new_state;
event_e event;

while (there_is_a_reason)
{
event = what_happened();
new_state = next_state(current_state, event);
if (new_state != NO_STATE)
{
/* valid transition found */
current_state = new_state;
}
else
{
/* No valid transition for
* event from current_state
* This may or not be an error
*/
}
}
}


Roberto Waltman

[ Please reply to the group,
return address is invalid ]
 
R

Richard Heathfield

Rui Maciel said:

Sarcasm and bitterness aside, I fail to see the point of your post.

That figures.
Was there one?

Yes, and it's this - that there is a huge body of experience within the
profession to show that structured programming constructs are a massive
improvement on unconditional jumps, with a real payoff in maintenance
cost reduction. A typical well-coded state machine will comprise a loop
containing a switch statement, with each case relating to a state and
selecting a new state according to circumstances.
 
C

Christopher Layne

Richard said:
Yes, and it's this - that there is a huge body of experience within the
profession to show that structured programming constructs are a massive
improvement on unconditional jumps, with a real payoff in maintenance
cost reduction. A typical well-coded state machine will comprise a loop
containing a switch statement, with each case relating to a state and
selecting a new state according to circumstances.

And yet the one with unconditional jumps will end up being faster. :D
 
R

Richard Heathfield

Christopher Layne said:
And yet the one with unconditional jumps will end up being faster. :D

If that's your bottleneck, you are a fortunate fellow indeed.
 
K

Keith Thompson

Rui Maciel said:
I've been delving into finite state machines and at this time it seems that
the best way to implement them is through an intense use of the goto
statement. Yet, everyone plus their granmother is constantly cursing at the
goto statement, accusing it of being some sort of spawn of satan. So it
seems we have a pickle here.

The thing is, at first thought it seems that there aren't any viable
alternatives which are better suited for this task. Yes, probably the end
result will be an entangled mess but as far as I know a set of nested if
and switch statements will not improve much on the subject.

So, what are your views on this subject? Is the use of goto statements in
this application really the best possible solution to this problem or is it
unjustified and there is some godsend method which I'm missing out on?

Most "structured" control-flow constructs correspond more or less
directly to some concept in the real-world thing that you're
modelling. An if, if/else, or switch statement corresponds to making
a decision and acting on it. A loop corresponds to doing something
repeatedly. A function call corresponds to doing a bunch of related
stuff viewed as a single task.

A goto statement, though, *usually* corresponds only to a construct
within your program, not something in the real world. And it's been
argued that it's the label that's the real problem, not the goto.
When you look at a label in your code, you can't tell how you got
there without, in the worst case, analyzing the whole program; your
preconditions are messed up.

A finite state machine, though, is the exception to this (and the only
one I can think of). A label corresponds to a state in the machine; a
goto corresponds to a transition from one state to another. If you
write your code carefully, the code itself perfectly reflects the
structure of the thing that you're modelling. If you end up with
spaghetti code, it's only because you're modelling a spaghetti
machine.

On the other hand, there is another obvious way to implement a finite
state machine, using a variable (likely an enum) to represent the
current state, and a switch statement within a loop. The current
state is represented as the value of a variable rather than where you
currently are in the running program, and a state transition is done
by setting the variable rather than by executing a goto. This makes
it easier to execute common code regardless of the state, for example
by adding:
printf("state = %d\n", state);
at the top of the loop. It also lets you implement more complex
situations, such as having two simultaneous state machines; just store
their states in two variables.

If you're sure your state machine is going to remain simple enough,
you can go ahead and use gotos if you like. But consider using a
switch in a loop if you might need the additional flexibility later
on.

(The other cases in C where gotos are reasonable, IMHO, are
multi-level break and continue, and bailing out of a nested control
structure when an error occurs. But these are necessary only because
C lacks explicit multi-level break and continue statements and
exception handling. In a language that does support these constructs
directly, using goto for them would be very poor style.)
 
R

Richard Harter

A goto statement, though, *usually* corresponds only to a construct
within your program, not something in the real world. And it's been
argued that it's the label that's the real problem, not the goto.
When you look at a label in your code, you can't tell how you got
there without, in the worst case, analyzing the whole program; your
preconditions are messed up.

Yes and no. A goto statement, e.g., goto X, says that the next thing to
do is X. The implication of this is that when we do X, it shouldn't
matter how we got there. In other words if there is a precondition that
is true whenever we goto X then the semantics of the goto construct are
satisfied. The problem with gotos in unstructured code is that there
are no defined preconditions for labels and consequently there is no
check at the goto as to whether preconditions are met.
 
S

Serve Laurijssen

Keith Thompson said:
When you look at a label in your code, you can't tell how you got
there without, in the worst case, analyzing the whole program; your
preconditions are messed up.

Why analyze the whole program? a label does not have complete program scope
so you know you got there from within the same function.
 
C

Christopher Layne

Roberto said:
For some reason I have never seen FSMs as anything other than tables
(with or without function pointers,) and felt that straight code,
switch statements, etc. are the wrong/unnatural way of implementing
them. Seems to be the other way around for most people.

A simple implementation follows. It is easily extended with default
actions, default transitions for given events, timeouts, etc.

Please note that:

(a) The linear search approach will be inefficient for large FSMs

Which is why you see a lot of methods using switch, gotos, and tables of
function pointers.
 
K

Keith Thompson

Serve Laurijssen said:
Why analyze the whole program? a label does not have complete program scope
so you know you got there from within the same function.

True.

Of course, in the *worst* case the whole program is one function.
:cool:}
 
J

Joe Wright

Serve said:
such general statements are, by definition, false
Bertrand Russell said (among other things)..
"All generalizations are false, including this one."
 
F

Flash Gordon

Keith Thompson wrote, On 10/02/07 23:54:
What definition would that be?

The definition of the C language which allows implementations to make
the one with unconditional jumps slower ;-)
 
W

websnarf

I've been delving into finite state machines and at this time it seems
that the best way to implement them is through an intense use of the goto
statement. Yet, everyone plus their granmother is constantly cursing at
the goto statement, accusing it of being some sort of spawn of satan. So
it seems we have a pickle here.

Correct. The C language standard itself creates this problem, for
really, no good reason. The simplest way to improve the C standard
for finite state machines is to extend either the continue or goto
statements to be able to go to a specific case label. In that way a
FSM can be implemented inside of a switch statement without paying the
exorbitant cost of rerunning a literal switch operation on every state
transition. This kind of thinking is basically beyond what the C
standards committee is capable of conceiving.
The thing is, at first thought it seems that there aren't any viable
alternatives which are better suited for this task. Yes, probably the end
result will be an entangled mess but as far as I know a set of nested if
and switch statements will not improve much on the subject.

So, what are your views on this subject? Is the use of goto statements in
this application really the best possible solution to this problem

In the C language? I'm afraid so. The cleaner *LOOKING* solution is
a switch() inside of a for(), but its not *that* much cleaner and it
is a *HELL* of a lot slower.
[...] or is it
unjustified and there is some godsend method which I'm missing out on?

No you pretty much got it. In C, FSMs are gross, because the only
acceptable way of doing them is with gotos.
 
C

Christopher Layne

In the C language? I'm afraid so. The cleaner *LOOKING* solution is
a switch() inside of a for(), but its not *that* much cleaner and it
is a *HELL* of a lot slower.

I know you know your stuff Paul, but don't you think "a hell of a lot slower"
is a bit of an overstatement?

We're talking a single compare here within the switch and then a jump.
And that's not even considering what the compiler might optimize it to in the
long run.
No you pretty much got it. In C, FSMs are gross, because the only
acceptable way of doing them is with gotos.

I don't think the overhead is *that* bad.
 
I

Ian Collins

No you pretty much got it. In C, FSMs are gross, because the only
acceptable way of doing them is with gotos.
What's wrong with nested switch statements or tables of function pointers?
 
W

websnarf

I know you know your stuff Paul, but don't you think "a hell of a lot
slower" is a bit of an overstatement?

The equivalent of an extra branch miss *PER* state transition. On
today's processors that's about 15 clocks or so. Think about that in
terms of a parser implemented as a FSM, where you are expecting each
state to be processed in, say, 5 clocks or so plus, say, a randomly
predicted branch (so half of those 15 clocks) at about 8 clocks. So
by adding an extra mis-predicted branch, you've just halved your
performance (a direct goto has an amortized cost of about 1 clock; but
it can be less in optimal circumstances).
We're talking a single compare here within the switch and then a jump.

We're talking about doing the switch in the first place. switch's
cannot be well predicted unless they follow a trivial pattern with at
most two branch targets. In general, for most FSMs, switch's cannot
be predicted at all.
And that's not even considering what the compiler might optimize it to in
the long run.

There's not really anything the compiler can do unless its a forever
for (as in for(;;)) and the compiler is willing and able to do some
really serious optimizations for switches (something I have never
seen.)
I don't think the overhead is *that* bad.

Look. A few years back here, someone dared to challenge me to a
direct competition on this very issue for implementing a CSV parser
right here in this very newsgroup. We both implemented it using
FSMs. After he finally figured out that fread() can be faster then
millions of fgetc()'s (I basically told him so, so that the
competition would not be a complete joke), the real competition began,
and I was still able to clean his clock. Not even James Antill was
able to post much of a threat. The two other "competitors" in this
little duel, each had their own agenda and were incentivised to beat
me (the first, as a matter of honor/pride, and the second to push his
own string library).

Why couldn't they beat me? Because I was the only one who implemented
the FSM with gotos. And that was just too "gross" for them to deal
with. But the performance difference is very measurable.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,226
Members
46,815
Latest member
treekmostly22

Latest Threads

Top