Hey everyone. I am currently reading through an RFC, and it mentions
that a client and server half of a transaction are embodied by finite
state machines. I am reading through the wikipedia article for finite
state machines, and sadly it's going a bit above my head. I don't
really have a background in engineering, and would really love to
understand what is being said. Does anyone have a simple way to
explain it?
Yes... "background in engineering" is what a layperson would assume
from the words "finite state machine" but this is not really so
(unless you think math is the same as engineering) because the words
were invented by mathematicians (no normal engineer would call a
machine an 'automaton' -- the original name!)
So assuming and staying within python a natural example would be what
is called lexical analysis or tokenizing.
Say you have a line of python like:
def foo(count, x ="if nothing is given") # catch a comment anyone?
the python interpreter in its lexical analysis phase breaks this up as
|def|foo|(|count|,|x||=|"if nothing is given"|)|<Comment DISCARDED>
Some of the things it does are:
1. Separating between all these tokens or lexemes
2. Distinguishing identifiers like foo, count, x (there's no
difference at this stage) from keywords like def
3. Discarding spaces but after using them to separate 'def' from 'foo'
4. Identifying special symbols like ( , = etc
5. Throwing away the comment
Now how does python know that the catch in the comment (or the if in
the string) are not the python keywords?
Well this is where the notion of state comes in:
It starts in a default or normal state but when it sees the " it goes
to (or is it gotos?) a state which we may call InsideStringState. In
this state everything is swallowed up indiscriminately until the next
" Likewise an InCommentState would swallow up everything from # to
newline.
So to tokenize python it would be natural to think with 3 states:
Normal, InString and InComment
Of course in real programming languages there would be many more
states than 3.
eg What happens when the machine is in InString state and you
encounter a backslash?
Or more hairy example: Say you have a language like C whose comments
are /*... */
Now when you see a / is it a divide symbol or a comment?
All such problems get solved by introducing suitable states like
MaybeComment
In short, dont focus your attention on 'machine' but on 'state' and
you should be fine