?
=?ISO-8859-1?Q?=C5smund_Grammeltvedt?=
Hi.
I am trying to implement a small compiler in python and, trying to use
something a bit more pythonic than lex/yacc, ended up with ply
(http://systems.cs.uchicago.edu/ply/). The only problem is that whereas
yacc accepts the grammar and appears to parse it correctly, ply does not.
Perhaps this belongs on some compiler list, but I couldn't decide if it
was a compiler or a python problem, so bear with me.
This smaller grammar illustrates the problem:
Goal ::= (+|+-)*;
The yacc grammar goes:
%start Goal
%%
Goal
: Block ';'
;
Block
: Empty
| Block T
| Block S
;
Empty
: /* empty */
;
S
: '+'
;
T
: '+' '-'
;
Translated to ply:
def p_Goal(p):
"""
Goal : Block SEMI
"""
def p_Block(p):
"""
Block : Empty
| Block T
| Block S
"""
def p_Empty(p):
"""
Empty :
"""
pass
def p_S(p):
"""
S : PLUS
"""
def p_T(p):
"""
T : PLUS MINUS
"""
Now, looking at the state machines that yacc and ply produces, the
difference show up.
Yacc:
state 0
0 $accept: . Goal $end
$default reduce using rule 5 (Empty)
Goal go to state 1
Block go to state 2
Empty go to state 3
state 2
1 Goal: Block . ';'
3 Block: Block . T
4 | Block . S
';' shift, and go to state 5
'+' shift, and go to state 6
S go to state 7
T go to state 8
state 3
2 Block: Empty .
$default reduce using rule 2 (Block)
Ply:
state 0
(0) S' -> . Goal
(1) Goal -> . Block SEMI
(2) Block -> . Empty
(3) Block -> . Block T
(4) Block -> . Block S
(5) Empty -> .
SEMI reduce using rule 5 (Empty -> .)
Empty shift and go to state 3
Goal shift and go to state 2
Block shift and go to state 1
OK, I believe those are the interesting parts. Appearently, ply doesn't
want to shift anything other than a ';' and thereby produces an error if
it encounters '+'. This only occurs in LALR mode, though. SLR handles
correctly, but my larger grammar isn't SLR, sigh.
So, any ideas? Have I completely messed up how to handle empty productions
or should I switch to another lexer/parser?
I am trying to implement a small compiler in python and, trying to use
something a bit more pythonic than lex/yacc, ended up with ply
(http://systems.cs.uchicago.edu/ply/). The only problem is that whereas
yacc accepts the grammar and appears to parse it correctly, ply does not.
Perhaps this belongs on some compiler list, but I couldn't decide if it
was a compiler or a python problem, so bear with me.
This smaller grammar illustrates the problem:
Goal ::= (+|+-)*;
The yacc grammar goes:
%start Goal
%%
Goal
: Block ';'
;
Block
: Empty
| Block T
| Block S
;
Empty
: /* empty */
;
S
: '+'
;
T
: '+' '-'
;
Translated to ply:
def p_Goal(p):
"""
Goal : Block SEMI
"""
def p_Block(p):
"""
Block : Empty
| Block T
| Block S
"""
def p_Empty(p):
"""
Empty :
"""
pass
def p_S(p):
"""
S : PLUS
"""
def p_T(p):
"""
T : PLUS MINUS
"""
Now, looking at the state machines that yacc and ply produces, the
difference show up.
Yacc:
state 0
0 $accept: . Goal $end
$default reduce using rule 5 (Empty)
Goal go to state 1
Block go to state 2
Empty go to state 3
state 2
1 Goal: Block . ';'
3 Block: Block . T
4 | Block . S
';' shift, and go to state 5
'+' shift, and go to state 6
S go to state 7
T go to state 8
state 3
2 Block: Empty .
$default reduce using rule 2 (Block)
Ply:
state 0
(0) S' -> . Goal
(1) Goal -> . Block SEMI
(2) Block -> . Empty
(3) Block -> . Block T
(4) Block -> . Block S
(5) Empty -> .
SEMI reduce using rule 5 (Empty -> .)
Empty shift and go to state 3
Goal shift and go to state 2
Block shift and go to state 1
OK, I believe those are the interesting parts. Appearently, ply doesn't
want to shift anything other than a ';' and thereby produces an error if
it encounters '+'. This only occurs in LALR mode, though. SLR handles
correctly, but my larger grammar isn't SLR, sigh.
So, any ideas? Have I completely messed up how to handle empty productions
or should I switch to another lexer/parser?