Well, all of it really.
I presume that lib/redparse.rb is the "real" parser (I also found
lib/redparse/babyparser.rb and babynodes.rb)
Yes, that's right. babyparser.rb is an example of a minimal (3 rule)
parser using this system, and can make a good place to start trying to
understand the full-blown parser.
It's a monolithic file, and it was hard even to see where the grammar
began. I believe it's here:
Unfortunately, there's a lot of experimental stuff which shouldn't be
in there, related to my attempt to write a proper parser compiler.
[
-[UNOP, Expr, lower_op]>>UnOpNode,
-[DEFOP, ParenedNode]>>UnOpNode,
-[Op(/^(?:unary|lhs|rhs)\*$/), ValueNode, lower_op]>>UnaryStarNode,
... etc
These are indeed the rules. The general form is of a rule is
-[ patterns to search for on the top of parse stack ] >>
NodeTypeTheyGetReplacedWith,
These 3 rules deal respectively with (most) unary operators, the
defined? operator, and unary star operators. (What's found to the
right of the >> is generally a pretty good clue as to what a
particular rule is doing. (But not always in the case of stack
monkeys...)) UNOP, DEFOP, Expr, lower_op and the like are defined
above in the definitions section.
Altho the action to take on finding a pattern on the parse stack is
not always to reduce the matched portion of the stack into a Node. For
instance, here:
and an example of a larger rule is like this:
-[NumberToken&-{:negative=>true}, Op('**').la]>>
stack_monkey("fix_neg_exp",2,Op("-@",true)){|stack|
#neg_op.unary=true
num=stack[-2]
op=OperatorToken.new("-@",num.offset)
# op.startline=num.startline
stack[-2,0]=op
num.ident.sub!(/\A-/,'')
num.offset+=1
},
That's not really what I'd call a larger rule, merely (alas) a longer
one... This is an example of one of many relatively unimportant rules
with which the parser must unfortunately be littered. This particular
example fixes up the precedence of expressions like -2**10. '-2' is
normally lexed as one single numeric token, as is normal in most
languages. In this one special case, however, the -@ must actually be
made lower precedence than **. The implementation of this fixup can't
be neatly shoehorned into a Node constructor, however, so special
imperative code (a 'stack monkey') had to be written to fiddle with
the parse stack directly.
I have no idea how to (a) understand, or (b) modify that. I can see
there is quite a lot of Ruby operator abuse going on, but without
defined semantics.
I would call that a 'DSL'. Most of the special operators and other
unusual syntax are defined in my pattern matching language, Reg, which
is a different project. Reg is moderately well documented, but that's
in a whole other directory.
At least with racc, it's extremely well documented, in the sense that I
have a printout of the yacc manual to refer to.
I just couldn't ever get yacc to do what I wanted it to do,
personally. Lots of other people have had more luck....
I would like some day (if I ever have time) to split out the parser
construction tool aspects or redparse from the actual ruby parser
itself, and package and document the parser compiler/interpreter
better. For now, altho I have made an effort to make the interface to
RedParse fairly clear and well described, the internals I simply
didn't even try to explain....
So no doubt RedParse is a fine ruby parser, and generates a fine object
tree as its output. But it's not so good for me as a starting point for
building languages which inherit some of ruby flavour, but are
significantly different.
I can explain more IF you're interested, but it does seem like you
know where you want to go right now.
I'd still like to hear more about this language(s) you're trying to
make, if you want to tell.