rlex and ryacc

L

Luke A. Kanies

Hi all,

I'm just cutting my teeth on ruby, after a long time with perl as a
sysadmin. I'm currently trying to write a language compiler. Because of
my experience with perl, I am currently planning on using some kind of
lex/yacc combo, mainly because it should be pretty fast and it should be
relatively easy.

However, rlex and ryacc don't seem to be quite functional. rlex didn't
work with debian, although it worked apparently fine on my Solaris x86
box. ryacc, though, is using /**/ style comments and seems to have left
out the middle couple hundred lines of code, which causes a few problems.

rockit hasn't been updated in 2 years, so I'm guessing it's not compatible
with the latest stuff. racc seems to be exist and be current, but it
looks like a recursive descent parser (I'm not really even sure how to
tell, it just seems to have a similar grammar), and the last one I used
(Parse::RecDescent in perl) was about 10 times slower than a roughly
equivalent Parse::Lex/Parse::Yapp combo.

So, my question is: Is there a good parsing solution in ruby right now?
I really don't want to write my own, as I don't think I'm up to it (I'm
barely up to writing the parser with a parser compiler), but I really
would like to use ruby, as all of my prototype code is in ruby and it has
worked smashingly.

I've read through the ruby-talk archives, and what consensus I could find
there seemed to point to racc, so maybe I just need someone to correct my
ideas about how racc works and whether I should use it. I'm not
particularly attached to yacc-like functionality, as I've really only used
it once, but I am definitely concerned about speed.

Any help would be greatly appreciated.

Thanks,
Luke Kanies
 
J

Jim Freeze

Hi all,

So, my question is: Is there a good parsing solution in ruby right now?
I really don't want to write my own, as I don't think I'm up to it (I'm
barely up to writing the parser with a parser compiler), but I really
would like to use ruby, as all of my prototype code is in ruby and it has
worked smashingly.

I've read through the ruby-talk archives, and what consensus I could find
there seemed to point to racc, so maybe I just need someone to correct my
ideas about how racc works and whether I should use it. I'm not
particularly attached to yacc-like functionality, as I've really only used
it once, but I am definitely concerned about speed.

I have used racc successfully on a project and did not have a problem
with speed. Howver, there are ways to speed up racc when needed.
Also, you could write your parser in C (IIRC, YAML started out in racc and
went to C. Ask Why.) I am no expert, but I can probably get you going
if you choose racc.

There is also rbison, but I am not familar with the differences between
it and racc, although I think it offers roughly the same features as
racc.

And, rockit is being re-written in C and should be released now, but I
haven't followed its progress.
 
J

Jim Freeze

Hi all,

with the latest stuff. racc seems to be exist and be current, but it
looks like a recursive descent parser (I'm not really even sure how to
tell, it just seems to have a similar grammar), and the last one I used

Oops, missed that question. Racc is LALR(1).
 
L

Luke A. Kanies

I have used racc successfully on a project and did not have a problem
with speed. Howver, there are ways to speed up racc when needed.
Also, you could write your parser in C (IIRC, YAML started out in racc and
went to C. Ask Why.) I am no expert, but I can probably get you going
if you choose racc.

Okay, I'm trying to use that, and I can't even get that far. What did you
use for a lexer? rlex seems to be giving me no end of problems. There's
the inconsistency between linux and solaris, and now I'm finding that
although I get a valid Lexer.rb on solaris, it does not define the
necessary constants, so it obviously doesn't work.

And when I rewrite my token definitions just to print and not return
anything, I get an infinite loop, because apparently the lexer isn't
actually consuming the text or something.
There is also rbison, but I am not familar with the differences between
it and racc, although I think it offers roughly the same features as
racc.

I never even found this. If racc doesn't work out, I'll look at it.
And, rockit is being re-written in C and should be released now, but I
haven't followed its progress.

The web site doesn't have releases more recent than 2001, so I don't think
anything's really available.

Racc seems fine, now if I could just find a lexer.

Thanks,
Luke
 
L

Luke A. Kanies

Okay, just to summarize...

I'm still having problems, but I have at least made progress.

Here's what I've found so far:

rlex:
Does not generate code that's valid for 1.8, but the code does run.
Does not seem to like parsing strings, only files (I get an infinite
(loop)

ryacc:
Does not generate a valid Parse.rb on any platform I've tested it
on, and is therefore unusable at this point.

racc:
Does not give much info about what to do for a lexer. I've tried
rlex, but they have incompatible means of specifying tokens,
apparently, so I don't know how to use them together. Without
a lexer, I can't really tell if this will work for me.

slex:
I could definitely be wrong here, but this seems to only function on
one character at a time, which is great if you needs lots of control
and have a complicated syntax, but neither case applies to me (yet).
I have no idea how to make this work for me.

So, I may have a valid parser, but I can't seem to find a good lexer.
Am I down to writing my own (which I'd prefer not to do), or is there some
way to integrate rlex and racc? How are other people solving this
problem? I would especially like to see other people's real-life examples
of their next_token/yyparse routines.

If I end up using rlex, I'm fully willing to modify it to generate valid,
1.8 code. I've already emailed the author, but have gotten no response
yet (it's only been a couple of days).

Once again, any ideas would be greatly appreciated. And considering how
often this seems to come up on this list (from looking through the
archives) it might be a good idea to get this solved once and for all. :)

Luke
 
M

Minero Aoki

Hi,

In mail "Re: rlex and ryacc"
Luke A. Kanies said:
I'm still having problems, but I have at least made progress.
racc:
Does not give much info about what to do for a lexer. I've tried
rlex, but they have incompatible means of specifying tokens,
apparently, so I don't know how to use them together. Without
a lexer, I can't really tell if this will work for me.

I'm using StringScanner (strscan). It is NOT a lexer generator,
but it is sufficient for me (including speed).

For real example of Racc, refer TMail or RDtool.

TMail (uses #yylex. The lexer is written in Ruby and C)
http://raa.ruby-lang.org/list.rhtml?name=tmail

RDtool (uses #do_parse and #next_token.
The lexer is written in Ruby using strscan)
http://raa.ruby-lang.org/list.rhtml?name=rdtool


Regards,
Minero Aoki
 
P

Phil Tomson

T

Tim Gesekus

--aVD9QWMuhilNxW9f
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

* Luke A. Kanies said:
rlex:
Does not generate code that's valid for 1.8, but the code does run.
Does not seem to like parsing strings, only files (I get an infinite
(loop)
Had the same problems, but made a patch for rlex. It works for me,
but isn't tested to well. And don't forget to redefine wrap to control
wrapping behavior.

Patch for rlex attached.

HTH Tim

--
"Lately, the only thing keeping me from becoming a serial killer is my distaste
for manual labor."
-- Dilbert
NP: Saints of Eden - Slow Stay (Crushed Remix)

--aVD9QWMuhilNxW9f
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="rlex.patch"

45a46
@yy_buffer_seen=false 49a51
@@yy_init = true
247c249
< yy_current_state = yy_get_previous_state ()
---
yy_current_state = yy_get_previous_state()
317c319
< case @yyin.type.to_s
---
case @yyin.class.to_s 321a324,327
if (@yy_buffer_seen)
input = ""
end
@yy_buffer_seen = true 471a478
@yy_buffer_seen=false
530c537
< if @yyin.type == File && @yyfilename != nil
---
if @yyin.class == File && @yyfilename != nil
533c540
< @@yyerr.print "`\#\{@yyin.type}' "
---
@@yyerr.print "`\#\{@yyin.class}' "

--aVD9QWMuhilNxW9f--
 
M

Minero Aoki

In mail "Re: rlex and ryacc"
Why not bundle a lexer with racc?

Because strscan comes with ruby 1.8. Maintaining same package
in other locations (for Racc and for Ruby 1.8) is undesirable
(for me :).


Regards,
Minero Aoki
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,141
Messages
2,570,813
Members
47,357
Latest member
sitele8746

Latest Threads

Top