tuncay said:
Of course, I do know that lex & yacc are the dinasours, but when it
comes to friendliness, I would not conclude lex & yacc are your best
friends (first of all, they are hard to debug) but they are as friendly
as C and UNIX goes.
A very apt judgment, I'd say. Each reader can make of that what they
want.
I'll go on record as saying that settling for the
"friendliness" of C and Unix is something many programmers are too eager
to do.
The point is what do you mean by *better*.
Exactly. All I can say is that for *many* purposes, there are better
tools than lex/yacc. If you don't need to generate C, there's probably a
language-specific tool for you that will be more accommodating than
lex/yacc, and there's no reason not to use that. If you're working with
a language that's not conveniently described as an LALR grammar, there
are better tools than lex/yacc. If you find merging grammar rules and
code unmaintainable, there are better tools than lex/yacc.
All that doesn't mean lex/yacc have had their day and are inferior
tools. They are surprisingly effective in a great number of situations.
Shopping around a bit for potentially better solutions might pay off,
though. There's enough software to go around.
What I wonder is in terms of power, is there any lexer/parser doing
sth that lex&yacc cannot do (easily) - ANTLR did not look promising
to me in this context.
Oh, you literally mean parsing power? The answer is that it hardly matters.
Most parser generators are naturally tuned to producing efficient
parsers for programming languages. Any sane programming language can be
described with an LALR(1) grammar -- the kind yacc and many other tools
can parse. (C++ is a notable example of an "insane" language; people who
have tried to produce a pure LALR(1) grammar generally agree that it's
not worth the bother, and you're better off doing special processing on
a restricted grammar.)
In fact, many programming languages can be parsed with less than full
LALR. ANTLR is a so-called "predicated LL(k) parser", which is less
powerful than LALR(1), but you'll be hard-pressed to come up with a
language that has a practical LALR(1) grammar but no practical LL(k)
grammar. The theoretical limits of parser generators are real, but
matter surprisingly little in practice. It's mostly about how easy it is
to write things down.
And that depends. LR and LL are different flavors of ice cream; most
people find LL far easier to understand (none of that shift-reduce
hoopla) but LR allows some constructs to be written down more naturally
than LL. And vice versa, of course; it's hard to judge these things
objectively. Plus, each parser generator will have its own set of bells
and whistles that will allow you to write down certain common things
easily. It might have exactly what you need or miss the mark.
If you really need parsers with even *more* power, they exist, but even
when optimized for performance, these tend to be too slow to use for
tasks like compiling. For example, for most languages a complete
annotated syntax tree could be built with a context-sensitive parser,
instead of getting a context-free grammar with a tool like yacc and then
computing the context-sensitive bits. But context-sensitive parsing is
not easy to do efficiently (let alone linearly), and writing
context-sensitive grammars can be tricky as compared to the "compute
stuff on a context-free tree", which most programmers find quite natural
to do.
This is not the place to go into a discussion of the various tradeoffs
of parsers; if you don't know this stuff, I recommend getting a good
book on parsing techniques. One of my favorites is unimaginatively
titled "Parsing Techniques - A Practical Guide" by Grune and Jacobs, but
unfortunately it's currently out of print (I saw a second edition is in
the works, though, which is great). It also contains *way* more
information than you'll need for your average compiler, but I felt like
plugging it.
Try Googling and asking around in comp.compilers. They should have
archives full of discussion on these matters. (Disclaimer: I've never
been there.) There's also an overview page of parser generators on
Wikipedia:
http://en.wikipedia.org/wiki/Compiler-compiler. Look around,
see what you can find.
By the way, no offense in any means, I am not defending these tools,
I am just curious
It's highly unlikely you could say anything that would offend me, and it
certainly wouldn't involve opinions on parsing tools. Curiosity is good.
Bottom line: look beyond lex & yacc for fun and profit sometimes. Even
if you're going to stick with them for the rest of your days, it's nice
to know what else is out there.
S.