Parsing library for Python?

Harry George · Feb 20, 2004

Viktor Rosenfeld said:
Hi,

I need to create a parser for a Python project, and I'd like to use process
kinda like lex/yacc. I've looked at various parsing packages online, but
didn't find anything useful for me:

- PyLR seems promising but is for Python 1.5
- Yappy seems promising, but I couldn't get it to work. It doesn't even
compile the main example in it's documentation
- mxTexttools is way complicated. I'd like something that I can give a BNF
grammar to handle.

Is there a good parsing module for python that I missed?

TIA,
Viktor

http://www.python.org/sigs/parser-sig/

I used Ply for a project a while ago. It felt comfortable.
http://systems.cs.uchicago.edu/ply/

Viktor Rosenfeld · Feb 20, 2004

Hi,

I need to create a parser for a Python project, and I'd like to use process
kinda like lex/yacc. I've looked at various parsing packages online, but
didn't find anything useful for me:

- PyLR seems promising but is for Python 1.5
- Yappy seems promising, but I couldn't get it to work. It doesn't even
compile the main example in it's documentation
- mxTexttools is way complicated. I'd like something that I can give a BNF
grammar to handle.

Is there a good parsing module for python that I missed?

TIA,
Viktor

Sean Ross · Feb 20, 2004

[snip]

Is there a good parsing module for python that I missed?

You could look into SPARK:

http://pages.cpsc.ucalgary.ca/~aycock/spark/

Yermat · Feb 20, 2004

Viktor Rosenfeld a écrit :

Hi,

I need to create a parser for a Python project, and I'd like to use process
kinda like lex/yacc. I've looked at various parsing packages online, but
didn't find anything useful for me:

- PyLR seems promising but is for Python 1.5
- Yappy seems promising, but I couldn't get it to work. It doesn't even
compile the main example in it's documentation
- mxTexttools is way complicated. I'd like something that I can give a BNF
grammar to handle.

Is there a good parsing module for python that I missed?

TIA,
Viktor

YAPPS : http://theory.stanford.edu/~amitp/Yapps/

and all those cite on http://www.python.org/sigs/parser-sig/

Christophe Delord · Feb 20, 2004

Hi,

I need to create a parser for a Python project, and I'd like to use
process kinda like lex/yacc. I've looked at various parsing packages
online, but didn't find anything useful for me:

- PyLR seems promising but is for Python 1.5
- Yappy seems promising, but I couldn't get it to work. It doesn't
even compile the main example in it's documentation
- mxTexttools is way complicated. I'd like something that I can give
a BNF grammar to handle.

Is there a good parsing module for python that I missed?

TIA,
Viktor

Have you seen this page :
http://www.python.org/sigs/parser-sig/

Christophe.

Diez B. Roggisch · Feb 20, 2004

You could look into SPARK:

http://pages.cpsc.ucalgary.ca/~aycock/spark/

Yes, I'd recommend that, too - its an early-parser implementation, which is
very powerful and allows e.g. left-recursive rules. however, you can't feed
it a ebnf directly, instead you do things like this (*->* is ebnf, ::= is
spark) :

rule -> term?

becomes

rule ::=
rule ::= term

rule -> (term)*

becomes

rule ::= rule term
rule ::=

Edward C. Jones · Feb 20, 2004

Viktor said:
I need to create a parser for a Python project, and I'd like to use process
kinda like lex/yacc. I've looked at various parsing packages online, but
didn't find anything useful for me:

- PyLR seems promising but is for Python 1.5
- Yappy seems promising, but I couldn't get it to work. It doesn't even
compile the main example in it's documentation
- mxTexttools is way complicated. I'd like something that I can give a BNF
grammar to handle.

Is there a good parsing module for python that I missed?

When looking for a parser generator, I think it is important that full
grammars be provided for at least C and Python and preferably for C++,
Java, and FORTRAN.

Mike C. Fletcher · Feb 20, 2004

Viktor said:
Hi,

I need to create a parser for a Python project, and I'd like to use process
kinda like lex/yacc. I've looked at various parsing packages online, but
didn't find anything useful for me:

- PyLR seems promising but is for Python 1.5
- Yappy seems promising, but I couldn't get it to work. It doesn't even
compile the main example in it's documentation
- mxTexttools is way complicated. I'd like something that I can give a BNF
grammar to handle.

SimpleParse is based on mxTextTools, but is EBNF-driven. You can find
it here:

http://simpleparse.sourceforge.net/

Have fun,
Mike

_______________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://members.rogers.com/mcfletch/

Brad Clements · Feb 21, 2004

_

Viktor Rosenfeld said:
Hi,

I need to create a parser for a Python project, and I'd like to use process
kinda like lex/yacc. I've looked at various parsing packages online, but
didn't find anything useful for me:

You might also try

http://pyparsing.sourceforge.net

M.-A. Lemburg · Feb 23, 2004

You should have a look at SimpleParse which converts BNF to
the tag tables used by mxTextTools:

http://simpleparse.sourceforge.net/

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, Feb 23 2004)________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::

Tim Roberts · Feb 24, 2004

Edward C. Jones said:
When looking for a parser generator, I think it is important that full
grammars be provided for at least C and Python and preferably for C++,
Java, and FORTRAN.

Are you kidding with this? I can't tell.

C, C++, and Fortran are parsing nightmares, where end-of-line and spacing
are important sometimes and ignored at other times, and so on.

I expect to find the canonical desk calculator example, and perhaps a
Pascal-based language, but any more than that is asking a bit much from all
but the most mature parser generators.

Harry George · Feb 24, 2004

Edward C. Jones said:
Not kidding. Nothing can be parsed without a grammar. I think parsing
the standard computer languages is a common need. I am sporatically
developing software to automatically generate Pyrex code for wrapping
C libraries in Python. I use ANTLR because it comes with a good C
grammar.

And then there is HTML. I wonder how Mozilla parses all the ill-formed
html that is on the web.

Yes, things can be parsed without a grammar, or at least without a
conventional CFG. Ad hoc parsers are so messy, of course, that we try
to avoid that in modern languages. But I've parsed textual documents
at times with context-sensitive RR(2) approaches and other oddities.

The point is that FORTRAN predates clear understanding of
line-independent lexing and Context Free grammars (CFG's). It uses
constructs which are not handled by the classic
scanner/lexer/parser/AST tools. I don't know how the pros handle
this, but when I run into a non-std grammar, I preprocess to tag it
with additional tokens, and then run it through a std lexer/parser.
Basically a tree re-writer approach.

C++ is (I think) classically lexable, but the semantics are so complex
that parsing (or understanding what to do with the parse) is a pain.
I wasn't in that business, but I understand C compiler vendors bombed
out trying to just upgrade C compilers and had to start fresh with a
much richer type model. SWIG also ran into this.

For parsing of "bad html", see "tidy". Its lexer/parser is ad hoc
(not generated by parser toolkits).

=?iso-8859-1?Q?Fran=E7ois?= Pinard · Feb 24, 2004

[Tim Roberts]

C, C++, and Fortran are parsing nightmares, where end-of-line and
spacing are important sometimes and ignored at other times, and so on.

End-of-line processing does not look too difficult for these languages.
But spaces in FORTRAN always looked difficult to parse, at least in
the original FORTRAN where they might appear anywhere, even inside an
identifier, while not even being required between "words".

One routine which was popular, at one place I worded, was named INGMTR,
as people used to always call it this way:

CALLING MTR(... ARGUMENTS ...)

One traditional amusement was writing obscure programs, like:

DO 50 I = 3

that had nothing to do with DO loops. I wonder how FORTRAN parsers
worked to sort out such things. Did later FORTRAN use more strict (or
at least usual) rules on white space?

Edward C. Jones · Feb 24, 2004

Tim said:
Are you kidding with this? I can't tell.

C, C++, and Fortran are parsing nightmares, where end-of-line and spacing
are important sometimes and ignored at other times, and so on.

I expect to find the canonical desk calculator example, and perhaps a
Pascal-based language, but any more than that is asking a bit much from all
but the most mature parser generators.

Not kidding. Nothing can be parsed without a grammar. I think parsing
the standard computer languages is a common need. I am sporatically
developing software to automatically generate Pyrex code for wrapping C
libraries in Python. I use ANTLR because it comes with a good C grammar.

And then there is HTML. I wonder how Mozilla parses all the ill-formed
html that is on the web.

Paul McGuire · Feb 24, 2004

Edward C. Jones said:
Not kidding. Nothing can be parsed without a grammar. I think parsing
the standard computer languages is a common need. I am sporatically
developing software to automatically generate Pyrex code for wrapping C
libraries in Python. I use ANTLR because it comes with a good C grammar.

And then there is HTML. I wonder how Mozilla parses all the ill-formed
html that is on the web.

I'm looking for the C grammar in ANTLR. Do you mean the tinyC example?
That leaves out a *lot*. (There are grammars for Java and Pascal included,
and they look pretty complete.)

-- Paul

Parsing files in python	0	Dec 23, 2012
Choosing the right parser for parsing C headers	11	Feb 8, 2005
PEP/GSoC idea: built-in parser generator module for Python?	0	Mar 14, 2014
With this artifact, everyone can easily invent new languages	5	Jan 11, 2014
Parsing XML RSS feed byte stream for <item> tag	2	Feb 7, 2013
Parsing: request for pointers	2	Nov 11, 2008
HTML Parsing and Indexing	5	Nov 13, 2006
HOWTO: Parsing email using Python part2	1	Jul 15, 2011

Parsing library for Python?

Harry George

Viktor Rosenfeld

Sean Ross

Yermat

Christophe Delord

Diez B. Roggisch

Edward C. Jones

Mike C. Fletcher

Brad Clements

M.-A. Lemburg

Tim Roberts

Harry George

=?iso-8859-1?Q?Fran=E7ois?= Pinard

Edward C. Jones

Paul McGuire

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads