Looking for Perl Grammar

K

Khamis Abuelkomboz

Hi

I'm writing a Perl parser and I'm looking for a pure perl grammar. the documentation from perl is
not a pure grammar, it's more a documentation of the perl language.

thanks in advance
--
Try Code-Navigator on http://www.codenav.com
a source code navigating, analysis and developing tool.
It supports following languages:
* C/C++
* Java
* .NET (including CSharp, VB.Net and other .NET components)
* Classic Visual Basic
* PHP, HTML, XML, ASP, CSS
* Tcl/Tk,
* Perl
* Python
* SQL,
* m4 Preprocessor
* Cobol
 
C

Christopher Nehren

I'm writing a Perl parser and I'm looking for a pure perl grammar. the
documentation from perl is not a pure grammar, it's more a
documentation of the perl language.

[Please wrap your lines at 70-72 characters. Thank you.]

First, a bit of folklore: "Only perl can parse Perl". What this means is
that only perl (the program) is capable of properly and completely
parsing Perl (the language). So why not have the existing Perl parser
(accessible at least via the command perl and possibly via a shared
library) do it for you? If you're doing this from C, you should start
with perldoc perlembed (assuming your documentation is installed, of
course; some Unix vendors think that it's bright to not ship
documentation by default).

Second, failing that, you can find the file perly.y in the perl source
kit which contains the grammar for the language.

Best Regards,
Christopher Nehren
 
A

Anno Siegel

Christopher Nehren said:
I'm writing a Perl parser and I'm looking for a pure perl grammar. the
documentation from perl is not a pure grammar, it's more a
documentation of the perl language.

[Please wrap your lines at 70-72 characters. Thank you.]

First, a bit of folklore: "Only perl can parse Perl". What this means is
that only perl (the program) is capable of properly and completely
parsing Perl (the language).

It is more than just folklore. For example, you cannot decide if a
bareword is just that, or a function to be called, unless all use
statements up to that point have been executed. That requires a Perl
interpreter.

Anno
 
J

Joe Smith

Khamis said:
I'm writing a Perl parser and I'm looking for a pure perl grammar.

You won't find it.

My favorite example is this:

#!/usr/bin/perl -l
print time / 2 ; #/; die 'This die() is not executed';
print cos / 2 ; #/; warn 'But this warn() is!';

To resolve the ambiguity of / as numerator/denominator versus m//
requires knowledge of which functions require arguments and which
do not. And if the program has 'use Module;', to determine which
user-defined functions take arguments and which do not requires
actually parsing the Module. You can't do that with pure grammar.
-Joe
 
K

Khamis Abuelkomboz

Fabian said:
* Khamis Abuelkomboz said:
However I still find clauses like

if ( s{foo}{bar} ) ...
if ( s[foo][var] ) ...
if ( s<foo><bar> ) ...

I still don't figure out what could be used as "brackets" for the
s|tr|m commands???? So I'm looking as example for grammar descriping
how the (s, tr, qq) could be built, something like

S: s OP1 expr OP2 expr OP3
OP1: '/' | '{' | '[' | ':' | ...
OP2: '/' | '}{' | '][' | ...
OP3: '/' | '}' | ']' | ':' | ...

I'm still guissing :)


This is mentioned in `perldoc perlop` in the section named "Quote and
Quote-like Operators".

Non-bracketing delimiters use the same character fore and aft, but
the four sorts of brackets (round, angle, square, curly) will all
nest, which means that

q{foo{bar}baz}

is the same as

'foo{bar}baz'

[...]

There can be whitespace between the operator and the quoting
characters, except when # is being used as the quoting character.
q#foo# is parsed as the string foo, while q #foo# is the operator q
followed by a comment. Its argument will be taken from the next
line. This allows you to write:

s {foo} # Replace foo
{bar} # with bar.

So, for s/// you could start with a grammar similar to

S: s WS BRACKETED WS BRACKETED | s WS DELIM expr DELIM expr DELIM
BRACKETED: '(' expr ')' | '<' expr '>' | '[' expr ']' | '{' expr '}'
WS: WHITE-SPACES *
DELIM: '/' | '!' | ':' | '^'

I'm not deeply familiar with grammar descriptions. Is there a way to
ensure that each DELIM is the same char?

Don't forget to add the special behavior of »#« to your grammar. And of
course, do you want to allow comments between the BRACKETED parts as in
the example above? ;-)

regards,
fabian
Hello!

Thank you all for your input. I still have a question, I found something like:

local $SIG{__DIE__} = sub {};
local $SIG{__WARN__} = sub {};

What does it mean? Does it define two local variables or two elements of an array? What does "sub
{}" here meen?

I thougt, (my|local|our) declare variables without those nasty brackets!

thanks
khamis

--
Try Code-Navigator on http://www.codenav.com
a source code navigating, analysis and developing tool.
It supports following languages:
* C/C++
* Java
* .NET (including CSharp, VB.Net and other .NET components)
* Classic Visual Basic
* PHP, HTML, XML, ASP, CSS
* Tcl/Tk,
* Perl
* Python
* SQL,
* m4 Preprocessor
* Cobol
 
F

Fabian Pilkowski

* Khamis Abuelkomboz said:
Thank you all for your input. I still have a question, I found something like:

local $SIG{__DIE__} = sub {};
local $SIG{__WARN__} = sub {};

What does it mean? Does it define two local variables or two elements
of an array? What does "sub {}" here meen?

In Perl, there is a global hash called %SIG (see `perldoc perlvar` to
learn what it is for). Here, you want to localize the elements __DIE__
and __WARN__ (see `perldoc -f local`). The »sub« actually returns a code
ref of the closure (anonymous sub) you just created. In this case, the
closure is empty.

regards,
fabian
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,814
Latest member
SpicetreeDigital

Latest Threads

Top