When little languages grow...

  • Thread starter Hugh Sasse Staff Elec Eng
  • Start date
C

Clifford Heath

Hugh said:
According to "Crafting a Compiler" ISBN 0-8053-3201-4 it seems that
recursive descent implies LL(1)...

No. LL(1) just means that the recursive descent parser may only consider
the current unprocessed token. If the lexer allows perusal of two
unprocessed tokens, your recursive descent parser is LL(2).
 
M

Mark Probert

Hi ..

Actually, it was the toolset that contained the original ANTLR.
When they decided they needed a Java version, they didn't go
multi-lingual - strange decision for a language research group!
:) I guess that it could have been worse. I notice that someone is working
on a Python generator for ANTLR. I wonder how hard it would be to do a Ruby
version?
Another suite of language tools we used successfully a few years
back was Eli.
That looks interesting, though it does seem a little tooly.
No, I mean syntax extensions - not choosing a language with an
almost complete absence of syntax :). So you could define useful
and appropriate domain-specific-languages...
:)

One of the parts of Forth that I really like is the ability to extend the
language in arbitrary ways. So, if you want to add syntax into the language,
then you are free to do so, and it is still Forth. So, there are guys out
there that have added OOP and Functional extensions to Forth, there is a
parser-generator (Anton Ertl's 'Gray'), a version of Lisp, in-fix extensions,
and so on.

You could argue that every Forth program is an extension into a
domain-specific language. But I wouldn't be so bold as to do that in this
forum ;-)

As a bit of fun, I once embedded a Forth system (ATLAST) into Ruby. It is
kind of fun to extend Ruby that way, though a little heretical.
... Unfortunately the individuals who are
capable of implementing these processes are so adept at linguistics
that they can't explain the processes in terms that we mortals can
understand :)... though I'm sure I almost understood it once... :)
:) Now that is something that I can understand and completely agree with.

Regards,
 
T

Trans

Mark said:
One of the parts of Forth that I really like is the ability to extend the
language in arbitrary ways. So, if you want to add syntax into the language,
then you are free to do so, and it is still Forth. So, there are guys out
there that have added OOP and Functional extensions to Forth, there is a
parser-generator (Anton Ertl's 'Gray'), a version of Lisp, in-fix extensions,
and so on.

You could argue that every Forth program is an extension into a
domain-specific language. But I wouldn't be so bold as to do that in this
forum ;-)

As a bit of fun, I once embedded a Forth system (ATLAST) into Ruby. It is
kind of fun to extend Ruby that way, though a little heretical.

That was you!? Ah, I've read the write-up and I've even downloaded and
installed ATLAST, but I haven't been able to play with it much.
Nonetheless....

I have always had this great love of Forth (despite my only slight use
of it), but I've always felt that it was to cryptic. Recently I've been
toying with a number of language notions, just to see what might come
of it. And I've been starting to feel as if I might actually be able to
get something "realish" together in the not too distant future. And
Forth plays an important role in this. If you're interested I will be
musing over these things on suby-muse (*note not same as suby-ruby). If
you'd like to join-in or just watch, you can subscribe here:
https://lists.berlios.de/mailman/listinfo/suby-muse

T.
 
H

Hugh Sasse Staff Elec Eng

No, I agree. The reason is that the grammar rules are transformed
first into simple rules each containing only *two* items. If you
look into how that transformation is done, you'll know what's
happening for a given grammar. The state is much more obvious then,
because it's defined in terms of these atomic rules.

For example, a rule A ::= B C D gets turned into two rules, either
B_C ::= B C
A ::= B_C D
or
C_D ::= C D
A ::= B C_D

depending on precedence. After processing all rules, there will
normally be a number of duplicates, ambiguities and other problems,

And they'd be much easier to detect. I like this model. Thank you.
so there's about five more stages of checking and massaging until
a parser can be generated. Unfortunately the individuals who are
capable of implementing these processes are so adept at linguistics
that they can't explain the processes in terms that we mortals can

Yes :) And when one considers the complexities possible in human
languages (cases, genders, 'moods'...) the scope for being esoteric
is certainly there...
understand :)... though I'm sure I almost understood it once... :)

Clifford.
Thank you,
Hugh
 
H

Hugh Sasse Staff Elec Eng

No. LL(1) just means that the recursive descent parser may only consider
the current unprocessed token. If the lexer allows perusal of two
unprocessed tokens, your recursive descent parser is LL(2).

Isn't that just re-framing the problem so that all token pairs are
regarded as single entities? Then at the level of what the parser
processes, it can only take one entity at a time? Tokenizing
generally lumps things together anyway, so /(\d+)/ rather than
/(?:(\d)+)/, therefore having the lexer peruse two tokens feels
like a special case.Anyway, I hope I haven't misrepresented what the book says.
Hugh
 
M

Mark Probert

Hi ..

That was you!? Ah, I've read the write-up and I've even downloaded and
installed ATLAST, but I haven't been able to play with it much.
Nonetheless....
:) It was indeed. I ran it up on my FreeBSD box the other day and it core
dumped horribly. ATLAST, in its raw form. I am thinking of reviving it
using the excellent pForth system. I am using pForth for a different project
and really like it (and it is ANS Forth). Again, more for fun that any
serious reason. I like the idea of being able to extend Ruby like this.
I have always had this great love of Forth (despite my only slight use
of it), but I've always felt that it was to cryptic.

Many have described it as the archetypal write-only language. Personally, I
think that it takes a little bit of practice to get your head around the way
it works, then it is fine. There is a web effort to update Brodie's
excellent "Thinking Forth" that may help you with the cryptic. See

http://thinking-forth.sourceforge.net/

If
you'd like to join-in or just watch, you can subscribe here:
https://lists.berlios.de/mailman/listinfo/suby-muse
Thank you, I will.

Regards,
 
S

Shashank Date

Hi Mark,

Mark said:
Many have described it as the archetypal write-only language. Personally, I
think that it takes a little bit of practice to get your head around the way
it works, then it is fine. There is a web effort to update Brodie's
excellent "Thinking Forth" that may help you with the cryptic. See

http://thinking-forth.sourceforge.net/

Were you able to download the PDF books? All my attempts failed :-(

Thanks for the link.

-- shanko
 
M

Mark Probert

Hi ..

Isn't that just re-framing the problem so that all token pairs are
regarded as single entities?
I am not sure if I fully understand but here goes. In parsing, there is the
sense of terminals and non-terminals. For LL(1), the lookahead is one token,
which needs to map to a terminal. For LL(2) we look ahead two tokens and
before checking the rules. To quote from Prof Moessenboeck: "[LL(1) means
that] at any point in the grammar the parser must be able to decide on the
bases of a single lookahead symbol which of several possible alternatives
have to be selected."

For example, the following rule is not LL(1):

State = ident "=" Expr
| ident [ "(" ExprList ")" ] .

Because we can't determine from a single lookahead which of the ident rules to
trigger. With LL(2) we could as it would be determined by the "=" or the
"(".

I hope that this helps.

Regards,
 
M

Mark Probert

Hi ..

Were you able to download the PDF books? All my attempts failed :-(
Yup. Did that this morning. I can send you the file if you contact me
off-list.

Regards,
 
S

Shashank Date

Hi Mark,

Mark said:
Many have described it as the archetypal write-only language. Personally, I
think that it takes a little bit of practice to get your head around the way
it works, then it is fine. There is a web effort to update Brodie's
excellent "Thinking Forth" that may help you with the cryptic. See

http://thinking-forth.sourceforge.net/

Were you able to download the PDF books? All my attempts failed :-(

Thanks for the link.

-- shanko
 
M

Mathieu Bouchard

i'm not sure you'd define them as 'special cases' but tcl can be
quirky if you come from a language like perl:
set x 1

Perl: set(\$x,1);
and to refer to it you use '$' so:
puts $x

the slashed-s $ means "Substitute" in Tcl, whereas it means "Scalar". In
Tcl, $ is used to read variables.
but to increment it you'd do:
incr x
ie. no '$' used.

Perl: incr(\$x);

btw those examples in Perl are using refs-to-scalar, but the actual Tcl
implementation uses strings and a thing called upvar/uplevel to access
other scopes on the stack. This is because Tcl completely lacks any notion
of pointer.
which features is it missing for you?

in two words: Pointers, Objects.
tk is easily one of the quicker ways to crank out a gui for an
application in tcl...

For a long time it was IMHO the quickest way to crank a gui, period, no
but's. I remember starting to learn to code Motif in C back in 1997 (when
there was no Gtk, and Qt was unknown), and a friend showed me Tcl/Tk and
my jaw dropped. Nowadays, there are a lot of possible combination of
high-level languages and gui-toolkits that are quite good and fast to
develop in.

_____________________________________________________________________
Mathieu Bouchard -=- Montréal QC Canada -=- http://artengine.ca/matju
 
H

Hugh Sasse Staff Elec Eng

Hi ..

Isn't that just re-framing the problem so that all token pairs are
regarded as single entities?
I am not sure if I fully understand but here goes. In parsing, there is the [...]

For example, the following rule is not LL(1):

State = ident "=" Expr
| ident [ "(" ExprList ")" ] .
Oh, I was forgetting that ident may be a nonterminal, therefore
composed of n terminals. Hence my above question is rubbish!
Because we can't determine from a single lookahead which of the ident rules to
trigger. With LL(2) we could as it would be determined by the "=" or the
"(".

I hope that this helps.

Thank you.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,166
Messages
2,570,903
Members
47,444
Latest member
Michaeltoyler01

Latest Threads

Top