Pythonic indentation (or: beating a dead horse)

E

Eleanor McHugh

Exactly true.

I'm offended by anyone trying to make Greg out to be some sort of
willfully and routinely vulgar troll.

His contributions to Ruby have been amazing, and I'm more than
willing to cut him whatever slack he needs.

Maybe some reflection on what might provoke someone such as Greg to
respond like that is more in order.

I felt that way when Tony had his outburst. I can't recall ever seeing
this extremity of reaction in a thread on ruby-talk in the years I've
been subscribed, and it either means that there's something
fundamental about this topic that pushes our buttons or that there was
something about the way it was presented that did likewise. Clearly a
raw nerve has been touched either way.


Ellie

Eleanor McHugh
Games With Brains
http://slides.games-with-brains.net
 
T

Tony Arcieri

[Note: parts of this message were removed to make it a legal post.]

On Fri, May 29, 2009 at 10:47 PM, Eleanor McHugh <
I felt that way when Tony had his outburst. I can't recall ever seeing this
extremity of reaction in a thread on ruby-talk in the years I've been
subscribed, and it either means that there's something fundamental about
this topic that pushes our buttons or that there was something about the way
it was presented that did likewise. Clearly a raw nerve has been touched
either way.


There's nothing about the topic that pushes buttons for me. I tried to make
a Ruby-like language with a Pythonic lexer handling indentation
sensitivity. I gave up, and in the end I think keeping the "end" token was
the right decision.

But after painstakingly detailing the problems of doing this with a Pythonic
lexer/grammar, and suggesting a Haskell-style lexer/grammar, I was instead
strawmanned with claims that it's "impossible" and a bunch of personal
attacks from J Haas. The guy was basically being a little twat, and all he
had to offer was some script he didn't write that did some very much
informal and complicated regex munging to back up his claims.

It has been interesting to see the alternantive formally decidable solutions
proposed here for the lexer, and I'm interested in possibly incorporating
them into Reia. In fact I'd be interested in a better description of the
lexer behavior, although I see it's been released so I suppose I could
always check out the source myself.

That said restoring indentation sensitivity is rather low on my list of
priorities.
 
R

Roger Pack

What I mean is..
if endless ruby code is allowed, it cannot be strictly optional. If a
core class or any other class/project that you utilize is coded without
ends, then you get endless ruby whether you want it or not.

I think core classes will take a long time before any would be written
in endless style, so till then it could be optional. Or did I
misunderstand?
That being said, if there were an endless gem then gems that are written
in endless could just depend on it. Thank you to the gems devs.
-=r
 
R

rzed

It's certainly not impossible, as the script you posted shows.
That script is a hack, however, since it doesn't lex its input
properly and it will fail in obscure cases. I've written a proper
implementation, based on my RubyLexer library.

Basically, it makes the end keyword optional. If you leave off
the end, the preprocessor inserts an end for you at the next line
indented at or below the level of indentation of the line which
started the scope.

End is optional, so you can still write things like this:
begin
do_something
end until done?
(However, you'd better make damn sure you get the end indented to
the right level!)

If I'm reading this right, given
x.foreach ...
if ...
while ...
do_something
something_else

.... would pop an end at the level of the 'while' only. You really
need an end there and at each succeeding dedent level up to the level
of the next statement ('something_else'). Not saying this is
something that should be done, but if it is done, that's what you
need to do.
 
R

Roger Pack

If I'm reading this right, given
x.foreach ...
if ...
while ...
do_something
something_else

... would pop an end at the level of the 'while' only. You really
need an end there and at each succeeding dedent level up to the level
of the next statement ('something_else'). Not saying this is
something that should be done, but if it is done, that's what you
need to do.

Currently that would read
x.foreach ...
if ...
while ...
do_something
end
end
end
something_else

I believe.
Which makes me wonder
how does it differentiate between that and
x.foreach ...
if ...
while ...
do_something
.something_else

becoming

x.foreach ...
if ...
while ...
do_something
end
end
end.something_else

is the "." special case?
-=r
 
C

Caleb Clausen

had to offer was some script he didn't write that did some very much
informal and complicated regex munging to back up his claims.

It has been interesting to see the alternantive formally decidable solutions
proposed here for the lexer, and I'm interested in possibly incorporating

It's curious to me that you use phrases like 'formally decidable'. I'm
not sure this is really the best description. To my mind, the output
of pyrb.rb was perfectly decidable, and it was equally clear that it
had bugs. Fundamental bugs, that could be chased around but not really
fixed except by complete rewrite. It's a buggy approach because it
doesn't do what you're supposed to do in that kind of automatic
munger: actually tokenize the input stream.

Examples:

begin # :
foo
bar
baz

begin p ":
some more string here"
foo
bar
them into Reia. In fact I'd be interested in a better description of the
lexer behavior, although I see it's been released so I suppose I could
always check out the source myself.

RubyLexer is large and ugly, so I don't envy anyone trying to review
it. It's organized in a straightforward imperative manner, which is
the problem; hand-coded lexers are painful. Also, ruby syntax is very
complicated. If you want to know about something more specific about
its internals, please ask me.
 
C

Caleb Clausen

If I'm reading this right, given
x.foreach ...
if ...
while ...
do_something
something_else

... would pop an end at the level of the 'while' only. You really

I guess you're reading it right, but I didn't write it right.
Endless.rb in fact does operate in the way you want; all three
constructs would be ended. But I see now that my description implies
only one end will be added.

Roger said:
Which makes me wonder
how does it differentiate between that and


becoming

x.foreach ...
if ...
while ...
do_something
end
end
end.something_else

is the "." special case?

Currently, this case causes an error in endless.rb. It runs through
endless.rb just fine, but the output isn't legal ruby. Every end added
is followed by a semicolon.
 
M

Michael Shigorin

Greetings, folks. First time poster

Last time I read ruby-talk@ via spambox was the tail of a similar
thread -- first-time poster wondering about python-like
whitespace semantics in Ruby, although *somewhat* less eager
to do his homework IIRC :)
But I digress... the purpose of this post is to talk about one
of the relatively few areas where I think Python beats Ruby,
and that's syntatically- significant indentation.

Well, guess you knew [most of] the answers in advance...

I for one do avoid Python for several reasons (thoroughly
as a sorta-developer and somewhat less so as a packager
for ALT Linux distribution).

Number one of them is immaturity of upstream management
(at least relatively to the popularity; think PHP).

Number two is that it's not up to people to use machines to push
their image of taste down someone else's throat.

And if I'd *have* to choose between Guido's one and Matz's style
of educating people with development tools, I'd still land here.

Fortunately we do have choice, and there are other languages
borrowing from both Python and Ruby as well.

Regarding "not DRY", but is it better to be dry? Sometimes it's
just time to stop, think a bit and write: "end". And continue. :)
 
S

Steven Arnold

After listening to this debate for some time, the position of allowing
optional Pythonic indentation seems increasingly persuasive.

First, no one is proposing to eliminate "end". The "end" keyword can
still be used just as it always was. If we adopt J. Haas' proposal
that indentation blocks be demarcated by a colon, as in:

if x:
foo

...then the preprocessor can run ALL existing Ruby code unchanged,
even if the "end" keywords are misaligned. Further, the colon gives a
visual cue that indentation syntax is being employed. We could
further reduce some of the ambiguity from which Python suffers by, for
example, disallowing tabs as marks of indentation. Spaces only, please.

Second, the idea that this change would be hard to implement has been
mostly rebutted, in my opinion. Caleb's code does the job. Ruby
could be trivially transformed to (a) include the preprocessor, and
(b) run it on Ruby files before processing them -- i.e. before lexing,
parsing, etc. That would be the dead-easy way to implement this
proposal, requiring very little C.

Third, there is no doubt about the objective fact that Ruby code that
removes optional "end" keywords will be shorter, and I think
practically everyone agrees in principle with the idea that terseness
is a virtue unless there is some specific reason for being redundant.

Fourth, I thought the point was quite persuasive that Ruby also allows
semicolons to be entirely optional except in those specific cases on
the same line where they are needed. This behavior and philosophy is
exactly analogous to the current proposal. You could put a semicolon
at the end of every line in Ruby right now, but no one does. Why
not? Because they are redundant, because they are visual noise.

The point that has not been so thoroughly deconstructed are all the
possible odd syntactical limitations that might exist if this proposal
were adopted. For example, I saw someone saying that "end" would not
be allowed inside any block that used indentation. If this were
really necessary, to me, it would be a point against the significant-
whitespace proposal. I don't really see why it would be necessary. I
do see that, for the code to work with Caleb's preprocessor, any "end"
keywords used inside an indented block would themselves need to be
indented properly. But this is another example of a minor syntactical
"gotcha" that could or would pop up. How many other such kinks or
gotchas do you suppose there might be, once we delve more deeply into
actual implementation?

Having said that, I think the main problem boils down to aesthetics.
Some people, myself included, would prefer the whitespace-aware option
rather than typing a string of "end" keywords assuming they are
unnecessary. Others find the "end" keywords to be visually helpful.
To argue against optional removal of "end" is to argue that others
should not be given the choice of removing "end" in their own code.
This position -- denying to others the option of doing things in a way
they prefer -- seems like a hard position to argue for. However, I
think from the viewpoint of people who do not like significant
whitespace, the problem is that they know that if the proposal were
adopted, sooner or later they will encounter code that is formatted in
this way, and they will then be "forced" to deal with code they find
unpleasant. In other words, someone will be forced to deal with code
they find difficult to work with, and those who prefer "end," quite
naturally, do not want to have to be the ones to deal with this.

I think in reality, the imposition on those who prefer "end" will be
minor. All their own code will have "end". The issue will arise in
situations where for some reason a person who prefers "end" is forced
to maintain code that uses indentation. This could be exacerbated by,
e.g., a corporate coding standard that requires indentation syntax
wherever possible. This, I think, is the future that those who like
the "end" keyword are trying to avoid. If the significance of
whitespace were guaranteed never to be an issue that confronts them,
then I think few people would object to the inclusion of optional
indent-aware syntax.

I don't have an answer to this aesthetic problem, and I don't think
one is possible, since taste is taste and one cannot be argued out of
their preferences. However, i would say that allowing people to do
things in different ways is part of the spirit and philosophy of
Ruby. It's not that Ruby seeks to be able to do things in many ways,
but Ruby does make it easy to do things in different ways if those
ways are convenient and desirable to a substantial number of people --
e.g., curly brackets versus do..end, aliases, << versus insert, etc.
Also, making redundant syntax optional, e.g. semicolon, is, IMO,
entirely within the spirit and philosophy of Ruby. It's hard to
understand why Ruby would allow semicolons to be optional but not
"end" keywords.

steven
 
T

Tony Arcieri

[Note: parts of this message were removed to make it a legal post.]

After listening to this debate for some time, the position of allowing
optional Pythonic indentation seems increasingly persuasive.

That said, "optional Pythonic indentation" seems like a bit of a misnomer in
this case. This approach works a lot more like Haskell where indent blocks
can be used in lieu of explicit delimiting tokens, however it is not
mandatory as is the case in Python.
 
R

Roger Pack

...then the preprocessor can run ALL existing Ruby code unchanged,
even if the "end" keywords are misaligned.

The preprocess could run all code that had misaligned end keywords,
however, it would break on things like
x = ":
a string!"

Caleb's wouldn't, though (and also doesn't have the added :'s)

The thought would be that code written in endless "wouldn't have
misaligned end keywords."

Note also that ruby 1.9 if you pass it "-w" already somewhat tracks
alignment, if that's helpful at all.
Further, the colon gives a
visual cue that indentation syntax is being employed. We could
further reduce some of the ambiguity from which Python suffers by, for
example, disallowing tabs as marks of indentation. Spaces only, please.

That's a good idea.

-=r
 
R

Rimantas Liubertas

Third, there is no doubt about the objective fact that Ruby code that
removes optional "end" keywords will be shorter, and I think practically
everyone agrees in principle with the idea that terseness is a virtue unl= ess
there is some specific reason for being redundant.

No, I don't agree.
Fourth, I thought the point was quite persuasive that Ruby also allows
semicolons to be entirely optional except in those specific cases on the
same line where they are needed. =C2=A0This behavior and philosophy is ex= actly
analogous to the current proposal. =C2=A0You could put a semicolon at the= end of
every line in Ruby right now, but no one does. =C2=A0Why not? =C2=A0Becau= se they are
redundant, because they are visual noise.

Tracking newlines is easy, tracking indentation is not.
Having said that, I think the main problem boils down to aesthetics.

No. It boils down to the mess this introduces.
There is Python for those who want Python.
However, I think from the viewpoint of people who do
not like significant whitespace, the problem is that they know that if th= e
proposal were adopted, sooner or later they will encounter code that is
formatted in this way, and they will then be "forced" to deal with code t= hey
find unpleasant. =C2=A0In other words, someone will be forced to deal wit= h code
they find difficult to work with, and those who prefer "end," quite
naturally, do not want to have to be the ones to deal with this.

That's correct.
I think in reality, the imposition on those who prefer "end" will be mino=
r.

I think otherwise.
I don't have an answer to this aesthetic problem, and I don't think one i= s
possible, since taste is taste and one cannot be argued out of their
preferences. =C2=A0However, i would say that allowing people to do things= in
different ways is part of the spirit and philosophy of Ruby.

Yes, please. there are whole languages for that.
=C2=A0It's hard to understand why Ruby would allow semicolons to be optio= nal but
not "end" keywords.

It is very easy to understand.

Regards,
Rimantas
 
S

Steven Arnold

No, I don't agree.

I'm not sure which of these propositions you're disagreeing with. I
will assume it's the second, since the first seems (to me) to be
virtually unassailable.

On that point, I said "practically everyone" rather than "everyone" to
account for those people who, like you, may not agree that concision
is a virtue. Note that I also added the disclaimer "unless there is
some specific reason for being redundant." One such reason that
advocates of "end" have put forward is clarity, and I agree that
clarity is a good reason to be less concise. The difference of
opinion lies in the question of whether the "ends" make code more or
less clear. My own feeling is they can sometimes, but for the most
part they just take up space.
There is Python for those who want Python.

This is a commonly-used, but in my opinion fallacious argument that is
primarily intended to evoke the idea that those who support this
proposal are somehow traitors to Ruby, or not really Rubyists. It's
the "love-it-or-leave-it" argument. Any time any language change is
proposed, one could use the same argument: there is
<language_that_contains_feature_x> for those who want feature x. I
want the Ruby language, but I think it might well be improved by
making a change.
That's correct.

Then we agree on the main point. The nub of opposition is not
technical, but aesthetic dislike of indent-aware syntax, which causes
a fear of having to maintain such code in the future.

steven
 
J

James Britt

Steven said:
Then we agree on the main point. The nub of opposition is not
technical, but aesthetic dislike of indent-aware syntax, which causes a
fear of having to maintain such code in the future.

And vice-versa, the current angst in people having to maintain code with
'end', and their fear that they will have to continue doing so in the
future if they cannot get 'end' dropped.

Who knew that programming was so scary?

More significant is that there are some people who may find endless Ruby
more aesthetic, yet still harder to actually work with, and so prefer to
have the extra token.

That is, it is not merely (or even) aesthetics , but, at least for some
people, an empirical problem in easily processing significant indentation.

Plus there are concerns with editor support (code folding, start/end
block navigation, auto-formatting, ease of cut-n-paste, etc.).

The nubs are many.


--
James Britt

www.jamesbritt.com - Playing with Better Toys
www.ruby-doc.org - Ruby Help & Documentation
www.rubystuff.com - The Ruby Store for Ruby Stuff
www.neurogami.com - Smart application development
 
M

Michael Bruschkewitz

James Britt said:
Who knew that programming was so scary?


When you start writing life-critical software, you will become scary some
times.

More significant is that there are some people who may find endless Ruby
more aesthetic, yet still harder to actually work with, and so prefer to
have the extra token.

That is, it is not merely (or even) aesthetics , but, at least for some
people, an empirical problem in easily processing significant indentation.

More Ruby people probably will find "endless" less "aesthetic".
(Is this really a proper english word? It's really un-aesthetic.)

Even bad "aesthetics" will spoil efficiency and therefore wasting
resources - that's the point where it wastes money - or, at least, life-time
of developers which should better play with their children than with
whitespaces, tabs or linefeeds.
 
M

Michael Bruschkewitz

"Michael Bruschkewitz" <[email protected]>
schrieb im Newsbeitrag
Even bad "aesthetics" will spoil efficiency and therefore wasting
resources - that's the point where it wastes money - or, at least,
life-time of developers which should better play with their children than
with whitespaces, tabs or linefeeds.

P.S. ... or creating new ones! (children)
 
M

Mark Kremer

Since we're in the realm of the subjective, I don't understand why you
and others are fighting so hard against having the _option_ of
syntactic indentation. Why are you so gung-ho on forcing your own
subjective interpretation of the "right balance" of useless redundancy
on everyone else?
My best guess (and personal reason to be against such an option) is that
it would add complexity to programming and maintaining Ruby code without
actually providing new functionality (it would only add a new syntax
option).

An additional argument could be that Ruby works based on the principle
of least surprise (as Matz intended, please correct me if I am mistaken)
and that having an "end" to end a block of code (or to use {}) is not
very surprising as lots of languages work similarly.

I have not taken a look at Python myself, so I do not know how easy or
hard it would be to program blocks of code based on identation. The idea
does not seem very appealing to me. Even though I always take great care
in properly identing my code, I know lots of other programmers don't and
I would not be very happy if I had to track down their identation errors
all over the place.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,172
Messages
2,570,934
Members
47,477
Latest member
ColumbusMa

Latest Threads

Top