quality of error messages

J

Joachim Wuttke

Thank you, Brian.
This is quite a convincing example: I count not less than eight
places where the "end" could be missing.
Just for curiosity: why didn't Ruby borrow indentation semantics
from Python ?
Regards, Joachim
 
B

Brian Candler

Thank you, Brian.
This is quite a convincing example: I count not less than eight
places where the "end" could be missing.
Just for curiosity: why didn't Ruby borrow indentation semantics
from Python ?

Probably because such forced indentation is extremely annoying in practice
:)

I suppose Ruby could make finding these problems much easier if there were a
way of distinguishing a method end, and a class/module end, from a block
end:

class Foo
def m1
...
enddef # ?
...
endclass # ?

Being forced to write that could be annoying too. But actually, just having
a way to test at compile time whether you're outside a method call, or
outside a class definition, would be good enough for me:

class Foo
def m1
"foo"
end
method_boundary <-- raise error if inside 'def'
def m2
"bar"
end
method_boundary
...
end
class_boundary <-- raise error if inside Class or Module
class Bar
...
method_boundary
end
class_boundary

Sprinkling a few "method_boundary" and "class_boundary" statements at
strategic points in a file would easily localise the problems. I wonder if
there is a clever way to implement this directly in Ruby?

But I still think a simpler and better solution would be an auto-indenter.
It has to be a fairly complete Ruby parser, but all it would output is each
original input line, with preceeding white-space stripped and replaced with
n*(nesting depth) spaces/tabs. In other words, I don't want it to split or
join any existing lines, just re-indent the lines which are there.

Regards,

Brian.
 
G

Gavin Sinclair

I suppose Ruby could make finding these problems much easier if there were a
way of distinguishing a method end, and a class/module end, from a block
end:
class Foo
def m1
...
enddef # ?
...
endclass # ?

Yeah, but that's ugly :)
Being forced to write that could be annoying too. But actually, just having
a way to test at compile time whether you're outside a method call, or
outside a class definition, would be good enough for me:
class Foo
def m1
"foo"
end
method_boundary <-- raise error if inside 'def'

method_boundary := "raise unless self == Foo"

or (more general)

method_boundary := "raise unless Class === self"
Sprinkling a few "method_boundary" and "class_boundary" statements at
strategic points in a file would easily localise the problems. I wonder if
there is a clever way to implement this directly in Ruby?

To localise the problem, I use % in Vim to jump between "end" and the
corresponding "class", "if", "def", ... statement. Doesn't take long
to find the sucker, usually.

Gavin
 
B

Brian Candler

method_boundary := "raise unless self == Foo"

or (more general)

method_boundary := "raise unless Class === self"

That doesn't work, because if it occurs within a method definition, it
doesn't get executed; it just gets compiled as part of the method.

class Foo
def m1
if 1 == 2
puts "Bad"
end
raise unless Class === self # nothing happens
end

Regards,

Brian.
 
G

Gavin Sinclair

That doesn't work, because if it occurs within a method definition, it
doesn't get executed; it just gets compiled as part of the method.
class Foo
def m1
if 1 == 2
puts "Bad"
end
raise unless Class === self # nothing happens
end

Try

puts 'foo'

then. If it's in a class, it will print it. If it's in a method, it
won't. Good enough for debugging?

Gavin
 
B

Brian Candler

Try

puts 'foo'

then. If it's in a class, it will print it. If it's in a method, it
won't. Good enough for debugging?

Hmm: puts "OK at #{__LINE__}" if Class === self

Well, this statement won't be executed until the whole file has been
successfully parsed first. So if you take an example like this:

---- 8< ----
#!/usr/local/bin/ruby -w

def checkpoint(x)
puts "OK(#{x}) at #{caller[0]}"
end

checkpoint:)Foo)
class Foo
checkpoint:)bar)
def bar(x)
if x == 3
puts "hello"
end
checkpoint:)baz)
def baz(y)
end
checkpoint("end of Foo")
end
---- 8< ----
$ ruby ert.rb
ert.rb:18: syntax error

Then you have to keep adding 'end' statements until it parses successfully.
You then get:

OK(Foo) at ert.rb:7
OK(bar) at ert.rb:9

at which point you can cross-reference to each of the checkpoint lines in
the code, until you find which checkpoint(s) are missing from the output.

I guess it's feasible... but it would be much better if it could work the
other way round, and show you were the error was, not where the good lines
were :)

Cheers,

Brian.
 
M

Mark Hubbart

Now there's an RCR I'd support. Heck, I'd even volunteer to code
it!

That's been hashed and rehashed here before... :) I'm guessing it's
unlikely. However, just because whitespace (except newlines) is not
syntactically relevant as far as ruby is concerned, doesn't mean it's
not syntactically relevant as far as the programmers are concerned.

What if we had a new warning level that checked for proper indentation
levels? By default, it could check any common indentation methods, so
one could use tabs, two, three four spaces, whatever; a strict mode
might check for standard library compatible code formatting.

So you might get this kind of warning:

------ script.rb ------

#!usr/local/bin/ruby -W3

class Foo
def initialize
puts "Hello World"
end

class Bar
def initialize
puts "Hello Bar"
end
end

% ./script.rb
script.rb:6: warning: inconsistent indentation level.
script.rb:13: parse error

This should make it easy to find such errors.

cheers,
Mark
 
B

Brian Schröder

Mark said:
That's been hashed and rehashed here before... :) I'm guessing it's
unlikely. However, just because whitespace (except newlines) is not
syntactically relevant as far as ruby is concerned, doesn't mean it's
not syntactically relevant as far as the programmers are concerned.

What if we had a new warning level that checked for proper indentation
levels? By default, it could check any common indentation methods, so
one could use tabs, two, three four spaces, whatever; a strict mode
might check for standard library compatible code formatting.

So you might get this kind of warning:

------ script.rb ------

#!usr/local/bin/ruby -W3

class Foo
def initialize
puts "Hello World"
end

class Bar
def initialize
puts "Hello Bar"
end
end

% ./script.rb
script.rb:6: warning: inconsistent indentation level.
script.rb:13: parse error

This should make it easy to find such errors.

cheers,
Mark

That seems like a nice idea to me. This would be nearly the same as the
autoindenter so, only that it does not indent but spit out warnings.

In the course I'm giving right now I saw that "missing ends" problems
are the most common problem for the beginners.
(Especially since they had a python course last week ;)

Maybe someone who is writing a ruby-parser will write this and the auto
indenter as a test-script for his parser ;)

Regards,

Brian
 
A

Austin Ziegler

Now there's an RCR I'd support. Heck, I'd even volunteer to code
it!

I think you'll find more people who don't want such an abomination.

Python's indentation is the number one thing that prevents me from
even considering using that language, because it forces me to work in
a stupid manner (e.g., *its* manner), rather than adapting to my
manner.

-austin
 
C

Charles Mills

I think you'll find more people who don't want such an abomination.

Python's indentation is the number one thing that prevents me from
even considering using that language, because it forces me to work in
a stupid manner (e.g., *its* manner), rather than adapting to my
manner.

I feel the same way as Austin here.
Although there have been a few times where I have had to break a script
up into smaller tmp files to figure out where the unclosed
block/if/.../ started. A separate utility to check the indentation
would be useful (IMHO), but I don't think it really belongs in the main
parser (or in Ruby at all).

-Charlie
 
M

Mark Hubbart

I feel the same way as Austin here.
Although there have been a few times where I have had to break a script
up into smaller tmp files to figure out where the unclosed
block/if/.../ started. A separate utility to check the indentation
would be useful (IMHO), but I don't think it really belongs in the main
parser (or in Ruby at all).

At first I was thinking that it should be a separate program; but then
I was thinking about how unhelpful the error messages that are
generated by the "missing end" error are. Integrating the indentation
check into the parser would allow for the automatic detection of that
type of error. Since it should only check for *consistent* indentation
levels, it should be flexible enough to allow people to use it
constantly.

cheers,
Mark
 
T

trans. (T. Onoma)

19, Joachim Wuttke wrote:
| > > Thank you, Brian.
| > > This is quite a convincing example: I count not less than eight
| > > places where the "end" could be missing.
| > > Just for curiosity: why didn't Ruby borrow indentation semantics
| > > from Python ?
| >
| > Now there's an RCR I'd support. Heck, I'd even volunteer to code
| > it!
|
| I think you'll find more people who don't want such an abomination.
|
| Python's indentation is the number one thing that prevents me from
| even considering using that language, because it forces me to work in
| a stupid manner (e.g., *its* manner), rather than adapting to my
| manner.

Probably it would be "bestest" if Ruby had used { ... } and/or do ... end in
all places.

class Foo {

}

def bar(baz) {

}

if x == 32 do

end

etc.

I bet parsing would be tad faster too.

T.
 
Y

Yukihiro Matsumoto

Hi,

In message "Re: quality of error messages"

|> % ./script.rb
|> script.rb:6: warning: inconsistent indentation level.
|> script.rb:13: parse error
|>
|> This should make it easy to find such errors.

|That seems like a nice idea to me. This would be nearly the same as the
|autoindenter so, only that it does not indent but spit out warnings.

I'm afraid that it might cause tab-space indentation war like in the
Python community. The issue is much smaller though, since it is not
mandatory.

matz.
 
M

markus

I think you'll find more people who don't want such an abomination.

Python's indentation is the number one thing that prevents me from
even considering using that language, because it forces me to work in
a stupid manner (e.g., *its* manner), rather than adapting to my
manner.

*laugh* I supose the reason I like the abomination is I learned to
indent back in the days of salient structure (before the matching-pairs
abomination took over the world). To me (and by extension, all right
thinking people), python's indentation seems perfectly natural. I suppose
it's all a matter of what abomination you grew up with.
That said, it really doesn't hurt as bad as you'd think it would,
since the main thing people differ on is where to put the closing token
(e.g., does the "end" or "}" go at the indentation of the enclosed block
or the inclosing block) and that issue vanishes if you don't have closing
tokens. I suspect that (if warned) most rubiests would find:

class My_class
attr_reader :an_atribute
def initialize a1,a2
@an_attribute = a1
@counter = 0
@limit = a2
def feed_the_lion food
raise "Roar" unless food.respond_to? :digestible_by
raise "Grrr" unless food.digestible_by self
digest food

waffle = My_class.new :fred,14

perfectly readable (and writable) without all the syntactic clutter/noise.
Required when you _don't_ pay attention to white space. Further, it's
well established that when there is discord between them people
(including long time lispers) will "read the indentation" and ignore the
punctuation regardless, so it reduces a significant source of bugs to
have the language look at the same features of the source code as the
person did. About the only measured downside (IIRC) is that it increases
the error rate when people are typing in code from a printed source. That
doesn't happen nearly as much as it used to.

BTW, I am _not_ trying to push this change; I'm just trying to
explain why I would not object to it, and suspect that many others would
find it more palatable than they might think.

-- Markus
 
B

Brian Schröder

Yukihiro said:
Hi,

In message "Re: quality of error messages"

|> % ./script.rb
|> script.rb:6: warning: inconsistent indentation level.
|> script.rb:13: parse error
|>
|> This should make it easy to find such errors.

|That seems like a nice idea to me. This would be nearly the same as the
|autoindenter so, only that it does not indent but spit out warnings.

I'm afraid that it might cause tab-space indentation war like in the
Python community. The issue is much smaller though, since it is not
mandatory.

matz.
Therefore I like the idea of using a high warning level or a seperate
program for this task. By now I normally
- mark everything
- M-x indent-region
- Look where something is weird.

If a program could guide where I look that would be nice. Therefore also
an external program or an xemacs plugin would be nice to me.

And just to clarify: This is in no way urgent or very important to me, I
just liked the idea ;)

Regards,

Brian
 
G

Gavin Sinclair

Therefore I like the idea of using a high warning level or a seperate
program for this task. By now I normally
- mark everything
- M-x indent-region
- Look where something is weird.

The unfortunate thing about this is that editor-based indenters never
seem to work perfectly. e.g. what to do with here-docs? Ruby is a
difficult language to indent. Consequently I tend to use the feature
locally, not on the whole file.
If a program could guide where I look that would be nice. Therefore also
an external program or an xemacs plugin would be nice to me.

An external program would be terrific, just becuase it could be used
independently, and also in any editor.

Gavin
 
B

Brian Schröder

Gavin said:
The unfortunate thing about this is that editor-based indenters never
seem to work perfectly. e.g. what to do with here-docs? Ruby is a
difficult language to indent. Consequently I tend to use the feature
locally, not on the whole file.

I've got to say I'm quite impressed with xemacs indentation. (Thanks
matz). The only things (but those are really hard to do) are:

indentations when escaping from a string:
<<EOT
blah #{indent do
this
end}
EOT

Indentation of lines ending with an operator
line1 +
line 2 +
line 3

does not work, so I always use
line 1 + \
line 2 + \
line 3
An external program would be terrific, just becuase it could be used
independently, and also in any editor.

And in any case in an external ruby script perfect indentation would be
easyer to implement than in elisp.

regards,

Brian
 
B

Brian Candler

|> % ./script.rb
|> script.rb:6: warning: inconsistent indentation level.
|> script.rb:13: parse error
|>
|> This should make it easy to find such errors.

|That seems like a nice idea to me. This would be nearly the same as the
|autoindenter so, only that it does not indent but spit out warnings.

I'm afraid that it might cause tab-space indentation war like in the
Python community. The issue is much smaller though, since it is not
mandatory.

I think that you don't have to enforce any particular indentation style or
amount of space on each line - only that it is consistent between begin and
end.

If we define the 'nesting depth' as the number of module / class / def / do
/ if sections we are within (i.e. the number of matching 'end's we expect to
see), then:

- at the start of each line, count the number of spaces. Ignore lines which
consist entirely of whitespace.

R1: if the nesting depth is the same as the previous line, then raise a
warning if the number of spaces is not the same as the previous line

R2: if the nesting depth is greater than the previous line, then remember
the indentation of this line associated with this nesting depth (e.g. on a
stack)

R3: if the nesting depth is less than the previous line, then raise a
warning if the number of spaces is not the same as the last line with the
same nesting depth

# hello [] R1: check indentation == 0
class Foo [] R1: check indentation == 0
def m [2] R2: push 2
wibble [2,6] R2: push 2
bibble [2,6] R1: check indentation == 6
end [2] R3: check indentation == 2
def [2] R1: check indentation == 2
end [2] R1: check indentation == 2
end [] R3: check indentation == 0

Some details left out, but you get the idea.

When comparing the number of spaces at the start of a line, I'd count tabs
as moving to the next 8th column, as is usual in most cases, so that
<spc><spc><tab><spc><spc> on one line matches 10 spaces on another line. If
people don't like that rule, it still doesn't matter; they just need to be
consistent in their use of either tabs or spaces.

e.g. if you're using 4-space tabs then

<tab><spc><spc>
will not match
<spc><spc><spc><spc><spc><spc>

but if you stick to one or the other, then they will match.

It is common for me to take a bit of code and wrap it with 'module foo' /
'end' or 'if false' / 'end' without re-indenting the code inside. This
system will still work for me:

if false []
class Foo [0] i.e. nesting depth 1 has indentation 0
def m1 [0,2]
end [0,2]
end [0]
end []

One other thing, you'd need to ignore subsequent lines of multi-line
statements:
a,b,c =
1,2,3
foo (
...
)

but I guess the parser must already have a mechanism which infers "the next
line is a continuation of the previous" anyway.

Regards,

Brian.
 
M

Markus

I think that you don't have to enforce any particular indentation style or
amount of space on each line - only that it is consistent between begin and
end.

If we define the 'nesting depth' as the number of module / class / def / do
/ if sections we are within (i.e. the number of matching 'end's we expect to
see), then:

- at the start of each line, count the number of spaces. Ignore lines which
consist entirely of whitespace.

R1: if the nesting depth is the same as the previous line, then raise a
warning if the number of spaces is not the same as the previous line

R2: if the nesting depth is greater than the previous line, then remember
the indentation of this line associated with this nesting depth (e.g. on a
stack)

R3: if the nesting depth is less than the previous line, then raise a
warning if the number of spaces is not the same as the last line with the
same nesting depth

*ROFL* Yes! What you have described here is salient structure
(well, missing, IIRC, the line continuation rule and the tab-size
rule). It's a wonderful system, and everybody thinks they use it. But
back in the early nineteen eighties or so someone who didn't understand
the system decided it would be easier to "find pairs" if the closing
token were indented as if it were a member of the enclosing block.
Thus, the algorithm as you stated it would complain
# hello [] R1: check indentation == 0
class Foo [] R1: check indentation == 0
def m [2] R2: push 2
wibble [2,6] R2: push 2
bibble [2,6] R1: check indentation == 6
end [2] R3: check indentation == 2
^

at the point marked "^" because (at the start of that line) the nesting
is the same as at was for the previous two lines. The same problem
occurs throughout the example. There are, as always, at least two ways
to fix this.

One would be to implement the rules as you stated them, adding the
half-step rule (indentations intermediate between the two are used for
syntactic continuation, e.g. else or your chained addition example). I
would love this, since that's the style I use (I was already out of
school before the sea change) but I suspect most if not all of you would
hate it.

Another way would be to do what python did and avoid the problem by
eliminating closing tokens. I suspect more people would like this, but
matz and the majority (so two majorities? yikes!?) would be against
them.

A third solution would be to do what I suggested a few hours ago,
the essence of which was to look not for the closing but the occurrence
of something "outer" (< indented, not <= indented), and only telling
about the most recent one when the problem is detected:

There may be heuristics to get a reasonable "hint" by making
some assumptions; e.g., warn if there is a line less indented
than the first line of an outstanding (open) construct,
excluding here-docs, %_{ constructs, etc., if (and only if)
there is a missing end at eof [or when the error is detected].
This could (I think) be implemented fairly easily by

* caching the location and indentation of a each class,
def, etc. on a stack

* popping from the stack on end

* noting when the first token is lexed from a line if it
was less indented than the most recent outstanding
def/class, etc., and if so noting the fact in a global

* including the information in the global (if any) when
generating the missing end message

But this is only a heuristic, based on the observation that even
people who don't like salient structure tend to use it to some
extent. It would not solve the problem in general, and perhaps
not even in a typical case, for anyone but me and the python
expatriates.

This is still the best of the options I've heard or thought of.
Instead of using a stack (as we both suggested) it might be easier to
use the parser's tree, which (IIRC) has adequate support for all sorts
of tagging.
One other thing, you'd need to ignore subsequent lines of multi-line
statements:
a,b,c =
1,2,3
foo (
...
)

but I guess the parser must already have a mechanism which infers "the next
line is a continuation of the previous" anyway.

Yes, it does. I'm not sure I grok its fullness, but it is there
and would be accessible. It wouldn't be needed in the soft-checking
third option (or at least, I don't think it would be needed), but it
wouldn't be too hard to watch for if it was.

-- Markus
 
C

Charles Hixson

*laugh* I supose the reason I like the abomination is I learned to
indent back in the days of salient structure (before the matching-pairs
abomination took over the world). To me (and by extension, all right
thinking people), python's indentation seems perfectly natural. I suppose
it's all a matter of what abomination you grew up with.
That said, it really doesn't hurt as bad as you'd think it would,
since the main thing people differ on is where to put the closing token
(e.g., does the "end" or "}" go at the indentation of the enclosed block
or the inclosing block) and that issue vanishes if you don't have closing
tokens. I suspect that (if warned) most rubiests would find:

class My_class
attr_reader :an_atribute
def initialize a1,a2
@an_attribute = a1
@counter = 0
@limit = a2
def feed_the_lion food
raise "Roar" unless food.respond_to? :digestible_by
raise "Grrr" unless food.digestible_by self
digest food

waffle = My_class.new :fred,14

perfectly readable (and writable) without all the syntactic clutter/noise.
Required when you _don't_ pay attention to white space. Further, it's
well established that when there is discord between them people
(including long time lispers) will "read the indentation" and ignore the
punctuation regardless, so it reduces a significant source of bugs to
have the language look at the same features of the source code as the
person did. About the only measured downside (IIRC) is that it increases
the error rate when people are typing in code from a printed source. That
doesn't happen nearly as much as it used to.

BTW, I am _not_ trying to push this change; I'm just trying to
explain why I would not object to it, and suspect that many others would
find it more palatable than they might think.

-- Markus
For that sample, you are correct, but when blocks get a bit more complex
it fails. Miserably in my opinion. Even when programming in Python I
typically add an end statement, e.g.

class My_class:
def __init__(self, a1,a2):
self.an_attribute = a1
self.counter = 0
self.limit = a2
#end __init__

def feed_the_lion(self, food):
raise "Roar" unless food.respond_to? :digestible_by
raise "Grrr" unless food.digestible_by self
digest food
#end feed_the_lion
#end class My_class
waffle = My_class.new :fred,14

Note this is an incomplete conversion. Also note that I don't just use
end statements, I label them. (Also note that I couldn't decide whether
or not waffle was a member of My_class just by looking at it. It
required noticing that it was an instantiation of the class. (Which I
couldn't remember how to do in Python.)

I guess that I got my block conventions from Ada. (Fortran was
rather....vague about them, except for not before column 6 and not after
column 72.)

My personal convention about when to use {} and when to use do..end is:
If it's within a single line, use {}. Otherwise tend to prefer do..end
This isn't a strong convention, but I try to not depend on relative
binding power (partially because I can never remember the exact
precedence order).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,159
Messages
2,570,886
Members
47,419
Latest member
ArturoBres

Latest Threads

Top