Consider this python code (lines numbered for exposition):
Thanks for going through the trouble of posting an elaborate example.
1 def dump(st):
2 mode, ino, dev, nlink, uid, gid, size, atime, mtime, ctime = st
3 print "- size:", size, "bytes"
4 print "- owner:", uid, gid
5 print "- created:", time.ctime(ctime)
6 print "- last accessed:", time.ctime(atime)
7 print "- last modified:", time.ctime(mtime)
8 print "- mode:", oct(mode)
9 print "- inode/dev:", ino, dev
10
11 def index(directory):
12 # like os.listdir, but traverses directory trees
13 stack = [directory]
14 files = []
15 while stack:
16 directory = stack.pop()
17 for file in os.listdir(directory):
18 fullname = os.path.join(directory, file)
19 files.append(fullname)
20 if os.path.isdir(fullname) and not os.path.islink(fullname):
21 stack.append(fullname)
22 return files
[...]
But let us consider cutting lines 6 and 7 and putting them
between lines 21 and 22. We get this:
15 while stack:
16 directory = stack.pop()
17 for file in os.listdir(directory):
18 fullname = os.path.join(directory, file)
19 files.append(fullname)
20 if os.path.isdir(fullname) and not os.path.islink(fullname):
21 stack.append(fullname)
6 print "- last accessed:", time.ctime(atime)
7 print "- last modified:", time.ctime(mtime)
22 return files
But it is unclear whether the intent was to be outside the while,
or outside the for, or part of the if.
I don't think so. Before pasting you just move your cursor to #1 #2 or #3 to
achieve the respective results:
15 while stack:
16 directory = stack.pop()
17 for file in os.listdir(directory):
18 fullname = os.path.join(directory, file)
19 files.append(fullname)
20 if os.path.isdir(fullname) and not os.path.islink(fullname):
21 stack.append(fullname) #1 #2 #3
22 return files
What's wrong with that? (If nothing, I'll finally hack it up in elisp,
unless someone already has done it).
[snipped]
Now consider this `pseudo-equivalent' parenthesized code:
1 (def dump (st)
2 (destructuring-bind (mode ino dev nlink uid gid size atime mtime ctime) st
3 (print "- size:" size "bytes")
4 (print "- owner:" uid gid)
5 (print "- created:" (time.ctime ctime))
6 (print "- last accessed:" (time.ctime atime))
7 (print "- last modified:" (time.ctime mtime))
8 (print "- mode:" (oct mode))
9 (print "- inode/dev:" ino dev)))
10
11 (def index (directory)
12 ;; like os.listdir, but traverses directory trees
13 (let ((stack directory)
14 (files '()))
15 (while stack
16 (setq directory (stack-pop))
17 (dolist (file (os-listdir directory))
18 (let ((fullname (os-path-join directory file)))
19 (push fullname files)
20 (if (and (os-path-isdir fullname) (not (os-path-islink fullname)))
21 (push fullname stack)))))
22 files))
If we cut lines 6 and 7 with the intent of inserting them
in the vicinity of line 21, we have several options (as in python),
but rather than insert them incorrectly and then fix them, we have
the option of inserting them into the correct place to begin with.
In the line `(push fullname stack)))))', there are several close
parens that indicate the closing of the WHILE, DOLIST, LET, and IF,
assuming we wanted to include the lines in the DOLIST, but not
in the LET or IF, we'd insert here:
V
21 (push fullname stack))) ))
AFAICT this is a more difficult editing operation to get right than what I
suggested above to get the desired effect for python and you still have to
reindent.
The resulting code is ugly: [...]
But it is correct.
Correct as in "has the desired behavior" is not the only thing that counts.
(Incidentally inserting at that point is easy: you move the cursor over
the parens until the matching one at the beginning of the DOLIST begins
to blink. At this point, you know that you are at the same syntactic level
as the dolist.)
Sure, it's not very taxing, but I think still slightly more so than placing
the cursor at #1 #2 or #3 for the python example (you need to invoke extra
functionality, even if this functionality is quite convinient and you need to
devote visual attention to an extra site).
Let me expand on this point. The lines I cut are very similar to each
other, and very different from the lines where I placed them. But
suppose they were not, and I had ended up with this:
19 files.append(fullname)
20 if os.path.isdir(fullname) and not os.path.islink(fullname):
21 stack.append(fullname)
6 print "- last accessed:", time.ctime(atime)
7 print "- last modified:", time.ctime(mtime)
22 print "- copacetic"
23 return files
Now you can see that lines 6 and 7 ought to be re-indented, but line 22
should not. It would be rather easy to either accidentally group line seven
with line 22, or conversely line 22 with line 7.
True, but as I stated above AFAICT there is no need to paste-and fix, you can
just paste correctly (Digression for the emacs-interested: a really
emacs-savvy user will paste and fix in a pretty safe fashion anyway, because
the source of error above would be to inadvertently select a (wrong) region
for reindentation. If you just reindent (with C-c> or C-c<) right after
pasting (the pasted code will automatically form the region then) you don't
have this problem).
Forgetting to indent properly in a lisp program does not yield
erroneous code.
I think this highlights an implicit but mistaken assumption underlying
anti WS arguments.
No, of course forgetting to indent does not *directly* yield erroneous code.
That's not good enough, however because it is associated with 2 problems:
1) masking an error: if you perform an operation that can go wrong and
frequently does (such as issuing an editing command) then enforcing manual
verification of the desired outcome (read C-M-\) is a source of error.
2) misleading later readers (including the programmer himself, esp. while he's
editing): the bad indentation might well suggest an altnernative meaning
from the actually intended one that goes unnoticed till someone reindents
the code -- again a source of errors.
This is correct. But what is recommended here is to use a simple tool to
enhance readability and do a trivial syntactic check.
I think "enhance readability" is more than an understatement. Lisp code of
reasonable complexity is simply unreadable if not or arbitrarily indented
(even more so than XML!).
Would that this were the case. Lisp code that is poorly indented will still
run.
Python code that is poorly indented will not. I have seen people write lisp
code like this:
(defun factorial (x)
(if (> x 0)
x
(*
(factorial (- x 1))
x
)))
I still tell them to re-indent it. A beginner writing python in this manner
would be unable to make the code run.
But the fact that you can't do this in python (and BTW I've never seen a
*python* newbie TRY -- have you?) is a FEATURE, for crying out loud.
Or what exactly do you think the benefit of making it easier for beginners to
write error-prone and unreadable code to be? Why is it a paedagogical
disadvantageous for python beginners to automatically write readable code that
does what they think it does (at least as far as block structure is concerned)
compared to lisp newbies who either often write code that doesn't do what they
(and other readers) think it does (because of indentation/paren mismatch) or
that is inpenetrable (just think about how many hours these people have wasted
trying to decipher and debug their code before you gave them the fatherly
advice to try to indent it)?
Ok. For any sort of semantic error (one in which a statement is
associated with an incorrect group) one could make in python, there is
an analagous one in lisp, and vice versa. This is simply because both
have unambiguous parse trees.
However, there is a class of *syntactic* error that is possible in
python, but is not possible in lisp (or C or any language with
balanced delimiters). Moreover, this class of error is common,
frequently encountered during editing, and it cannot be detected
mechanically.
This argument basically boils down to "lisp is more redundant and therefore
less error prone". This inself is not a valid argument as it depends on the
type of redundancy and the extend to which this redundancy itself causes
errors, e.g. by rendering the code more obscure by additional verbosity (as in
xml compared to sexps); whether this redundancy helps or hinders perception
and how this redundancy interferes with the editing process.
As an example of such interference in the case of lisp consider the example of
commenting/deleteing/appending after a line with surplus trailing parens
(corresponding ot opening parens in earlier lines). Conceptually this line is
the same as all the other lines in the code in the same block, but you have to
use quite different (and more complicated and error-prone) commands to achieve
the same editing process. Not so in python.
Consider this thought experiment: pick a character (like parenthesis
for example) go to a random line in a lisp file and insert four of them.
This thougt experiment seems of limited informational value. While I can see
how you might accidentally indent 4 spaces by pressing tab (which you'd
normally notice) accidently, inserting 4 spaces or 4 parens (by pressing space
or ')') seems highly unlikely to me.
Additionally most of the time the code is actively being edited and thus *not*
in a consistent state -- this is how most editing errors occur and I don't
think lisp compares favourably to python here.
This is largely due to the fact that the relationship between suggested
meaning and actual meaning of the code is automatically maintained in python
and only semi-automatically by emacs/lisp. As an example consider introducing
additional local vars by adding a let-block. This is not quite equivalent to
the same example in python (which is trivial, editing wise, baring potential
for local name-conflicts), but close enough and (unless you've plenty at
routine at editing lisp code) it's pretty easy to screw things up here (paren
matching makes it easy to match with visually exposed outer parents (viz.
corresponding to e.g. "(LET"'), but I find it's still pretty easy to end up
with wrong parens inside, viz. the var forms/body) -- the more code you write
before reindenting everything the higher the likelyhood something went wrong,
although the total number of parens will be correct.
Anyway, back to your thought-experiment:
Is the result syntactically correct? No. Could a naive user find them?
Trivially. Could I program Emacs to find them? Sure.
Now go to a random line in a python file and insert four spaces. Is
Uhm where? A) Anywhere in the line? Since only *leading whitespace* is
significant this wouldn't alter the meaning (safe in strings and identifiers
-- as in CL). So I assume you mean B) at the beginning of the line, in which
case:
the result syntactically correct?
Likely.
Depends on what you mean by "likely". A necessary condition for the result to
be syntactically valid and semantically different is that the preceding
(non-comment) line has the same indent (which occurs in about of 10% of the
lines in a representative sample [1]).
Could a naive user find them?
Unlikely.
Uhm, actually extremely likely because 90% of the cases in B) yield invalid
syntax, which both python and your editor can easily figure out for the naive
user without even running the code.
Could you write a program to find them? No.
Sure I could, but writing one that performs better then say 95% would involve
an amount of work unwarranted by the unrealistic scenario.
> Delete four adjacent parens in a Lisp file. Will it still compile? No.
Will it even be parsable? No.
Delete four adjacent spaces in a Python file. Will it still compile?
Likely.
See above.
^^^^^^^
[thinko; should have been Gricean]
I thought that whitespace was significant to Python.
Only human readable whitespace [2].
My computer does not display whitespace.
Mine sure does.
if bar:
foo
and
if bar:
foo
look quite different on my computer. Since this (leading ws) is the only type
of whitespace that semtantically matters to python, I still fail to see your
point.
I understand that most computers do
not. There are few fonts that have glyphs at the space character.
Since having the correct amount of whitespace is *vital* to the
correct operation of a Python program, it seems that the task of
maintaining it is made that munch more difficult because it is only
conspicuous by its absence.
No idea what you're driving at -- maybe you've got some wrong assumptions
about the details of python's syntax?
Sussman is careful to separate the equations of classical mechanics
from the *implementation* of those equations in the computer, the
I really don't want to push this point too hard, but why not use sexps for
both if sexp-readability was near optimal (Iverson seems to have done the
equivlanent for his mathematical books using APL/J)?
former are written using a functional mathematical notation similar to
that used by Spivak, the latter in Scheme. The two appendixes give
the details. Sussman, however, notes ``For very complicated
expressions the prefix notation of Scheme is often better''
That statement seems to suggest to me that in the general case MN is
preferable.
And where did I claim that? You originally stated:
Well I didn't explicitly say you did, but you contested the claim requoted
below about ')))))))))' (I'm still not quite sure why) and stated that Abelson
and Sussman would disagree with my assesments of parens and whitespace, so
this interpretation seemed not entirely implausible to me.
Still, I'm sure you're familiar with the following quote (with which I most
heartily agree):
"[P]rograms must be written for people to read, and only incidentally for
machines to execute."
People can't "read" '))))))))'.
Quoting Sussman and Abelson as a prelude to stating that parenthesis are
unreadable is hardly going to be convincing to anyone.
I didn't say parens are generally unreadable, as the quote above shows I said
that people can't read *large numbers of trailing parens*, which is quite
different (I would also have had to contradict myself otherwise in stating
that properly formated lisp reads well)?
The reason why I brought up this quote is because you were trashing python's
indentation based syntax (which, remember, I didn't claim was suitable or
"better" than sexps for lisp or similar expression based languages) as an
illustration that python's syntax at least compares favourably to sexps in an
area that is of regarded as very important by prominent members of the
lisp/scheme community.
Look, I *like* lisp syntax (and I'm even happy with you to claim it's better
than python's, as long as I can dissuade you from claiming that it's terribly
error-prone, unless you've actually tried writing some reasonable amount of
python yourself).
But clearly a syntax were some vital information is not so much there for
people to read but for editors and compilers is not in full conformance with
the above goal (if you really disagree with this, again why alias '[]' in
pretty much all schemes if nested '()'s are just as readable?).
In python you don't need the editor to semi-automatically put comments in your
code (which is what pressing 'M-C-\' amounts to) to render it intelligible.
Of course it does matter. We were talking about readability by *humans*.
I know that emacs has no trouble "reading" 7 trailing parenthesis (i.e.
extracting the relevant semantic information from them, namely what they
delimit), but I doubt you (or other humans) can (without some prosthetic aid
in the form of brain implants or manually invoked emacs commands).
You are presupposing *two* errors of two different kinds here: the
accidental inclusion of an extra parenthesis after bar *and* the
accidental omission of a parenthesis after watz.
The kind of error I am talking about with Python code is a single
error of either omission or inclusion.
In my book this is a *single* error of one kind: accidental misplacement of a
parenthesis (sure happens to me as an *atomic* operation). Does this never
happen to you (maybe you use more effective editing strategies, M-( etc.)?
I could hardly care less.
Well, anyway I'll leave it at that before this one completely degrades, too. I
hope at least the discussion of editing errors and strategies will have had
some value.
'as
Footnotes
[1] [I really get to write more than my fair share of perl in this thread ;|]
perl -ne '/^ */; $lines++; $canIndent++ if (length($last)
== length($&)+4);$last = $&; END{print "LOC:$lines CANINDENT:$canIndent\n";
print "ratio:" . $canIndent/$lines . "\n"}' /usr/local/lib/python2.3/*.py
LOC:76264 CANINDENT:9307
ratio:0.122036609671667
This is an upper boundary, BTW. Not all these lines would semantically
change if reindented.
[2] modulo #\Tab which we can include from the current discussion because its
use is discouraged and the wart that python still allows use of tabs is not
really relevant to evaluating the merrits of the use of identation for
block-structure indication as such.