The rap against "while True:" loops

P

Paul Rubin

Standard Python idiom:

try:
d[key] += value
except KeyError:
d[key] = value

Maybe you need to re-think "appropriate".

But more recent style prefers:

d = collections.defaultdict(int)
...
d[key] += value
 
A

Aahz

[email protected] (Aahz) said:
Standard Python idiom:

try:
d[key] += value
except KeyError:
d[key] = value

Maybe you need to re-think "appropriate".

But more recent style prefers:

d = collections.defaultdict(int)
...
d[key] += value

That was a trivial example; non-trivial examples not addressed by
defaultdict are left as an exercise for the reader.
--
Aahz ([email protected]) <*> http://www.pythoncraft.com/

"To me vi is Zen. To use vi is to practice zen. Every command is a
koan. Profound to the user, unintelligible to the uninitiated. You
discover truth everytime you use it." (e-mail address removed)
 
P

Paul Rubin

d = collections.defaultdict(int)
...
d[key] += value

That was a trivial example; non-trivial examples not addressed by
defaultdict are left as an exercise for the reader.

Even in the "nontrivial" examples, I think avoiding the exception is
stylistically preferable:

d[key] = value + d.get(key, 0)

It might be worth handling the exception in an inner loop where you
want to avoid the cost of looking up key in the dictionary twice, but
even that requires profiling to be sure.
 
T

Terry Reedy

Paul said:
Standard Python idiom:

try:
d[key] += value
except KeyError:
d[key] = value

Maybe you need to re-think "appropriate".

But more recent style prefers:

d = collections.defaultdict(int)
...
d[key] += value

Yes, the motivation was to reduce 4 lines to 1 line for a common use
case, and not because of any sense of 'inappropriateness'.

tjr
 
T

Terry Reedy

Steven said:
I don't think it's *only* the performance thing, it's also clarity. The
understood meaning of throwing an exception is to say "something
happened that shouldn't have". If one uses it when something has
happened that *should* have, because it happens to have the right
behaviour (even if the overhead doesn't matter), then one is
misrepresenting the program logic.

No, you have a fundamental misunderstanding. They're called exceptions,
not errors, because they represent exceptional cases. Often errors are
exceptional cases, but they're not the only sort of exceptional case.

Python uses exceptions for flow control: e.g. for-loops swallow
StopIteration or IndexError to indicate the end of the loop. In the
context of a for-loop, StopIteration or IndexError doesn't represent an
error. It doesn't represent an unexpected case. It represents an
expected, but exceptional (special) case: we expect that most sequences
are finite, and it is normal to eventually reach the end of the sequence,
after which the loop must change behaviour.

Similarly, it's hardly an *error* for [1, 2, 3].index(5) to fail -- who
is to say that the list is supposed to have 5 in it? ValueError (a
slightly misleading name in this situation) is used to indicate an
exceptional, but not unexpected, occurrence.

Likewise, KeyboardInterrupt is used to allow the user to halt processing;
SystemExit is used to shut down the Python virtual machine; and warnings
are implemented using exceptions. There may be others among the built-ins
and standard library, but even if there aren't, there is plenty of
precedence for us to do the same.

Nicely put. Programmers are exceptional people, but not erroneous, in
spite of nerd stereotypes ;-).

tjr
 
P

Paul Rubin

Terry Reedy said:
d[key] += value

Yes, the motivation was to reduce 4 lines to 1 line for a common use
case, and not because of any sense of 'inappropriateness'.

Reducing 4 confusing lines to 1 clear one is almost always appropriate.
 
S

Steven D'Aprano

Terry Reedy said:
d[key] += value

Yes, the motivation was to reduce 4 lines to 1 line for a common use
case, and not because of any sense of 'inappropriateness'.

Reducing 4 confusing lines to 1 clear one is almost always appropriate.

This is true, but there is nothing confusing about "Asking for
forgiveness is better than asking for permission".

For the record, the four lines Paul implies are "confusing" are:

try:
d[key] += value
except KeyError:
d[key] = value

Paul, if that confuses you, perhaps you should consider a change of
career. *wink*

On the other hand:

d = collections.defaultdict(int)
d[key] += value

is confusing, at least to me. I would expect the argument to defaultdict
to be the value used as a default, not a callable. In other words, I
would expect the above to try adding the type `int` to the integer
`value` and fail, and wonder why it wasn't written as:

d = collections.defaultdict(0)
d[key] += value

Having thought about it, I can see why defaultdict is done that way,
instead of this:

class MyDefaultDict(dict):
def __init__(self, default=None):
self._default = default
def __getitem__(self, key):
if key in self:
return self[key]
else:
return self._default

And here's why this doesn't work too well:
d = MyDefaultDict([])
d['a'] = [1,2]
d['b'] = [3,4,5]
d {'a': [1, 2], 'b': [3, 4, 5]}
d['c'] += [6,7]
d
{'a': [1, 2], 'c': [6, 7], 'b': [3, 4, 5]}

So far so good. But wait:
{'a': [1, 2], 'c': [6, 7, 8], 'b': [3, 4, 5], 'd': [6, 7, 8]}

Oops. So even though it's initially surprising and even confusing, the
API for collections.defaultdict functionally more useful.
 
P

Paul Rubin

Steven D'Aprano said:
For the record, the four lines Paul implies are "confusing" are:

try:
d[key] += value
except KeyError:
d[key] = value

Consider what happens if the computation of "key" or "value" itself
raises KeyError.
 
S

Steven D'Aprano

Steven D'Aprano said:
For the record, the four lines Paul implies are "confusing" are:

try:
d[key] += value
except KeyError:
d[key] = value

Consider what happens if the computation of "key" or "value" itself
raises KeyError.

How does using a defaultdict for d save you from that problem?

table = {101: 'x', 202: 'y'}
data = {'a': 1, 'b': 2}
d = collections.defaultdict(int)
d[table[303]] += data['c']


It may not be appropriate to turn table and data into defaultdicts --
there may not be a legitimate default you can use, and the key-lookup
failure may be a fatal error. So defaultdict doesn't solve your problem.

If you need to distinguish between multiple expressions that could raise
exceptions, you can't use a single try to wrap them all. If you need to
make that distinction, then the following is no good:

try:
key = keytable
value = datatable[t]
d[key] += value
except KeyError:
print "An exception occurred somewhere"

But if you need to treat all three possible KeyErrors identically, then
the above is a perfectly good solution.
 
L

Lie Ryan

Paul said:
Steven D'Aprano said:
For the record, the four lines Paul implies are "confusing" are:

try:
d[key] += value
except KeyError:
d[key] = value

Consider what happens if the computation of "key" or "value" itself
raises KeyError.

Isn't key and value just a simple variables/names? Why should it ever
raises KeyError? The only other error that try-block code could ever
possibly throw are NameError and possibly MemoryError.
 
P

Paul Rubin

Hendrik van Rooyen said:
Standard Python idiom:

if key in d:
d[key] += value
else:
d[key] = value

The issue is that uses two lookups. If that's ok, the more usual idiom is:

d[key] = value + d.get(key, 0)
 
S

Steven D'Aprano

The whole reason for the mechanism, across all languages that have it,
is to deal with situations that you don't know how to deal with locally.

That confuses me. If I call:

y = mydict[x]

how does my knowledge of what to do if x is not a key relate to whether
the language raises an exception, returns an error code, dumps core, or
prints "He's not the Messiah, he's a very naughty boy" to stderr?

You seem to be making a distinction of *intent* which, as far as I can
tell, doesn't actually exist. What's the difference in intent between
these?

y = mydict[x]
if y == KeyErrorCode:
handle_error_condition()
process(y)


and this?

try:
y = mydict[x]
except KeyError:
handle_error_condition()
process(y)


Neither assumes more or less knowledge of what to do in
handle_error_condition(). Neither case assumes that the failure of x to
be a key is an error:

try:
y = mydict[x]
except KeyError:
process() # working as expected
else:
print 'found x in dict, it shouldn't be there'
sys.exit()

Either way, whether the language uses error codes or exceptions, the
decision of what to do in an exceptional situation is left to the user.

If you'll excuse me pointing out the bleedin' obvious, there are
differences between error codes and exceptions, but they aren't one of
intention. Error codes put the onus on the caller to check the code after
every single call which might fail (or have a buggy program), while
exceptions use a framework that do most of the heavy lifting.

That's why they have the overhead that they do.

Exceptions don't have one common overhead across all languages that use
them. They have different overhead in different languages -- they're very
heavyweight in C++ and Java, but lightweight in Python. The Perl
Exception::Base module claims to be lightweight. The overhead of
exceptions is related to the implementation of the language.

Yes, and in some cases I think that's a serious language wart. Not
enough to put me off the language, but a serious wart nevertheless.

I disagree with that. I think exceptions are a beautiful and powerful way
of dealing with flow control, much better than returning a special code,
and much better than having to check some status function or global
variable, as so often happens in C. They're more restricted, and
therefore safer, than goto. They're not a panacea but they're very useful.

Similarly, it's hardly an *error* for [1, 2, 3].index(5) to fail -- who
is to say that the list is supposed to have 5 in it? ValueError (a
slightly misleading name in this situation) is used to indicate an
exceptional, but not unexpected, occurrence.

That one is, I think, a legitimate use of an exception. The result
returned by index is defined if the index is in bounds. If not, index
doesn't know whether it was supposed to be in bounds or not, and so
can't handle the case locally. It could suggest an error or merely
(IMHO) poor programming. Because index cannot know what the proper
action is, an exception is the appropriate response.

I think you're confused about what list.index(obj) does. You seem to me
to be assuming that [1,2,3].index(5) should return the item in position 5
of the list, and since there isn't one (5 is out of bounds), raise an
exception. But that's not what it does. It searches the list and returns
the position at which 5 is found.

Of course list.index() could have returned an error code instead, like
str.find() does. But str also has an index() method, which raises an
exception -- when handling strings, you can Look Before You Leap or Ask
For Forgiveness Instead Of Permission, whichever you prefer.

Again, I think it's fair to treat a program being killed from outside as
an exception as far as the program is concerned.

No, it's not being killed from outside the program -- it's being
*interrupted* from *inside* the program by the user. What you do in
response to that interrupt is up to you -- it doesn't necessarily mean
"kill the program".

If you kill the program from outside, using (say) kill or the TaskManager
or something, you don't necessarily get an exception. With kill -9 on
POSIX systems you won't get anything, because the OS will just yank the
carpet out from under your program's feet and then drop a safe on it to
be sure.
 
L

Lie Ryan

Ben said:
Lie Ryan said:
Paul said:
For the record, the four lines Paul implies are "confusing" are:

try:
d[key] += value
except KeyError:
d[key] = value
Consider what happens if the computation of "key" or "value" itself
raises KeyError.
Isn't key and value just a simple variables/names?

In that example, yes. Paul is encouraging the reader to think of more
complex cases where they are compound expressions, that can therefore
raise other errors depending on what those expressions contain.

If key and value had been anything but simple variable/name, then
definitely you're writing the try-block the wrong way. try-block must be
kept as simple as possible.

Here is a simple, and effective way to mitigate the concern about
compound expressions:
key = complex_compound_expression_to_calculate_key
value = complex_compound_expression_to_calculate_value
try:
d[key] += value
except KeyError:
d[key] = value


The point is: "complex expressions and exceptions are used for different
purpose"
 
P

Paul Rubin

Lie Ryan said:
If key and value had been anything but simple variable/name, then
definitely you're writing the try-block the wrong way. try-block must
be kept as simple as possible.

Avoiding the try-block completely is the simplest possibility of them all.
 
N

NiklasRTZ

Russ said:
Never, ever use "while True". It's an abomination. The correct form is
"while 1".

equivalently appears doable with for statement too, but not C-style for
(;;), however like this
from itertools import count
for a in (2*b in itertools.count() if b**2 > 3):
doThis()
sleep(5)
doThat()

best regards,
NR
 
H

Hendrik van Rooyen

Hendrik van Rooyen said:
Standard Python idiom:

if key in d:
d[key] += value
else:
d[key] = value

The issue is that uses two lookups. If that's ok, the more usual idiom is:

d[key] = value + d.get(key, 0)

I was actually just needling Aahz a bit. The point I was trying to make
subliminally, was that there is a relative cost of double lookup for all
cases versus exceptions for some cases. - Depending on the frequency
of "some", I would expect a breakeven point.

- Hendrik
 
S

Steven D'Aprano

The point I was trying to make
subliminally, was that there is a relative cost of double lookup for all
cases versus exceptions for some cases. - Depending on the frequency of
"some", I would expect a breakeven point.

There is, at least according to my (long distant and only barely
remembered) tests.

Setting up a try...except is very cheap, about as cheap as a pass
statement. That is:

d = {1: None}
try:
x = d[1]
except KeyError:
print "This can't happen"


is approximately as costly as:

d = {1: None}
pass
x = d[1]

under Python 2.5. However, catching an exception is more expensive,
approximately ten times more so. Doing a lookup twice falls somewhere
between the two, closer to the cheap side than the expensive.

So according to my rough estimates, it is faster to use the try...except
form so long as the number of KeyErrors is less than about one in six,
give or take. If KeyError is more common than that, it's cheaper to do a
test first, say with d.has_key(). Using the `in` operator is likely to be
faster than has_key(), which will shift the break-even point.

(The above numbers are from memory and should be taken with a large pinch
of salt. Even if they are accurate for me, they will likely be different
on other machines, and will depend on the actual keys in the dict. In
other words, your mileage may vary.)
 
T

Tim Rowe

2009/10/18 Steven D'Aprano said:
That confuses me. If I call:

y = mydict[x]

how does my knowledge of what to do if x is not a key relate to whether
the language raises an exception, returns an error code, dumps core, or
prints "He's not the Messiah, he's a very naughty boy" to stderr?

You seem to be making a distinction of *intent* which, as far as I can
tell, doesn't actually exist. What's the difference in intent between
these?

y = mydict[x]
if y == KeyErrorCode:
   handle_error_condition()
process(y)


and this?

try:
   y = mydict[x]
except KeyError:
   handle_error_condition()
process(y)

Nothing -- because both of those are at the level of code where you
know what to do with the error condition. But if you weren't -- if,
for example, you were in a library routine and had no idea how the
routime might be used in the future -- the latter would just become:
y = mydict[x]
and the problem is passed on until it gets to somebody who does know.
In the former case, though, you still need all the error handling code
even though you don't know how to handle the error, *and* you need to
exit with an error code of your own, *and* your client needs error
handling code even though /they/ might not know how to handle the
error, and so on.
Neither assumes more or less knowledge of what to do in
handle_error_condition(). Neither case assumes that the failure of x to
be a key is an error:

They both assume that calling handle_error_condition() is an
appropriate response.
Exceptions don't have one common overhead across all languages that use
them. They have different overhead in different languages -- they're very
heavyweight in C++ and Java, but lightweight in Python.

As others have pointed out, Python exceptions are cheap to set up, but
decidedly less cheap to invoke. But I'm not making the efficiency
argument, I'm making the clarity argument. I don't /quite/ believe
that Premature Optimisation is the root of all evil, but I wouldn't
avoid exceptions because of the overhead. In the rare cases where the
response to an exceptional condition is time (or resource?)-critical
I'd consider that case to need special handling anyway, and so
wouldn't treat it as an exception.
I think exceptions are a beautiful and powerful way
of dealing with flow control, much better than returning a special code,
and much better than having to check some status function or global
variable, as so often happens in C. They're more restricted, and
therefore safer, than goto. They're not a panacea but they're very useful..

I agree completely -- when they're used in appropriate circumstances.
Not when they're not.
I think you're confused about what list.index(obj) does. You seem to me
to be assuming that [1,2,3].index(5) should return the item in position 5
of the list, and since there isn't one (5 is out of bounds), raise an
exception. But that's not what it does. It searches the list and returns
the position at which 5 is found.

Yes, sorry, brain fade. But my point remains that the authors of index
can't know whether the item not being in the list is an error or not,
can't know how to handle that case, and so passing it to the client as
an exception is an appropriate response.
No, it's not being killed from outside the program -- it's being
*interrupted* from *inside* the program by the user.

Who -- unless AI has advanced further than I thought -- is *outside*
the program.
 
J

Jaime Buelta

For me, it's more a question of clarity than anything else. I don't
like very much using break, continue or more than one return per
function on C/C++, but sometimes it's much clearer to use them.
Also, in Python I use them often, as usually the code is cleaner this
way.

for example, I will wrote that code in C/C++

for (i=0;(i<MAX) && (get_out == True);i++) {
...... do lot of things...
.....
.....

if( condition) {
get_out = True
}

}

but in Python will use

for i in range(MAX):
..do lot of things...
if condition:
#Exit the loop
break

Don't know, seems to me that the more syntetic code of Python helps me
to see clearly when to exit, in C/C++ the break statements seems to
confuse me. Probably related with the amount (and density) of code

I think an infinity loop (while True:) should be used only on, well,
infinite loops (or at least indeterminate ones that depends on
arguably input, like user input or network data) I wouldn't use them
for reading a file, for example...

But, anyway, I think the key concept is to use them when it's more
readable and makes more sense than a "finite loop". By the way, "more
readable" to me, of course :p
 
H

Hendrik van Rooyen

There is, at least according to my (long distant and only barely
remembered) tests.

8< -----------------------------------
under Python 2.5. However, catching an exception is more expensive,
approximately ten times more so. Doing a lookup twice falls somewhere
between the two, closer to the cheap side than the expensive.

So according to my rough estimates, it is faster to use the try...except
form so long as the number of KeyErrors is less than about one in six,
give or take. If KeyError is more common than that, it's cheaper to do a
test first, say with d.has_key(). Using the `in` operator is likely to be
faster than has_key(), which will shift the break-even point.

(The above numbers are from memory and should be taken with a large pinch
of salt. Even if they are accurate for me, they will likely be different
on other machines, and will depend on the actual keys in the dict. In
other words, your mileage may vary.)

So if you want to sum stuff where there are a lot of keys, but only a few
values per key - say between one and ten, then it would be faster to look
before you leap.

On the other hand, if there are relatively few keys and tens or hundreds of
values per key, then you ask for forgiveness.

And if you don't know what the data is going to look like, then you should
either go into a catatonic state, or take to the bottle, as the zen of python
states that you should refuse the temptation to guess.

This is also known as paralysis by analysis.

:)

- Hendrik
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,184
Messages
2,570,978
Members
47,578
Latest member
LC_06

Latest Threads

Top