question of style

L

Lie Ryan

Simon said:
Hey I was hoping to get your opinions on a sort of minor stylistic
point.
These two snippets of code are functionally identical. Which would you
use and why?

If self is an object (and it must have been), it would be preferrable to
set self.higher and self.lower to a known value inside self.__init__ and
avoid having the comparison.

I'm guessing you have something like this in your __init__:

def __init__(self, lower=None, higher=None):
self.lower = lower
self.higher = higher


that should be changed to:

def __init__(self, lower=None, higher=None):
self.lower = lower if lower is not None else higher
self.higher = higher if lower is not None else lower

or whatever else-value you see is appropriate for the context.
 
P

Paul Rubin

Lie Ryan said:
I guess in python, None as the missing datum idiom is still quite prevalent:

Well, sometimes there is no way around it, but:
def cat_list(a=None, b=None):
# poor man's list concatenation
if a is None and b is None: return []
if a is None: return b
if b is None: return a
return a + b

def cat_list(a=[], b=[]):
return a + b
 
S

Steven D'Aprano

Right, and how many times have you had to debug

AttributeError: 'NoneType' object has no attribute 'foo'

Hardly ever, and mostly when I do something silly like:

another_list = alist.sort()

Of course, YMMV.


or the equivalent null pointer exceptions in Java, C, or whatever? They
are very common. And the basic idea is that if you avoid using null
pointers in the first place, you'll get fewer accidental null pointer
exceptions.

And some other error instead, due to the bugs in your more complicated
code *wink*
 
L

Lie Ryan

Paul said:
Lie Ryan said:
I guess in python, None as the missing datum idiom is still quite prevalent:

Well, sometimes there is no way around it, but:
def cat_list(a=None, b=None):
# poor man's list concatenation
if a is None and b is None: return []
if a is None: return b
if b is None: return a
return a + b

def cat_list(a=[], b=[]):
return a + b

Being super contrived is why I tagged it as poor man's concat.
Generally, there will be a processing:

....
if b is None: return a
# processing here
return retlist
....

and it will not be about concatenating two lists, but something more
complex. But I thought that was unnecessary since I just want to mention
about the None argument idiom.
 
S

Steven D'Aprano

That being the case, it might be a good idea either to handle the
situation and raise an exception or add:

assert self.lower <= self.higher

Only if you want strange and mysterious bugs whenever somebody runs your
code with the -O flag.

That way an exception will be raised if there is an error somewhere else
in the code rather then silently passing a possibly incorrect value.

asserts are disabled when the -O flag is given.

You should never use asserts for testing data. That's not what they're
for. They're for testing program logic (and unit-testing).

This is wrong, because it will sometimes fail when running under -O:

def parrot(colour):
assert colour.lower() in ('red', 'green', 'blue')
return "Norwegian %s" % colour.title()

That should be written as test followed by an explicit raise:

def parrot(colour):
if colour.lower() not in ('red', 'green', 'blue'):
raise ValueError
return "Norwegian %s" % colour.title()


An acceptable use for assert is to verify your program logic by testing
something which should never fail. If it does fail, then you know
something bizarre and unexpected has happened, and you have a program bug
rather than bad data. Here's a silly example:


def cheese_available(kind):
"""Return True if the kind of cheese specified is in stock
in the cheeseshop run by Michael Palin.

This function always returns False.
"""
return False


def get_cheese():
exchange_amusing_banter()
for kind in ('Cheddar', 'Gouda', 'Swiss', 'Camembert'):
ask_proprietor("Do you have any %s?" % kind)
flag = cheese_available(kind)
assert not flag
if flag:
buy_cheese(kind)
return None
else:
express_disappointment()
# We only get here if no cheese is bought.
print "Cleese: Does this cheeseshop actually have ANY cheese?"
print "Palin: No sir, I'm afraid I've been wasting your time."
print "Cleese: Well then, I'm going to have to shoot you."
 
T

Tim Harig

Only if you want strange and mysterious bugs whenever somebody runs your
code with the -O flag.

The point here is that Rubin presumed that the condition where lower >
higher should never exist. Therefore, because the program given segment of
program doesn't test it elswhere, it is possible that a bug in the code
that sets lower > higher could go unoticed while silently passing the wrong
data.

Therefore, It is better to test that assumption in a way that *will* crash
if something is wrong, for instance, if another method accidently changes
one of the values. That would tend to make errors in another method of the
class (or even higher up in this method), more likely to be found.
asserts are disabled when the -O flag is given.

Right, like defining NDEBUG in C. In _Writing Solid Code_ by Steve
Maguire, he likens it to having two pieces of code: one for testing where
any errors should cause crashes as early as possible and one for shipping
where it may be better to handle errors if possible from within the code.
You should never use asserts for testing data. That's not what they're
for. They're for testing program logic (and unit-testing).

What I am suggesting here is exactly that. If lower is always defined such
that it should always be equal to or lower then higher, then there is an
error in the code somewhere if lower is higher by the time this code is
called. Either the constructor didn't didn't reject input properly or
another method within the class has inadvertantly changed higher or lower
in an uncorrect way.

In any event, whether I am using it 100% correctly, I want to find any
errors in my code. If there is an error, I want the program to crash
during testing rather then silently passing bad data. assert will help
here.

Unfortunatly, Rubin is right. I did make the mistake that the assert will
fail with higher or lower values of None, which is allowed here. The same
basic premise is correct but will require more complex assertions to allow
that through.
 
J

Jean-Michel Pichavant

Simon said:
Hey I was hoping to get your opinions on a sort of minor stylistic
point.
These two snippets of code are functionally identical. Which would you
use and why?
The first one is easier [for me anyway] to read and understand, but
Easier for you but not for those who are familiar with the standard way
of python coding. (by standard I don't mean the "only Godly way of
coding", there's room for custom rules)
But I think most of the people will write:
if self.higher is None:
return self.lower
if self.lower is None:
return self.higher
# here neither are None
slightly less efficient, while the second is [marginally] harder to
follow but more efficient.

## First snippet

if self.higher is self.lower is None: return
if self.lower is None: return self.higher
if self.higher is None: return self.lower

## Second snippet

if self.higher is None:
if self.lower is None:
return
return self.lower
if self.lower is None:
return self.higher

What do you think?

(One minor point: in the first snippet, the "is None" in the first
line is superfluous in the context in which it will be used, the only
time "self.lower is self.higher" will be true is when they are both
None.)
 
S

Simon Forman

I say, you should keep the `is None` for readability and to safe guard
against immutable optimization and/or Singleton objects. Due to an
implementation detail, python does not guarantee that two immutable
object will return different objects; you have:


True

I probably would have kept the 'is None' for those reasons except that
this code was never meant to be used by anyone but me, and it turned
out the entire line was redundant anyway. :]
 
S

Simon Forman

 http://math.andrej.com/2009/04/11/on-programming-language-design/

discusses the issue (scroll down to "Undefined values" section).

Hmm, I'm not convinced. I read that post with interest, but his
argument against None et. al. was just this:

"...stick to the principle: NULL is wrong because it causes horrible
and tricky mistakes which appear even after the program was tested
thoroughly. You cannot introduce NULL into the language and tell the
programmer to be careful about it. The programmer is not capable of
being careful!"

To me that seems weak and too general, you could say /anything/
"...causes horrible and tricky mistakes which appear even after the
program was tested thoroughly."

He does go on to say, in a reply to a comment, "In my opinion, at the
end of the day the quality of the code depends more on the quality of
its author than on the quality of the language he uses. But this does
not mean that the quality of the programming language does not
matter."

Which I agree with wholeheartedly. I just don't see that he made a
strong case that the inclusion of NULLs directly implies poor quality
of language.

(I like Haskell's Maybe, but saying A is better than B doesn't imply
that B is therefore terrible.)


Right, and how many times have you had to debug

   AttributeError: 'NoneType' object has no attribute 'foo'

or the equivalent null pointer exceptions in Java, C, or whatever?
They are very common.  And the basic idea is that if you avoid using
null pointers in the first place, you'll get fewer accidental null
pointer exceptions.

I have encountered exceptions like that, but usually as a result of
e.g. omitting a return statement (so the caller gets None instead of
whatever should have been returned.)

I don't usually write "pointer" style code in python. In fact, other
than this BTree and the Dancing Links algorithm I mentioned earlier I
can't recall ever having done it. Indeed, python's omission of
pointers is one of the reasons I'm so fond of it.

In my experience with python exceptions like that one are indicators
of something bad wrong with my code, not of misuse of None (as NULL
pointer.)


But you have implemented a mutable tree.  If you want to write a
persistent one, then write a persistent one!  You also use a wishful

The paper I'm working from is about general techniques for making
persistent data structures from mutable forms, thus I wanted a mutable
BTree to then "overwrite" with a persistent form.

heuristic in the hope of keeping the tree balanced--if you want a
balanced tree, why not use one that's guaranteed to stay in balance?
AVL trees are pretty easy to implement; maybe you could try writing a
persistent one:

 http://en.wikipedia.org/wiki/AVL_tree

The balance (or lack thereof) of the tree is not important (in this
use.) I threw in that heuristic in the delete function because it was
easy enough and it was mentioned in the wikipedia article on BTrees.
AVL trees are easy too, but still more complicated than I need.

Well, I can't understand why that might be, but it's surely
subjective, so ok.

It's a matter of this:

if a is None:
return b
if b is None:
return a
# handle both non-None here...

vs. this:

if a is not None:
if b is not None:
# handle both non-None here...
else:
return a
return b

Everything is subjective, but I think the first one is more elegant
IMO.


Er, I'm not trying to argue with you.  You asked for people's advice
and impressions, so I gave you some.  I don't feel like rewriting that

I know, I asked for it. :] But it still stings a little to hear
someone call my code "rather ugly". No hard feelings. I'm not trying
to argue with you either. :]

whole class, but I'll do a method or two, using [] to represent an
empty node.  (Hmm, actually changing the empty node representation did
less to shorten the code than just getting rid of a bunch of
duplication.  Really, I'd tend to look for a slightly different tree
structure which tried to avoid these issues).

Hmm, I really don't see that using an empty list rather than None as a
stand-in for NULL pointer bought you anything there. In fact, this
code would work identically if you replaced [] with None.

I was hoping to see some clever code-fu involving list manipulation or
something. When would you ever store a value in the empty list
"nodes" anyway?
Untested:

class BinaryTree:
    def __init__(self, key, value):
        self.key = key
        self.value = value
        self.higher = self.lower = []

    def get(self, key):
        if key == self.key:
            return self.value
        branch = self.lower if key < self.key else self.higher
        return branch.get(key) if branch else []

    def insert(self, key, value):
        def xinsert(xtree):  # xtree is either a tree or []
           if not xtree: xtree = BinaryTree(key, value)
           else: xtree.insert(key, value)
           return xtree

        if key == self.key:
            self.value = value
        elif key < self.key:
            self.lower = xinsert(self.lower)
        else:
            self.higher = xinsert(self.higher)

and so forth.
 
P

Paul Rubin

Simon Forman said:
"...stick to the principle: NULL is wrong because it causes horrible
and tricky mistakes which appear even after the program was tested
thoroughly. You cannot introduce NULL into the language and tell the
programmer to be careful about it. The programmer is not capable of
being careful!"

To me that seems weak and too general, you could say /anything/
"...causes horrible and tricky mistakes which appear even after the
program was tested thoroughly."

On the contrary, general means powerful :). For example, you can say
that malloc/free in C causes horrible and tricky mistakes etc. Python
avoids the mistakes by having automatic storage management.
Which I agree with wholeheartedly. I just don't see that he made a
strong case that the inclusion of NULLs directly implies poor quality
of language.

(I like Haskell's Maybe, but saying A is better than B doesn't imply
that B is therefore terrible.)

I wouldn't say Python's None is terrible, but the programming style in
which None is used as a marker for "absent value" is genuinely a
source of bugs, requiring care when used. Often it's easy to just
avoid it and all the bugs that come with it.

Haskell's Maybe type, ML's Option type, and various similar constructs
in other recently designed languages all are responses to the
shortcomings of None-like constructs in older languages. I'm not
going purely by that one guy's blog post, though I do think that post
was pretty good.
Hmm, I really don't see that using an empty list rather than None as a
stand-in for NULL pointer bought you anything there. In fact, this
code would work identically if you replaced [] with None.

Yes I noticed that too (and mentioned it) after writing the code. I
think the real problem is that the class has unnatural semantics and
Python's OOP system (well, OOP in general) is also sort of kludgy.
I will think about a possible rewrite.
 
J

John Yeung

I wouldn't say Python's None is terrible, but the
programming style in which None is used as a marker
for "absent value" is genuinely a source of bugs,
requiring care when used.  Often it's easy to just
avoid it and all the bugs that come with it.

While other languages may make it easier to avoid these types of bugs,
I think it's pretty natural to use Python's None as a marker for
"absent value", if you're coding in Python. Python seems to encourage
this by having functions return None when there is no explicit return
value, and by having the interactive command line print nothing at all
when the expression evaluates to None.

I won't argue with anyone who wants to call these design flaws, but it
seems to me that using None this way in one's own Python code is
reasonably Pythonic.

John
 
S

Steven D'Aprano

I wouldn't say Python's None is terrible, but the programming style in
which None is used as a marker for "absent value" is genuinely a source
of bugs, requiring care when used. Often it's easy to just avoid it and
all the bugs that come with it.

So you keep saying, but I don't believe it. I've never found this to be a
problem in my own code: a few silly mistakes where I accidentally assign
the result of a function that operates by side-effect:

blist = alist.sort()

or similar, and I haven't done that for years now.

No, wait, I tell I lie... re.search() sometimes bites me, because
sometimes it returns None and sometimes it returns a matchobject and I
don't use re often enough to have good habits with it yet. But that's a
good example of why removing NULL (None, nul, nil, whatever you want to
call it) doesn't lower the number of bugs, it just changes their nature:
if re.search didn't return None, I'd still have trouble with it.

There are three natural approaches to (say) re.search() for dealing with
failure:

(1) return a sentinel value like None;

(2) return a matchobject which tests False;

(3) raise an exception.

Here's how you would use them correctly:


# 1 or 2 are handled identically
o = re.search(s)
if o:
process_match(o)
else:
print "Not found"


# 3 is different
try:
process_match(re.search(s))
except Exception:
print "Not found"


There's essentially little or no difference in the handling. In all three
cases, if you assume search will always succeed or forget to handle the
failure case, you will write incorrect code:

process_match(re.search(s))

and get an unexpected exception. The only difference is that the specific
exception will differ between the cases. Remove None from the scenario
doesn't reduce the number of bugs, it just changes their nature.
 
A

Aahz

I wouldn't say Python's None is terrible, but the programming style in
which None is used as a marker for "absent value" is genuinely a
source of bugs, requiring care when used. Often it's easy to just
avoid it and all the bugs that come with it.

Haskell's Maybe type, ML's Option type, and various similar constructs
in other recently designed languages all are responses to the
shortcomings of None-like constructs in older languages. I'm not
going purely by that one guy's blog post, though I do think that post
was pretty good.

AFAICT, in order for Maybe and Option to work, you need to have some
kind of static typing system. Am I missing something?
 
U

upwestdon

Hey I was hoping to get your opinions on a sort of minor stylistic
point.
These two snippets of code are functionally identical. Which would you
use and why?
The first one is easier [for me anyway] to read and understand, but
slightly less efficient, while the second is [marginally] harder to
follow but more efficient.

## First snippet

if self.higher is self.lower is None: return
if self.lower is None: return self.higher
if self.higher is None: return self.lower

## Second snippet

if self.higher is None:
    if self.lower is None:
        return
    return self.lower
if self.lower is None:
    return self.higher

What do you think?

(One minor point: in the first snippet, the "is None" in the first
line is superfluous in the context in which it will be used, the only
time "self.lower is self.higher" will be true is when they are both
None.)

How about just:

if not (self.higher and self.lower):
return self.higher or self.lower
 
P

Paul Rubin

AFAICT, in order for Maybe and Option to work, you need to have some
kind of static typing system. Am I missing something?

I'm not sure. Maybe there could be some kind of syntactic support in
a dynamic language (not that I'm suggesting adding that to Python).
I've been meaning to look up how Erlang does it. For the limited
purpose of regexps that might or might not match, Perl has some
special support.

A kludgy Python counterpart to Option might be to use a pair (present, x)
where present is a bool saying whether the value is available. If
present is True, then x is the value. That could make it harder to
forget to test the flag before using the value.

In the stuff I'm doing, I often find it best to just use a list value,
so instead of:

y = f(x) if x is not None else None

I can just write

y = map(f, x)

In many cases it later turns out that list really was the natural
representation and it ends up growing additional potential elements.
I saw some article about database design recently that claimed as a
program evolves, all relationships end up becoming many-to-many. It
gave as an example, if your program deals with names and addresses, it
will eventually have to handle the case where someone has more than
one residence address.
 
D

Dave Angel

upwestdon said:
Hey I was hoping to get your opinions on a sort of minor stylistic
point.
These two snippets of code are functionally identical. Which would you
use and why?
The first one is easier [for me anyway] to read and understand, but
slightly less efficient, while the second is [marginally] harder to
follow but more efficient.

## First snippet

if self.higher is self.lower is None: return
if self.lower is None: return self.higher
if self.higher is None: return self.lower

## Second snippet

if self.higher is None:
if self.lower is None:
return
return self.lower
if self.lower is None:
return self.higher

What do you think?

(One minor point: in the first snippet, the "is None" in the first
line is superfluous in the context in which it will be used, the only
time "self.lower is self.higher" will be true is when they are both
None.)

How about just:

if not (self.higher and self.lower):
return self.higher or self.lower
But if self.higher is 0 and self.lower is None, this'll return None,
which wasn't the original intent.

Without having some documented constraints on the self.higher and lower
types and values, it's difficult to simplify further.
 
S

Simon Forman

Hey I was hoping to get your opinions on a sort of minor stylistic
point.
These two snippets of code are functionally identical. Which would you
use and why?
The first one is easier [for me anyway] to read and understand, but
slightly less efficient, while the second is [marginally] harder to
follow but more efficient.
## First snippet
if self.higher is self.lower is None: return
if self.lower is None: return self.higher
if self.higher is None: return self.lower
## Second snippet
if self.higher is None:
    if self.lower is None:
        return
    return self.lower
if self.lower is None:
    return self.higher
What do you think?
(One minor point: in the first snippet, the "is None" in the first
line is superfluous in the context in which it will be used, the only
time "self.lower is self.higher" will be true is when they are both
None.)

How about just:

if not (self.higher and self.lower):
    return self.higher or self.lower

That would work too in this case, both higher and lower are expected
to be either None or an object (that doesn't override the default all-
objects-are-true rule.)

Thanks,
~Simon
 
A

Aahz

In many cases it later turns out that list really was the natural
representation and it ends up growing additional potential elements.
I saw some article about database design recently that claimed as a
program evolves, all relationships end up becoming many-to-many. It
gave as an example, if your program deals with names and addresses, it
will eventually have to handle the case where someone has more than
one residence address.

That's definitely a critical point, and I absolutely agree that this
should get rubbed into people's faces.
 
P

Paul Rubin

Simon Forman said:
That would work too in this case, both higher and lower are expected
to be either None or an object (that doesn't override the default all-
objects-are-true rule.)

-1.

Objects can support methods like __bool__, __len__ (a tree might use
this to compute the number of nodes or something like that), and
__nonzero__. If len(object)==0 then the object is treated as false.
So that test can fail if a __len__ method gets added to the tree class
sometime. Or you might decide you want to support empty trees which
would test as false.

Anyway, Python's overloading of bool(...) is yet another misfeature
and although it's convenient, the "explicit is better than implicit"
principle indicates to avoid that sort of trick.
 
S

Simon Forman


Heh heh heh, I figured someone might call me on this, but I stopped
short of adding further elucidation to that parenthetical clause.

What I meant (of course, *wink*) was "...an instance of a user-defined
class that doesn't override the special double-underscore methods in a
way that causes it (the instance) to be considered False in a 'boolean
context', as they say in perl."

In [1]: class bar: pass
...:

In [2]: b = bar()

In [3]: bool(b)
Out[3]: True


Objects can support methods like __bool__, __len__ (a tree might use
this to compute the number of nodes or something like that), and
__nonzero__.  If len(object)==0 then the object is treated as false..
So that test can fail if a __len__ method gets added to the tree class
sometime.  Or you might decide you want to support empty trees which
would test as false.

Ya, that's exactly why I stuck to checking explicitly against None in
that tree code. :]

Anyway, Python's overloading of bool(...) is yet another misfeature
and although it's convenient, the "explicit is better than implicit"
principle indicates to avoid that sort of trick.  

I really like the "misfeature" myself. Python (to me) seems to
relieve you of gory details that get in the way (compare equivalent C+
+ and Python code) but still exposes you to gory details that, with
care in coding, can make your code easier to grok and more elegant in
general.

BTW, Paul, kind of a tangent: I reimplemented the same algorithm but
using tuples instead of instances (and empty tuples for "NULL"
values.) I was trying to mess around in the space you seemed to
indicate existed, i.e. a better implementation using other datatypes,
but I didn't have a clear idea what I was doing and, as I said, I
started by simply re-implementing with a different datatype.

Much to my surprise and delight, I discovered the tuple-based BTree
was /already/ a "persistent data type"! It was both awesome and a bit
of an anti-climax. :]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,201
Messages
2,571,049
Members
47,654
Latest member
LannySinge

Latest Threads

Top