Lisp mentality vs. Python mentality

C

Ciprian Dorin, Craciun

The above doesn't really compare for equality, it's a generic element-by-
element comparison function, and so it is inappropriate to contrast it to
__eq__ alone. Defaulting to equality testing is misleading, and if I were
writing such a function I'd remove the default.

compare(a, b, operator.eq) gives the same result as the simpler a == b,
but compare(a, b, operator.lt) does something very different to a < b. I
can't think of an application for element-by-element comparisons off the
top of my head, but if the numpy people use it, there must be a need :)

I agree with your comment, the presented function compare does
more than this, in fact checks if the elements in the two functions
match a given binary predicate. (In Scheme this is named "andmap")

About the compare (a, b, operator.lt) it does the same as a < b,
where a and b are lists of numbers.

An usage for the compare function (as I've shown in a previous
post to this thread) could also have been checking a list of strings
if they match to a list of regular expressions (we rename the function
compare to the name "match" as it is much general):
match (re.match, regexps, strings)

About the defaults, I really, really, agree with your comment:
"Defaulting to equality testing is misleading"... Most of the times
equality means equality by value, but sometimes you need the equality
to be more relaxed (maybe ignore case, maybe ignore spaces, or maybe
for real numbers to ignore a difference smaller than a chosen delta
(those writing scientific applications know this too well))...

Ciprian.
 
S

Steven D'Aprano

I liked very much your implementation for the compare function, it
is very short and at the same time readable:


But I have only one problem, it is suboptimal, in the sense that: *
it constructs two intermediary lists (the list comprehension and
the zip call);

If you look closely, there is no list comprehension. The argument to
all() is a generator expression, which does not construct an intermediary
list.

However, you are right that zip produces a list, at least in Python 2.x.
In Python 3 it produces a generator-like object.


* it evaluates all the elements, even if one is false at the
beginning;

No, all() uses short-cut evaluation. It will return as soon as it hits a
False element.


So, could you overcome these problems?

In Python 2.4 or better, I can remove the intermediate list produced by
zip with one line:

from itertools import izip as zip

(This may even work in 2.3, but I don't have 2.3 to test it.)
 
P

Paul Rubin

namekuseijin said:
... return (len(a) == len(b)) and not any(not comp(a,b) for (a,b)
in (zip(a, b)))

If I'm reading that correctly, I think I'd write it as:

from itertools import imap, izip
return (len(a) == len(b)) and all(imap(comp, izip(a, b)))

That is more concise and avoids building an intermediate list with zip.

Maybe we need something like zipWith ...
 
P

Paul Rubin

namekuseijin said:
return (len(a) == len(b)) and not any(not comp(*t) for t in
(zip(a, b)))

plus the zip call enclosed in parentheses got turned into an iterator.

zip in python 2.x always makes a list. You want itertools.izip.
You could also use itertools.starmap.
 
A

Arnaud Delobelle

No, it only constructs one list (the zip() one) and only in Python 2.x -
in Python 3.x, zip return a special 'zip object'. There is no list
comprehension. It's a generator expression [1].

To avoid the list created by zip in python 2.x (for x large enough!),
just do:

from itertools import izip as zip

Another way to define the function that may appeal to you, as a lisper.

def compare(a, b, comp=operator.eq):
return len(a) == len(b) and all(map(comp, a, b))

The same remark applies to map() here as to zip() above.

It does not [2].
This evaluates just until finding one that is false:

return (len(a) == len(b)) and not any(not comp(*t) for t in
(zip(a, b)))

plus the zip call enclosed in parentheses got turned into an iterator.

not any(not i for i in iterable) is not an optimisation of
all(iterable). Refer to [2]. Moreover, putting a list in parenthesis
does not magically turn it into a generator.
 
A

Arnaud Delobelle

Paul Rubin said:
If I'm reading that correctly, I think I'd write it as:

from itertools import imap, izip
return (len(a) == len(b)) and all(imap(comp, izip(a, b)))

Do you mean imap(comp, a, b)?
 
M

Martin v. Löwis

From your comments I understand that the only problem with my code
proposal are the function names...

No, the problem is that you are using way too many functions, that do
too little. The problem with that is then that you have to give names
to all the functions, which then find people difficult to read because
they don't just need to the code in question itself; they also need to
dozen of helper functions that it relies on.
And about the find_index, we could rename it to
first_matching_index. About the negation optional parameter, we could
eliminate it if we allow either: to have another function
first_missmatching_index, but this leads to namespace bloat, or we
have a function named negate, that takes another function, and negates
it meaning.

Or you could avoid introducing the function altogether, to make it more
readable. This makes it more pythonic, also: readability counts (from
the Zen of Python).

Regards,
Martin
 
C

Ciprian Dorin, Craciun

No, the problem is that you are using way too many functions, that do
too little. The problem with that is then that you have to give names
to all the functions, which then find people difficult to read because
they don't just need to the code in question itself; they also need to
dozen of helper functions that it relies on.


Or you could avoid introducing the function altogether, to make it more
readable. This makes it more pythonic, also: readability counts (from
the Zen of Python).

Regards,
Martin


So if I'm reading right you are saying something in the lines:
"using too many functions is bad just because it is unreadable and
non-understandable to average (could I say mediocre?) programmers"...
Unfortunately I thought that delegating responsibilities to other
functions, and thus writing small chunks of code, is what good
software engineering is... Well my bad...

(As a side-note, maybe this is a reason why some of my students
find it hard to understand functional programming -- too many
functions that is -- I shall have to revise my teaching, and tell the
students to inline everything, maybe they'll understand it like this
:) )

Although you have a point -- that of being hard to comprehend by
average programmers -- but this doesn't mean it is a wrong (as in
ugly) solution... Also, with respects, but the "pythonic" solution
involving generators (or iterators) and "zip" or "all" function --
although I appreciate it as it comes close to FP -- is not what I
would call readable and understandable by non-guru programmers...

Ciprian Craciun.
 
T

Travis

In answering the recent question by Mark Tarver, I think I finally hit
on why Lisp programmers are the way they are (in particular, why they
are often so hostile to the "There should only be one obvious way to
do it" Zen).

Say you put this task to a Lisp and a Python programmer: Come up with
a good, generic, reusable way to compare two lists.  What are their
respective trains of thought?

Lisp programmer:

Well, there is a standard function called mismatch that does it, but I
can't recommend it.  First of all, you don't know where that
function's been.  Anyone and their mother could have worked on it, did
they have good, sound programming practice in mind when they wrote
it?  Of course not.  Let's be real here, we have to implement this by
hand.

(defun lists-are-equal (a b)
   (or (and (not a) (not b))
       (and (= (car a) (car b)) (lists-are-equal (cdr a) (cdr b))))

There, much better than the standard function, and better yet, it's in
the *absolute minimal form possible*.  There is no way to express list
comparison in a more reduced form.  It's almost erotic how awesome it
is.  I'm---whoa, ok, I'm getting a little excited now, settle down.
Well, come to think of it, that's really not that good.  First of all
it's a function.  I mean, it just sits there and does nothing till you
call it.  How boring is that?  It can't react to the current
situation.  Plus it compares all lists the same way, and that is
really inefficient.  Every list compare is a new problem.  Different
lists need different comparative strategies.  This function simply
won't do.  I need a macro that can intelligently compile the right
list compare methodology in.  For instance, if we want to compare two
lists that are known at compile time, we don't want to waste time
comparing them at runtime.  No, the macro must detect constant
arguments and special case them.  Good start.  Now, we have to
consider the conditions this comparison is being done under.  If the
user is passing the result of a sort to this macro, it's almost
certain that they are trying to see whether the lists have the same
elements.  We can do that a lot more efficiently with a countset.  So
let's have the macro check to see if the forms passed to it are all
sort calls.  Better yet, let's check for my own powerful sort macro.
Hmm.  Wait... I think my 4600-line sort macro already checks its
calling context to see if its results are being fed to a list
comparison.  I'll have to refactor that together with this macro.  Ok,
good, now I am sure other users will eventually want to customize list
comparison for their own use, after all every list comparison is
different and I can't possibly anticipate all of them.  A user needs
to be able to adapt to the situation, so it's vitally important to
create a plug-in infrastructure to give them that flexibility.  Now,
what about exceptions, there's a millions ways to deal with that...

...and so on until eyelids can no longer stay open....

Python programmer:

a == b.  Next question.

Carl Banks, who might be exaggerating

...a little.

I've noticed that every one of you is wrong about programming.
Since I can't say it effectively, here's someone who can:

That's the answer.
 
B

bearophileHUGS

Paul Rubin:
Arnaud Delobelle:

Oh yes, I forgot you can do that.  Thanks.

That works and is nice and readable:


import operator
from itertools import imap

def equal_sequences(a, b, comp=operator.eq):
"""
a and b must have __len__
>>> equal_sequences([1, 2, 3], [1, 2, 3]) True
>>> L1, L2 = [1, 2, 3], [1, -2, 3]
>>> equal_sequences(L1, L2) False
>>> equal_sequences(L1, L2, lambda x,y: abs(x) == abs(y)) True
>>> L3, L4 = ["hello", "HALLO"], ["hello", "hallo"]
>>> equal_sequences(L3, L4) False
>>> equal_sequences(L3, L4, lambda x,y: x.lower() == y.lower())
True
"""
return len(a) == len(b) and all(imap(comp, a, b))


if __name__ == "__main__":
import doctest
doctest.testmod()
print "Doctests finished.\n"



But both sequences must have a len. Otherwise you may use (if you
don't have izip_longest the folllowing code gets longer):


"""
equal_items([1, 2], [1, 2, 3]) False
equal_items([1, 2, 3], [1, 2, 3]) True
equal_items([1, 2, 3], [1, -2, 3]) False
equal_items([1, 2, 3], [1, -2, 3], abs) True
equal_items([1, 2, 3], [1, -2, 4], abs) False
L1, L2 = ["hello", "HALLO"], ["hello", "hallo"]
equal_items(L1, L2) False
equal_items(L1, L2, str.lower) True
equal_items(xrange(3), (i for i in xrange(3))) True
equal_items([0, 1, 2], (i for i in xrange(3))) True
equal_items([0, 1, 2, 3], (i for i in xrange(3))) False
equal_items([-0, -1, -2], (i for i in xrange(3)), key=abs) True
equal_items([-0, -1, -2, -3], (i for i in xrange(3)), key=abs) False
x = []
equal_items( (x for i in range(3)), (x for i in range(3)) ) True
equal_items( (x for i in range(3)), (x for i in range(4)) ) False
equal_items( (x for i in range(3)), (x for i in range(3)), key=id) True
equal_items( (x for i in range(3)), (x for i in range(4)), key=id) False
equal_items( (x for i in range(3)), (x for i in range(3)), key=lambda x:x) True
equal_items( (x for i in range(3)), (x for i in range(4)), key=lambda x:x)
False
"""

from itertools import izip_longest


def equal_items(iter1, iter2, key=None):
try:
len_iter1 = len(iter1)
len_iter2 = len(iter2)
except TypeError:
pass
else:
if len_iter1 != len_iter2:
return False

class Guard(object): pass

if key is None:
for x, y in izip_longest(iter1, iter2, fillvalue=Guard()):
if x != y:
return False
else:
try:
for x, y in izip_longest(iter1, iter2, fillvalue=Guard()):
if key(x) != key(y):
return False
except TypeError: # intercepts key(guard)
return False

return True


if __name__ == "__main__":
import doctest
doctest.testmod()
print "Doctests finished.\n"

You can write hairy code in Python too, not just in CLisp :)


Bye,
bearophile
 
T

Tim Chase

I liked very much your implementation for the compare function, it
is very short and at the same time readable:


But I have only one problem, it is suboptimal, in the sense that:
* it constructs two intermediary lists (the list comprehension and
the zip call);
* it evaluates all the elements, even if one is false at the beginning;

So, could you overcome these problems?

The solution would be to use itertools.izip() instead of the
stock zip(). The all() function short-circuits at the first
non-True value. Thus, using izip() instead will (1) not create
any new lists (it's a generator, not a list) and (2) the all()
will only look until it fails.

Just make sure that your comp() function returns equality for
this to work, or otherwise use "not comp()" (which is germane if
you use a __cmp__ function that returns 0 for equality)

-tkc
 
B

bearophileHUGS

You can also use quite less code, but this is less efficient:

def equal_items(iter1, iter2, key=lambda x: x):
class Guard(object): pass
try:
for x, y in izip_longest(iter1, iter2, fillvalue=Guard()):
if key(x) != key(y):
return False
except TypeError: # intercepts key(guard)
return False
return True

Bye,
bearophile
 
V

Vsevolod

In answering the recent question by Mark Tarver, I think I finally hit
on why Lisp programmers are the way they are (in particular, why they
are often so hostile to the "There should only be one obvious way to
do it" Zen).

Say you put this task to a Lisp and a Python programmer: Come up with
a good, generic, reusable way to compare two lists. What are their
respective trains of thought?

Lisp programmer:

Well, there is a standard function called mismatch that does it, but I
can't recommend it. First of all, you don't know where that
function's been. Anyone and their mother could have worked on it, did
they have good, sound programming practice in mind when they wrote
it? Of course not. Let's be real here, we have to implement this by
hand.

(defun lists-are-equal (a b)
(or (and (not a) (not b))
(and (= (car a) (car b)) (lists-are-equal (cdr a) (cdr b))))

There, much better than the standard function, and better yet, it's in
the *absolute minimal form possible*. There is no way to express list
comparison in a more reduced form. It's almost erotic how awesome it
is. I'm---whoa, ok, I'm getting a little excited now, settle down.
Well, come to think of it, that's really not that good. First of all
it's a function. I mean, it just sits there and does nothing till you
call it. How boring is that? It can't react to the current
situation. Plus it compares all lists the same way, and that is
really inefficient. Every list compare is a new problem. Different
lists need different comparative strategies. This function simply
won't do. I need a macro that can intelligently compile the right
list compare methodology in. For instance, if we want to compare two
lists that are known at compile time, we don't want to waste time
comparing them at runtime. No, the macro must detect constant
arguments and special case them. Good start. Now, we have to
consider the conditions this comparison is being done under. If the
user is passing the result of a sort to this macro, it's almost
certain that they are trying to see whether the lists have the same
elements. We can do that a lot more efficiently with a countset. So
let's have the macro check to see if the forms passed to it are all
sort calls. Better yet, let's check for my own powerful sort macro.
Hmm. Wait... I think my 4600-line sort macro already checks its
calling context to see if its results are being fed to a list
comparison. I'll have to refactor that together with this macro. Ok,
good, now I am sure other users will eventually want to customize list
comparison for their own use, after all every list comparison is
different and I can't possibly anticipate all of them. A user needs
to be able to adapt to the situation, so it's vitally important to
create a plug-in infrastructure to give them that flexibility. Now,
what about exceptions, there's a millions ways to deal with that...

...and so on until eyelids can no longer stay open....

Python programmer:

a == b. Next question.

Carl Banks, who might be exaggerating

...a little.

I think you're exaggerating. Go ask this question in c.l.l and the
first answer you'll get is mismatch.
But, from the other point of view your exaggeration makes sense: Lisp
unlike Python, IMO, is the language, where it's pleasant to program
not only applications, but the internals as well. So some people may
find interest in reprogramming what's already there. In lisp
programmer's mentality it's good to know, that you have that ability.

And let's look at my recent experience with Python: I wanted to
implement a daemon process and stumbled at a simplest problem with
threading: neither Thread, nor Threading module provides thread-
killing possibility. Surely, I'm not so experienced in Python as in
Lisp (in which I'd definitely be able to solve this problem by
extending the library), but I don't see an obvious solution, which
will stay inside the language: I have to either use the shell or stick
to the limited set of provided options and skew my program design to
work with them. Any other suggestions?

P.S. Btw the other issue with CL's mismatch is that it provides a
possibility to use any test and keyword extraction function.

Best regards,
Vsevolod Dyomkin
 
A

Arnaud Delobelle

You can also use quite less code, but this is less efficient:

def equal_items(iter1, iter2, key=lambda x: x):
class Guard(object): pass
try:
for x, y in izip_longest(iter1, iter2, fillvalue=Guard()):
if key(x) != key(y):
return False
except TypeError: # intercepts key(guard)
return False
return True

You don't want to silence TypeErrors that may arise from with key() when
x or y is not a Guard, as it could hide bugs in key(). So I would write
something like this:

def equal_items(iter1, iter2, key=lambda x: x, _fill = object()):
for x, y in izip_longest(iter1, iter2, fillvalue=_fill):
if x is _fill or y is _fill or key(x) != key(y):
return False
return True

(untested)

Another way would be:

def equal_items(iter1, iter2, key=lambda x: x):
iter1, iter2 = iter(iter1), iter(iter2)
for x, y in izip(iter1, iter2):
if key(x) != key(y):
return False
for x, y in izip_longest(iter1, iter2):
return False
return True

(untested)
 
A

Arnaud Delobelle

Arnaud Delobelle said:
Another way would be:

def equal_items(iter1, iter2, key=lambda x: x):
iter1, iter2 = iter(iter1), iter(iter2)
for x, y in izip(iter1, iter2):
if key(x) != key(y):
return False
for x, y in izip_longest(iter1, iter2):
return False
return True

(untested)

Or even:

def equal_items(iter1, iter2, key=lambda x: x):
iter1, iter2 = iter(iter1), iter(iter2)
for x, y in izip(iter1, iter2):
if key(x) != key(y):
return False
return not any(izip_longest(iter1, iter2))

(untested)

Or even:

def equal_items(iter1, iter2, key=lambda x: x):
iter1, iter2 = iter(iter1), iter(iter2)
if any(key(x) != key(y) for x, y in izip(iter1, iter2)):
return False
return not any(izip_longest(iter1, iter2))
 
P

Peter Otten

Arnaud said:
def equal_items(iter1, iter2, key=lambda x: x):
    iter1, iter2 = iter(iter1), iter(iter2)
    for x, y in izip(iter1, iter2):
        if key(x) != key(y):
            return False
    for x, y in izip_longest(iter1, iter2):
        return False
    return True

(untested)

This will fail when iter1 yields one more item than iter2. izip() then
consumes one extra item:
from itertools import izip
a = iter([1,2])
list(izip(a, "b")) [(1, 'b')]
a.next()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration

Peter
 
B

bearophileHUGS

Arnaud Delobelle:
You don't want to silence TypeErrors that may arise from with key() when
x or y is not a Guard, as it could hide bugs in key(). So I would write
something like this:

def equal_items(iter1, iter2, key=lambda x: x, _fill = object()):
    for x, y in izip_longest(iter1, iter2, fillvalue=_fill):
        if x is _fill or y is _fill or key(x) != key(y):
            return False
    return True

(untested)

You are right, thank you. Darn exceptions. You are often able to find
bugs in my code.

Here is a new version then (this is designed to be usable in practice,
that's why it's longer):


from itertools import izip_longest

def equal_iter(iter1, iter2, key=None):
"""
>>> equal_iter([1, 2], [1, 2, 3]) False
>>> equal_iter([1, 2, 3], [1, 2, 3]) True
>>> equal_iter([1, 2, 3], [1, -2, 3]) False
>>> equal_iter([1, 2, 3], [1, -2, 3], abs) True
>>> equal_iter([1, 2, 3], [1, -2, 4], abs) False
>>> L1, L2 = ["hello", "HALLO"], ["hello", "hallo"]
>>> equal_iter(L1, L2) False
>>> equal_iter(L1, L2, str.lower) True
>>> equal_iter(xrange(3), (i for i in xrange(3))) True
>>> equal_iter([0, 1, 2], (i for i in xrange(3))) True
>>> equal_iter([0, 1, 2, 3], (i for i in xrange(3))) False
>>> equal_iter([-0, -1, -2], (i for i in xrange(3)), key=abs) True
>>> equal_iter([-0, -1, -2, -3], (i for i in xrange(3)), key=abs) False
>>> x = []
>>> equal_iter( (x for i in range(3)), (x for i in range(3)) ) True
>>> equal_iter( (x for i in range(3)), (x for i in range(4)) ) False
>>> equal_iter( (x for i in range(3)), (x for i in range(3)),
key=id)
Truekey=id)
Falsekey=lambda x:x)
Truekey=lambda x:x)
Falsekey=k)
Traceback (most recent call last):
...
TypeError
"""
try:
len_iter1 = len(iter1)
len_iter2 = len(iter2)
except TypeError:
pass
else:
if len_iter1 != len_iter2:
return False

class Guard(object): pass
guard = Guard()

if key is None:
for x, y in izip_longest(iter1, iter2, fillvalue=guard):
if x != y:
return False
else:
for x, y in izip_longest(iter1, iter2, fillvalue=guard):
if x is guard or y is guard or key(x) != key(y):
return False

return True


if __name__ == "__main__":
import doctest
doctest.testmod()
print "Doctests finished.\n"

Bye,
bearophile
 
A

Arnaud Delobelle

Peter Otten said:
This will fail when iter1 yields one more item than iter2. izip() then
consumes one extra item:

Ah yes! And this is why I should have tested it.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,981
Messages
2,570,188
Members
46,731
Latest member
MarcyGipso

Latest Threads

Top