map/filter/reduce/lambda opinions and background unscientificmini-survey

C

Carl Banks

John said:
May I most respectfully point out that you've got it backwards.
Part of the justification for list comprehensions was that they could
be used to replace map and filter.

The jihad against the functional constructs has been going on for a
long time, and list comprehensions are only one piece of it.


Many people believe that the functional constructs in Python exist to
enhance Python's support of functional programming, but that's wrong.
They exist to enhance support of procedural programming.

In other words, the functional elements were added not because Python
embraced functional programming, but because discreet use of functional
code can make procedural programs simpler and more concise.

Listcomps et al. cannot do everything map, lambda, filter, and reduce
did. Listcomps are inferior for functional programming. But, you see,
functional is not the point. Streamlining procedural programs is the
point, and I'd say listcomps do that far better, and without all the
baroque syntax (from the procedural point of view).

Jihad? I'd say it's mostly just indifference to the functional
programming cause.
 
S

Scott David Daniels

egbert said:
Also, map is easily replaced.
map(f1, sequence) == [f1(element) for element in sequence]

How do you replace
map(f1,sequence1, sequence2)
especially if the sequences are of unequal length ?

I didn't see it mentioned yet as a candidate for limbo,
but the same question goes for:
zip(sequence1,sequence2)

OK, you guys are picking on what reduce "cannot" do.
The first is [f1(*args) for args in itertools.izip(iter1, iter2)]
How to _you_ use map to avoid making all the intermediate structures?

I never saw anything about making zip go away. It is easy to explain.

Now reduce maps to what I was taught to call "foldl."
How do you express "foldr?" How do you express:

_accum = initial()
for elem in iterable:
_accum = func(elem, _accum, expr)

....

If you want functional programming in python, you have at least three
big problems:

1) Python has side effect like mad, so order of evaluation matters.
I'd claim any useful language is like that (I/O to a printer is
kind of hard to do out-of-order), but I'd get sliced to death
by a bunch of bullies wielding Occam's razors.

2) Python's essential function call is not a single-argument
function which might be a tuple, it is a multi-argument
function which is not evaluated in the same way. The natural
duality of a function taking pairs to a function taking an arg
and returning a function taking an arg and returning a result
breaks down in the face of keyword args, and functions that
take an indeterminate number of arguments. Also, because of (1),
there is a big difference between a function taking no args and
its result.

3) Python doesn't have a full set of functional primitives.
Fold-right is one example, K-combinator is another, ....
Why pick on reduce as-is to keep? There is another slippery
slope argument going up the slope adding functional primitives.

--Scott David Daniels
(e-mail address removed)
 
C

Christopher Subich

Carl said:
Listcomps et al. cannot do everything map, lambda, filter, and reduce
did. Listcomps are inferior for functional programming. But, you see,
functional is not the point. Streamlining procedural programs is the
point, and I'd say listcomps do that far better, and without all the
baroque syntax (from the procedural point of view).

I've heard this said a couple times now -- how can listcomps not
completely replace map and filter?

I'd think that:
mapped = [f(i) for i in seq]
filtered = [i for i in seq if f(i)]

The only map case that doesn't cleanly reduce is for multiple sequences
of different length -- map will extend to the longest one (padding the
others with None), while zip (izip) truncates sequences at the shortest.
This suggests an extension to (i)zip, possibly (i)lzip ['longest zip']
that does None padding in the same way that map does.

Reduce can be rewritten easily (if an initial value is supplied) as a
for loop:
_accum = initial
for j in seq: _accum=f(_accum,j)
result = _accum

(two lines if the result variable can also be used as the accumulator --
this would be undesirable of assigning to that can trigger, say, a
property function call)

Lambdas, I agree, can't be replaced easily, and they're the feature I'd
probably be least happy to see go, even though I haven't used them very
much.
 
S

Steven D'Aprano

egbert said:
Also, map is easily replaced.
map(f1, sequence) == [f1(element) for element in sequence]

How do you replace
map(f1,sequence1, sequence2)
especially if the sequences are of unequal length ?

I didn't see it mentioned yet as a candidate for limbo,
but the same question goes for:
zip(sequence1,sequence2)

OK, you guys are picking on what reduce "cannot" do.
The first is [f1(*args) for args in itertools.izip(iter1, iter2)]

And now we get messier and messier... Compare these two idioms:

"Map function f1 to each pair of items from seq1 and seq2."

"Build a list comprehension by calling function f1 with the unpacked list
that you get from a list built by zipping seq1 and seq2 together in pairs."

Good thing that removing reduce is supposed to make code easier to
understand, right?

How to _you_ use map to avoid making all the intermediate structures?

I don't understand the question. Presumably the sequences already exist.
That's not the point.
I never saw anything about making zip go away. It is easy to explain.

I don't find map any less clear than zip.

Except for the arbitrary choice that zip truncates unequal sequences while
map doesn't, zip is completely redundant:

def my_zip(*seqs):
return map(lambda *t: t, *seqs)

Zip is just a special case of map. I find it disturbing that Guido is
happy to fill Python with special case built-ins like sum, zip and
(proposed) product while wanting to cut out more general purpose solutions.


[snip]
If you want functional programming in python, you have at least
three big problems:

1) Python has side effect like mad, so order of evaluation matters.

Not if you *just* use functional operations.

Not that I would ever do that. The point isn't to turn Python into a
purely functional language, but to give Python developers access to
functional tools for when it is appropriate to use them.
2) Python's essential function call is not a single-argument
function which might be a tuple, it is a multi-argument function
which is not evaluated in the same way.

And I'm sure that makes a difference to the functional programming
purists. But not to me.
3) Python doesn't have a full set of functional primitives.
Fold-right is one example, K-combinator is another, .... Why pick on
reduce as-is to keep? There is another slippery slope argument
going up the slope adding functional primitives.

My car isn't amphibious, so I can't go everywhere with it. Should I throw
it away just because I can't drive under water?

No, of course not. Just because Python isn't a purely functional language
doesn't mean that we should reject what functional idioms (like list
comps, and zip, and reduce) it does have.

Personally, I'd like to learn more about about fold-right and
K-combinator, rather than dump reduce and map.

Frankly, I find this entire discussion very surreal. Reduce etc *work*,
right now. They have worked for years. If people don't like them, nobody
is forcing them to use them. Python is being pushed into directions which
are *far* harder to understand than map and reduce (currying, decorators,
etc) and people don't complain about those. And yet something as simple
and basic as map is supposed to give them trouble? These are the same
people who clamoured for zip, which is just a special case of map?
 
C

Christopher Subich

Scott said:
egbert said:
How do you replace
map(f1,sequence1, sequence2)
especially if the sequences are of unequal length ?

I didn't see it mentioned yet as a candidate for limbo,
but the same question goes for:
zip(sequence1,sequence2)

OK, you guys are picking on what reduce "cannot" do.
The first is [f1(*args) for args in itertools.izip(iter1, iter2)]
How to _you_ use map to avoid making all the intermediate structures?

Not quite -- zip an izip terminate at the shortest sequence, map extends
the shortest with Nones. This is resolvable by addition of an lzip (and
ilzip) function in Python 2.5 or something.

And egbert's Chicken Littling with the suggestion that 'zip' will be
removed.
 
P

Peter Hansen

Steven said:
Frankly, I find this entire discussion very surreal. Reduce etc *work*,
right now. They have worked for years. If people don't like them, nobody
is forcing them to use them. Python is being pushed into directions which
are *far* harder to understand than map and reduce (currying, decorators,
etc) and people don't complain about those.

I find it surreal too, for a different reason.

Python *works*, right now. It has worked for years. If people don't
like the direction it's going, nobody is forcing them to upgrade to the
new version (which is not imminent anyway).

In the unlikely event that the latest and greatest Python in, what, five
years or more?, is so alien that one can't handle it, one has the right
to fork Python and maintain a tried-and-true-and-still-including-reduce-
-filter-and-map version of it, or even just to stick with the most
recent version which still has those features. And that's assuming it's
not acceptable (for whatever bizarre reason I can't imagine) to use the
inevitable third-party extension that will provide them anyway.

I wonder if some of those who seem most concerned are actually more
worried about losing the free support of a team of expert developers as
those developers evolve their vision of the language, than about losing
access to something as minor as reduce().

-Peter
 
R

Ron Adam

Erik said:
Ron Adam wrote:

I really don't understand this reasoning. You essentially grant the
position that reduce has a purpose, but you still seem to approve
removing it. Let's grant your whole point and say that 90% of the use
cases for reduce are covered by sum and product, and the other 10% are
used by eggheads and are of almost no interest to programmers. But it
still serves a purpose, and a useful one. That it's not of immediate
use to anyone is an argument for moving it into a functional module
(something I would have no serious objection to, though I don't see its
necessity), not for removing it altogether! Why would you remove the
functionality that already exists _and is being used_ just because? What
harm does it do, vs. the benefit of leaving it in?

There are really two separate issues here.

First on removing reduce:

1. There is no reason why reduce can't be put in a functional module or
you can write the equivalent yourself. It's not that hard to do, so it
isn't that big of a deal to not have it as a built in.

2. Reduce calls a function on every item in the list, so it's
performance isn't much better than the equivalent code using a for-loop.

*** (note, that list.sort() has the same problem. I would support
replacing it with a sort that uses an optional 'order-list' as a sort
key. I think it's performance could be increased a great deal by
removing the function call reference. ***


Second, the addition of sum & product:

1. Sum, and less so Product, are fairly common operations so they have
plenty of use case arguments for including them.

2. They don't need to call a pre-defined function between every item, so
they can be completely handled internally by C code. They will be much
much faster than equivalent code using reduce or a for-loop. This
represents a speed increase for every program that totals or subtotals a
list, or finds a product of a set.

But removing reduce is just removing
functionality for no other reason, it seems, than spite.

No, not for spite. It's more a matter of increasing the over all
performance and usefulness of Python without making it more complicated.
In order to add new stuff that is better thought out, some things
will need to be removed or else the language will continue to grow and
be another visual basic.

Having sum and product built in has a clear advantage in both
performance and potential frequency of use, where as reduce doesn't have
the same performance advantage and most poeple don't use it anyway, so
why have it built in if sum and product are? Why not just code it as a
function and put it in your own module?

def reduce( f, seq):
x = 0
for y in seq:
x = f(x,y)
return x

But I suspect that most people would just do what I currently do and
write the for-loop to do what they want directly instead of using lambda
in reduce.

x = 1
for y in seq:
x = x**y

If performance is needed while using reduce with very large lists or
arrays, using the numeric module would be a much better solution.

http://www-128.ibm.com/developerworks/linux/library/l-cpnum.html

Cheers,
Ron
 
C

Carl Banks

Steven said:
egbert said:
On Sat, Jul 02, 2005 at 08:26:31PM -0700, Devan L wrote:

Also, map is easily replaced.
map(f1, sequence) == [f1(element) for element in sequence]

How do you replace
map(f1,sequence1, sequence2)
especially if the sequences are of unequal length ?

I didn't see it mentioned yet as a candidate for limbo,
but the same question goes for:
zip(sequence1,sequence2)

OK, you guys are picking on what reduce "cannot" do.
The first is [f1(*args) for args in itertools.izip(iter1, iter2)]

And now we get messier and messier... Compare these two idioms:

"Map function f1 to each pair of items from seq1 and seq2."

"Build a list comprehension by calling function f1 with the unpacked list
that you get from a list built by zipping seq1 and seq2 together in pairs."

The shamelessness with which you inflated the verbosity of the latter
is hilarious.

Good thing that removing reduce is supposed to make code easier to
understand, right?

It was a bad example. I would say most people don't usually just call
a function in the list comp, because, frankly, they don't have to. A
realistic list comp would look something like this in a real program:

[ x**2 + y**2 for (x,y) in izip(xlist,ylist) ]

Now there's no longer much advantage in conciseness for the map version
(seeing that you'd have to define a function to pass to map), and this
is more readable.
 
C

Carl Banks

Christopher said:
I've heard this said a couple times now -- how can listcomps not
completely replace map and filter?

If you're doing heavy functional programming, listcomps are
tremendously unwieldy compared to map et al.
 
C

Christopher Subich

Carl said:
If you're doing heavy functional programming, listcomps are
tremendously unwieldy compared to map et al.

Interesting; could you post an example of this? Whenever I try to think
of that, I come up with unwieldly syntax for the functional case. In
purely functional code the results of map/filter/etc would probably be
directly used as arguments to other functions, which might make the
calls longer than I'd consider pretty. This is especially true with
lots of lambda-ing to declare temporary expressions.
 
S

Steven D'Aprano

First on removing reduce:

1. There is no reason why reduce can't be put in a functional module

Don't disagree with that.
or
you can write the equivalent yourself. It's not that hard to do, so it
isn't that big of a deal to not have it as a built in.

Same goes for sum. Same goes for product, which doesn't have that many
common usages apart from calculating the geometric mean, and let's face
it, most developers don't even know what the geometric mean _is_.

If you look back at past discussions about sum, you will see that there is
plenty of disagreement about how it should work when given non-numeric
arguments, eg strings, lists, etc. So it isn't so clear what sum should do.
2. Reduce calls a function on every item in the list, so it's
performance isn't much better than the equivalent code using a for-loop.

That is an optimization issue. Especially when used with the operator
module, reduce and map can be significantly faster than for loops.
*** (note, that list.sort() has the same problem. I would support
replacing it with a sort that uses an optional 'order-list' as a sort
key. I think it's performance could be increased a great deal by
removing the function call reference. ***


Second, the addition of sum & product:

1. Sum, and less so Product, are fairly common operations so they have
plenty of use case arguments for including them.

Disagree about product, although given that sum is in the language, it
doesn't hurt to put product as well for completion and those few usages.
2. They don't need to call a pre-defined function between every item, so
they can be completely handled internally by C code. They will be much
much faster than equivalent code using reduce or a for-loop. This
represents a speed increase for every program that totals or subtotals a
list, or finds a product of a set.

I don't object to adding sum and product to the language. I don't object
to adding zip. I don't object to list comps. Functional, er, functions
are a good thing. We should have more of them, not less.
No, not for spite. It's more a matter of increasing the over all
performance and usefulness of Python without making it more complicated.
In order to add new stuff that is better thought out, some things
will need to be removed or else the language will continue to grow and
be another visual basic.

Another slippery slope argument.
Having sum and product built in has a clear advantage in both
performance and potential frequency of use, where as reduce doesn't have
the same performance advantage and most poeple don't use it anyway, so
why have it built in if sum and product are?

Because it is already there.
Why not just code it as a
function and put it in your own module?

Yes, let's all re-invent the wheel in every module! Why bother having a
print statement, when it is so easy to write your own:

def myprint(obj):
sys.stdout.write(str(obj))

Best of all, you can customize print to do anything you like, _and_ it is
a function.
def reduce( f, seq):
x = 0
for y in seq:
x = f(x,y)
return x

Because that is far less readable, and you take a performance hit.
But I suspect that most people would just do what I currently do and
write the for-loop to do what they want directly instead of using lambda
in reduce.

That's your choice. I'm not suggesting we remove for loops and force you
to use reduce. Or even list comps.
 
S

Steven Bethard

Christopher said:
One caevat that I just noticed, though -- with the for-solution, you do
need to be careful about whether you're using a generator or list if you
do not set an explicit initial value (and instead use the first value of
'sequence' as the start). The difference is:
_accum = g.next()
for i in g: _accum = stuff(_accum,i)

versus
_accum = g[0]
for i in g[1:]: _accum = stuff(_accum,i)

If you want to be general for all iterables (list, generators, etc), you
can write the code like:

itr = iter(g)
_accum = itr.next()
for i in itr:
_accum = stuff(_accum, i)

STeVe
 
E

Erik Max Francis

Christopher said:
Interesting; could you post an example of this? Whenever I try to think
of that, I come up with unwieldly syntax for the functional case. In
purely functional code the results of map/filter/etc would probably be
directly used as arguments to other functions, which might make the
calls longer than I'd consider pretty. This is especially true with
lots of lambda-ing to declare temporary expressions.

I personally think that map looks clearer than a list comprehension for
a simple function call, e.g.

map(str, sequence)

vs.

[str(x) for x in sequence]
 
M

Mike Meyer

Steven D'Aprano said:
I don't object to adding sum and product to the language. I don't object
to adding zip. I don't object to list comps. Functional, er, functions
are a good thing. We should have more of them, not less.

Yes, but where should they go? Adding functions in the standard
library is one thing. Adding builtins is another. Builtins make every
python process heavier. This may not matter on your desktop, but
Python gets used in embedded applications as well, and it does
there. Builtins also clutter the namespace. Nothing really wrong with
that, but it's unappealing.

I'd say that removing functions is a bad thing. On the other hand, I'd
say moving them from builtins to the standard library when Python has
functionality that covers most of the use cases for them is a good
thing.

The latter has occured for map, filter, and reduce. Lambda I'm not so
sure of, but it gets swept up with the same broom. Moving the first
three into a library module seems like a good idea. I'm not sure about
removing lambda. Removing map, filter and reduce remove most of my
use cases for it. But not all of them.

<mike
 
E

Erik Max Francis

Mike said:
I'd say that removing functions is a bad thing. On the other hand, I'd
say moving them from builtins to the standard library when Python has
functionality that covers most of the use cases for them is a good
thing.

We all can pretty much guess that map, filter, and reduce will be
reimplemented in a functional module by a third party within mere
seconds of Python 3000 being released :). So it's really just a
question of whether it will be let back in to the standard library as a
module (rather than builtins) or not. Even granting the reasons for
removing them as builtins, I really can't understand the motivation for
removing them entirely, not even as a standard library module.
 
R

Ron Adam

Steven said:
Don't disagree with that.




Same goes for sum. Same goes for product, ...

Each item needs to stand on it's own. It's a much stronger argument for
removing something because something else fulfills it's need and is
easier or faster to use than just saying we need x because we have y.

In this case sum and product fulfill 90% (estimate of course) of reduces
use cases. It may actually be as high as 99% for all I know. Or it may
be less. Anyone care to try and put a real measurement on it?


which doesn't have that many
common usages apart from calculating the geometric mean, and let's face
it, most developers don't even know what the geometric mean _is_.

I'm neutral on adding product myself.

If you look back at past discussions about sum, you will see that there is
plenty of disagreement about how it should work when given non-numeric
arguments, eg strings, lists, etc. So it isn't so clear what sum should do.

Testing shows sum() to be over twice as fast as either using reduce or a
for-loop. I think the disagreements will be sorted out.

That is an optimization issue. Especially when used with the operator
module, reduce and map can be significantly faster than for loops.

I tried it... it made about a 1% improvement in the builtin reduce and
an equal improvement in the function that used the for loop.

The inline for loop also performed about the same.

See below..

Disagree about product, although given that sum is in the language, it
doesn't hurt to put product as well for completion and those few usages.

I'm not convinced about product either, but if I were to review my
statistics textbooks, I could probably find more uses for it. I suspect
that there may be a few common uses for it that are frequent enough to
make it worth adding. But it might be better in a module.

I don't object to adding sum and product to the language. I don't object
to adding zip. I don't object to list comps. Functional, er, functions
are a good thing. We should have more of them, not less.

Yes, we should have lots of functions to use, in the library, but not
necessarily in builtins.
Another slippery slope argument.

Do you disagree or agree? Or are you undecided?

Because it is already there.

Hmm.. I know a few folks, Good people, but they keep everything to the
point of not being able to find anything because they have so much.
They can always think of reasons to keep things, "It's worth something",
"it means something to me", "I'm going to fix it", "I'm going to sell
it", "I might need it". etc..

"Because it is already there" sound like one of those type of reasons.

Yes, let's all re-invent the wheel in every module! Why bother having a
print statement, when it is so easy to write your own:

def myprint(obj):
sys.stdout.write(str(obj))

Yes, Guido wants to make print a function in Python 3000. The good
thing about this is you can call your function just 'p' and save some
typing.

p("hello world")

Actually, I think i/o functions should be grouped in an interface
module. That way you choose the interface that best fits your need. It
may have a print if it's a console, or it may have a widget if it's a gui.

Best of all, you can customize print to do anything you like, _and_ it is
a function.




Because that is far less readable, and you take a performance hit.

They come out pretty close as far as I can tell.


def reduce_f( f, seq):
x = seq[0]
for y in seq[1:]:
x = f(x,y)
return x

import time

t = time.time()
r2 = reduce(lambda x,y: x*y, range(1,10000))
t2 = time.time()-t
print 'reduce builtin:', t2

t = time.time()
r1 = reduce_f(lambda x,y: x*y, range(1,10000))
t2 = time.time()-t
print 'reduce_f: ', t2

if r1!=r2: print "results not equal"
reduce builtin: 0.156000137329
reduce_f: 0.155999898911reduce builtin: 0.15700006485
reduce_f: 0.155999898911reduce builtin: 0.141000032425
reduce_f: 0.155999898911


That's your choice. I'm not suggesting we remove for loops and force you
to use reduce. Or even list comps.

Just don't force me to use decorators! ;-)

Nah, they're ok too, but it did take me a little while to understand
their finer points.

Cheers,
Ron
 
E

Erik Max Francis

Ron said:
Each item needs to stand on it's own. It's a much stronger argument for
removing something because something else fulfills it's need and is
easier or faster to use than just saying we need x because we have y.

In this case sum and product fulfill 90% (estimate of course) of reduces
use cases. It may actually be as high as 99% for all I know. Or it may
be less. Anyone care to try and put a real measurement on it?

Well, reduce covers 100% of them, and it's one function, and it's
already there.
 
R

Ron Adam

Erik said:
Well, reduce covers 100% of them, and it's one function, and it's
already there.

So you are saying that anything that has a 1% use case should be
included as a builtin function?

I think I can find a few hundred other functions in the library that are
used more than ten times as often as reduce. Should those be builtins too?

This is a practical over purity issue, so what are the practical reasons
for keeping it. "It's already there" isn't a practical reason. And it
covers 100% of it's own potential use cases, is circular logic without a
real underlying basis.

Cheers,
Ron
 
S

Steven D'Aprano

Carl said:
The shamelessness with which you inflated the verbosity of the latter
is hilarious.
[snip]

[ x**2 + y**2 for (x,y) in izip(xlist,ylist) ]

Now there's no longer much advantage in conciseness for the map version
(seeing that you'd have to define a function to pass to map), and this
is more readable.
> If you're doing heavy functional programming,
> listcomps are tremendously unwieldy compared to
> map et al.

Having a dollar each way I see :)
 
E

Erik Max Francis

Ron said:
So you are saying that anything that has a 1% use case should be
included as a builtin function?

I think I can find a few hundred other functions in the library that are
used more than ten times as often as reduce. Should those be builtins too?

This is a practical over purity issue, so what are the practical reasons
for keeping it. "It's already there" isn't a practical reason. And it
covers 100% of it's own potential use cases, is circular logic without a
real underlying basis.

But the Python 3000 plan, at least what we've heard of it so far, isn't
to move it to a standard library module. It's to remove it altogether,
replacing it with sum and product. Since sum and product don't cover
all the uses cases for reduce, this is a case of taking one function
that handles all the required use cases and replacing it with _two_
functions that don't. Since it's doubling the footprint of the reduce
functionality, arguments about avoiding pollution are red herrings.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top