R
Raymond Hettinger
Proposal
--------
I am gathering data to evaluate a request for an alternate version of
itertools.izip() with a None fill-in feature like that for the built-in
map() function:
[('a', '1'), ('b', '2'), ('c', '3'), (None, '4'), (None, '5')]
The motivation is to provide a means for looping over all data elements
when the input lengths are unequal. The question of the day is whether
that is both a common need and a good approach to real-world problems.
The answer can likely be found in results from other programming
languages and from surveying real-world Python code.
Other languages
---------------
I scanned the docs for Haskell, SML, and Perl6's yen operator and found
that the norm for map() and zip() is to truncate to the shortest input
or raise an exception for unequal input lengths. Ruby takes the
opposite approach and fills-in nil values -- the reasoning behind the
design choice is somewhat inscrutable:
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-dev/18651
Real-world code
---------------
I scanned the standard library, my own code, and a few third-party
tools. I
found no instances where map's fill-in feature was used.
History of zip()
----------------
PEP 201 (lock-step iteration) documents that a fill-in feature was
contemplated and rejected for the zip() built-in introduced in Py2.0.
In the years before and after, SourceForge logs show no requests for a
fill-in feature.
Request for more information
----------------------------
My request for readers of comp.lang.python is to search your own code
to see if map's None fill-in feature was ever used in real-world code
(not toy examples). I'm curious about the context, how it was used,
and what alternatives were rejected (i.e. did the fill-in feature
improve the code). Likewise, I'm curious as to whether anyone has seen
a zip-style fill-in feature employed to good effect in some other
programming language.
Parallel to SQL?
----------------
If an iterator element's ordinal position were considered as a record
key, then the proposal equates to a database-style full outer join
operation (one which includes unmatched keys in the result) where record
order is significant. Does an outer-join have anything to do with
lock-step iteration? Is this a fundamental looping construct or just a
theoretical wish-list item? Does Python need itertools.izip_longest()
or would it just become a distracting piece of cruft?
Raymond Hettinger
FWIW, the OP's use case involved printing files in multiple
columns:
for f, g in itertools.izip_longest(file1, file2, fillin_value=''):
print '%-20s\t|\t%-20s' % (f.rstrip(), g.rstrip())
The alternative was straightforward but less terse:
while 1:
f = file1.readline()
g = file2.readline()
if not f and not g:
break
print '%-20s\t|\t%-20s' % (f.rstrip(), g.rstrip())
--------
I am gathering data to evaluate a request for an alternate version of
itertools.izip() with a None fill-in feature like that for the built-in
map() function:
[('a', '1'), ('b', '2'), ('c', '3'), (None, '4'), (None, '5')]
The motivation is to provide a means for looping over all data elements
when the input lengths are unequal. The question of the day is whether
that is both a common need and a good approach to real-world problems.
The answer can likely be found in results from other programming
languages and from surveying real-world Python code.
Other languages
---------------
I scanned the docs for Haskell, SML, and Perl6's yen operator and found
that the norm for map() and zip() is to truncate to the shortest input
or raise an exception for unequal input lengths. Ruby takes the
opposite approach and fills-in nil values -- the reasoning behind the
design choice is somewhat inscrutable:
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-dev/18651
Real-world code
---------------
I scanned the standard library, my own code, and a few third-party
tools. I
found no instances where map's fill-in feature was used.
History of zip()
----------------
PEP 201 (lock-step iteration) documents that a fill-in feature was
contemplated and rejected for the zip() built-in introduced in Py2.0.
In the years before and after, SourceForge logs show no requests for a
fill-in feature.
Request for more information
----------------------------
My request for readers of comp.lang.python is to search your own code
to see if map's None fill-in feature was ever used in real-world code
(not toy examples). I'm curious about the context, how it was used,
and what alternatives were rejected (i.e. did the fill-in feature
improve the code). Likewise, I'm curious as to whether anyone has seen
a zip-style fill-in feature employed to good effect in some other
programming language.
Parallel to SQL?
----------------
If an iterator element's ordinal position were considered as a record
key, then the proposal equates to a database-style full outer join
operation (one which includes unmatched keys in the result) where record
order is significant. Does an outer-join have anything to do with
lock-step iteration? Is this a fundamental looping construct or just a
theoretical wish-list item? Does Python need itertools.izip_longest()
or would it just become a distracting piece of cruft?
Raymond Hettinger
FWIW, the OP's use case involved printing files in multiple
columns:
for f, g in itertools.izip_longest(file1, file2, fillin_value=''):
print '%-20s\t|\t%-20s' % (f.rstrip(), g.rstrip())
The alternative was straightforward but less terse:
while 1:
f = file1.readline()
g = file2.readline()
if not f and not g:
break
print '%-20s\t|\t%-20s' % (f.rstrip(), g.rstrip())