Fun with fancy slicing

D

David Mertz

|I'm not sure what this 'more' is about, actually. Greg's case is
|currently best solved [IMHO] with
| args = cmdstring.split()
| command = args.pop(0)

Part of his description was "I might or might not have more in the
list." My 'more' stuff was just illustrating handling that extra stuff.
Admittedly, that's not directly part of what 'car,*cdr=lst' solves.

|Actually, I don't -- both args.reverse() and args.pop(0) are
|O(len(args))

Hmmm... I am pretty sure I have found Alex in an outright mistake. A
shocking thing, if so :).

|[alex@lancelot python2.3]$ python timeit.py -s'ags=range(1000)'
|'x=ags[:].pop(0)'
|10000 loops, best of 3: 24.5 usec per loop
|[alex@lancelot python2.3]$ python timeit.py -s'ags=range(1000)'
|'ags.reverse(); x=ags[:].pop()'
|10000 loops, best of 3: 23.2 usec per loop

The second operation does a '.reverse()' for every '.pop()', which is
both the wrong behavior, and not the relevant thing to time.
... start = clock()
... l.reverse()
... while 1:
... try: l.pop()
... except: break
... print "%.2f seconds" % (clock()-start)
... ... start = clock()
... while 1:
... try: l.pop(0)
... except: break
... print "%.2f seconds" % (clock()-start)
... ...abandoned after a few hours :)...

Yours, David...
 
A

Alex Martelli

David said:
Alex Martelli said:
David said:
In the mean time, it isn't too hard to write a function which does
this:

def first_rest(x):
return x[0], x[1:]

Shouldn't that be called cons?

Hmmm, feels more like the 'reverse' of a cons to me -- takes a list
and gives me the car and cdr...

Well, but it returns an object composed of a car and a cdr, which is
exactly what a cons is...

Yeah, the psychological issue is no doubt with calling 'list' that
argument -- cons takes TWO separate arguments (and the SECOND one
is normally a list), while this is being discussed in the context
of UN-packing, AND takes a single argument to boot. While of course
it's true that "return this, that" returns a tuple, thinking of it
in terms of "returning multiple objects" is most natural when you're
planning to unpack that tuple before it's had time to look around...;-).


Alex
 
F

Fernando Perez

Alex said:
How sweet it would be to be able to unpack by coding:
head, *tail = alist

+1000 :)

I'd LOVE to have this syntax around. I'd even want:

head, *body, last = alist

and

*head_and_body, last = alist

to both work in their respectively obvious ways ;) (left-to-right unpacking,
empty lists/tuples returned for elements which can't be unpacked --like if
alist contains just one element, and the lhs returning tuple/list based on the
type of the rhs).

We can dream :)

Cheers,

f
 
A

Alex Martelli

David said:
|I'm not sure what this 'more' is about, actually. Greg's case is
|currently best solved [IMHO] with
| args = cmdstring.split()
| command = args.pop(0)

Part of his description was "I might or might not have more in the
list." My 'more' stuff was just illustrating handling that extra stuff.
Admittedly, that's not directly part of what 'car,*cdr=lst' solves.

Right, and that's what we were discussing.

|Actually, I don't -- both args.reverse() and args.pop(0) are
|O(len(args))

Hmmm... I am pretty sure I have found Alex in an outright mistake. A
shocking thing, if so :).

Nope. We're discussing situations where ONE pop from the start is
needed; and you claimed I knew that in that case reversing the list
and popping from the end was faster.
|[alex@lancelot python2.3]$ python timeit.py -s'ags=range(1000)'
|'x=ags[:].pop(0)'
|10000 loops, best of 3: 24.5 usec per loop
|[alex@lancelot python2.3]$ python timeit.py -s'ags=range(1000)'
|'ags.reverse(); x=ags[:].pop()'
|10000 loops, best of 3: 23.2 usec per loop

The second operation does a '.reverse()' for every '.pop()', which is
both the wrong behavior, and not the relevant thing to time.

It's exactly the right behavior and exactly the relevant thing. Note
the [:] which I discussed in my post -- I'm not interested in popping
all items one after the other (which would have absolutely nothing to do
with 'car, *cdr = thelist'), I'm interested in popping ONE item -0- the
first one. So, the comparison is between one pop(0) and one
reverse() then one pop() [of course, not with a [:] -- except for
the need to cooperate with timeit.py's measurements in what would
otherwise be a 'destructive' operation and thus unmeasurable by
it]. (My error, as Michael helped me see, was in not using -c or
making my machine perfectly quiescent -- I _thought_ it was but a
little experimentation soon showed it was anything but; anyway,
even _with_ better measurement, reverse-then-pop does turn out
to be faster -- by a miniscule margin, on this release and on this box --
than pop(0) -- I'm not quite sure I "know" it, yet, though...).


Alex
 
A

Alex Martelli

Fernando said:
+1000 :)

I'd LOVE to have this syntax around. I'd even want:

head, *body, last = alist

and

*head_and_body, last = alist

to both work in their respectively obvious ways ;) (left-to-right
unpacking, empty lists/tuples returned for elements which can't be

So far so good...
unpacked --like if alist contains just one element, and the lhs returning
tuple/list based on the type of the rhs).

_blink_ uh, what...?
.... yield '1'; yield 22; yield 33
....
....so WHAT type would you want for the lhs *andalltherest term in each
of these cases...??? I'd be _particularly_ curious about the last one...

In other words, given the rhs in unpacking can be ANY iterable, including,
for example, a generator, it's (IMHO) absurd to try to have the *blah arg
on the lhs 'take on that type'. Doing it for some special kinds of sequence
and dumping all the others into [e.g.] list or tuple, sort of like filter
does, is a serious mistake again IMHO -- try teaching *that* a few times.

I would want the type of the rhs to be irrelevant, just as long as it's any
iterable whatsoever, and bundle the 'excess items' into ONE specified
kind of sequence. I also think it would be best for that one kind to be
tuple, by analogy to argument passing: you can use any iterable as the
actual argument 'expanded' by the *foo form, but whatever type foo
is, if the formal argument is *bar then in the function bar is a TUPLE --
period, simplest, no ifs, no buts. Maybe using a list instead might have
been better, but tuple was chosen, and I think we should stick to that
for unpacking, because the analogy with argument passing is a good
part of the reason the *foo syntax appeals to me (maybe for the same
reason we should at least for now forego the *foo appearing in any
spot but the last... though that _would_ be a real real pity...).

Oh, and formats for module struct -- THERE, too, allowing a star
would sometimes be very VERY useful and handy, perhaps even
more than in unpacking-assignment...


Alex
 
D

David Mertz

|Nope. We're discussing situations where ONE pop from the start is
|needed; and you claimed I knew that in that case reversing the list
|and popping from the end was faster.

Nah. I believe the context of my first note in the thread made the
imagined situation obvious. The situation Greg described was where you
want to look (from the left) at one item at a time, then decide
whether/when to grab the next based on that one.

That one-at-a-time-but-many-times seems to me the case where
'car,*cdr=thelist' is most useful. And for that, one-reverse-
plus-many-pops is the best current approach.

I quite concur with Alex that a '.reverse()' is not particularly
worthwhile to do just one '.pop()' (I guess it turned out to save a
couple milliseconds, but nothing important). The hermeneutics of the
thread aren't worth getting hung up on.

Of course, in Greg's case of parsing a dozen command-line switches, the
speed issue is not interesting. The whole '.reverse()' thing only
matters when you need to handle thousands of items (and *that*, I think,
Alex will admit he knows :)).

Yours, David...
 
B

Bengt Richter

+1000 :)

I'd LOVE to have this syntax around. I'd even want:

head, *body, last = alist

and

*head_and_body, last = alist

to both work in their respectively obvious ways ;) (left-to-right unpacking,
empty lists/tuples returned for elements which can't be unpacked --like if
alist contains just one element, and the lhs returning tuple/list based on the
type of the rhs).

We can dream :)
Maybe spelling it a little differently wouldn't be too bad?
(tested only as you see ;-):
>>> def headtail(seq): it=iter(seq); yield it.next(); yield list(it) ...
>>> h,t = headtail(range(5))
>>> h 0
>>> t [1, 2, 3, 4]
>>> h,t
(0, [1, 2, 3, 4])

Or from a generator
... for i in range(0,100,10): yield i
... (0, [10, 20, 30, 40, 50, 60, 70, 80, 90])

Not so useful:
('a', ['b', 'c', 'd'])

Changes to list:
['123', 1, 2, 3]

So, to preserve common seq types, you could do a type-preserving-headtail
... if isinstance(seq, str): yield seq[0]; yield seq[1:]; return
... it = iter(seq); yield it.next()
... if isinstance(seq, tuple): yield tuple(it)
... else: yield list(it)
...
>>> h,t = tpheadtail(range(5))
>>> h,t (0, [1, 2, 3, 4])
>>> h,t = tpheadtail(tuple(range(5)))
>>> h,t (0, (1, 2, 3, 4))
>>> h,t = tpheadtail('abcdef')
>>> h,t
('a', 'bcdef')

You could control repacking ad libitum by adding a packing-format-control string parameter,
say '.' for one element, a decimal>1 for n elements in a tuple or list, one '*' for as many
elements as exist in the context, no prefix to preserve seq type, prefix T to make tuple,
L to make list. Hence the default headtail format would be '.*' -- some example spellings:

h,t = repack(seq) # head, *tail
h,t = repack(seq, '*.') # *head, tail
h,m,t = repack(seq, '.*.') # head, *middle, last
e1,e2,e3,tup,em2,em1 = repack(seq, '...t*..') # force seq[3:-2] to tuple for tup
t5,myList123,restAsList = repack(seq, 'T5L123,L*')

This is a tangent from a prime number tangent from ... er, so the implementation of
the above will be left as an exercise ;-)

def repack(seq, fmt='.*'):
# ... no ;-)












Regards,
Bengt Richter
 
A

Alex Martelli

David said:
|Nope. We're discussing situations where ONE pop from the start is
|needed; and you claimed I knew that in that case reversing the list
|and popping from the end was faster.

Nah. I believe the context of my first note in the thread made the
imagined situation obvious. The situation Greg described was where you
want to look (from the left) at one item at a time, then decide
whether/when to grab the next based on that one.

He was describing a commandverb / args separation. _One_ commandverb,
zero or more args to go with it.
That one-at-a-time-but-many-times seems to me the case where
'car,*cdr=thelist' is most useful. And for that, one-reverse-
plus-many-pops is the best current approach.

If you have a mapping from command-verb to number of req arguments
(with all further optional args to be left as the sequence), I
think slicing is in fact clearer:

command_verb = args[0]
num_req_args = numargs.get(command_verb, 0)
requiredargs = args[1:1+num_req_args]
other_args = args[1+num_req_args:]

I don't think any unpacking or popping, with or without reversing
thrown in, can compete with this clarity.

In most other use cases where it might initially seems that the
best idea is "consuming" the sequence (and thus that reversing
might be a clever idea), it turns out that just *iterating* on
the sequence (or an iter(...) thereof, to keep iteration state)
is in fact quite preferable.

I quite concur with Alex that a '.reverse()' is not particularly
worthwhile to do just one '.pop()' (I guess it turned out to save a
couple milliseconds, but nothing important). The hermeneutics of the

_micro_seconds (for lists of several thousands of items).
thread aren't worth getting hung up on.

If you have no grounds to claim I made a mistake, then please don't
(and apologize if you think you have done so in error). If you do
think you can prove it, and I disagree, then it seems to me that
this disagreement IS "worth getting hung up on".

Of course, in Greg's case of parsing a dozen command-line switches, the
speed issue is not interesting. The whole '.reverse()' thing only
matters when you need to handle thousands of items (and *that*, I think,
Alex will admit he knows :)).

I do know (having read Knuth) that micro-optimizations should be ignored
perhaps 97% of the time, and can well guess that this case doesn't look
anywhere close to the remaining "perhaps 3%". I.e., that (like most
debates on performance) this one has little substance or interest.

Much more interesting might be to give the list.reverse method optional
start and end parameters to reverse in-place a contiguous slice of a
list -- the current alternative:
aux = somelist[start:end]
aux.reverse()
somelist[start:end] = aux
is just too sucky, substantially diminishing the interest of the
reverse method. But that, of course, is a totally unrelated subject.


Alex
 
D

David Mertz

|> The hermeneutics of the thread aren't worth getting hung up on.

|If you have no grounds to claim I made a mistake, then please don't
|(and apologize if you think you have done so in error).

Well, OK... if you want to get hung up on hermeneutics, the sequence
was:

(0) Greg used the example of command-line processing to argue for the
addition of 'car,*cdr=thelist' to Python (which Alex and I both
like too).

(1) I made the claim (well-known to Alex, but perhaps not to newbies
who might be reading) that reverse-with-many-pops is (much) faster
than many-pops-from-left.

(2) Alex wrote: "No, you're wrong", and posted a comparison of one
reverse with one left pop.

(3) I observed that such was the wrong issue. I don't think I
insinuated it was a -morally- wrong issue, just not the one I
was interested in.

(4) Alex seemed WAY oversensitive about said observation.

(5) I suggested, in a concilliatory mood, that we not get hung up on
he-said/she-said (hermeneutics).

(6) Alex demands an apology.

(7) I post this chronology.

Hopefully, the next part is (8) Alex chills out.

All the best, David...
 
A

Alex Martelli

David said:
|> The hermeneutics of the thread aren't worth getting hung up on.

|If you have no grounds to claim I made a mistake, then please don't
|(and apologize if you think you have done so in error).

Well, OK... if you want to get hung up on hermeneutics, the sequence
was:

(0) Greg used the example of command-line processing to argue for the
addition of 'car,*cdr=thelist' to Python (which Alex and I both
like too).

(1) I made the claim (well-known to Alex, but perhaps not to newbies
who might be reading) that reverse-with-many-pops is (much) faster
than many-pops-from-left.

You did not clearly express at that time that you had changed the
subject (without changing the Subject: header) from the subject
of star-unpacking to the one of "*MANY* pops". Thus, I think I
was entirely justified in continuing to address the subject in
the header (and that your expression of your intentions left a
lot to be desired). As you indicate you do not want to continue
discussing this, I will not; but it seems this is the crucial
point of disagreement, which I believe made your assertion about
having "found Alex in an outright mistake" (which you chose to
post as a comment to my assertion that both args.reverse() and
args.pop(0) are O(len(reverse)) -- an assertion that is anything
but a mistake) apparently justified in your opinion, and utterly
unjustified in mine. I _do_ make "outright mistakes" (why, just
the other day I hastily posted a list comprehension as
[x in seq] instead of [x for x in seq]!!!), and I apologize when
it happens. Having carefully re-read and examined this thread,
I am entirely convinced that this is not one such case, and not
happy at all that, despite having IN THAT VERY SAME POST admitted
that you had murkily been "changing the subject and not the Subject:"
("Admittedly, that's not directly part of what" we were discussing
is what you said!), you STILL appear convinced that no apology is
warranted about that "outright mistake" observation.

Hopefully, the next part is (8) Alex chills out.

Not to worry: I'm more than chilly, I'm _ICY_ with people who claim
I am wrong and fail to back off when I am convinced that my evidence
to the contrary is entirely satisfactory. At least, I think the
reason for my iciness has been made perfectly clear -- although I'm
still in the dark about why you still think your "outright mistake"
assertion was correct, I won't lose any sleep over that.


Alex
 
I

Ian McMeans

I've got to drop in my two cents :)
I wanted to see how clean I could make the algorithms look.

def quicksort(lst):
if len(lst) <= 1: return lst

left = [x for x in lst if x < lst[0]]
middle = [x for x in lst if x == lst[0]]
right = [x for x in lst if x > lst[0]]

return quicksort(left) + middle + quicksort(right)
quicksort([random.randint(0, 20) for dummy in range(20)])
[0, 1, 2, 4, 4, 4, 4, 4, 6, 6, 8, 9, 12, 12, 13, 14, 14, 14, 16, 16]

Hopefully this is still nlogn :)

and then:

def primes():
p = {}
for cand in itertools.count(2):
if not [True for factor in p if not cand % factor]:
p[cand] = None
yield cand

list(itertools.islice(primes(), 0, 20))
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71]
 
D

David Eppstein

def quicksort(lst):
if len(lst) <= 1: return lst

left = [x for x in lst if x < lst[0]]
middle = [x for x in lst if x == lst[0]]
right = [x for x in lst if x > lst[0]]

return quicksort(left) + middle + quicksort(right)
quicksort([random.randint(0, 20) for dummy in range(20)])
[0, 1, 2, 4, 4, 4, 4, 4, 6, 6, 8, 9, 12, 12, 13, 14, 14, 14, 16, 16]

Hopefully this is still nlogn :)

Well, for random inputs it is. If you want it to be O(n log n) even for
sorted inputs you could change it a little:

def quicksort(lst):
if len(lst) <= 1: return lst
pivot = lst[random.randrange(len(lst))]

left = [x for x in lst if x < pivot]
middle = [x for x in lst if x == pivot]
right = [x for x in lst if x > pivot]

return quicksort(left) + middle + quicksort(right)
 
F

Fernando Perez

Alex said:
I would want the type of the rhs to be irrelevant, just as long as it's any
iterable whatsoever, and bundle the 'excess items' into ONE specified
kind of sequence. I also think it would be best for that one kind to be
tuple, by analogy to argument passing: you can use any iterable as the
actual argument 'expanded' by the *foo form, but whatever type foo
is, if the formal argument is *bar then in the function bar is a TUPLE --
period, simplest, no ifs, no buts. Maybe using a list instead might have
been better, but tuple was chosen, and I think we should stick to that
for unpacking, because the analogy with argument passing is a good
part of the reason the *foo syntax appeals to me (maybe for the same
reason we should at least for now forego the *foo appearing in any
spot but the last... though that _would_ be a real real pity...).

Agreed: a single convention (and following tuples is a good one, if nothing
else b/c it's the existing one) is probably the sanest solution. I hadn't
thought of generators and arbitrary iterables, partly b/c I still use pretty
much 'naked' python 2.1 for everything.

Cheers,

f.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
474,164
Messages
2,570,899
Members
47,441
Latest member
OscarSchle

Latest Threads

Top