list.clear() missing?!?

S

Steven D'Aprano

Something this small doesn't need a PEP. I'll just send a note to
Guido asking for a pronouncement.

Raymond, if you're genuinely trying to help get this sorted in the
fairest, simplest way possible, I hope I speak for everyone when
I say thank you, your efforts are appreciated.

But there is this:
* the request is inane, the underlying problem is trivial, and the
relevant idiom is fundamental (api expansions should be saved for rich
new functionality and not become cluttered with infrequently used
redundant entries)

Is this sort of editorialising fair, or just a way of not-so-subtly
encouraging Guido to reject the whole idea, now and forever?

Convenience and obviousness are important for APIs -- that's why lists
have pop, extend and remove methods. The only difference I can see between
a hypothetical clear and these is that clear can be replaced with a
one-liner, while the others need at least two, e.g. for extend:

for item in seq:
L.append(item)

Here is another Pro for your list:

A list.clear method will make deleting items from a list more OO,
consistent with almost everything else you do to lists, and less
procedural. This is especially true if clear() takes an optional index (or
two), allowing sections of the list to be cleared, not just the entire
list.
 
S

Steven D'Aprano

Serious question: Should it work more like "s=[]" or more like
"s[:]=[]". I'm assuming the latter, but the fact that there is
a difference is an argument for not hiding this operation behind
some syntactic sugar.

Er, I don't see how it can possibly work like s = []. That merely
reassigns a new empty list to the name s, it doesn't touch the existing
list (which may or may not be garbage collected soon/immediately
afterwards).

As far as I know, it is impossible -- or at least very hard -- for an
object to know which namesspaces it is in, so it can reassign one name but
not the others. Even if such a thing was possible, I think it is an
absolutely bad idea.
 
A

Alan Morgan

Serious question: Should it work more like "s=[]" or more like
"s[:]=[]". I'm assuming the latter, but the fact that there is
a difference is an argument for not hiding this operation behind
some syntactic sugar.

Er, I don't see how it can possibly work like s = []. That merely
reassigns a new empty list to the name s, it doesn't touch the existing
list (which may or may not be garbage collected soon/immediately
afterwards).

Right. I was wondering what would happen in this case:

s=[1,2,3]
t=s
s.clear()
t # [] or [1,2,3]??

If you know your basic python it is "obvious" what would happen
if you do s=[] or s[:]=[] instead of s.clear() and I guess it is
equally "obvious" which one s.clear() must mimic. I'm still not
used to dealing with mutable lists.

Alan
 
P

Peter Hansen

Alan said:
Right. I was wondering what would happen in this case:

s=[1,2,3]
t=s
s.clear()
t # [] or [1,2,3]??

If you know your basic python it is "obvious" what would happen
if you do s=[] or s[:]=[] instead of s.clear() and I guess it is
equally "obvious" which one s.clear() must mimic. I'm still not
used to dealing with mutable lists.

If you know your basic python :), you know that s[:] = [] is doing the
only thing that s.clear() could possibly do, which is changing the
contents of the list which has the name "s" bound to it (and which might
have other names bound to it, just like any object in Python). It
*cannot* be doing the same as "s=[]" which does not operate on the list
but creates an entirely new one and rebinds the name "s" to it.

The only possible answer for your question above is "t is s" and "t ==
[]" because you haven't rebound the names.

-Peter
 
P

Peter Hansen

Steven said:
Convenience and obviousness are important for APIs -- that's why lists
have pop, extend and remove methods. The only difference I can see between
a hypothetical clear and these is that clear can be replaced with a
one-liner, while the others need at least two, e.g. for extend:

for item in seq:
L.append(item)

It's not even clear that extend needs two lines:
>>> s = range(5)
>>> more = list('abc')
>>> s[:] = s + more
>>> s
[0, 1, 2, 3, 4, 'a', 'b', 'c']

Okay, it's not obvious, but I don't think s[:]=[] is really any more
obvious as a way to clear the list.

Clearly .extend() needs to be removed from the language as it is an
unnecessary extension to the API using slicing, which everyone should
already know about...

-Peter
 
A

Alan Morgan

Alan said:
Right. I was wondering what would happen in this case:

s=[1,2,3]
t=s
s.clear()
t # [] or [1,2,3]??

If you know your basic python it is "obvious" what would happen
if you do s=[] or s[:]=[] instead of s.clear() and I guess it is
equally "obvious" which one s.clear() must mimic. I'm still not
used to dealing with mutable lists.

If you know your basic python :), you know that s[:] = [] is doing the
only thing that s.clear() could possibly do,

Ah, but if you know your basic python then you wouldn't be looking for
s.clear() in the first place; you'd just use s[:]=[] (or s=[], whichever
is appropriate). IOW, the people most in need of s.clear() are those
least likely to be able to work out what it is actually doing.
Personally, it seems more *reasonable* to me, a novice python user,
for s.clear() to behave like s=[] (or, perhaps, for s=[] and s[:]=[] to
mean the same thing). The fact that it can't might be an argument for
not having it in the first place.

Alan
 
T

Terry Reedy

Peter Hansen said:
It's not even clear that extend needs two lines:
s = range(5)
more = list('abc')
s[:] = s + more
s
[0, 1, 2, 3, 4, 'a', 'b', 'c']

This is not the same as list.extend because it makes a separate
intermediate list instead of doing the extension completely in place.
However, the following does mimic .extend.
s=range(5)
s[len(s):] = list('abc')
s
[0, 1, 2, 3, 4, 'a', 'b', 'c']

So. at the cost of writing and calling len(s), you are correct that .extend
is not necessary.

Terry Jan Reedy
 
D

Dan Christensen

Raymond Hettinger said:
Felipe said:
I love benchmarks, so as I was testing the options, I saw something very
strange:

$ python2.4 -mtimeit 'x = range(100000); '
100 loops, best of 3: 6.7 msec per loop
$ python2.4 -mtimeit 'x = range(100000); del x[:]'
100 loops, best of 3: 6.35 msec per loop

Why the first benchmark is the slowest? I don't get it... could someone
test this, too?

It is an effect of the memory allocator and fragmentation. The first
builds up a list with increasingly larger sizes.

I don't see what you mean by this. There are many lists all of
the same size. Do you mean some list internal to the memory
allocator?
It periodically
cannot grow in-place because something is in the way (some other
object) so it needs to move its current entries to another, larger
block and grow from there. In contrast, the other entries are reusing
a the previously cleared out large block.

Just for grins, replace the first with"
'x=None; x=range(100000)'
The assignment to None frees the reference to the previous list and
allows it to be cleared so that its space is immediately available to
the new list being formed by range().

It's true that this runs at the same speed as the del variants on my
machine. That's not too surprising to me, but I still don't
understand why the del variants are more than 5% faster than the first
version.

Once this is understood, is it something that could be optimized?
It's pretty common to rebind a variable to a new value, and if
this could be improved 5%, that would be cool. But maybe it
wouldn't affect anything other than such micro benchmarks.

Dan
 
F

Fredrik Lundh

Peter said:
And yet it doesn't appear to be in the tutorial.

oh, please.

slices are explained in the section on strings, and in the section on lists,
and used to define the behaviour of the list methods in the second section
on lists, ...
I could have missed it, but I've looked in a number of the obvious places

http://docs.python.org/tut/node5.html#SECTION005140000000000000000

section 3.1.2 contains an example that shows to remove stuff from a list,
in place.

if you want a clearer example, please consider donating some of your time
to the pytut wiki:

http://pytut.infogami.com/

</F>
 
F

Fredrik Lundh

Peter said:
It's not even clear that extend needs two lines:
s = range(5)
more = list('abc')
s[:] = s + more
s
[0, 1, 2, 3, 4, 'a', 'b', 'c']

Okay, it's not obvious, but I don't think s[:]=[] is really any more
obvious as a way to clear the list.

Clearly .extend() needs to be removed from the language as it is an
unnecessary extension to the API using slicing

you just flunked the "what Python has to do to carry out a certain operation"
part of the "how Python works, intermediate level" certification.

</F>
 
D

Duncan Booth

Peter said:
* learning slices is basic to the language (this lesson shouldn't be
skipped)

And yet it doesn't appear to be in the tutorial. I could have missed
it, but I've looked in a number of the obvious places, without
actually going through it (again) from start to finish. Also,
googling for "slice site:docs.python.org", you have to go to the
*sixth* entry before you can find the first mention of "del x[:]" and
what it does. I think given the current docs it's possible to learn
all kinds of things about slicing and still not make the non-intuitive
leap that "del x[slice]" is actually how you spell "delete contents of
list in-place".

Looking in the 'obvious' place in the Tutorial, section 5.1 'More on
Lists' I found in the immediately following section 5.2 'The del
statement':
There is a way to remove an item from a list given its index instead
of its value: the del statement. Unlike the pop()) method which
returns a value, the del keyword is a statement and can also be used
to remove slices from a list (which we did earlier by assignment of an
empty list to the slice).

The 'earlier showing assignment of an empty list to a slice' is a reference
to section 3.1.4 'Lists':
Assignment to slices is also possible, and this can even change the
size of the list:

# Replace some items: ... a[0:2] = [1, 12]
a [1, 12, 123, 1234]
# Remove some: ... a[0:2] = []
a
[123, 1234]

Both of these talk about ways to remove slices from a list. Perhaps the
wording could be clearer to make it obvious that they can also be used to
clear a list entirely (using the word 'clear' would certainly help people
Googling for the answer). So maybe 'this can even change the size of the
list or clear it completely' would be a good change for 3.1.4.
 
D

Duncan Booth

Raymond said:
also, a clear method would simply clear the entire list. You still need to
learn the assigning to/deleting slices technique any time you want to clear
out part of a list.

Every so often I still get an "oh, I didn't know Python could do *that*
moment", just had one now:
s = range(10)
s[::2] = reversed(s[::2])
s
[8, 1, 6, 3, 4, 5, 2, 7, 0, 9]

I've no idea when I might need it, but it just never occurred to me before
that you can also assign/del non-contiguous slices.

The symmetry does breaks down a bit here as assigning to an extended slice
only lets you assign a sequence of the same length as the slice, so you
can't delete an extended slice by assignment, only by using del.
s = range(10)
del s[::2]
s [1, 3, 5, 7, 9]
s = range(10)
s[::2] = []

Traceback (most recent call last):
File "<pyshell#19>", line 1, in -toplevel-
s[::2] = []
ValueError: attempt to assign sequence of size 0 to extended slice of size
5
 
G

Georg Brandl

Both of these talk about ways to remove slices from a list. Perhaps the
wording could be clearer to make it obvious that they can also be used to
clear a list entirely (using the word 'clear' would certainly help people
Googling for the answer). So maybe 'this can even change the size of the
list or clear it completely' would be a good change for 3.1.4.

I added two examples of clearing a list to the section about slice assignment
and del.

Georg
 
S

Steven D'Aprano

also, a clear method would simply clear the entire list. You still need to
learn the assigning to/deleting slices technique any time you want to clear
out part of a list.

A clear method does not *necessarily* clear only the entire list. That's
an choice that can be made. I for one would vote +1 for clear taking an
optional index or slice, or two indices, to clear only a part of the list.
 
S

Steven D'Aprano

Peter said:
It's not even clear that extend needs two lines:
s = range(5)
more = list('abc')
s[:] = s + more
s
[0, 1, 2, 3, 4, 'a', 'b', 'c']

Okay, it's not obvious, but I don't think s[:]=[] is really any more
obvious as a way to clear the list.

Clearly .extend() needs to be removed from the language as it is an
unnecessary extension to the API using slicing

you just flunked the "what Python has to do to carry out a certain operation"
part of the "how Python works, intermediate level" certification.

So implementation details are part of the language now?

Out of curiosity, are those things that Python has to do the same for
CPython, Jython, IronPython and PyPy?

Personally, I think it is a crying shame that we're expected to be experts
on the specific internals of the Python interpreter before we're allowed
to point out that "only one obvious way to do it" just is not true, no
matter what the Zen says.
L = []
L.append(0)
L[:] = L + [1]
L[:] += [2]
L[len(L):] = [3]
L.__setslice__(len(L), -1, [4])
L.__setslice__(sys.maxint, sys.maxint, [5])
L += [6]
L
[0, 1, 2, 3, 4, 5, 6]

That's at least seven ways to do an append, and it is a double-standard to
declare that slice manipulation is the One and Only True obvious way
to clear a list, but slice manipulation is not obvious enough for
appending to a list.

No doubt under the hood, these seven ways are implemented differently.
They certainly differ in their obviousness, and I'm willing to bet that
practically nobody thinks that the slicing methods are more obvious than
append. Perhaps we're not Dutch. I daresay one method is better, faster,
or more efficient than the others, but remember the warning against
premature optimisation.

Whenever I see "only one obvious way to do it" being used as a knee-jerk
excuse for rejecting any change, my eyes roll. Nobody wants Python to
turn into Perl plus the kitchen sink, but it isn't as if Python is some
absolutely minimalist language with two objects and three functions. There
is no shortage of "more than one way to do it" convenience methods,
functions and even entire modules. And that's why Python is such a fun,
easy to use language: because most things in Python are just convenient.
When you want to append to a list, or insert into a list, you don't have
to muck about with slices, you call the obvious list method.

And so it should be for clearing all or part of a list.
 
P

Peter Hansen

Alan said:
Ah, but if you know your basic python then you wouldn't be looking for
s.clear() in the first place; you'd just use s[:]=[] (or s=[], whichever
is appropriate).

One of very first things newcomers learn (I believe, though I don't know
how soon the tutorial teaches it) is to use "dir()" on objects. The
clear() method would show up there and quickly attract attention.
Neither of the other techniques are likely to be discovered as quickly.
(Well, okay, s=[] would be, I'm sure, but to many newcomers that might
"feel wrong" as the way to empty out a list, but then we're back to
wondering how often there's really a usecase for doing so.)
Personally, it seems more *reasonable* to me, a novice python user,
for s.clear() to behave like s=[] (or, perhaps, for s=[] and s[:]=[] to
mean the same thing). The fact that it can't might be an argument for
not having it in the first place.

Then it's a good reason we had this thread, so you could learn something
*crucial* to understanding Python and writing non-buggy code: name
binding versus variables which occupy fixed memory locations like in
some other languages. This has to be by far the most frequent area that
newcomer's trip up. But that's another story...

-Peter
 
P

Peter Hansen

Duncan said:
Peter Hansen wrote:
Looking in the 'obvious' place in the Tutorial, section 5.1 'More on
Lists' I found in the immediately following section 5.2 'The del
statement':

I saw that section too, but was scanning for any example of wiping out
the whole list. As you point out, it's not mentioned. I don't think
there's even an example of slicing with no arguments [:] for copying a
list (i.e. on the right side of the =), and it's reasonable to assume (I
originally did, as I recall) that this would be some kind of error...
Both of these talk about ways to remove slices from a list. Perhaps the
wording could be clearer to make it obvious that they can also be used to
clear a list entirely (using the word 'clear' would certainly help people
Googling for the answer). So maybe 'this can even change the size of the
list or clear it completely' would be a good change for 3.1.4.

This is quite true. After all, who imagines when offered a "slice of
cake" that a slice might be the entire thing! The concept of "slice" in
English strongly implies a *subset*, not the whole, so if we're not
going to get a .clear() method, I do believe that the various uses of
[:] should be much more explicitly pointed out in the docs. At least
we'd have a ready place to point to in the tutorial, instead of this
discussion cropping up every month.

-Peter
 
F

Fredrik Lundh

Peter said:
One of very first things newcomers learn (I believe, though I don't know
how soon the tutorial teaches it)

let's see. lists are introduced on page 19, a more extensive discussion of lists is
found on page 33, the del statement appears on page 34, and the dir() function
is introduced on page 46.

(page numbers from the current pytut PDF copy)

</F>
 
M

Mel Wilson

Alan said:
* s.clear() is more obvious in intent

Serious question: Should it work more like "s=[]" or more like
"s[:]=[]". I'm assuming the latter, but the fact that there is
a difference is an argument for not hiding this operation behind
some syntactic sugar.


It has to work more like s[:]=[]

It's not easy for an object method to bind a whole different
object to the first object's name. Generally only
statements can affect namespaces (hence the uses of del that
everyone remembers.)

Mel.
 
M

Mel Wilson

Steven said:
Convenience and obviousness are important for APIs -- that's why lists
have pop, extend and remove methods. The only difference I can see between
a hypothetical clear and these is that clear can be replaced with a
one-liner, while the others need at least two, e.g. for extend:

for item in seq:
L.append(item)

Both extend and append have one-line slice equivalents,
except that the equivalents have to keep referring to
the length of the list.. (have to keep finding the
len function.)

Mel.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,293
Messages
2,571,501
Members
48,189
Latest member
StaciLgf76

Latest Threads

Top