Why Python does SLICING the way it does??

Antoon Pardon · Apr 21, 2005

Op 2005-04-20 said:
That's true of course. It's more likely to show up in manipulating
lists or strings. And Python provides a much richer environment for
processing strings, so one has to deal with explicit indexing much
less.

But I still think that I make fewer error per instance of dealing with
intervals. It's rare that I even have to think about it much when
writing such a thing. Negative indexing also helps a lot.

I'm anbivallent about negative indexes. It helps a lot, but can
be annoying a lot too. IMO it deters from the, its easier to
be forgiven than to get permission, style of programming.

It happens rather regularly that I need to do some calculations
and if the start conditions were good, I get a valid index for
a list and otherwise I get an invalid index. From this specification
the following seems a natural way to program

try:
index = calculate(...)
lst[index] = ...
...
except IndexError
...

But of course this doesn't work because a negative index in this
case is an invalid index but python allows it.

I sometimes think python should have been more explicite here,
using a marker for the start-index and end-index, may '^' and
'$'. So if you wanted the last element you had to write:

lst[$]

And for the next to last element:

lst[$ - 1]

This would make accessing list elements counted from the rear
almost just as easy as it is now but wouldn't interfere with
the ask forgiveness programming style.

Ron · Apr 21, 2005

Many people I know ask why Python does slicing the way it does.....

Can anyone /please/ give me a good defense/justification???

I'm referring to why mystring[:4] gives me
elements 0, 1, 2 and 3 but *NOT* mystring[4] (5th element).

Many people don't like idea that 5th element is not invited.

(BTW, yes I'm aware of the explanation where slicing
is shown to involve slices _between_ elements. This
doesn't explain why this is *best* way to do it.)

Chris

Hi Chris,

What I've found is foreword slicing with positive stepping is very
convenient for a lot of things.

But when you start trying to use reverse steps, it can get tricky.

There are actually 4 different ways to slice and dice. So we have a
pretty good choice. So the trick is to match the slice method to what
you need, and also use the correct index's for that method.

Where s = 'abcd'
With s[i,j]

Foreword slices index, forward steps
a, b, c, d
i= 0, 1, 2, 3
j= 1, 2, 3, 4

s[0,4] = 'abcd'
s[1,3] = 'bc'

Foreword slice index (-steps)
a, b, c, d
i= 0, 1, 2, 3
j= -5, -4, -3, -2

s[3,-5] = 'dcba'
s[2,-4] = 'cb'

Reverse slice index (+steps)
a, b, c, d
i= -4, -3, -2, -1
j= 1, 2, 3, 4

s[-4,4] = 'abcd'
s[-3,3] = 'bc'

Reverse slice index (-steps)
a, b, c, d
i= -4, -3, -2, -1
j= -5, -4, -3, -2

s[-1,-5] = 'dcba'
s[-2,-4] = 'cb'

(Maybe this could be made a little more symetrical for Python 3000?)

Cheers,
Ron_Adam

Paul Rubin · Apr 21, 2005

Antoon Pardon said:
I sometimes think python should have been more explicite here,
using a marker for the start-index and end-index, may '^' and
'$'. So if you wanted the last element you had to write:

lst[$]

And for the next to last element:

lst[$ - 1]

I like this. I don't know how many times I've gotten screwed by
wanting the n'th element from the last for variable n, and saying
"lst[n]" then realizing I have a bug when n=0. lst[$-n] works perfectly.

Antoon Pardon · Apr 21, 2005

Op 2005-04-21 said:
[Antoon Pardon]

I don't see why the start index can't be accessible through
a method or function just like the length of a list is now.

My favourite would be a range method so we would have
the following idiom:

for i in lst.range():
do something with lst

Click to expand...

After going to all that trouble, you might as well also get the value at that
position:

for i, x in enumerate(lst):
do something with lst also known as x

No you wouldn't, enumerate always starts with 0.

So if you write a class with list-like behaviour except that the
start index can be different from 0, enumerate is useless
because lst won't be x in that case.

Raymond Hettinger · Apr 21, 2005

[Antoon Pardon]

I don't see why the start index can't be accessible through
a method or function just like the length of a list is now.

My favourite would be a range method so we would have
the following idiom:

for i in lst.range():
do something with lst

Click to expand...

After going to all that trouble, you might as well also get the value at that
position:

for i, x in enumerate(lst):
do something with lst also known as x

Click to expand...

No you wouldn't, enumerate always starts with 0.

You don't get it. Your proposed list-like class indicates its start index.
enumerate() can be made to detect that start value so that the above code always
works for both 0-based and 1-based arrays.

Raymond Hettinger

Antoon Pardon · Apr 21, 2005

Op 2005-04-21 said:
[Antoon Pardon]
I don't see why the start index can't be accessible through
a method or function just like the length of a list is now.

My favourite would be a range method so we would have
the following idiom:

for i in lst.range():
do something with lst

After going to all that trouble, you might as well also get the value at that
position:

for i, x in enumerate(lst):
do something with lst also known as x

Click to expand...

No you wouldn't, enumerate always starts with 0.

Click to expand...

You don't get it. Your proposed list-like class indicates its start index.
enumerate() can be made to detect that start value so that the above code always
works for both 0-based and 1-based arrays.

Oh you mean if it would be made a buildin class.

Personnally I would still prefer my range solution. I often find
enumerate gives me too much. Often enough I want to assign new values
to the elements in the list. I have no need for the old value, that
is also provided by enumerate.

Reinhold Birkenfeld · Apr 21, 2005

Antoon said:
I sometimes think python should have been more explicite here,
using a marker for the start-index and end-index, may '^' and
'$'. So if you wanted the last element you had to write:

lst[$]

And for the next to last element:

lst[$ - 1]

This would make accessing list elements counted from the rear
almost just as easy as it is now but wouldn't interfere with
the ask forgiveness programming style.

How would you pass this argument to __getitem__?
What would be allowed, only '$-x' or also '$+x' or what else?
What type would '$' be?

Reinhold

Steve Holden · Apr 21, 2005

Terry Hancock wrote:

So I like Python's slicing because it "bites *less*" than intervals

Click to expand...

in C or Fortran.

I disagree. Programming languages should not needlessly surprise
people, and a newbie to Python probably expects that x[1:3] =
[x[1],x[2],x[3]] . Array-oriented languages, such as Fortran 90/95,
Matlab/Octave/Scilab, and S-Plus/R do not follow the Python convention,
and I don't know of Fortran or R programmers who complain (don't follow
Matlab enough to say). There are Python programmers, such as the OP and
me, who don't like the Python convention. What languages besides Python
use the Python slicing convention?

The principle of least surprise is all very well, but "needless surprise
of newbies" is a dangerous criterion to adopt for programming language
design and following it consistently would lead to a mess like Visual
Basic, which grew by accretion until Microsoft realized it was no longer
tenable and broke backward compatibility.

Along the same lines, I think the REQUIREMENT that x[0] rather than
x[1] be the first element of list x is a mistake. At least the
programmer should have a choice, as in Fortran or VBA. In C starting at
0 may be justified because of the connection between array subscripting
and pointer arithmetic, but Python is a higher-level language where
such considerations are less relevant.

But Pythonicity required that there should be one obvious way to do
something. How obvious is having two ways? Then when you read code you
would continually be asking yourself "is this a one-based or a
zero-based structure?", which is not a profitable use of time.

regards
Steve

Antoon Pardon · Apr 21, 2005

Op 2005-04-21 said:
Antoon said:

I sometimes think python should have been more explicite here,
using a marker for the start-index and end-index, may '^' and
'$'. So if you wanted the last element you had to write:

lst[$]

And for the next to last element:

lst[$ - 1]

This would make accessing list elements counted from the rear
almost just as easy as it is now but wouldn't interfere with
the ask forgiveness programming style.

Click to expand...

How would you pass this argument to __getitem__?

Well assuming lst.last, was the last index of lst, __getitem__
would get lst.last and lst.last - 1 passed.

What would be allowed, only '$-x' or also '$+x' or what else?

Any expression where an int is allowed.

What type would '$' be?

It would be an int.

Antoon Pardon · Apr 21, 2005

Op 2005-04-21 said:
Terry Hancock wrote:

So I like Python's slicing because it "bites *less*" than intervals

Click to expand...

in C or Fortran.

I disagree. Programming languages should not needlessly surprise
people, and a newbie to Python probably expects that x[1:3] =
[x[1],x[2],x[3]] . Array-oriented languages, such as Fortran 90/95,
Matlab/Octave/Scilab, and S-Plus/R do not follow the Python convention,
and I don't know of Fortran or R programmers who complain (don't follow
Matlab enough to say). There are Python programmers, such as the OP and
me, who don't like the Python convention. What languages besides Python
use the Python slicing convention?

Click to expand...

The principle of least surprise is all very well, but "needless surprise
of newbies" is a dangerous criterion to adopt for programming language
design and following it consistently would lead to a mess like Visual
Basic, which grew by accretion until Microsoft realized it was no longer
tenable and broke backward compatibility.

Along the same lines, I think the REQUIREMENT that x[0] rather than
x[1] be the first element of list x is a mistake. At least the
programmer should have a choice, as in Fortran or VBA. In C starting at
0 may be justified because of the connection between array subscripting
and pointer arithmetic, but Python is a higher-level language where
such considerations are less relevant.

Click to expand...

But Pythonicity required that there should be one obvious way to do
something. How obvious is having two ways?

How obvious is that lists can be any length? Do you consider it
an unbounded number of ways, that lists can be any length?

Like users have a choice in how long they make a list, they
should have a choice where the indexes start. (And that
shouldn't be limited to 0 and 1).

Then when you read code you
would continually be asking yourself "is this a one-based or a
zero-based structure?", which is not a profitable use of time.

No you wouldn't. If you have the choice you just take the start
index that is more natural. Sometimes that is 0 sometimes that
is 1 and other times it is some whole other number. The times
I had the opportunity to use such structures, the question of
whether it was zero-based or one-based, rarely popped up.
Either it was irrelevant or it was clear from what you were
processing.

That you are forced to use zero-based structures, while the
problem space you are working on uses one-based structures
is a far bigger stumbling block where you continually have
to be aware that the indexes in your program are one off
from the indexes the problem is expressed in.

The one obvious way is to use the same index scheme as the
one that is used in the specification or problem setting.
Not to use always zero no matter what.

Reinhold Birkenfeld · Apr 21, 2005

Antoon said:
Op 2005-04-21 said:

Antoon said:

I sometimes think python should have been more explicite here,
using a marker for the start-index and end-index, may '^' and
'$'. So if you wanted the last element you had to write:

lst[$]

And for the next to last element:

lst[$ - 1]

This would make accessing list elements counted from the rear
almost just as easy as it is now but wouldn't interfere with
the ask forgiveness programming style.

Click to expand...

How would you pass this argument to __getitem__?

Click to expand...

Well assuming lst.last, was the last index of lst, __getitem__
would get lst.last and lst.last - 1 passed.

Then it would be an alias for len(lst)-1 ?

Any expression where an int is allowed.
Okay.

It would be an int.

Where would it be allowed? Only in subscriptions?

Reinhold

Antoon Pardon · Apr 21, 2005

Op 2005-04-21 said:
Antoon said:

Op 2005-04-21 said:

Antoon Pardon wrote:

I sometimes think python should have been more explicite here,
using a marker for the start-index and end-index, may '^' and
'$'. So if you wanted the last element you had to write:

lst[$]

And for the next to last element:

lst[$ - 1]

This would make accessing list elements counted from the rear
almost just as easy as it is now but wouldn't interfere with
the ask forgiveness programming style.

How would you pass this argument to __getitem__?

Click to expand...

Well assuming lst.last, was the last index of lst, __getitem__
would get lst.last and lst.last - 1 passed.

Click to expand...

Then it would be an alias for len(lst)-1 ?

In the context of current python lists yes. But if you would
go further and allow lists to start from an other index than
0 then not.

Where would it be allowed? Only in subscriptions?

Yes, the idea would be that the brackets indicate a
scope where $ would be the last index and ^ would
be the first index. So if you wanted the middle
element you could do: lst[(^ + $) // 2]

But outsides the brackets the scope where this
has any meaning wouldn't be present.

Not that I think this idea has any chance. Already
I can hear people shout that this is too perlish.

But here is another idea.

Sometime ago I read about the possibility of python
acquiring a with statement. So that instead of having
to write:

obj.inst1 ...
obj.inst2 ...
obj.inst1 ...

You could write:

with obj:
.inst1 ...
.inst2 ...
.inst1 ...

If this would get implemented we could think of a left bracked
as implicitely exucting a with statement. If we then had a list
like class where the start-index could be different from zero and
which had properties first and last indicating the first and last
index we could then write something like:

lst[.last] for the last element or
lst[.last - 1] for the next to last element.
lst[.first] for the first element
lst[(.first + .last) // 2] for the middle element

Maybe this makes the proposal again pythonic enough to get
a little consideration.

Rocco Moretti · Apr 21, 2005

Steve said:
The principle of least surprise is all very well, but "needless surprise
of newbies" is a dangerous criterion to adopt for programming language
design and following it consistently would lead to a mess like Visual
Basic, which grew by accretion until Microsoft realized it was no longer
tenable and broke backward compatibility.

Well, *needless* surprise of newbies is never a good thing. If it were,
it wouldn't be needless, now would it?

Surprising newbies just to
surprise newbies is just cruel, but there is room in this world for "it
may suprise you now, but you'll thank us later" and situations where
there is a "newbie way" and an "other way", and the "other" way is
chosen because it's the easiest thing for the most people in the long run.

But I agree, having "the easiest thing for newbies" as your sole
criterion for language design is a road to madness, for no other reason
than that newbies don't stay newbies forever.

Dan Bishop · Apr 21, 2005

Antoon said:
Op 2005-04-21 said:

(e-mail address removed) wrote: ....

Along the same lines, I think the REQUIREMENT that x[0] rather than
x[1] be the first element of list x is a mistake. At least the
programmer should have a choice, as in Fortran or VBA. In C starting at
0 may be justified because of the connection between array subscripting
and pointer arithmetic, but Python is a higher-level language where
such considerations are less relevant.

Click to expand...

But Pythonicity required that there should be one obvious way to do
something. How obvious is having two ways?

Click to expand...

How obvious is that lists can be any length? Do you consider it
an unbounded number of ways, that lists can be any length?

Like users have a choice in how long they make a list, they
should have a choice where the indexes start. (And that
shouldn't be limited to 0 and 1).

Suppose you could. Then what should

([3, 1, 4] indexbase 0) + ([1, 5, 9] indexbase 4)

equal?

That you are forced to use zero-based structures, while the
problem space you are working on uses one-based structures
is a far bigger stumbling block where you continually have
to be aware that the indexes in your program are one off
from the indexes the problem is expressed in.

Name a problem space that inherently requires arrays to be 1-based
rather than 0-based.

Paul Rubin · Apr 21, 2005

Dan Bishop said:
Name a problem space that inherently requires arrays to be 1-based
rather than 0-based.

"inherently" is too strong a word, since after all, we could do all
our computing with Turing machines.

Some algorithms are specified in terms of 1-based arrays. And most
Fortran programs are written in terms of 1-based arrays. So if you
want to implement a 1-based specification in Python, or write Python
code that interoperates with Fortran code, you either need 1-based
arrays in Python or else you need messy conversions all over your
Python code.

The book "Numerical Recipes in C" contains a lot of numerical
subroutines written in C, loosely based on Fortran counterparts from
the original Numerical Recipes book. The C routines are full of messy
conversions from 0-based to 1-based. Ugh.

Again, this (along with nested scopes and various other things) was
all figured out by the Algol-60 designers almost 50 years ago. In
Algol-60 you could just say "integer x(3..20)" and get a 3-based array
(I may have the syntax slightly wrong by now). It was useful and took
care of this problem.

Robert Kern · Apr 21, 2005

Paul said:
"inherently" is too strong a word, since after all, we could do all
our computing with Turing machines.

Some algorithms are specified in terms of 1-based arrays. And most
Fortran programs are written in terms of 1-based arrays. So if you
want to implement a 1-based specification in Python, or write Python
code that interoperates with Fortran code, you either need 1-based
arrays in Python or else you need messy conversions all over your
Python code.

I write Python code that interoperates with Fortran code all the time
(and write Fortran code that interoperates with Python code, too). Very,
very rarely do I have to explicitly do any conversions. They only show
up when a Fortran subroutine requires an index in its argument list.

In Fortran, I do Fortran. In Python, I do Python.

Yes, there is some effort required when translating some code or
pseudo-code that uses 1-based indexing. Having done this a number of
times, I haven't found it to be much of a burden.

The book "Numerical Recipes in C" contains a lot of numerical
subroutines written in C, loosely based on Fortran counterparts from
the original Numerical Recipes book. The C routines are full of messy
conversions from 0-based to 1-based. Ugh.

I contend that if they had decided to just write the C versions as C
instead of C-wishing-it-were-Fortran, they would have made a much better
library. Still sucky, but that's another story.

Again, this (along with nested scopes and various other things) was
all figured out by the Algol-60 designers almost 50 years ago. In
Algol-60 you could just say "integer x(3..20)" and get a 3-based array
(I may have the syntax slightly wrong by now). It was useful and took
care of this problem.

There's nothing that stops you from writing a class that does this. I
believe someone posted such a one to this thread.

I have yet to see a concrete proposal on how to make lists operate like
this.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter

Ron · Apr 21, 2005

Ron said:
Many people I know ask why Python does slicing the way it does.....

Can anyone /please/ give me a good defense/justification???

I'm referring to why mystring[:4] gives me
elements 0, 1, 2 and 3 but *NOT* mystring[4] (5th element).

Click to expand...

> There are actually 4 different ways to slice ....

Where s = 'abcd'
With s[i,j]

Foreword slices index, forward steps
a, b, c, d
i= 0, 1, 2, 3
j= 1, 2, 3, 4

s[0,4] = 'abcd'
s[1,3] = 'bc'
.......

Minor correction to this. It's what I get for not realizing how late it
was.

Where s = 'abcd'
With s[i:j:step]

Positive slice index, (+1 step)
a, b, c, d
i= 0, 1, 2, 3
j= 1, 2, 3, 4

s[0:4] = 'abcd'
s[1:3] = 'bc'

Positive slice index, (-1 step)
a, b, c, d
i= 0, 1, 2, 3
j= -5, -4, -3, -2

s[3:-5:-1] = 'dcba'
s[2:-4:-1] = 'cb'

Negative slice index, (+1 step)
a, b, c, d
i= -4, -3, -2, -1
j= 1, 2, 3, 4

s[-4:4] = 'abcd'
s[-3:3] = 'bc'

Reverse slice index, (-1 step)
a, b, c, d
i= -4, -3, -2, -1
j= -5, -4, -3, -2

s[-1:-5:-1] = 'dcba'
s[-2:-4:-1] = 'cb'

Cheers,
Ron_Adam

Greg Ewing · Apr 22, 2005

Antoon said:
This is nonsens. table = j, just associates value j with key i.
That is the same independend from whether the keys can start from
0 or some other value.

Also, everyone, please keep in mind that you always have
the option of using a *dictionary*, in which case your
indices can start wherever you want.

You can't slice them, true, but you can't have
everything.

Greg Ewing · Apr 22, 2005

Roy said:
What would actually be cool is if Python were to support the normal math
notation for open or closed intervals.
>
foo = bar (1, 2)
foo = bar (1, 2]
foo = bar [1, 2)
foo = bar [1, 2]

That would certainly solve this particular problem, but the cost to the
rest of the language syntax would be rather high

Not to mention the sanity of everyone's editors when
they try to do bracket matching!

Greg Ewing · Apr 22, 2005

I disagree. Programming languages should not needlessly surprise
people, and a newbie to Python probably expects that x[1:3] =
[x[1],x[2],x[3]].

But said newbie's expectations will differ considerably
depending on which other language he's coming from. So
he's almost always going to be surprised one way or another.

Python sensibly adopts a convention that long experience
has shown to be practical, rather than trying to imitate
any particular precedent.

Along the same lines, I think the REQUIREMENT that x[0] rather than
x[1] be the first element of list x is a mistake. At least the
programmer should have a choice

Who says the Python programmer doesn't have a choice?

class NewbieWarmFuzzyList(list):

def __new__(cls, base, *args):
obj = list.__new__(cls, *args)
obj.base = base
return obj

def __getitem__(self, i):
return list.__getitem__(self, i - self.base)

# etc...

Why are slice indices the way they are in python?	6	Nov 30, 2006
Python Unicode handling wins again -- mostly	67	Nov 30, 2013
Python Warts: The where, when, how, and why of a PyWart.	0	Feb 18, 2013
Why Python3	12	Jun 28, 2010
Why does way_1() obtain the correct list but way_2() gets an emptylist?	2	Feb 12, 2009
Why Python 3?	5	Dec 5, 2007
how to avoid spaghetti in Python?	2	Jan 21, 2014
Easy way to play single musical notes in Python	3	Nov 15, 2009

Why Python does SLICING the way it does??

Antoon Pardon

Ron

Paul Rubin

Antoon Pardon

Raymond Hettinger

Antoon Pardon

Reinhold Birkenfeld

Steve Holden

Antoon Pardon

Antoon Pardon

Reinhold Birkenfeld

Antoon Pardon

Rocco Moretti

Dan Bishop

Paul Rubin

Robert Kern

Ron

Greg Ewing

Greg Ewing

Greg Ewing

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads

Why Python does *SLICING* the way it does??

Antoon Pardon

Ron

Paul Rubin

Antoon Pardon

Raymond Hettinger

Antoon Pardon

Reinhold Birkenfeld

Steve Holden

Antoon Pardon

Antoon Pardon

Reinhold Birkenfeld

Antoon Pardon

Rocco Moretti

Dan Bishop

Paul Rubin

Robert Kern

Ron

Greg Ewing

Greg Ewing

Greg Ewing

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads

Why Python does SLICING the way it does??