Why Python does *SLICING* the way it does??

A

Antoon Pardon

Op 2005-04-20 said:
That's true of course. It's more likely to show up in manipulating
lists or strings. And Python provides a much richer environment for
processing strings, so one has to deal with explicit indexing much
less.

But I still think that I make fewer error per instance of dealing with
intervals. It's rare that I even have to think about it much when
writing such a thing. Negative indexing also helps a lot.

I'm anbivallent about negative indexes. It helps a lot, but can
be annoying a lot too. IMO it deters from the, its easier to
be forgiven than to get permission, style of programming.

It happens rather regularly that I need to do some calculations
and if the start conditions were good, I get a valid index for
a list and otherwise I get an invalid index. From this specification
the following seems a natural way to program

try:
index = calculate(...)
lst[index] = ...
...
except IndexError
...

But of course this doesn't work because a negative index in this
case is an invalid index but python allows it.

I sometimes think python should have been more explicite here,
using a marker for the start-index and end-index, may '^' and
'$'. So if you wanted the last element you had to write:

lst[$]

And for the next to last element:

lst[$ - 1]


This would make accessing list elements counted from the rear
almost just as easy as it is now but wouldn't interfere with
the ask forgiveness programming style.
 
R

Ron

Many people I know ask why Python does slicing the way it does.....

Can anyone /please/ give me a good defense/justification???

I'm referring to why mystring[:4] gives me
elements 0, 1, 2 and 3 but *NOT* mystring[4] (5th element).

Many people don't like idea that 5th element is not invited.

(BTW, yes I'm aware of the explanation where slicing
is shown to involve slices _between_ elements. This
doesn't explain why this is *best* way to do it.)

Chris

Hi Chris,

What I've found is foreword slicing with positive stepping is very
convenient for a lot of things. :)

But when you start trying to use reverse steps, it can get tricky.

There are actually 4 different ways to slice and dice. So we have a
pretty good choice. So the trick is to match the slice method to what
you need, and also use the correct index's for that method.


Where s = 'abcd'
With s[i,j]

Foreword slices index, forward steps
a, b, c, d
i= 0, 1, 2, 3
j= 1, 2, 3, 4

s[0,4] = 'abcd'
s[1,3] = 'bc'

Foreword slice index (-steps)
a, b, c, d
i= 0, 1, 2, 3
j= -5, -4, -3, -2

s[3,-5] = 'dcba'
s[2,-4] = 'cb'

Reverse slice index (+steps)
a, b, c, d
i= -4, -3, -2, -1
j= 1, 2, 3, 4

s[-4,4] = 'abcd'
s[-3,3] = 'bc'

Reverse slice index (-steps)
a, b, c, d
i= -4, -3, -2, -1
j= -5, -4, -3, -2

s[-1,-5] = 'dcba'
s[-2,-4] = 'cb'


(Maybe this could be made a little more symetrical for Python 3000?)

Cheers,
Ron_Adam
 
P

Paul Rubin

Antoon Pardon said:
I sometimes think python should have been more explicite here,
using a marker for the start-index and end-index, may '^' and
'$'. So if you wanted the last element you had to write:

lst[$]

And for the next to last element:

lst[$ - 1]

I like this. I don't know how many times I've gotten screwed by
wanting the n'th element from the last for variable n, and saying
"lst[n]" then realizing I have a bug when n=0. lst[$-n] works perfectly.
 
A

Antoon Pardon

Op 2005-04-21 said:
[Antoon Pardon]
I don't see why the start index can't be accessible through
a method or function just like the length of a list is now.

My favourite would be a range method so we would have
the following idiom:

for i in lst.range():
do something with lst


After going to all that trouble, you might as well also get the value at that
position:

for i, x in enumerate(lst):
do something with lst also known as x


No you wouldn't, enumerate always starts with 0.

So if you write a class with list-like behaviour except that the
start index can be different from 0, enumerate is useless
because lst won't be x in that case.
 
R

Raymond Hettinger

[Antoon Pardon]
I don't see why the start index can't be accessible through
a method or function just like the length of a list is now.

My favourite would be a range method so we would have
the following idiom:

for i in lst.range():
do something with lst


After going to all that trouble, you might as well also get the value at that
position:

for i, x in enumerate(lst):
do something with lst also known as x


No you wouldn't, enumerate always starts with 0.


You don't get it. Your proposed list-like class indicates its start index.
enumerate() can be made to detect that start value so that the above code always
works for both 0-based and 1-based arrays.


Raymond Hettinger
 
A

Antoon Pardon

Op 2005-04-21 said:
[Antoon Pardon]
I don't see why the start index can't be accessible through
a method or function just like the length of a list is now.

My favourite would be a range method so we would have
the following idiom:

for i in lst.range():
do something with lst

After going to all that trouble, you might as well also get the value at that
position:

for i, x in enumerate(lst):
do something with lst also known as x


No you wouldn't, enumerate always starts with 0.


You don't get it. Your proposed list-like class indicates its start index.
enumerate() can be made to detect that start value so that the above code always
works for both 0-based and 1-based arrays.


Oh you mean if it would be made a buildin class.

Personnally I would still prefer my range solution. I often find
enumerate gives me too much. Often enough I want to assign new values
to the elements in the list. I have no need for the old value, that
is also provided by enumerate.
 
R

Reinhold Birkenfeld

Antoon said:
I sometimes think python should have been more explicite here,
using a marker for the start-index and end-index, may '^' and
'$'. So if you wanted the last element you had to write:

lst[$]

And for the next to last element:

lst[$ - 1]


This would make accessing list elements counted from the rear
almost just as easy as it is now but wouldn't interfere with
the ask forgiveness programming style.

How would you pass this argument to __getitem__?
What would be allowed, only '$-x' or also '$+x' or what else?
What type would '$' be?

Reinhold
 
S

Steve Holden

Terry Hancock wrote:

So I like Python's slicing because it "bites *less*" than intervals

in C or Fortran.

I disagree. Programming languages should not needlessly surprise
people, and a newbie to Python probably expects that x[1:3] =
[x[1],x[2],x[3]] . Array-oriented languages, such as Fortran 90/95,
Matlab/Octave/Scilab, and S-Plus/R do not follow the Python convention,
and I don't know of Fortran or R programmers who complain (don't follow
Matlab enough to say). There are Python programmers, such as the OP and
me, who don't like the Python convention. What languages besides Python
use the Python slicing convention?
The principle of least surprise is all very well, but "needless surprise
of newbies" is a dangerous criterion to adopt for programming language
design and following it consistently would lead to a mess like Visual
Basic, which grew by accretion until Microsoft realized it was no longer
tenable and broke backward compatibility.
Along the same lines, I think the REQUIREMENT that x[0] rather than
x[1] be the first element of list x is a mistake. At least the
programmer should have a choice, as in Fortran or VBA. In C starting at
0 may be justified because of the connection between array subscripting
and pointer arithmetic, but Python is a higher-level language where
such considerations are less relevant.
But Pythonicity required that there should be one obvious way to do
something. How obvious is having two ways? Then when you read code you
would continually be asking yourself "is this a one-based or a
zero-based structure?", which is not a profitable use of time.

regards
Steve
 
A

Antoon Pardon

Op 2005-04-21 said:
Antoon said:
I sometimes think python should have been more explicite here,
using a marker for the start-index and end-index, may '^' and
'$'. So if you wanted the last element you had to write:

lst[$]

And for the next to last element:

lst[$ - 1]


This would make accessing list elements counted from the rear
almost just as easy as it is now but wouldn't interfere with
the ask forgiveness programming style.

How would you pass this argument to __getitem__?

Well assuming lst.last, was the last index of lst, __getitem__
would get lst.last and lst.last - 1 passed.
What would be allowed, only '$-x' or also '$+x' or what else?

Any expression where an int is allowed.
What type would '$' be?

It would be an int.
 
A

Antoon Pardon

Op 2005-04-21 said:
Terry Hancock wrote:

So I like Python's slicing because it "bites *less*" than intervals

in C or Fortran.

I disagree. Programming languages should not needlessly surprise
people, and a newbie to Python probably expects that x[1:3] =
[x[1],x[2],x[3]] . Array-oriented languages, such as Fortran 90/95,
Matlab/Octave/Scilab, and S-Plus/R do not follow the Python convention,
and I don't know of Fortran or R programmers who complain (don't follow
Matlab enough to say). There are Python programmers, such as the OP and
me, who don't like the Python convention. What languages besides Python
use the Python slicing convention?
The principle of least surprise is all very well, but "needless surprise
of newbies" is a dangerous criterion to adopt for programming language
design and following it consistently would lead to a mess like Visual
Basic, which grew by accretion until Microsoft realized it was no longer
tenable and broke backward compatibility.
Along the same lines, I think the REQUIREMENT that x[0] rather than
x[1] be the first element of list x is a mistake. At least the
programmer should have a choice, as in Fortran or VBA. In C starting at
0 may be justified because of the connection between array subscripting
and pointer arithmetic, but Python is a higher-level language where
such considerations are less relevant.
But Pythonicity required that there should be one obvious way to do
something. How obvious is having two ways?

How obvious is that lists can be any length? Do you consider it
an unbounded number of ways, that lists can be any length?

Like users have a choice in how long they make a list, they
should have a choice where the indexes start. (And that
shouldn't be limited to 0 and 1).
Then when you read code you
would continually be asking yourself "is this a one-based or a
zero-based structure?", which is not a profitable use of time.

No you wouldn't. If you have the choice you just take the start
index that is more natural. Sometimes that is 0 sometimes that
is 1 and other times it is some whole other number. The times
I had the opportunity to use such structures, the question of
whether it was zero-based or one-based, rarely popped up.
Either it was irrelevant or it was clear from what you were
processing.

That you are forced to use zero-based structures, while the
problem space you are working on uses one-based structures
is a far bigger stumbling block where you continually have
to be aware that the indexes in your program are one off
from the indexes the problem is expressed in.

The one obvious way is to use the same index scheme as the
one that is used in the specification or problem setting.
Not to use always zero no matter what.
 
R

Reinhold Birkenfeld

Antoon said:
Op 2005-04-21 said:
Antoon said:
I sometimes think python should have been more explicite here,
using a marker for the start-index and end-index, may '^' and
'$'. So if you wanted the last element you had to write:

lst[$]

And for the next to last element:

lst[$ - 1]


This would make accessing list elements counted from the rear
almost just as easy as it is now but wouldn't interfere with
the ask forgiveness programming style.

How would you pass this argument to __getitem__?

Well assuming lst.last, was the last index of lst, __getitem__
would get lst.last and lst.last - 1 passed.

Then it would be an alias for len(lst)-1 ?
Any expression where an int is allowed.
Okay.


It would be an int.

Where would it be allowed? Only in subscriptions?

Reinhold
 
A

Antoon Pardon

Op 2005-04-21 said:
Antoon said:
Op 2005-04-21 said:
Antoon Pardon wrote:

I sometimes think python should have been more explicite here,
using a marker for the start-index and end-index, may '^' and
'$'. So if you wanted the last element you had to write:

lst[$]

And for the next to last element:

lst[$ - 1]


This would make accessing list elements counted from the rear
almost just as easy as it is now but wouldn't interfere with
the ask forgiveness programming style.

How would you pass this argument to __getitem__?

Well assuming lst.last, was the last index of lst, __getitem__
would get lst.last and lst.last - 1 passed.

Then it would be an alias for len(lst)-1 ?

In the context of current python lists yes. But if you would
go further and allow lists to start from an other index than
0 then not.
Where would it be allowed? Only in subscriptions?

Yes, the idea would be that the brackets indicate a
scope where $ would be the last index and ^ would
be the first index. So if you wanted the middle
element you could do: lst[(^ + $) // 2]

But outsides the brackets the scope where this
has any meaning wouldn't be present.

Not that I think this idea has any chance. Already
I can hear people shout that this is too perlish.


But here is another idea.

Sometime ago I read about the possibility of python
acquiring a with statement. So that instead of having
to write:

obj.inst1 ...
obj.inst2 ...
obj.inst1 ...

You could write:

with obj:
.inst1 ...
.inst2 ...
.inst1 ...


If this would get implemented we could think of a left bracked
as implicitely exucting a with statement. If we then had a list
like class where the start-index could be different from zero and
which had properties first and last indicating the first and last
index we could then write something like:

lst[.last] for the last element or
lst[.last - 1] for the next to last element.
lst[.first] for the first element
lst[(.first + .last) // 2] for the middle element

Maybe this makes the proposal again pythonic enough to get
a little consideration.
 
R

Rocco Moretti

Steve said:
The principle of least surprise is all very well, but "needless surprise
of newbies" is a dangerous criterion to adopt for programming language
design and following it consistently would lead to a mess like Visual
Basic, which grew by accretion until Microsoft realized it was no longer
tenable and broke backward compatibility.

Well, *needless* surprise of newbies is never a good thing. If it were,
it wouldn't be needless, now would it? :) Surprising newbies just to
surprise newbies is just cruel, but there is room in this world for "it
may suprise you now, but you'll thank us later" and situations where
there is a "newbie way" and an "other way", and the "other" way is
chosen because it's the easiest thing for the most people in the long run.

But I agree, having "the easiest thing for newbies" as your sole
criterion for language design is a road to madness, for no other reason
than that newbies don't stay newbies forever.
 
D

Dan Bishop

Antoon said:
Op 2005-04-21 said:
(e-mail address removed) wrote: ....
Along the same lines, I think the REQUIREMENT that x[0] rather than
x[1] be the first element of list x is a mistake. At least the
programmer should have a choice, as in Fortran or VBA. In C starting at
0 may be justified because of the connection between array subscripting
and pointer arithmetic, but Python is a higher-level language where
such considerations are less relevant.
But Pythonicity required that there should be one obvious way to do
something. How obvious is having two ways?

How obvious is that lists can be any length? Do you consider it
an unbounded number of ways, that lists can be any length?

Like users have a choice in how long they make a list, they
should have a choice where the indexes start. (And that
shouldn't be limited to 0 and 1).

Suppose you could. Then what should

([3, 1, 4] indexbase 0) + ([1, 5, 9] indexbase 4)

equal?
That you are forced to use zero-based structures, while the
problem space you are working on uses one-based structures
is a far bigger stumbling block where you continually have
to be aware that the indexes in your program are one off
from the indexes the problem is expressed in.

Name a problem space that inherently requires arrays to be 1-based
rather than 0-based.
 
P

Paul Rubin

Dan Bishop said:
Name a problem space that inherently requires arrays to be 1-based
rather than 0-based.

"inherently" is too strong a word, since after all, we could do all
our computing with Turing machines.

Some algorithms are specified in terms of 1-based arrays. And most
Fortran programs are written in terms of 1-based arrays. So if you
want to implement a 1-based specification in Python, or write Python
code that interoperates with Fortran code, you either need 1-based
arrays in Python or else you need messy conversions all over your
Python code.

The book "Numerical Recipes in C" contains a lot of numerical
subroutines written in C, loosely based on Fortran counterparts from
the original Numerical Recipes book. The C routines are full of messy
conversions from 0-based to 1-based. Ugh.

Again, this (along with nested scopes and various other things) was
all figured out by the Algol-60 designers almost 50 years ago. In
Algol-60 you could just say "integer x(3..20)" and get a 3-based array
(I may have the syntax slightly wrong by now). It was useful and took
care of this problem.
 
R

Robert Kern

Paul said:
"inherently" is too strong a word, since after all, we could do all
our computing with Turing machines.

Some algorithms are specified in terms of 1-based arrays. And most
Fortran programs are written in terms of 1-based arrays. So if you
want to implement a 1-based specification in Python, or write Python
code that interoperates with Fortran code, you either need 1-based
arrays in Python or else you need messy conversions all over your
Python code.

I write Python code that interoperates with Fortran code all the time
(and write Fortran code that interoperates with Python code, too). Very,
very rarely do I have to explicitly do any conversions. They only show
up when a Fortran subroutine requires an index in its argument list.

In Fortran, I do Fortran. In Python, I do Python.

Yes, there is some effort required when translating some code or
pseudo-code that uses 1-based indexing. Having done this a number of
times, I haven't found it to be much of a burden.
The book "Numerical Recipes in C" contains a lot of numerical
subroutines written in C, loosely based on Fortran counterparts from
the original Numerical Recipes book. The C routines are full of messy
conversions from 0-based to 1-based. Ugh.

I contend that if they had decided to just write the C versions as C
instead of C-wishing-it-were-Fortran, they would have made a much better
library. Still sucky, but that's another story.
Again, this (along with nested scopes and various other things) was
all figured out by the Algol-60 designers almost 50 years ago. In
Algol-60 you could just say "integer x(3..20)" and get a 3-based array
(I may have the syntax slightly wrong by now). It was useful and took
care of this problem.

There's nothing that stops you from writing a class that does this. I
believe someone posted such a one to this thread.

I have yet to see a concrete proposal on how to make lists operate like
this.

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
R

Ron

Ron said:
Many people I know ask why Python does slicing the way it does.....

Can anyone /please/ give me a good defense/justification???

I'm referring to why mystring[:4] gives me
elements 0, 1, 2 and 3 but *NOT* mystring[4] (5th element).
> There are actually 4 different ways to slice ....
Where s = 'abcd'
With s[i,j]

Foreword slices index, forward steps
a, b, c, d
i= 0, 1, 2, 3
j= 1, 2, 3, 4

s[0,4] = 'abcd'
s[1,3] = 'bc'
.......


Minor correction to this. It's what I get for not realizing how late it
was.

Where s = 'abcd'
With s[i:j:step]

Positive slice index, (+1 step)
a, b, c, d
i= 0, 1, 2, 3
j= 1, 2, 3, 4

s[0:4] = 'abcd'
s[1:3] = 'bc'

Positive slice index, (-1 step)
a, b, c, d
i= 0, 1, 2, 3
j= -5, -4, -3, -2

s[3:-5:-1] = 'dcba'
s[2:-4:-1] = 'cb'

Negative slice index, (+1 step)
a, b, c, d
i= -4, -3, -2, -1
j= 1, 2, 3, 4

s[-4:4] = 'abcd'
s[-3:3] = 'bc'

Reverse slice index, (-1 step)
a, b, c, d
i= -4, -3, -2, -1
j= -5, -4, -3, -2

s[-1:-5:-1] = 'dcba'
s[-2:-4:-1] = 'cb'



Cheers,
Ron_Adam
 
G

Greg Ewing

Antoon said:
This is nonsens. table = j, just associates value j with key i.
That is the same independend from whether the keys can start from
0 or some other value.


Also, everyone, please keep in mind that you always have
the option of using a *dictionary*, in which case your
indices can start wherever you want.

You can't slice them, true, but you can't have
everything. :)
 
G

Greg Ewing

Roy said:
What would actually be cool is if Python were to support the normal math
notation for open or closed intervals.
>
foo = bar (1, 2)
foo = bar (1, 2]
foo = bar [1, 2)
foo = bar [1, 2]

That would certainly solve this particular problem, but the cost to the
rest of the language syntax would be rather high :)

Not to mention the sanity of everyone's editors when
they try to do bracket matching!
 
G

Greg Ewing

I disagree. Programming languages should not needlessly surprise
people, and a newbie to Python probably expects that x[1:3] =
[x[1],x[2],x[3]].

But said newbie's expectations will differ considerably
depending on which other language he's coming from. So
he's almost always going to be surprised one way or another.

Python sensibly adopts a convention that long experience
has shown to be practical, rather than trying to imitate
any particular precedent.
Along the same lines, I think the REQUIREMENT that x[0] rather than
x[1] be the first element of list x is a mistake. At least the
programmer should have a choice

Who says the Python programmer doesn't have a choice?

class NewbieWarmFuzzyList(list):

def __new__(cls, base, *args):
obj = list.__new__(cls, *args)
obj.base = base
return obj

def __getitem__(self, i):
return list.__getitem__(self, i - self.base)

# etc...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,236
Messages
2,571,188
Members
47,822
Latest member
mariya234

Latest Threads

Top