Why Python does *SLICING* the way it does??

R

Roy Smith

Greg Ewing said:
Roy said:
What would actually be cool is if Python were to support the normal math
notation for open or closed intervals.

foo = bar (1, 2)
foo = bar (1, 2]
foo = bar [1, 2)
foo = bar [1, 2]

That would certainly solve this particular problem, but the cost to the
rest of the language syntax would be rather high :)

Not to mention the sanity of everyone's editors when
they try to do bracket matching!

I have no doubt that somebody could teach emacs python-mode to correctly
match half-open interval brackets.
 
B

beliavsky

Dan said:
Antoon Pardon wrote:
Like users have a choice in how long they make a list, they
should have a choice where the indexes start. (And that
shouldn't be limited to 0 and 1).

Suppose you could. Then what should

([3, 1, 4] indexbase 0) + ([1, 5, 9] indexbase 4)

equal?

Assuming the + sign means concatenate (as it does for Python lists)
rather than add (as it does for Numeric or Numarray arrays), it would
be

([3,1,4,1,5,9] indexbase 0)

since 0 would still be the default indexbase. If the user wanted a
1-based list as a result, he would use an expression such as

(([3, 1, 4] indexbase 0) + ([1, 5, 9] indexbase 4) indexbase 1)

If + means add, the result would be

([4,6,13] indexbase 0) .

Adding arrays makes sense if they have the same number of elements --
they do not need to have the same indices.

I rarely see problems caused by flexible lower array bounds mentioned
in comp.lang.fortran .
 
P

Paul Rubin

Suppose you could. Then what should
([3, 1, 4] indexbase 0) + ([1, 5, 9] indexbase 4)
equal?
If + means add, the result would be ([4,6,13] indexbase 0) .

That's counterintuitive. I'd expect c = a + b to result in c =
a+b for all elements. So, for example,
([0,1,2,3,4] indexbase 0) + ([2,3] indexbase 2)
should result in ([0,1,4,6,4] indexbase 0)
and ([3, 1, 4] indexbase 0) + ([1, 5, 9] indexbase 4)
should either result in ([3,1,4,0,1,5,9] indexbase 0) or
else raise an exception.
 
R

Roy Smith

Greg Ewing said:
Also, everyone, please keep in mind that you always have
the option of using a *dictionary*, in which case your
indices can start wherever you want.

You can't slice them, true, but you can't have everything. :)

Of course you can slice them, you just have to subclass dict! The
following was about 15 minutes work:

---------------
import types

class slicableDict (dict):
def __getitem__ (self, index):
if type (index) == types.SliceType:
d2 = slicableDict()
for key in self.keys():
if key >= index.start and key < index.stop:
d2[key] = self[key]
return d2
else:
return dict.__getitem__ (self, index)

d = slicableDict()
d['hen'] = 1
d['ducks'] = 2
d['geese'] = 3
d['oysters'] = 4
d['porpoises'] = 5

print d
print d['a':'m']
---------------

Roy-Smiths-Computer:play$ ./slice.py
{'oysters': 4, 'hen': 1, 'porpoises': 5, 'geese': 3, 'ducks': 2}
{'hen': 1, 'geese': 3, 'ducks': 2}

I defined d[x:y] as returning a new dictionary which contains those items
from the original whose keys are in the range x <= key < y. I'm not sure
this is terribly useful but it's a neat demonstration of just how simple
Python makes it to do stuff like this. I can't imagine how much work it
would be to add a similar functionality to something like C++ multimap.

I'm sure the code above could be improved, and I know I've ignored all
sorts of things like steps, and error checking. Frankly, I'm amazed this
worked at all; I expected to get a syntax error when I tried to create a
slice with non-numeric values.

PS: Extra credit if you can identify the set of keys I used without
resorting to google :)
 
M

Mike Meyer

Roy Smith said:
Greg Ewing said:
Also, everyone, please keep in mind that you always have
the option of using a *dictionary*, in which case your
indices can start wherever you want.

You can't slice them, true, but you can't have everything. :)

Of course you can slice them, you just have to subclass dict! The
following was about 15 minutes work:

---------------
import types

class slicableDict (dict):
def __getitem__ (self, index):
if type (index) == types.SliceType:
d2 = slicableDict()
for key in self.keys():
if key >= index.start and key < index.stop:
d2[key] = self[key]
return d2
else:
return dict.__getitem__ (self, index)

d = slicableDict()
d['hen'] = 1
d['ducks'] = 2
d['geese'] = 3
d['oysters'] = 4
d['porpoises'] = 5

print d
print d['a':'m']
---------------

I couldn't resist:

py> d = slicableDict()
py> d[3j] = 1
py> d[4j] = 2
py> d[5j] = 3
py> d[6j] = 4
py> d[4j:5j]
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 6, in __getitem__
TypeError: cannot compare complex numbers using <, <=, >, >=

Somehow, that seems like a wart.

<mike
 
A

Antoon Pardon

Op 2005-04-21 said:
Antoon said:
Op 2005-04-21 said:
(e-mail address removed) wrote: ...
Along the same lines, I think the REQUIREMENT that x[0] rather than
x[1] be the first element of list x is a mistake. At least the
programmer should have a choice, as in Fortran or VBA. In C starting at
0 may be justified because of the connection between array subscripting
and pointer arithmetic, but Python is a higher-level language where
such considerations are less relevant.

But Pythonicity required that there should be one obvious way to do
something. How obvious is having two ways?

How obvious is that lists can be any length? Do you consider it
an unbounded number of ways, that lists can be any length?

Like users have a choice in how long they make a list, they
should have a choice where the indexes start. (And that
shouldn't be limited to 0 and 1).

Suppose you could. Then what should

([3, 1, 4] indexbase 0) + ([1, 5, 9] indexbase 4)

equal?

There are multiple possibilities, that can make sense.
I'm sure that no consensus will be reached about what
is should be. This will stop this idea from ever being
implemented. So you shouldn't worry too much about it :).
Name a problem space that inherently requires arrays to be 1-based
rather than 0-based.

None, but that is beside the question. If I look into my mathematics
books plenty of problems are described in terms of one based indexes
Sure most of them could just as easily be described in terms of
zero-based indexes, but the fact of the matter is they are not.

The same goes for problems other people would like me to solve.
Often enough they describe their problems in term of 1-based
indexes rather than 0-based. Sure I can translate it to 0-based,
but that will just make communication between me and the client
more difficult.
 
R

Reinhold Birkenfeld

Roy said:
import types

class slicableDict (dict):
def __getitem__ (self, index):
if type (index) == types.SliceType:
d2 = slicableDict()
for key in self.keys():
if key >= index.start and key < index.stop:
d2[key] = self[key]
return d2
else:
return dict.__getitem__ (self, index) [...]

Roy-Smiths-Computer:play$ ./slice.py
{'oysters': 4, 'hen': 1, 'porpoises': 5, 'geese': 3, 'ducks': 2}
{'hen': 1, 'geese': 3, 'ducks': 2}

I defined d[x:y] as returning a new dictionary which contains those items
from the original whose keys are in the range x <= key < y. I'm not sure
this is terribly useful but it's a neat demonstration of just how simple
Python makes it to do stuff like this. I can't imagine how much work it
would be to add a similar functionality to something like C++ multimap.

Other possibility, probably faster when almost all keys in the range are in
the dictionary:

class sdict(dict):
def __getitem__(self, index):
if isinstance(index, slice):
d = {}
for key in xrange(slice.start, slice.stop, slice.step):
if key in self:
d[key] = self[key]
return d
else:
return dict.__getitem__(self, index)

Reinhold
 
R

Reinhold Birkenfeld

Reinhold said:
Other possibility, probably faster when almost all keys in the range are in
the dictionary:

class sdict(dict):
def __getitem__(self, index):
if isinstance(index, slice):
d = {}
for key in xrange(slice.start, slice.stop, slice.step):

Hm. I wonder whether (x)range could be made accepting a slice as single argument...

Reinhold
 
P

Peter Hansen

Rocco said:
...
But I agree, having "the easiest thing for newbies" as your sole
criterion for language design is a road to madness, for no other reason
than that newbies don't stay newbies forever.

If Visual BASIC is the example of "easiest thing for newbies",
then it disproves your theory already. I think VB newbies
*do* stay newbies forever (or as long as they are using just VB).

-Peter
 
R

Roy Smith

Roy said:
import types

class slicableDict (dict):
def __getitem__ (self, index):
if type (index) == types.SliceType:
d2 = slicableDict()
for key in self.keys():
if key >= index.start and key < index.stop:
d2[key] = self[key]
return d2
else:
return dict.__getitem__ (self, index) [...]

Roy-Smiths-Computer:play$ ./slice.py
{'oysters': 4, 'hen': 1, 'porpoises': 5, 'geese': 3, 'ducks': 2}
{'hen': 1, 'geese': 3, 'ducks': 2}

I defined d[x:y] as returning a new dictionary which contains those items
from the original whose keys are in the range x <= key < y. I'm not sure
this is terribly useful but it's a neat demonstration of just how simple
Python makes it to do stuff like this. I can't imagine how much work it
would be to add a similar functionality to something like C++ multimap.

Other possibility, probably faster when almost all keys in the range are in
the dictionary:

class sdict(dict):
def __getitem__(self, index):
if isinstance(index, slice):
d = {}
for key in xrange(slice.start, slice.stop, slice.step):
if key in self:
d[key] = self[key]
return d
else:
return dict.__getitem__(self, index)

The problem with that is it requires the keys to be integers.
 
B

beliavsky

Peter Hansen wrote:

If Visual BASIC is the example of "easiest thing for newbies",
then it disproves your theory already. I think VB newbies
*do* stay newbies forever (or as long as they are using just VB).

Much snobbery is directed at Visual Basic and other dialects of Basic
(they even have "basic" in their name), but I think VBA is better
designed than the prestigious C in some important ways.

Suppose you want to allocate a 2-D array at run-time and pass it to a
procedure. The VBA code is just

Option Explicit
Option Base 1

Sub make_matrix()
Dim x() As Double
Dim n1 As Integer, n2 As Integer
n1 = 2
n2 = 3
ReDim x(n1, n2)
Call print_matrix(x)
End Sub

Sub print_matrix(xmat() As Double)
Debug.Print UBound(xmat, 1), UBound(xmat, 2)
'do stuff with xmat
End Sub

It is trivial to allocate and pass multidimensional arrays in VBA, but
C requires expertise with pointers. The subroutine print_matrix can
query the dimensions of xmat, so they don't need to be passed as
separate arguments, as in C. The fact that is tricky to do simple
things is a sign of the poor design of C and similar languages, at
least for non-systems programming.

People bash VB as a language the corrupts a programmer and prevents him
from ever becoming a "real" programmer. Maybe VB programmers quickly
get so productive with it that they don't need to fuss with trickier
languages.
 
R

Reinhold Birkenfeld

Roy said:
Other possibility, probably faster when almost all keys in the range are in
the dictionary:

class sdict(dict):
def __getitem__(self, index):
if isinstance(index, slice):
d = {}
for key in xrange(slice.start, slice.stop, slice.step):
if key in self:
d[key] = self[key]
return d
else:
return dict.__getitem__(self, index)

The problem with that is it requires the keys to be integers.

Yes, but wasn't it thought as a replacement for a list?

Reinhold
 
R

Roy Smith

Roy said:
Other possibility, probably faster when almost all keys in the range are in
the dictionary:

class sdict(dict):
def __getitem__(self, index):
if isinstance(index, slice):
d = {}
for key in xrange(slice.start, slice.stop, slice.step):
if key in self:
d[key] = self[key]
return d
else:
return dict.__getitem__(self, index)

The problem with that is it requires the keys to be integers.

Yes, but wasn't it thought as a replacement for a list?

Originally, yes. I went off on a tangent with string-keyed slices.
 
M

Mike Meyer

Much snobbery is directed at Visual Basic and other dialects of Basic
(they even have "basic" in their name), but I think VBA is better
designed than the prestigious C in some important ways.

C and VBA have totally different application domains. Given any two
such languages, it's almost a given that either one will be better
than the other "in some important ways".
It is trivial to allocate and pass multidimensional arrays in VBA, but
C requires expertise with pointers. The subroutine print_matrix can
query the dimensions of xmat, so they don't need to be passed as
separate arguments, as in C. The fact that is tricky to do simple
things is a sign of the poor design of C and similar languages, at
least for non-systems programming.

Since C was desinged as a systems programming language, evaluating
it's design for non-systems programming is a pointless exercise.

Personally, I think of C as a "portable assembler". You write
time-critical code, important applications, and the kernel in it. You
let your HLL compilers generate it. Other than that, you ignore it.
People bash VB as a language the corrupts a programmer and prevents him
from ever becoming a "real" programmer. Maybe VB programmers quickly
get so productive with it that they don't need to fuss with trickier
languages.

I've not really heard a lot of nasty comments about VB, though I have
about BASIC. Nuts, I've said a lot of nasty things about BASIC. But VB
is so far removed from the BASICs I worked with that the point is
moot.

I have heard that VB is much more productive than "conventional
languages" (usually meaning C/C++/COBOL/etc.). Then again, this is
ture of modern "scripting" languages. That's where VB started life,
even though it's now compiled. So this should come as no surprise.

<mieke
 
F

Fredrik Lundh

Sub print_matrix(xmat() As Double)
Debug.Print UBound(xmat, 1), UBound(xmat, 2)
'do stuff with xmat
End Sub

It is trivial to allocate and pass multidimensional arrays in VBA, but
C requires expertise with pointers. The subroutine print_matrix can
query the dimensions of xmat, so they don't need to be passed as
separate arguments, as in C. The fact that is tricky to do simple
things is a sign of the poor design of C

Sounds more like poor C skills on your part. Here's a snippet from the Python
Imaging Library which takes a 2D array (im) and creates another one (imOut).

Imaging
ImagingRankFilter(Imaging im, int size, int rank)
{
Imaging imOut = NULL;
int x, y;
int i, margin, size2;

/* check image characteristics */
if (!im || im->bands != 1 || im->type == IMAGING_TYPE_SPECIAL)
return (Imaging) ImagingError_ModeError();

/* check size of rank filter */
if (!(size & 1))
return (Imaging) ImagingError_ValueError("bad filter size");

size2 = size * size;
margin = (size-1) / 2;

if (rank < 0 || rank >= size2)
return (Imaging) ImagingError_ValueError("bad rank value");

/* create output image */
imOut = ImagingNew(im->mode, im->xsize - 2*margin, im->ysize - 2*margin);
if (!imOut)
return NULL;

... actual algorithm goes here ...

return imOut;
}

The "im" input object carries multidimensional data, as well as all other properties
needed to describe the contents. There are no separate arguments for the image
dimensions, nor any tricky pointer manipulations. A Python version of this wouldn't
be much different.

</F>
 
G

Guest

Antoon Pardon said:
The problem is that the fields in lst are associated
with a number that is off by one as they are normally
counted. If I go and ask my colleague which field
contains some specific data and he answers:
the 5th, I have to remind my self I want lst[4]

This is often a cause for errors.

It sounds like you should wrap that list in an object more reminding of
the source data, then.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,237
Messages
2,571,189
Members
47,823
Latest member
eipamiri

Latest Threads

Top