More pythonic shell sort?

A

akameswaran

Disclaimer - I recognize this is not a practical exercise. There are
many implementations around that would do the job better, more
efficiently (Meaning in C) or whatever.

I caught some thread about sorting and started thinking about shell
sort.... and as part of my trying to pythonise my mind and break my
java mindset

So I decided to tackle this old school problem with the python mindset.


I played around with some list comprehensions, trying to use slicing
inteligently etc. Anyway this is what I came up with. If anyone has
suggestions on a more pythonic way to go (all attempts at using list
comprehensions turned into unruly rubbish quite quickly) I'd
appreciate the input. An aside - can anyone tell me where the use +=
and -= is documented? it works but I can't find it in my docs.
(don't ask about shellSorters 1 thru 3)

class shellSorter4(object):

def __init__(self,gapSeq):
self.gapSeq = gapSeq # gap sequences are
notoriously hard to tune
self.gapSeq.sort(reverse=True)

def sort(self,myList):
for gap in self.gapSeq:
for i in range(1,gap+1):
self.gapInsertionSort(myList,i,gap)

def gapInsertionSort(self,theList,start,gap):
for i in range(start,len(theList),gap):
j = i
curElem = theList[j]
while (j > 0) and (theList[j-gap] > curElem):
j -=gap # undocumented
feature??
if j!=i:
theList.insert(j,theList.pop(i))
theList.insert(j+gap,theList.pop(j+1))
 
J

John Machin

Disclaimer - I recognize this is not a practical exercise. There are
many implementations around that would do the job better, more
efficiently (Meaning in C) or whatever.

I caught some thread about sorting and started thinking about shell
sort.... and as part of my trying to pythonise my mind and break my
java mindset

So I decided to tackle this old school problem with the python mindset.


I played around with some list comprehensions, trying to use slicing
inteligently etc.

Slicing? I don't see any, and besides you don't want to be copying
chunks of your list anyway -- see below.
Anyway this is what I came up with. If anyone has
suggestions on a more pythonic way to go (all attempts at using list
comprehensions turned into unruly rubbish quite quickly) I'd
appreciate the input. An aside - can anyone tell me where the use +=
and -= is documented? it works but I can't find it in my docs.
(don't ask about shellSorters 1 thru 3)

class shellSorter4(object):

def __init__(self,gapSeq):
self.gapSeq = gapSeq # gap sequences are
notoriously hard to tune
self.gapSeq.sort(reverse=True)

Not exactly stand-alone, if it needs another sort for its initialisation :)
def sort(self,myList):
for gap in self.gapSeq:
for i in range(1,gap+1):

Use xrange instead of range, to save memory.
self.gapInsertionSort(myList,i,gap)

def gapInsertionSort(self,theList,start,gap):

Having a method call here will slow it down somewhat.

for i in range(start,len(theList),gap):
j = i
curElem = theList[j]
while (j > 0) and (theList[j-gap] > curElem):

I think that you mean "while j >= gap" ... otherwise theList[j-gap] will
be using Python's wrap-around negative subscripting to access something
at the other end of the list! And you'll never know, because the last
pass with gap == 1 will (laboriously) clean up any mess. You should
instrument your code with counts of comparisons and rearrangements, and
compare those with examples from the textbooks and the literature *AND*
your pencil-and-paper experiments with a gap-list of (say) [5, 1] on a
small number of elements.
j -=gap # undocumented
feature??
if j!=i:
theList.insert(j,theList.pop(i))
theList.insert(j+gap,theList.pop(j+1))

Quadruple yuck. Each pop() and insert() could involve moving many list
elements unnecessarily. It's not "pythonic" to use advanced language
features when they are inefficient for the job at hand. All you need is
len(alist), alist = value, and value = alist. A simple translation
of a shellsort written in C would do the job admirably.

Perhaps to Pythonise your mind you should try an object-oriented example
-- one that truly needs a class or two; your shellsort doesn't really
need a class (that's your Java background shining through!).

HTH,
John
 
A

akameswaran

Thanks for the critique.

John said:
Slicing? I don't see any, and besides you don't want to be copying
chunks of your list anyway -- see below.

That was two of the other sort implementations I tried - far uglier
than this. there were sorts 1-3. And while I liked the idea of using
the list manipulations built into python - they absolutely were
horrible for this. I was almost tempted to post that as well - but it
really was an abomination.
Not exactly stand-alone, if it needs another sort for its initialisation :)


Use xrange instead of range, to save memory.

for large sorts, yeah xrange is good. with the runs I was doing, it
was irrelevant
Having a method call here will slow it down somewhat.

true enough, performance wasn't a consideration on this one

for i in range(start,len(theList),gap):
j = i
curElem = theList[j]
while (j > 0) and (theList[j-gap] > curElem):

I think that you mean "while j >= gap" ... otherwise theList[j-gap] will
be using Python's wrap-around negative subscripting to access something
at the other end of the list! And you'll never know, because the last
pass with gap == 1 will (laboriously) clean up any mess. You should
instrument your code with counts of comparisons and rearrangements, and
compare those with examples from the textbooks and the literature *AND*
your pencil-and-paper experiments with a gap-list of (say) [5, 1] on a
small number of elements.

good catch on that one. Didn't think about it. and with the low
probability of it happening it WOULD have been missed - doh. The
versions I was running were instrumented - but took that out for
clarities sake. However j>= gap would not work - say on the first
run where gap should be N/2 (just the worst case) but obviously the
first elements would never get sorted. . An additional test for j-gap
0 would suffice.
j -=gap # undocumented
feature??
if j!=i:
theList.insert(j,theList.pop(i))
theList.insert(j+gap,theList.pop(j+1))

Quadruple yuck. Each pop() and insert() could involve moving many list
elements unnecessarily. It's not "pythonic" to use advanced language
features when they are inefficient for the job at hand. All you need is
len(alist), alist = value, and value = alist. A simple translation
of a shellsort written in C would do the job admirably.

I didn't really think of pop and insert as advanced features. But it
was part of my goals to use features that exist in python - remembering
that they are python lists, not funky arrays.
As far as C goes, that was shellSorter1. Since performance wasn't my
goal - I rather liked the expresiveness of the pop and insert. makes
it almost read like a book.
Perhaps to Pythonise your mind you should try an object-oriented example
-- one that truly needs a class or two; your shellsort doesn't really
need a class (that's your Java background shining through!).

True enough - java is hard to break sometimes. I was tempted to add
some try catches just for fun too. I've written quite a bit that is
more complex - then I get caught in just making it work. Hence the
point of writing simple well understood problems.
 
J

John Machin

Thanks for the critique.

John said:
On 10/06/2006 7:00 AM, (e-mail address removed) wrote: [snip]
def sort(self,myList):
for gap in self.gapSeq:
for i in range(1,gap+1):
self.gapInsertionSort(myList,i,gap)

def gapInsertionSort(self,theList,start,gap):
for i in range(start,len(theList),gap):
j = i
curElem = theList[j]
while (j > 0) and (theList[j-gap] > curElem):

[some other comments and responses snipped above, to show the relevant
code in one piece]
I think that you mean "while j >= gap" ... otherwise theList[j-gap] will
be using Python's wrap-around negative subscripting to access something
at the other end of the list! And you'll never know, because the last
pass with gap == 1 will (laboriously) clean up any mess. You should
instrument your code with counts of comparisons and rearrangements, and
compare those with examples from the textbooks and the literature *AND*
your pencil-and-paper experiments with a gap-list of (say) [5, 1] on a
small number of elements.

good catch on that one. Didn't think about it. and with the low
probability of it happening it WOULD have been missed - doh. The
versions I was running were instrumented - but took that out for
clarities sake. However j>= gap would not work - say on the first
run where gap should be N/2 (just the worst case) but obviously the
first elements would never get sorted. . An additional test for j-gap
0 would suffice.

I don't understand your response. Let len(theList) == N and assume that
the first gap is N/2. The first time that gapInsertionSort is called,
start will be 1 and gap will be N/2.
Loop index i becomes start i.e. 1.
j becomes 1.
while (j > 0) [True, must test the other part of the "and"] and
.... (theList[j-gap] > curElem) *BUT* the subscript is 1-(N/2) which is
negative and thus will wrap around and cause an utterly meaningless
comparison.

In fact it will happen the first time in gapInsertionSort for any gap >
1. In one trial I did sorting list('qwertyasdfgzxcvb') (i.e. N == 15) it
made 20 meaningless comparisons.

Question 1: What do you mean by "the low probability of it happening"?

Question 2: I don't understand "obviously the first elements would never
get sorted", as the last pass with gap == 1 cleans up any errors of
commission or omission made by the earlier passes. Could you please
explain that? Or alternatively can you show your code with the "an
additional test for j-gap > 0 would suffice" fix applied to it?

Question 3: All implementations and descriptions of shellshort that I
have found consist of THREE nested loops. This includes Knuth's TAOCP
and K&R 2nd edition. Yours consists of FOUR nested loops (FIVE if you
count the looping inside insert() and pop()). Where did you get your
implementation from?
j -=gap # undocumented
feature??
if j!=i:
theList.insert(j,theList.pop(i))
theList.insert(j+gap,theList.pop(j+1))
Quadruple yuck. Each pop() and insert() could involve moving many list
elements unnecessarily. It's not "pythonic" to use advanced language
features when they are inefficient for the job at hand. All you need is
len(alist), alist = value, and value = alist. A simple translation
of a shellsort written in C would do the job admirably.

I didn't really think of pop and insert as advanced features.


They are relatively advanced compared with subscripting, which is *all*
that is needed to implement shellsort.
But it
was part of my goals to use features that exist in python - remembering
that they are python lists, not funky arrays.

What kind of arrays are "funky" arrays?
As far as C goes, that was shellSorter1. Since performance wasn't my
goal - I rather liked the expresiveness of the pop and insert. makes
it almost read like a book.

IMHO, like a book written by James Joyce. :)
True enough - java is hard to break sometimes. I was tempted to add
some try catches just for fun too.

Try adding some assertions, like:
assert 0 <= j-gap < n
I've written quite a bit that is
more complex - then I get caught in just making it work.

So it seems. Simple design, testing, desk-checking, testing, *permanent*
instrumentation, testing, *permanent* assertions, testing, ... all these
can help :)

Cheers,
John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,152
Members
46,697
Latest member
AugustNabo

Latest Threads

Top