qsort

E

Eric Sosman

Lawrence said:
A simple example is an insertion sort. A basic implementation
compares adjacent elements. If they are out of order it swaps them and
moves on. Essentially one element moves along the array until it is in its
correct place. An optimisation of this is to copy that element to a
temporary variable and compare that against array elements, move them when
out of order and finally write your temporary back to the free slot in the
array when you find the right position. Esssentially you can optimise a
lot of swap operations into moves. With the requirements of C99 the
element must stay in the array for the purposes of comparison so this
optimisation is no longer possible. You can "make do" but what is the
benefit of potentially cripping the efficiency of the qsort()
implementation?

Leave the "moving" item in place in the array until you
discover its destination, then swap it with its neigbors in
one operation. One of Jon Bentley's "Programming Pearls"
columns shows a neat way of doing this, or you could use
memcpy-memmove-memcpy via a temporary area (note that the
restriction applies only when the comparison function is
called; the use of non-array memory is permitted at other
times during the sort).

There's another consideration: How would qsort() acquire
sufficient temporary memory to hold a copy of an item of
arbitrary size? malloc() is unattractive because it can
fail but qsort() cannot. qsort() could, of course, be a
wrapper around two implementations, one for use when malloc()
succeeds and a fall-back to cope with failure -- but that
seems unattractive, since the dynamic memory doesn't seem to
add much to the overall efficiency.
[...]
Yes, that's the problem, C99 appears to have invented a pointless and
counterproductive restriction.

The restriction doesn't seem counter-productive, but I
admit I don't see much point in it. A comparison function
might, perhaps, modify the pointed-to elements (in a way
that doesn't affect the computed result, of course, and after
casting away the `const'-ness), and such modifications would
be problematic if two versions of "the same" element could
co-exist simultaneously. But why would a comparison function
want to do such a thing? I remain baffled.
 
M

Michael Mair

pete said:
Michael said:
Lawrence said:
On Fri, 19 Nov 2004 14:22:07 +0100, Michael Mair wrote:

pete wrote:


Lawrence Kirby wrote:


On Thu, 18 Nov 2004 13:29:38 -0500, Eric Sosman wrote:


Lawrence Kirby wrote:

I don't see anything in the description of qsort()
that guarantees this.

The C89 wording isn't clear, but C99 makes it explicit:

7.20.5 Searching and sorting utilities
/2/ The implementation shall ensure that [...] both
arguments (when called from qsort), are pointers to
elements of the array. [...]

OK, it is required in C99.
Very strange though, it potentially reduces the
efficiency of the implementation for no obvious benefit.

Umh, I cannot think of a situation where you cannot make do
with pointers to array elements -- the information is hidden
behind void *, so I fail to see the restriction.

A simple example is an insertion sort. A basic implementation
compares adjacent elements. If they are out of order it swaps
them and moves on. Essentially one element moves along
the array until it is in its correct place.
An optimisation of this is to copy that element to a
temporary variable and compare that against array elements,
move them when out of order and finally write your temporary
back to the free slot in the
array when you find the right position.
Esssentially you can optimise a lot of swap operations into moves.
With the requirements of C99 the
element must stay in the array for the purposes of comparison
so this optimisation is no longer possible.
You can "make do" but what is the
benefit of potentially cripping the efficiency of the qsort()
implementation?

Okay, so we essentially have to switch from a few memcpy()s to
one memmove(). For small partitions left by a quicksort or a
"nearly" sorted array, we probably will lose something.
I guess Shell sort would be even more pathological as you cannot
use one memmove but would have to first find out where everything
goes and then loop again to do the memcpy()s.


Shell sort with a temp variable
doesn't need any relooping that I'm aware of.

We are still in the setting that you *must* *not*
use the temporary variable for comparing -- this is
the whole point. If we may use it, of course no relooping
is necessary. But the requirement that both arguments
passed to compar are pointers to objects residing _inside_
the array (which is what we are talking about) effectively
makes it impossible to avoid swapping the "equidistant"
non-adjacent elements if not by some type of relooping.

Example: We look at the N-distant subsets and are looking
for the right position of some element X within its
remainder class. We compare X with the elements with
index index(X)-i*N. If we use a temporary variable
and have to copy an element to a position with higher
index, we have to copy the temporary object into the
"free slot" in order to be able to compare again.
With insertion sort, we have N==1. So, we can just
find out where exactly X goes, copy it to temp, memmove()
the elements in between one element size "up" and
insert X which should be cheaper than doing 2*(number of
elements to move) memcpy()s.
Alternatively, with the information available, we could
just replace the memmove() by (number of elements to move)+1
memcpy()s. In the non-contiguous setting of Shell sort (N>1),
This is exactly what I would do. The relooping probably is
cheaper than the additional memcpy() calls.

e_type is a user defined nonarray type.
GT is a user defined "greater than" macro.
If e_type were to be an array, then all of the e_type
assignments would have to be replaced with memcpy calls.

void s3sort(e_type *array, size_t nmemb)
{
e_type temp, *i, *j, *k, *after;

after = array + nmemb;
if (nmemb > (size_t)-1 / 3) {
nmemb = (size_t)-1 / 3;
}
do {
for (i = array + nmemb; i != after; ++i) {
j = i - nmemb;
if (GT(j, i)) {
k = i;
temp = *k;
do {
*k = *j;
k = j;
if (nmemb + array > j) {
break;
}
j -= nmemb;
} while (GT(j, &temp));
^^^^^^^^^^^^
Here goes your approach.
*k = temp;
}
}
nmemb = nmemb != 2 ? 3 * nmemb / 7 : 1;
} while (nmemb != 0);
}

I hope I made the problem clearer now.

The references to memcpy and memmove
suggest that you're talking about writing qsort in C.
qsort can't depend on malloc (and friends) to provide the temp object,
because malloc may fail with valid arguments, while qsort may not.
That means that if qsort is going to use a temp variable,
it must have an alternative way if malloc fails.
Automatic variables are unsuitable for temp types larger than char,
because their alignment requirements may be less than their size,
while array elements of the same size may need to be aligned
at their full size.

This is a problem I silently excluded and would rather see
as separate.

Any array sorting algorithm that compares and swaps
nonadjacent elements, like quicksort does,
is going to have a very hard time being made stable,
if stable sorting is what you're talking about.

Yep, that is what I was talking about -- I was unaware of
the term "stable sorting", so I only now found out that this
is what I wanted to say...


Cheers
Michael
 
M

Michael Mair

Eric said:
Leave the "moving" item in place in the array until you
discover its destination, then swap it with its neigbors in
one operation. One of Jon Bentley's "Programming Pearls"
columns shows a neat way of doing this, or you could use
memcpy-memmove-memcpy via a temporary area (note that the
restriction applies only when the comparison function is
called; the use of non-array memory is permitted at other
times during the sort).

Do you remember which column? I did not want to go through all
the material of the 2nd Edition as it is on the web and could
not find the specific one googling.

There's another consideration: How would qsort() acquire
sufficient temporary memory to hold a copy of an item of
arbitrary size? malloc() is unattractive because it can
fail but qsort() cannot. qsort() could, of course, be a
wrapper around two implementations, one for use when malloc()
succeeds and a fall-back to cope with failure -- but that
seems unattractive, since the dynamic memory doesn't seem to
add much to the overall efficiency.

How would you do that in a portable way? My first thought
was bytewise XORing for swapping two elements but this
usually is the worst way to do it...
However, without having any kind of dynamic memory allocation
the only other thing is having a (medium-sized) fixed temp-array and
memcpy()ing and memmove()ing junks of at most the size of that
array. Sounds unattractive, too.

[...]
Yes, that's the problem, C99 appears to have invented a pointless and
counterproductive restriction.

The restriction doesn't seem counter-productive, but I
admit I don't see much point in it. A comparison function
might, perhaps, modify the pointed-to elements (in a way
that doesn't affect the computed result, of course, and after
casting away the `const'-ness), and such modifications would
be problematic if two versions of "the same" element could
co-exist simultaneously. But why would a comparison function
want to do such a thing? I remain baffled.

Perhaps a pet algorithm of someone. As I wondered aloud
elsewhere: Does anyone know a reason for this?

I think the example of an underlying Shell sort shows that you
would have to change qsort() implementation to one with more
cost -- which is really unnecessary.


Cheers
Michael
 
E

Eric Sosman

Michael said:
Do you remember which column? I did not want to go through all
the material of the 2nd Edition as it is on the web and could
not find the specific one googling.

I don't recall the column, but I do recall the method.
The challenge is to exchange two adjacent regions of an
array, when the two regions may not be the same size:

abcRSTUVWXYZ -> RSTUVWXYZabc

In the insertion-sort context, "abc" has been determined
to belong just after its neighbors "RSTUVWXYZ," and the
challenge is to get it there. With a temporary area big
enough for "abc" this is simple: memcpy() "abc" to the
temporary, slide the rest leftward with memmove(), and
them memcpy() the temporary to the right-hand end. You
could even make do with a smaller temporary area by using
multiple "rotations:"

abcRSTUVWXYZ -> bcRSTUVWXYZa
-> cRSTUVWXYZab
-> RSTUVWXYZabc

This is unattractive, though, because it copies all the
data multiple times.

Bentley tells the sad tale of many programmers who
tried to make the exchange efficiently but using only a
fixed amount of temporary space. The efforts, he said,
were discouraging: horribly complex code with a tendency
to fall over and die when confronted by edge cases.

Look what happens, though, if you reverse the array
in place, byte-for-byte (or element-for-element):

abcRSTUVWXYZ -> ZYXWVUTSRcba

The two segments are now in the proper positions, but
each segment has been reversed. Correct this by
reversing each segment individually:

ZYXWVUTSRcba -> RSTUVWXYZcba
^^^^^^^^^ ^^^^^^^^^

RSTUVWXYZcba -> RSTUVWXYZabc
^^^ ^^^

.... and the job is complete.

"But what about efficiency? Surely there's a better
way than moving every byte twice!" Bentley thinks not,
based on observation of the complicated but "efficient"
codes. Compilers can easily unroll and otherwise optimize
the simple loop of an in-place reversal, but the more
complicated codes didn't lend themselves to optimization:
too many decisions, too many branches, and so on. Not
only was the "inefficient" method easy to write and easy
to verify, but it actually turned out to be faster than
the (often bug-ridden) "efficient" methods.

If sufficient temporary space is available, I think
the memcpy/memmove/memcpy method is probably best. But
if you've only got a fixed amount of extra space and it
isn't quite big enough, the three-rotation method has
much to recommend it.
 
M

Michael Mair

Eric said:
I don't recall the column, but I do recall the method.
The challenge is to exchange two adjacent regions of an
array, when the two regions may not be the same size:

abcRSTUVWXYZ -> RSTUVWXYZabc

In the insertion-sort context, "abc" has been determined
to belong just after its neighbors "RSTUVWXYZ," and the
challenge is to get it there. With a temporary area big
enough for "abc" this is simple: memcpy() "abc" to the
temporary, slide the rest leftward with memmove(), and
them memcpy() the temporary to the right-hand end. You
could even make do with a smaller temporary area by using
multiple "rotations:"

abcRSTUVWXYZ -> bcRSTUVWXYZa
-> cRSTUVWXYZab
-> RSTUVWXYZabc

This is unattractive, though, because it copies all the
data multiple times.

Bentley tells the sad tale of many programmers who
tried to make the exchange efficiently but using only a
fixed amount of temporary space. The efforts, he said,
were discouraging: horribly complex code with a tendency
to fall over and die when confronted by edge cases.

Look what happens, though, if you reverse the array
in place, byte-for-byte (or element-for-element):

abcRSTUVWXYZ -> ZYXWVUTSRcba

The two segments are now in the proper positions, but
each segment has been reversed. Correct this by
reversing each segment individually:

ZYXWVUTSRcba -> RSTUVWXYZcba
^^^^^^^^^ ^^^^^^^^^

RSTUVWXYZcba -> RSTUVWXYZabc
^^^ ^^^

... and the job is complete.

"But what about efficiency? Surely there's a better
way than moving every byte twice!" Bentley thinks not,
based on observation of the complicated but "efficient"
codes. Compilers can easily unroll and otherwise optimize
the simple loop of an in-place reversal, but the more
complicated codes didn't lend themselves to optimization:
too many decisions, too many branches, and so on. Not
only was the "inefficient" method easy to write and easy
to verify, but it actually turned out to be faster than
the (often bug-ridden) "efficient" methods.

If sufficient temporary space is available, I think
the memcpy/memmove/memcpy method is probably best. But
if you've only got a fixed amount of extra space and it
isn't quite big enough, the three-rotation method has
much to recommend it.

Thank you :)))

-Michael
 
C

Charlie Gordon

[...]
The restriction doesn't seem counter-productive, but I
admit I don't see much point in it. A comparison function
might, perhaps, modify the pointed-to elements (in a way
that doesn't affect the computed result, of course, and after
casting away the `const'-ness), and such modifications would
be problematic if two versions of "the same" element could
co-exist simultaneously. But why would a comparison function
want to do such a thing? I remain baffled.

Such a function could do this for statistical purposes.
Another possibility is that the comparison could require acquiring external
resources to which a handle may need to be stored in the object.
Some other cacheing mechanism might entail modifying the compared objects...
Modifying the objects during the comparison violates the prototype for the
comparison function where the parameters are defined as const void *, so this
was probably not the intent of the standard, thus not the reason for the
restriction.
I am more inclined to assume that specific alignment constraints unknown to
qsort(), and not necessarily matched by malloc(), make it a good reason to
enforce that the objects always be array elements at comparison time.

Chqrlie.
 
L

Lawrence Kirby

On Fri, 19 Nov 2004 16:13:08 +0000, pete wrote:

....
The references to memcpy and memmove
suggest that you're talking about writing qsort in C.
qsort can't depend on malloc (and friends) to provide the temp object,

Certainly it can.
because malloc may fail with valid arguments, while qsort may not.
That means that if qsort is going to use a temp variable,
it must have an alternative way if malloc fails.

That's one solution to the problem.

But consider that failing due to lack of resources could in principle
happen to ANY standard library function, or indeed any user-defined
function. If the C standard forbade such failure the language would be to
all intents and purposes unimplementable.

qsort() doesn't have a way to flag a failure to complete the sort to the
caller, so in such circumstances it can't return. It CAN cause the program
execution to terminate abnormally, e.g. by calling abort(). This isn't
really any different to a program "crashing" at any point due to lack of
stack space (on relevant stack based implementations).
Automatic variables are unsuitable for temp types larger than char,
because their alignment requirements may be less than their size,
while array elements of the same size may need to be aligned
at their full size.

However qsort() is part of the implementation so it can be written with
knowledge of implementation alignment characteristics, in the same way
that malloc() can. Of course allocation failure is a possibility here
too.

Lawrence
 
P

pete

Lawrence said:
On Fri, 19 Nov 2004 16:13:08 +0000, pete wrote:

...


Certainly it can.


That's one solution to the problem.

But consider that failing due to lack of resources could in principle
happen to ANY standard library function, or indeed any user-defined
function.
If the C standard forbade such failure the language would be to
all intents and purposes unimplementable.

If an implementation cannot translate and execute a program
like the one described in N869 5.2.4.1, [#1],
then it is not a conforming implementation.
qsort() doesn't have a way to flag a failure to complete
the sort to the
caller, so in such circumstances it can't return.
It CAN cause the program
execution to terminate abnormally, e.g. by calling abort().

You just made that up. There's nothing in the standard which suggests
that qsort can call abort when qsort is called with valid arguments.
 
L

Lawrence Kirby

Lawrence said:
On Fri, 19 Nov 2004 16:13:08 +0000, pete wrote:

...


Certainly it can.


That's one solution to the problem.

But consider that failing due to lack of resources could in principle
happen to ANY standard library function, or indeed any user-defined
function.
If the C standard forbade such failure the language would be to
all intents and purposes unimplementable.

If an implementation cannot translate and execute a program
like the one described in N869 5.2.4.1, [#1],
then it is not a conforming implementation.

But it is up to the implementation to choose the program. It will clearly
choose a program that it can translate and execute. This does not require
it to be able to translate and execute successfully any other program. The
compiler could be written to recognise this program specifically and just
generate code to produce the correct output. Indeed the program could be
designed to produce no output.
You just made that up. There's nothing in the standard which suggests
that qsort can call abort when qsort is called with valid arguments.

I'll try to be clearer. Any program (except the one designated containing
examples of the translation limits) can fail in execution at any point
without creating a conformance issue for the implementation. If qsort() is
part of the implementation and if it decides it can't complete the
operation, it is at perfect liberty to cause the execution of the program
to fail. I suggested calling abort() as a means to achieve this. If you
don't like that it isn't a problem, it will just have to use some
"internal" mechanism, e.g. it could do the equivalent of causing a stack
overflow (or whatever makes sense on the implementation).

Lawrence
 
P

pete

Lawrence said:
Lawrence said:
On Fri, 19 Nov 2004 16:13:08 +0000, pete wrote:

...

The references to memcpy and memmove
suggest that you're talking about writing qsort in C.
qsort can't depend on malloc (and friends)
to provide the temp object,

Certainly it can.

because malloc may fail with valid arguments,
while qsort may not.
That means that if qsort is going to use a temp variable,
it must have an alternative way if malloc fails.

That's one solution to the problem.

But consider that failing due to lack of resources
could in principle
happen to ANY standard library function, or indeed any user-defined
function.
If the C standard forbade such failure the language would be to
all intents and purposes unimplementable.

If an implementation cannot translate and execute a program
like the one described in N869 5.2.4.1, [#1],
then it is not a conforming implementation.

But it is up to the implementation to choose the program.
It will clearly choose a program that it can translate and execute.
This does not require it to be able to translate and execute
successfully any other program. The compiler could be written to
recognise this program specifically and just
generate code to produce the correct output.
Indeed the program could be designed to produce no output.

It has nothing to do with choosing.
Anything that can't translate and execute a C program,
isn't a C implementation. It's as simple as that.
 
K

Keith Thompson

pete said:
Lawrence Kirby wrote: [...]
But it is up to the implementation to choose the program.
It will clearly choose a program that it can translate and execute.
This does not require it to be able to translate and execute
successfully any other program. The compiler could be written to
recognise this program specifically and just
generate code to produce the correct output.
Indeed the program could be designed to produce no output.

It has nothing to do with choosing.
Anything that can't translate and execute a C program,
isn't a C implementation. It's as simple as that.

No, it's not as simple as that.

The following is a C program. On a system with infinite available
memory, it will never terminate. No real system can execute it
(unless it optimizes away the recursive calls).

static void recurse(void)
{
char data[1024];
recurse();
recurse();
}

int main(void)
{
recurse();
}

This is an extreme case, but all implementations have capacity
limitations.
 
G

Guillaume

The following is a C program. On a system with infinite available
memory, it will never terminate. No real system can execute it
(unless it optimizes away the recursive calls).

static void recurse(void)
{
char data[1024];
recurse();
recurse();
}

int main(void)
{
recurse();
}

Well, a clever C compiler might just compile this as a single infinite
loop, but I wouldn't count on it. This piece of code will confuse a lot
of them as to optimizing.
This is an extreme case, but all implementations have capacity
limitations.

This is an infinite recursion.

Therefore, it goes beyond the scope of any programming language except
maybe a formal Turing machine. In the real world, infinite recursion
makes no sense. Can we really speak of capacity limitations? Yes and no.
What is infinite capacity, to begin with? What kind of problem would an
infinite recursion be able to solve? And lastly, does infinity really
exist? Is our universe finite or infinite?

Are we going too far? ;-)
 
K

Keith Thompson

Guillaume said:
The following is a C program. On a system with infinite available
memory, it will never terminate. No real system can execute it
(unless it optimizes away the recursive calls).
static void recurse(void)
{
char data[1024];
recurse();
recurse();
}
int main(void)
{
recurse();
}

Well, a clever C compiler might just compile this as a single infinite
loop, but I wouldn't count on it. This piece of code will confuse a lot
of them as to optimizing.

Which is what I meant by "unless it optimizes away the recursive
calls". (I should have referred to transforming them into an infinite
loop; my wording implied eliminating them altogether, which is of
course invalid.)
This is an infinite recursion.

Therefore, it goes beyond the scope of any programming language except
maybe a formal Turing machine. In the real world, infinite recursion
makes no sense. Can we really speak of capacity limitations? Yes and no.
[snip]

I agree with your last statement, except for the "and no" part.

All implementations have capacity limitations, some more stringent
than others. If a program has, for example, allocated *almost* all
the memory available to it, a call to qsort() (or to any function) may
fail due to a lack of memory. Since qsort() in particular is likely
to be recursive, it's entirely possible for the initial call to
qsort() to succeed, but for the program to run out of memory while
qsort() is running. There's no mechanism for qsort() to report this
(or any other failure) to the caller; aborting the program is probably
the most reasonable response.

The standard provides a mechanism to report running out of heap
(malloc() returning a null pointer), but not to report running out of
stack. (The standard doesn't use the terms "heap" and "stack", but
you get the idea.)

The standard also allows an implementation to be very sloppy about
documenting its capacity limitations. Since the amount of memory
available to a program can vary randomly depending on the current
state of the system, that's probably unavoidable.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,156
Messages
2,570,878
Members
47,405
Latest member
DavidCex

Latest Threads

Top