calling qsort

B

Ben Bacarisse

Joachim Schmitz said:
Still not perfect, overflow might occur. Better:
int mycmp2(const void *a, const void *b)
{
const char *p1=a, *p2=b;

return (*p1 > *p2) - (*p1 < *p2);
}

Help! How is one less prone to overflow than the other?
 
J

Joachim Schmitz

Ben said:
Help! How is one less prone to overflow than the other?

Sorry for the nonsens I wrote ... I meant that using

return *p1 - *p2;

as Richard at one point recommended, might overflow.

However, I'd still prefer

int mycmp2(const void *a, const void *b)
{
const char *p1=a, *p2=b;

return (*p1 > *p2) - (*p1 < *p2);
}

over

int mycmp2(const void *a, const void *b)
{
const char *p1=a, *p2=b;
int compare = 0;

if (*p1 < *p2) compare = -1;
if (*p1 > *p2) compare = 1;
return compare;
}


Bye, Jojo
 
R

Ralf Damaschke

6.3.1.3 refers to (int)LONG_MAX for example. (where LONG_MAX >
INT_MAX)
Computing (*(int *)p2)-(*(int *)p1) can result in integer overflow
which invokes undefined behavior.
Another (more clear) example is INT_MAX + 1. Just evaluating such
expression invokes undefined behavior.

Hmm, seems my search was too cursory. After doing a longer search
now I only find 6.2.5p9 which somewhat astonishes me as the
fundamental difference between signed and unsigned is introduced
in a mere subordinate clause (in a linguistic sense).

-- Ralf
 
R

Richard

Joachim Schmitz said:
Sorry for the nonsens I wrote ... I meant that using

return *p1 - *p2;

as Richard at one point recommended, might overflow.

It might. But since its a "home rolled" function tend to know the
numbers range. So perfect, and fast, when the ints do not reach ranges
where overflow might occur ....

I wonder what your code looks like where other subtractions with ints
are done. Do you compare them all the time before hand?
 
V

vippstar

Hmm, seems my search was too cursory. After doing a longer search
now I only find 6.2.5p9 which somewhat astonishes me as the
fundamental difference between signed and unsigned is introduced
in a mere subordinate clause (in a linguistic sense).

I'm curious, why "<no name given>"?
 
R

Ron Ford

Still not perfect, overflow might occur. Better:
int mycmp2(const void *a, const void *b)
{
const char *p1=a, *p2=b;

return (*p1 > *p2) - (*p1 < *p2);
}

Or with the casts:

int mycmp2(const void *a, const void *b)
{
return (*(const char *)a > *(const char *)b) - (*(const char *)a < *(const
char *)b);
}

but I like the former better.

Bye, Jojo

Nonsense, the possibility of overflow occurs when you return a number that
is the difference of the inputs, and it's completely unnecessary. See
Heathfield's analysis in this thread.

When you return a number that is necessarily 1, -1 or 0, how would it
overflow on a system with more than one bit bytes?

The critical thing you need here is the compare variable !=a or b, and its
inclusion was something I picked up from Glen Hermannsfeldt in c.l.f.:

integer function compare(i1,i2)
compare=0
if(i1>i2) compare=1
if(i1<i2) compare=-1
return
end

The necessity of the compare variable here is like the necessity of temp in
a reasonable swap.
 
R

Ron Ford

Ron Ford said:
Richard Heathfield posted:
pete said:
There are about (2 * INT_MAX) situations,
where the result is undefined. [...]
I agree with your reasoning but not with your mathematics.
I agree with your mathematics but not your calculation.

Touche'. :)

I think what you've shown is that you hit UB with just about an eighth of
the cases. There's 2 *INT_MAX + 1 integers, if INT_MIN = -INT_MAX. a
vector populated by two ints would then have (2*IM+1)**2 possiblities,
which is close to 4*IM**2. UB happens S(n+1) times=m(m+1)/2=(n+1)(n+2)/2
where n = IM. That's close to IM**2 /2. Their ratio is eight to one.

I can't think of a theoretical justifcation for why one eighth is the magic
ratio here.
During. I take it you've never heard of cafe church?

No. I hope for your sake that it's fundamentally different than the
fundies we have around here. I actually made it to services, and it was
all about forgiveness today, but I cannot find forgiveness for these
neocons. Are you supposed to forgive the robbers and usurpers who have
tagged you for seven years, and will continue to do so in the future? I'll
forgive Cheney when he's in Leavenworth or the Haag. (or hell)
 
K

Keith Thompson

Ron Ford said:
The critical thing you need here is the compare variable !=a or b, and its
inclusion was something I picked up from Glen Hermannsfeldt in c.l.f.:

integer function compare(i1,i2)
compare=0
if(i1>i2) compare=1
if(i1<i2) compare=-1
return
end

The necessity of the compare variable here is like the necessity of temp in
a reasonable swap.

No, the compare variable is necessary because of the way functions
work in Fortran. C's return statement lets you return a value; you
don't have to store that value in a variable (though of course you
can).

In C, I'd consider any of the following to be reasonable:

if (i1 < i2) {
return -1;
}
elsif (i1 > i2) {
return 1;
}
else {
return 0;
}

or:

int result = 0;
if (i1 < i2) {
result = -1;
}
if (i1 > i2) {
result = 1;
}
return result;

or:

return (i1 > i2) - (i1 < i2);

The latter is admittedly tricky. If you think it's *too* tricky, I
won't argue with you.
 
K

Keith Thompson

Ron Ford said:
No. I hope for your sake that it's fundamentally different than the
fundies we have around here. I actually made it to services, and it was
all about forgiveness today, but I cannot find forgiveness for these
neocons. Are you supposed to forgive the robbers and usurpers who have
tagged you for seven years, and will continue to do so in the future? I'll
forgive Cheney when he's in Leavenworth or the Haag. (or hell)

Can we *please* avoid discussing politics here?
 
R

Ron Ford

No, the compare variable is necessary because of the way functions
work in Fortran. C's return statement lets you return a value; you
don't have to store that value in a variable (though of course you
can).

§1.8 K&R speaks to this, specifically naming fortran. I haven't quite
sorted this out, and I find Cspeak and Fortranspeak difficult to combine in
this context.

F03 promises C interoperability. One of the things you can do is
explicitly "call by value." Unfortunately, no f03 compiler exists.:-(

In C, I'd consider any of the following to be reasonable:

if (i1 < i2) {
return -1;
}
elsif (i1 > i2) {
return 1;
}
else {
return 0;
}

or:

int result = 0;
if (i1 < i2) {
result = -1;
}
if (i1 > i2) {
result = 1;
}
return result;

or:

return (i1 > i2) - (i1 < i2);

The latter is admittedly tricky. If you think it's *too* tricky, I
won't argue with you.

I think this goes to legibility. Good code documents itself. There's no
significant performance loss/gain here, but in the latter scenario, it's
harder to see the zero outcome in particular. Add a bunch of pointers and
casts, and then you open the door for UB.

Of course, you weren't arguing.
 
K

Kenny McCormack

Can we *please* avoid discussing politics here?

Don't see how. We discuss religion all the time.
And, in modern day America, politics and religion are intimately bound
together. Deal with it.
 
B

Ben Bacarisse

Richard Heathfield said:
Back to topic: the observation that the ratio of UB additions of ints to
the total number of additions of ints is (approximately)

((INT_MAX * INT_MAX) / 2) / ((INT_MAX * 2) * (INT_MAX * 2)) =
(1/2) / (2 * 2) =
1/8

is interesting for two reasons: (1) it's so high, and (2) it's a constant
ratio (to a first approximation, anyway), regardless of the value of
INT_MAX.

Unless I am mistaken about your method it is high, in part, because you
count every addition twice: a+b and b+a. You stared with subtractions
so naturally you want to consider the order, but switching to
additions I'd just call it 1/4 which just seems to be what I'd expect.
 
C

Chris Torek

I think there's two issues here. An implementation could modify the call
to qsort. Chris Torek says this about that:

%- That said, I do not know of any C implementations that modify calls
%- to qsort().  Of course, this does not guarantee that none exist.
%- (Among other things, to cross comp.lang.c threads a bit, I have
%- never seen or used a [full] C99 implementation.  Perhaps one of
%- those -- several are rumored to exist -- might substitute some
%- qsort() calls, for instance.

I don't understand what "modify calls to qsort()" means.

This was in a side discussion about whether compilers are allowed
to recognize (and modify) calls to "standard library" functions.
We have examples of compilers that *do* this. Under gcc, with
certain invocations, this:

#include <stdio.h>

int main(void) {
printf("hello world\n");
return 0;
}

compiles to code that links with the puts() routine, not the printf()
routine. Clearly there is a call to printf(), but in fact there
is a call to puts(). The compiler has modified the source code,
so that there is no call to printf() in the object code.

Similarly, compilers may "inline" calls to memcpy() (many do),
sqrt() (some do), and so on. So if you, as a C programmer simply
using some implementation(s), attempt to do tricky things "behind
the back" of your implementation -- such as provide your own private
printf() or sqrt() or memcpy() -- you may get ambushed by the
compiler.

The main take-away lesson for the C programmer is, or at least
should be, "when you write a sort routine, do not call it qsort()".
Use some other name, such as "Hoare_QuickSort()" for instance.
Then, even if some fancy compiler recognizes and modifies calls to
qsort() -- just as gcc modifies calls to printf() and memcpy() --
it will not affect *your* code, because you did not use the name
"qsort". (In this respect, K&R, even K&R2, are doing a bit of a
disservice. To be fair, the original book was written during the
"wild west days of C" as it were, when C was whatever Dennis's
compiler did, and even the second edition predated the C standard.
The new standard gave compiler-writers more options than Dennis
had perhaps intended originally.)
 
K

Kenny McCormack

Not quite right. It is not a kosher comment in C90, which Ron is apparently
using (gcc -o sort -Wall -pedantic -ansi c6.c).

It is perfectly legal in C99 and a common extension in many compilers that
are not C99

Bye, Jojo

What JoJo is trying to say is this: Get rid of those crippling options
in the command line! Don't use options whose only function is to
cripple your compiler (meaning: -ansi and -pedantic).
 
R

Ralf Damaschke

wrote:
I'm curious, why "<no name given>"?

Above you see what my newsreader produced for the attribution
from the name in your "From:" header .
I felt that this looks too obscure and added a placeholder to
where the name should have been.

-- Ralf
 
J

James Kuyper

Ralf said:
wrote:


Above you see what my newsreader produced for the attribution
from the name in your "From:" header .
I felt that this looks too obscure and added a placeholder to
where the name should have been.

In my own newsreader, and when I use Google groups, his From: line is
properly filled in.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,820
Latest member
GilbertoA5

Latest Threads

Top