Promoting unsigned long int to long int

P

pereges

Hello, I'm trying to sort an array of pointers based on the values
they point to. I'm using the quick sort method. The array of pointers
is parent_vpa and left, right represent the indices. The axis can take
value 0,1,2 and hence i decided to use unsigned char in order to save
some space. I noticed that the program fails and terminates abnormally
when I use left and right as unsigned long but works well when its
data type is declared as long. I would have preferred unsigned long
because the array indices should not be negative and in the calling
function unsigned integers (left = 0, right = n - 1) are passed. I
think the error occurs because of when there is exactly one element in
the array , the j-- can cause j to assume garbage values. My question
is am I breaking any rule or commiting some illegal action by passing
an unsigned long int as a parameter in calling function but using a
long int to receive the variable ? My compiler doesn't complain.

#define ulong unsigned long int
#define uchar unsigned char

void quicksort(vector **parent_vpa, ulong left, ulong right, uchar
axis)
{
ulong i, j;
vector *pivotpoint;
vector *tempstore;

i = left;
j = right;
pivotpoint = parent_vpa[(left+right)/2];

while (i <= j)
{
while (parent_vpa->coord[axis] < pivotpoint->coord[axis])
{
i++;
}
while (parent_vpa[j]->coord[axis] > pivotpoint->coord[axis])
{
j--;
}

if (i <= j)
{
tempstore = parent_vpa;
parent_vpa = parent_vpa[j];
parent_vpa[j] = tempstore;
i++;
j--;
}
}

if (left < j)
{
quicksort(parent_vpa, left, j, axis);
}

if (i < right)
{
quicksort(parent_vpa, i, right, axis);
}

return;
}
 
J

Jens Thoms Toerring

pereges said:
Hello, I'm trying to sort an array of pointers based on the values
they point to. I'm using the quick sort method. The array of pointers
is parent_vpa and left, right represent the indices. The axis can take
value 0,1,2 and hence i decided to use unsigned char in order to save
some space. I noticed that the program fails and terminates abnormally
when I use left and right as unsigned long but works well when its
data type is declared as long. I would have preferred unsigned long
because the array indices should not be negative and in the calling
function unsigned integers (left = 0, right = n - 1) are passed. I
think the error occurs because of when there is exactly one element in
the array , the j-- can cause j to assume garbage values. My question
is am I breaking any rule or commiting some illegal action by passing
an unsigned long int as a parameter in calling function but using a
long int to receive the variable ? My compiler doesn't complain.

Using unsigned long should be completely ok.
#define ulong unsigned long int
#define uchar unsigned char
void quicksort(vector **parent_vpa, ulong left, ulong right, uchar
axis)
{
ulong i, j;
vector *pivotpoint;
vector *tempstore;
i = left;
j = right;

Lets look at the case you say seems to be the one where things
fail, left = right = 0. In that case i = j = 0
pivotpoint = parent_vpa[(left+right)/2];

Since i and j are equal you get into this loop.
while (i <= j)
{

The next two inner loops shouldn't do anything.
while (parent_vpa->coord[axis] < pivotpoint->coord[axis])
{
i++;
}
while (parent_vpa[j]->coord[axis] > pivotpoint->coord[axis])
{
j--;
}

if (i <= j)
{
tempstore = parent_vpa;
parent_vpa = parent_vpa[j];
parent_vpa[j] = tempstore;


This is useless if i == j.

Now i gets set to 1.

And j gets set to ULONG_MAX (unsigned integers "wrap around").
That will keep you in this loop for a loooong time and you will
try to access elements of your array that don't exist, rather
likely resulting in a crash.
if (left < j)
{
quicksort(parent_vpa, left, j, axis);
}
if (i < right)
{
quicksort(parent_vpa, i, right, axis);
}
return;
}

The simplest solution is to just bail out of the function if
left is equal to right - there's nothing to sort in that case.

Regards, Jens
 
P

pereges

Using unsigned long should be completely ok.
Ok.

And j gets set to ULONG_MAX (unsigned integers "wrap around").
That will keep you in this loop for a loooong time and you will
try to access elements of your array that don't exist, rather
likely resulting in a crash.

Yeah, I notice that too.
The simplest solution is to just bail out of the function if
left is equal to right - there's nothing to sort in that case.

You mean :

void quicksort (void quicksort(vector **parent_vpa, ulong left, ulong
right, uchar
axis)
{
ulong i, j;

if (left == right)
{
i = left;
j = right;

while (i <= j)
{
...
}
if (left < j)
{
quicksort(parent_vpa, left, j, axis);
}

if (i < right)
{
quicksort(parent_vpa, i, right, axis);
}

}
else
return;
 
J

Jens Thoms Toerring

pereges said:
On Jun 30, 5:07 pm, (e-mail address removed) (Jens Thoms Toerring) wrote:
Yeah, I notice that too.
You mean :
void quicksort (void quicksort(vector **parent_vpa, ulong left, ulong
right, uchar
axis)
{
ulong i, j;
if (left == right)

Not exactly;-) I they're the same there's nothing to sort since
your array only has a single element, so I meant

if ( left < right )
{
i = left;
j = right;
while (i <= j)
{
...
}
if (left < j)
{
quicksort(parent_vpa, left, j, axis);
}
if (i < right)
{
quicksort(parent_vpa, i, right, axis);
}
}
else
return;
Regards, Jens
 
K

Keith Thompson

pereges said:
#define ulong unsigned long int
#define uchar unsigned char
[...]

These types already have perfectly good names already. Why give them
new ones?

If you must rename them for some reason, use typedefs, not macros.
 
P

pereges

These types already have perfectly good names already. Why give them
new ones?

No purpose really. Just that in some of my functions, there are too
many parameters already and the whole thing doesn't fit in single
line.
If you must rename them for some reason, use typedefs, not macros.

ok
 
K

Keith Thompson

pereges said:
No purpose really. Just that in some of my functions, there are too
many parameters already and the whole thing doesn't fit in single
line.

So write it on multiple lines.

Given:
#define ulong unsigned long int
#define uchar unsigned char
or, preferably:
typedef unsigned long int ulong;
typedef unsigned char uchar;

If I'm reading your code and see a reference to "ulong", I can't
understand what it means until I've confirmed that "ulong" means
"unsigned long int" (which I'd probably write as "unsigned long").
And I have to wonder whether you might some day change the definition
so "ulong" means something else.

If you drop the definition of "ulong" and just write "unsigned long"
directly, I don't have to wonder; your code will be clearer.

Now if you want a typedef whose name says something about how you're
using the type, rather than how it's define, that's a different
matter.

You use "ulong" for array indices; it would make more sense to use
size_t.

You use "uchar" for a parameter that can only have the value 0, 1, or
2. I'd use int. Using unsigned char might save some space, but it's
just as likely to cost you in code size, since the compiler has to
generate code to expand the 1-byte value into a word before it can
operate on it, and to shrink it back down to 1 byte before storing it.
You also make the reader wonder whether there's some fundamental
reason for this value to fit into a byte (there isn't). If you had an
array of these things, it would make sense to use a smaller type.
Since it's just a single parameter, using int is fine.
 
J

Jens Thoms Toerring

No purpose really. Just that in some of my functions, there are too
many parameters already and the whole thing doesn't fit in single
line.

You don't have to put all arguments etc. on a single line, e.g.

void
quicksort( vector ** parent_vpa,
unsigned long left,
unsigned long right,
unsigned char axis )

will do nicely and may even increases readabilty (and there's
still space for adding a short comment;-)

Regards, Jens
 
P

pereges

So write it on multiple lines.

Given:
#define ulong unsigned long int
#define uchar unsigned char
or, preferably:
typedef unsigned long int ulong;
typedef unsigned char uchar;

If I'm reading your code and see a reference to "ulong", I can't
understand what it means until I've confirmed that "ulong" means
"unsigned long int" (which I'd probably write as "unsigned long").
And I have to wonder whether you might some day change the definition
so "ulong" means something else.

If you drop the definition of "ulong" and just write "unsigned long"
directly, I don't have to wonder; your code will be clearer.

Now if you want a typedef whose name says something about how you're
using the type, rather than how it's define, that's a different
matter.

You use "ulong" for array indices; it would make more sense to use
size_t.

You use "uchar" for a parameter that can only have the value 0, 1, or
2. I'd use int. Using unsigned char might save some space, but it's
just as likely to cost you in code size, since the compiler has to
generate code to expand the 1-byte value into a word before it can
operate on it, and to shrink it back down to 1 byte before storing it.
You also make the reader wonder whether there's some fundamental
reason for this value to fit into a byte (there isn't). If you had an
array of these things, it would make sense to use a smaller type.
Since it's just a single parameter, using int is fine.

If I had to store 80-90 of structures and each had a member say
'axis'. How much of a difference would using a unsigned char over int
would make (for that member) ? For my program, space as well as
performance are important.
 
P

pereges

You don't have to put all arguments etc. on a single line, e.g.

void
quicksort( vector ** parent_vpa,
unsigned long left,
unsigned long right,
unsigned char axis )

will do nicely and may even increases readabilty (and there's
still space for adding a short comment;-)

Just a style question (I'm cleaning up my code)

Which of the two in your opinion is better ?

void quicksort( vector ** parent_vpa,
unsigned long left,
unsigned long right,
unsigned char axis ) {

....
....
}

OR

void quicksort( vector ** parent_vpa,
unsigned long left,
unsigned long right,
unsigned char axis )
{

....
....
}
 
K

Keith Thompson

pereges said:
If I had to store 80-90 of structures and each had a member say
'axis'. How much of a difference would using a unsigned char over int
would make (for that member) ? For my program, space as well as
performance are important.

It depends.

If "axis" is the last member of your structure, it's likely that the
compiler will insert padding after it if it's an unsigned char anyway,
so declaring it as int would make no difference.

In the worst case (well, the worst plausible case), the cost of using
int would be 90 * (sizeof(int) - 1), which is trivial on anything
other than a small embedded system. But using int could easily save
code space, depending on the underlying architecture. And if you're
concerned about speed, it's likely (though by no means guaranteed)
that operations on int will be faster than operations on unsigned
char.

I think you're engaging in premature micro-optimization.
 
P

pereges

btw, i see a lot of people saying that size_t must be used over
unsigned long. The question is are there any fix guidelines as to when
i should and when i shouldn't use size_t. It seems to me that size_t
is most commonly used in three situations :

1. Size of array or any object.
2. Array indices.
3. Count of something.

Although I don't see why using unsigned long for any of the above
could be wrong. It may not be a necessity to use size_t
 
S

santosh

pereges said:
If I had to store 80-90 of structures and each had a member say
'axis'. How much of a difference would using a unsigned char over int
would make (for that member) ? For my program, space as well as
performance are important.

Well assuming that an int is four bytes and there is no padding involved
you could save 240 to 270 bytes. Not much. Under many systems memory is
allocated by the OS on a page basis and a common page size is 4 Kb. The
OS might allocate a complete page even for a one byte object. Also
objects passed to functions via the stack are typically padded to
maintain stack alignment.

So while saving storage is indeed important, I wouldn't worry about
replacing small amounts of int with char, unless you are targetting
severely restricted systems. OTOH I would definitely accept a chance to
save something on the order of a kilobyte or more.
 
J

Jens Thoms Toerring

pereges said:
On Jun 30, 9:44 pm, (e-mail address removed) (Jens Thoms Toerring) wrote:
Just a style question (I'm cleaning up my code)
Which of the two in your opinion is better ?
void quicksort( vector ** parent_vpa,
unsigned long left,
unsigned long right,
unsigned char axis ) {


void quicksort( vector ** parent_vpa,
unsigned long left,
unsigned long right,
unsigned char axis )
{
....
....
}

I usually go for the way that has more white-space, probably
because my eyes aren't as good anymore as they used to be,
so I usually put the opening brace on a new line (at least
when I am writing C, in Perl I do it the other way round;-).
So I personally don't mind much. Just pick something that
looks good to you and is logically consistent (or, if you
are working somewhere, use the in-house style) and use that
consistently.
Regards, Jens
 
K

Keith Thompson

pereges said:
btw, i see a lot of people saying that size_t must be used over
unsigned long. The question is are there any fix guidelines as to when
i should and when i shouldn't use size_t. It seems to me that size_t
is most commonly used in three situations :

1. Size of array or any object.
2. Array indices.
3. Count of something.

Although I don't see why using unsigned long for any of the above
could be wrong. It may not be a necessity to use size_t

size_t can represent the size in bytes of any object (since it's the
type of the result of the sizeof operator and of the argument to
malloc(). It can therefore represent the size in elements of any
array object, making it suitable for array indices.

The third is a bit more iffy; it depends on what you're counting.

But there are no such guarantees for unsigned long. Consider a
hypothetical system where unsigned long is 32 bits and size_t is 64
bits. You might have objects whose size cannnot be represented as an
unsigned long value.

The question is, why would you want to use unsigned long rather than
size_t?
 
S

santosh

pereges said:
btw, i see a lot of people saying that size_t must be used over
unsigned long. The question is are there any fix guidelines as to when
i should and when i shouldn't use size_t. It seems to me that size_t
is most commonly used in three situations :

1. Size of array or any object.

This is the purpose of size_t.
2. Array indices.

IMHO, any unsigned type of suitable range will do. Sometimes signed
types are convenient too, for certain forms of loops, and rarely, to
index with a negative offset.
3. Count of something.

Again a suitable unsigned type, not necessarily size_t should be okay.
Although I don't see why using unsigned long for any of the above
could be wrong. It may not be a necessity to use size_t

You are correct. In C, size_t has the only purpose of being large enough
to hold the size in bytes of the largest possible single object. Also
the sizeof operator yields a size_t value for related reasons. It is
not necessarily a suitable type for counting things are holding
offsets, indexes and such.

Generally, a careful examination of the origin, purpose and types of use
a value may be subject to will give you a clue as to the most suitable
type of object to represent it.
 
P

pereges

This is the purpose of size_t.


IMHO, any unsigned type of suitable range will do. Sometimes signed
types are convenient too, for certain forms of loops, and rarely, to
index with a negative offset.


Again a suitable unsigned type, not necessarily size_t should be okay.


You are correct. In C, size_t has the only purpose of being large enough
to hold the size in bytes of the largest possible single object. Also
the sizeof operator yields a size_t value for related reasons. It is
not necessarily a suitable type for counting things are holding
offsets, indexes and such.

Generally, a careful examination of the origin, purpose and types of use
a value may be subject to will give you a clue as to the most suitable
type of object to represent it.

Thanks for the clarification. After reading this, I guess I will
continue using unsigned long for all the 3 situations.
 
P

pereges

Not exactly;-) I they're the same there's nothing to sort since
your array only has a single element, so I meant

if ( left < right )

btw its still not working. I'm not sure why but I had to use long
instead of unsigned long to make it work.
 
P

pereges

size_t can represent the size in bytes of any object (since it's the
type of the result of the sizeof operator and of the argument to
malloc(). It can therefore represent the size in elements of any
array object, making it suitable for array indices.

The third is a bit more iffy; it depends on what you're counting.

But there are no such guarantees for unsigned long. Consider a
hypothetical system where unsigned long is 32 bits and size_t is 64
bits. You might have objects whose size cannnot be represented as an
unsigned long value.

The question is, why would you want to use unsigned long rather than
size_t?

Suppose I have a structure:

struct mesh
{
unsigned long nvert; /* number of vertices */
unsigned long ntri; /* number of triangles */
vector *vert; /* pointer to array of vertices */
triangle *tri; /* pointer to array of triangles */
};

Now, according to what you are saying nvert and ntri should have data
type size_t instead. In my program its possible for nvert and ntri to
be in millions. I have been adviced that size_t in many cases causes
loss of data, hence I'm skecptical about its use. Also, I would want
to keep things consistent.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,186
Members
46,740
Latest member
JudsonFrie

Latest Threads

Top