size_t or int for malloc-type functions?

J

jacob navia

Rcently I posted code in this group, to help a user
that asked to know how he could find out the size of
a block allocated with malloc.

As always when I post something, the same group
of people started to try to find possible errors,
a harmless passtime they seem to enjoy.

One of their remarks was that I used "int" instead of
"size_t" for the input of my allocator function.

As I explained, I prefer a signed type to the
unsigned size_t because small negative number will be
confused with huge integers when interpreted as unsigned.

I researched a bit the subject, and I found a type

ssize_t

The name is defined for instance in
http://www.delorie.com/gnu/docs/glibc/libc_239.html

Data Type: ssize_t
This data type is used to represent the sizes of blocks that can be
read or written in a single operation. It is similar to size_t, but must
be a signed type.

Another reference to this type appears in:
http://bochs.sourceforge.net/cgi-bin/lxr/ident?i=ssize_t
with
#define ssize_t long

This concern with the usage of an unsigned type that can be
easily lead to errors (of course only for people that do
make errors like me...) is also expressed in the document
ISO/IEC JTC1 SC22 WG14 N1135 :
"Specification for Safer, More Secure C Library Functions"
where they propose:

Extremely large object sizes are frequently a sign that an object’s size
was calculated incorrectly. For example, negative numbers appear as very
large positive numbers when converted to an unsigned type like size_t.

Also, some implementations do not support objects as large as the
maximum value that can be represented by type size_t.

For those reasons, it is sometimes beneficial to restrict the range of
object sizes to detect programming errors.

They propose having an unsigned rsize_t, but a macro RSIZE_MAX that
would limit the range of the object size.

I post this to document why having an "int" as an argument to a
malloc-type function is not such a bad idea.

Your opinion may differ.

jacob
 
P

P.J. Plauger

Rcently I posted code in this group, to help a user
that asked to know how he could find out the size of
a block allocated with malloc.

As always when I post something, the same group
of people started to try to find possible errors,
a harmless passtime they seem to enjoy.

One of their remarks was that I used "int" instead of
"size_t" for the input of my allocator function.

As I explained, I prefer a signed type to the
unsigned size_t because small negative number will be
confused with huge integers when interpreted as unsigned.

I researched a bit the subject, and I found a type

ssize_t

The name is defined for instance in
http://www.delorie.com/gnu/docs/glibc/libc_239.html

Data Type: ssize_t
This data type is used to represent the sizes of blocks that can be
read or written in a single operation. It is similar to size_t, but must
be a signed type.

Another reference to this type appears in:
http://bochs.sourceforge.net/cgi-bin/lxr/ident?i=ssize_t
with
#define ssize_t long

This concern with the usage of an unsigned type that can be
easily lead to errors (of course only for people that do
make errors like me...) is also expressed in the document
ISO/IEC JTC1 SC22 WG14 N1135 :
"Specification for Safer, More Secure C Library Functions"
where they propose:

Extremely large object sizes are frequently a sign that an object’s size
was calculated incorrectly. For example, negative numbers appear as very
large positive numbers when converted to an unsigned type like size_t.

Also, some implementations do not support objects as large as the
maximum value that can be represented by type size_t.

For those reasons, it is sometimes beneficial to restrict the range of
object sizes to detect programming errors.

They propose having an unsigned rsize_t, but a macro RSIZE_MAX that
would limit the range of the object size.

I post this to document why having an "int" as an argument to a
malloc-type function is not such a bad idea.

Your opinion may differ.

The problem with int is that it throws away half the address space.
This is a *big* issue with 16-bit address spaces, but of course
such machines are largely relegated to embedded systems, and small
ones at that. Nevertheless, I regularly need to deal with objects
bigger than half the address space even today, perhaps because I
write so much systems code. So I find the choice of slicing the
address space in half arbitrary and potentially dangerous.

That's why I pushed for the notion of an RSIZE_MAX to accompany
the unsigned rsize_t that Microsoft put forth in TR 24731. You
can set it to:

-- (size_t)-1 >> 2 if you want the same protection as a signed
byte count

-- some other value if you know how big objects can really be,
and get maximum protection against silly sizes

-- (size_t)-1 if you want to turn the damned checking off

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 
J

jacob navia

P.J. Plauger a écrit :
The problem with int is that it throws away half the address space.
This is a *big* issue with 16-bit address spaces, but of course
such machines are largely relegated to embedded systems, and small
ones at that.

Yes. I see this too as a problem, but then, in such small
systems, the amount of ram tends to be very small too, and
the situation remains the same.

In a DSP I used last year the total amount of RAM was
4K, so a sizeof(int) 16 bits I had plenty of space
anyway :)

Nevertheless, I regularly need to deal with objects
bigger than half the address space even today, perhaps because I
write so much systems code.

In that case it can be useful to have two sets of malloc functions
maybe, one for the small allocations, and another for the "big"
ones.
So I find the choice of slicing the
address space in half arbitrary and potentially dangerous.

That's why I pushed for the notion of an RSIZE_MAX to accompany
the unsigned rsize_t that Microsoft put forth in TR 24731. You
can set it to:

-- (size_t)-1 >> 2 if you want the same protection as a signed
byte count

-- some other value if you know how big objects can really be,
and get maximum protection against silly sizes

-- (size_t)-1 if you want to turn the damned checking off

Checking can be a nuisance but it can be an advantage sometimes. It
depends on the situation.

Thanks for your input.

jacob
 
R

Richard Heathfield

jacob navia said:

As I explained, I prefer a signed type to the
unsigned size_t because small negative number will be
confused with huge integers when interpreted as unsigned.

Why would you need a small negative number as an argument to malloc? Are you
trying to allocate a negative amount of memory?
I researched a bit the subject, and I found a type

ssize_t

The name is defined for instance in
http://www.delorie.com/gnu/docs/glibc/libc_239.html

It's a POSIX type, not a C type. Try comp.unix.programmer.

<snip>
 
E

Eric Sosman

jacob said:
P.J. Plauger a écrit :

Yes. I see this too as a problem, but then, in such small
systems, the amount of ram tends to be very small too, and
the situation remains the same.

It's the same, and yet not the same. The size of a data
structure does not usually scale linearly with the width of
its constituent words. I have not worked with 16-bit machines
for a number of years now, but when I did it was not unusual to
want to allocate >32K to a single object. By contrast, I have
never needed to allocate >2G as a single object on a 32-bit
system.

YMMV, but it seems to me that the need to "use all the
bits" is more pressing on small systems than on large ones.
In that case it can be useful to have two sets of malloc functions
maybe, one for the small allocations, and another for the "big"
ones.

Walk that road with caution: I've been down it, and it is
twisty and dangerous, beset with bandits and worse. You need
to exercise a *lot* of discipline to segregate the memory blocks
you get from multiple allocators: If you obtain a block from a
hypothetical lmalloc() and release it with plain free(), chaos
is likely.

But then, you may be considering a different path, like an
smalloc(int) as a sanity-checker wrapped around malloc(size_t):

#include <stdlib.h>
void *smalloc(int bytes) {
return (bytes < 0) ? NULL : malloc(bytes);
}

If so, the hobgoblin of multiple allocators disappears. But
implementing smalloc() is no trick at all; you can do it for
yourself with the tools already provided, if you like. You might
want to consider leaving the argument type as size_t, though, to
give yourself more freedom in setting the failure threshold:

#include <stdlib.h>
#include <limits.h>
#define THRESHOLD ((INT_MAX / 2u + 1) * 3) /* for example */
void *smalloc(size_t bytes) {
return (bytes < THRESHOLD) ? malloc(bytes) : NULL;
}

Returning for a moment to your original motivation for using
a signed argument:
> As I explained, I prefer a signed type to the
> unsigned size_t because small negative number will be
> confused with huge integers when interpreted as unsigned.

I don't see this as a big problem. If the program blunders and
asks for -42 bytes, conversion to size_t turns this into a request
for a very large amount of memory (by the machine's standards,
and assuming a size_t of similar "width" to the addresses). The
request almost certainly fails and returns NULL, with no harm
done. So instead of filtering the argument going into malloc(),
it might be more useful to monitor the result:

#include <stdlib.h>
void *zmalloc(size_t bytes) {
void *new = malloc(bytes);
if (new == NULL)
print_debugging_info();
return new;
}

This would catch absurd arguments (they'll provoke malloc()
failure) and also help track down the slobs who call malloc()
and fail to check the result for NULL. If desired, one could
also monitor the incoming argument for "reasonableness," to
help find code that's making excessively greedy requests.
 
C

CBFalconer

jacob said:
Rcently I posted code in this group, to help a user
that asked to know how he could find out the size of
a block allocated with malloc.

As always when I post something, the same group
of people started to try to find possible errors,
a harmless passtime they seem to enjoy.

One of their remarks was that I used "int" instead of
"size_t" for the input of my allocator function.
.... snip ...

I post this to document why having an "int" as an argument to a
malloc-type function is not such a bad idea.

Your opinion may differ.

It does. Are you or are you not the implementor of lcc-win32? If
so, you can easily limit the range accepted within the malloc code,
without fouling the specifications of the standard. If not you
shouldn't be fooling with routines that are defined in the
standard.

--
Some informative links:
<http://members.fortunecity.com/nnqweb/> (newusers)
<http://www.catb.org/~esr/faqs/smart-questions.html>
<http://www.caliburn.nl/topposting.html>
<http://www.netmeister.org/news/learn2quote.html>
<http://cfaj.freeshell.org/google/> (taming google)
 
J

jacob navia

Eric Sosman a écrit :
It's the same, and yet not the same. The size of a data
structure does not usually scale linearly with the width of
its constituent words. I have not worked with 16-bit machines
for a number of years now, but when I did it was not unusual to
want to allocate >32K to a single object. By contrast, I have
never needed to allocate >2G as a single object on a 32-bit
system.

YMMV, but it seems to me that the need to "use all the
bits" is more pressing on small systems than on large ones.

With a bit of more reflection I think you (and Mr Plauger)
have a point here Eric.

Yes, in 16 bit systems all bits may be needed, and this
may be more important than catching a wrong allocation.

jacob
 
K

Kenny McCormack

jacob navia said:



Why would you need a small negative number as an argument to malloc? Are you
trying to allocate a negative amount of memory?

Explanation for those (like RH) with limited ability to read between the
lines: What's going on here is the combination of buggy programming
(incompetent programmers) who calculate things and then pass the results
to library functions and buggy implementations (like Linux) that allow
all mallocs to succeed.

You put the two together, and you get chaos.
 
J

jacob navia

Richard Heathfield a écrit :
jacob navia said:




Why would you need a small negative number as an argument to malloc? Are you
trying to allocate a negative amount of memory?

Can't you read?

This thread is not for you

I said in my original message:

"... the usage of an unsigned type that can
easily lead to errors (of course only for people that do
make errors like me...)"

You never do any errors heathfield since you are a "competent
C programmer". Please go away. This thread is about errors
and their prevention. You do not need it.
 
P

P.J. Plauger

P.J. Plauger a écrit :

Yes. I see this too as a problem, but then, in such small
systems, the amount of ram tends to be very small too, and
the situation remains the same.

In a DSP I used last year the total amount of RAM was
4K, so a sizeof(int) 16 bits I had plenty of space
anyway :)

That may be your experience, but mine has been that 16-bit systems
often have >32KB of memory. Anything that interferes with handling
all of memory in one go is sure to cause trouble, sooner or later.
In that case it can be useful to have two sets of malloc functions
maybe, one for the small allocations, and another for the "big"
ones.

A nice complement to the malloc(0) discussion elsegroup. My idea
of elegance is to have one function that accepts any size from 0 to
the largest representable object, and do something uniformly sane
with it. But YMMV.
Checking can be a nuisance but it can be an advantage sometimes. It
depends on the situation.

But you *always* have to check if you want to keep your program sane.
The only advantage of a signed byte count is that it makes it slightly
easier to check for a clearly bogus size. And the program had still
better check for a null pointer return from malloc. I find it hard to
conceive of a situation in this day and age where you *wouldn't* want
both malloc and the caller of malloc to check. I can't imagine what
situation would make such checking an avoidable "nuisance".
Thanks for your input.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 
C

CBFalconer

jacob said:
Richard Heathfield a écrit :
.... snip ...

Can't you read?

This thread is not for you

This is a *public* newsgroup. If you don't want comments on your
posts, don't post.
 
R

Richard Heathfield

jacob navia said:
Richard Heathfield a écrit :

Can't you read?

I can read just fine. Why would you need a small negative number as an
argument to malloc?
This thread is not for you

This is Usenet. If you want a private discussion, use email.
I said in my original message:

"... the usage of an unsigned type that can
easily lead to errors (of course only for people that do
make errors like me...)"

On the contrary, using an unsigned type as malloc's argument *eliminates*
the possibility of requesting a negative size.

<nonsense snipped>
 
S

Stephen Sprunk

jacob navia said:
One of their remarks was that I used "int" instead of
"size_t" for the input of my allocator function.

As long as your allocator isn't named malloc(), calloc(), or realloc(),
that's up to you.

Of course, one must wonder why you'd ever want to allow folks to request
negative amounts of memory and exactly what it means if they do. Making
the argument unsigned (whether size_t or something else) makes it much
more obvious what you intended the proper use to be, and it doubles the
number of legitimate argument values.
As I explained, I prefer a signed type to the
unsigned size_t because small negative number will be
confused with huge integers when interpreted as unsigned. ....
This concern with the usage of an unsigned type that can be
easily lead to errors (of course only for people that do
make errors like me...)

The typical C programmer answer to this is "don't do that". C's
philosophy, at its core, is to give programmers the tools to shoot
themselves in the foot if they so desire. If you frequently shoot
yourself in the foot, then consider using another language, like BASIC,
that doesn't give you that option, or spend more time learning how to
code defensively or how to use your debugger.
is also expressed in the document ISO/IEC JTC1 SC22 WG14 N1135 :
"Specification for Safer, More Secure C Library Functions"
where they propose:

Extremely large object sizes are frequently a sign that an object’s
size
was calculated incorrectly. For example, negative numbers appear as
very
large positive numbers when converted to an unsigned type like size_t.

Also, some implementations do not support objects as large as the
maximum value that can be represented by type size_t.

For those reasons, it is sometimes beneficial to restrict the range of
object sizes to detect programming errors.

They propose having an unsigned rsize_t, but a macro RSIZE_MAX that
would limit the range of the object size.

There is nothing preventing the implementor from returning NULL if the
request to malloc() et al appears to be erroneous, such as being "a
small negative number" converted to unsigned. We do not need a change
to the standard for this, since the standard already allows the
implementation to return NULL for any reason it wishes -- just like we
don't need a change to the standard to add bounds-checking pointers.
Implementors are free to do all sorts of extra work behind the scenes to
try to prevent problems, if they wish.

Most implementors, however, take the position that it's more important
to give people the freedom to do unexpected things than to treat them
like idiots who need adult supervision. Insulting your customers is not
a sustainable business practice.

S
 
R

Richard Tobin

"... the usage of an unsigned type that can
easily lead to errors (of course only for people that do
make errors like me...)"
[/QUOTE]
On the contrary, using an unsigned type as malloc's argument *eliminates*
the possibility of requesting a negative size.

You seem to be deliberately misunderstanding - surely reading the rest
of the thread makes it clear. The mistake is that you inadvertently
pass an incorrect value to malloc(). Sometimes such incorrect values
will be negative, and if the argument to malloc() was signed, it
could notice this error, rather than treating it as a very large
positive value.

On most current general-purpose computers, such a large positive value
will fail anyway, so it wouldn't be very helpful on those systems.

-- Richard
 
K

Kenny McCormack

This is a *public* newsgroup. If you don't want comments on your
posts, don't post.

Are you really that stupid? Are you really so stupid that you don't
follow what Jacob is really saying when he says "This thread is not for
you"?

You
Macho programmers need not apply.
folks
Macho programmers need not apply.
need
Macho programmers need not apply.
to
Macho programmers need not apply.
learn
Macho programmers need not apply.
to
Macho programmers need not apply.
read
Macho programmers need not apply.
between
Macho programmers need not apply.
the
Macho programmers need not apply.
lines.
 
K

Kenny McCormack

Stephen Sprunk said:
Most implementors, however, take the position that it's more important
to give people the freedom to do unexpected things than to treat them
like idiots who need adult supervision. Insulting your customers is not
a sustainable business practice.

Seems to work just fine for MS. But then again, they're not "most
implementors", I suppose. They're insignificant.
 
M

Mark McIntyre

This thread is not for you

I'm sorry, where does it say in the Rules of Usenet that some threads
are forbidden to some posters?

(snip collection of sarcastic and gratuitous insults).
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
 
M

Mark McIntyre

The mistake is that you inadvertently
pass an incorrect value to malloc(). Sometimes such incorrect values
will be negative,

Thats impossible, if the argument is unsigned.
and if the argument to malloc() was signed, it
could notice this error, rather than treating it as a very large
positive value.

Huh? So you sacrifice half the possible address space, to cater for a
stupid programming error. Actually, this sounds about par for the
course, given the recent string thread.
On most current general-purpose computers, such a large positive value
will fail anyway, so it wouldn't be very helpful on those systems.

It works fine on all my general purpose computers. That is to say, it
fails if the attempt is for too much memory, and it works otherwise.
What else would we reasonably expect it to do?
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
 
J

jacob navia

Mark McIntyre a écrit :
I'm sorry, where does it say in the Rules of Usenet that some threads
are forbidden to some posters?
That was of course just an advise.

I said in my original post

As always when I post something, the same group
of people started to try to find possible errors,
a harmless passtime they seem to enjoy.

You belong to that group.
 
J

jacob navia

P.J. Plauger a écrit :
That may be your experience, but mine has been that 16-bit systems
often have >32KB of memory. Anything that interferes with handling
all of memory in one go is sure to cause trouble, sooner or later.

Yes. You are right in this point. For 16 bit systems the
lost of 32K of addressing space is quite a hit. Specially
if you do have the full 64K.

jacob
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,813
Latest member
lawrwtwinkle111

Latest Threads

Top