size_t or int for malloc-type functions?

R

Richard Heathfield

Mark McIntyre said:
The compiler doesn't read "-1" though, does it?

No, you're right - the preprocessor reads -1 (in the code under
consideration), and converts it into a pp-token. But I don't think it's
unreasonable to speak of the compiler, or at least the implementation,
reading -1 as being the argument to malloc.
Its an expression of
type int, whose value is represented by some bits which, when regarded
as a signed int, equal -1, and when regarded as an unsigned long, make
up (say) 0xFFFF.

Let's say 0xFFFFFFFF instead, shall we?
Imagine if you'd passed in 'a'.

Then it would be a positive number with a value <= CHAR_MAX, and certainly
representable as a size_t, and it would be of no relevance whatsoever to
this discussion.
No, it recieves a set of bits in some memory address or register, and
interprets them as a size_t.

Nothing in the C spec requires malloc to receive its parameter via a memory
address or register, or to do any interpreting of that value as a size_t.
It could receive the parameter by parcel post or carrier pigeon, with any
necessary interpretation already done for it, for all the C Standard cares.
But the Standard *does* require malloc to receive a parameter of type
size_t. So I maintain that my statement was correct, and note that yours
was rather less so.
 
G

Guest

CBFalconer said:
I think that works everywhere. I would rework it slightly to:

void *calloc(size_t n, size_t s) {
void *result;
size_t sz;

result = NULL;
if (SIZE_MAX / n < s) {

What if n == 0 ?
 
A

av

And that's where it breaks down. You see, where are you going to _get_
these small negative numbers? Are you going to get a negative
multiplicand from sizeof? No, because that's defined as giving a
positive number under all circumstances. Is your programmer going to
specify a negative number of objects? Hardly likely. That would be a
blunder of the first order.
So whence the negative number? Probably, one supposes, from multiplying
two largeish positive numbers and getting a signed integer overflow. Ah,
but! But signed integer overflow causes undefined behaviour. So the
error is not trying to allocate a negative number of bytes, the error is
computing the negative in the first place, and it's an error that is
allowed to be fatal and cannot reliably be caught.

Of course, there _is_ an easy way to stop the undefined behaviour. That
way is not to use signed integers for sizes in the first place.
Multiplying an unsigned integer by an (unsigned) size_t gives you
another unsigned integer. The multiplication cannot overflow, and cannot
cause UB. It _can_ wrap around, but that error is fairly easy to detect;
the way to do this is left as an exercise to the reader, but should not
evade any first-year student of C.

So, by suggesting that instead of the unsigned size_t, we should use
signed int or ssize_t, you are effectively advocating replacing a safe
method of handling malloc() in which overly large sizes are easily
spotted, by an unsafe method in which overly large numbers cause
untrappable errors which can only be caught after the damage has already
been done, and in which the program may crash before you even get to
check whether the result is negative at all. Is that wise? Seems to me
that it's not.

Richard

google "integer overflow" & bugtraq,
some type can not "overflow" (using mod)
and if "overflow" it has to be easy to find they have "overflow"
 
J

Jun Woong

Harald said:
CBFalconer wrote: [...]
void *calloc(size_t n, size_t s) {
void *result;
size_t sz;

result = NULL;
if (SIZE_MAX / n < s) {

What if n == 0 ?

That should be

if (n == 0 || s == 0 || SIZE_MAX / n > s) {

or something like that.


--
Jun, Woong (woong at icu.ac.kr)
Samsung Electronics Co., Ltd.

``All opinions expressed are mine, and do not represent
the official opinions of any organization.''
 
C

christian.bau

In order to validly argue that a definition of a term is wrong, you
must reference a differing more authoritative definition. In this case,
the authoritative definition of size_t is the one provided the C
standard - there is no higher authority you can refer to, to justify
calling the definition wrong. The standard's definition might be poorly
written, unreadable, useless, meaningless, internally inconsistent,
inconsistent with other parts of the standard, inconsistent with some
other standard, unimplementable, or it might possess any of a wide
variety of other negative characteristics. But since C standard is the
relevant authority, it's definition of size_t is inherently incapable
of being wrong.

The word "wrong" is used with different meanings, it can mean
inappropriate, ill-advised, unsound and so on. I could reasonably say
"The definition of the strncpy function in the C Standard is wrong".
Some people say "trigraphs are wrong" or harsher things. On the other
hand, I don't think the definition of size_t is wrong in that sense.
 
C

christian.bau

Stephen said:
Not portably and in a single object. size_t is _defined_ to be able to
hold the size of the largest possible object. That, of course, doesn't
exclude systems where you can allocate SIZE_MAX bytes multiple times
(e.g. MS DOS) or where there is some other (i.e. non-portable)
allocator.

I did a search for "size_t" in the C99 final draft, and I couldn't
actually find anything that says size_t must be able to store the sizes
of any array returned by calloc. You cannot define a type that is
bigger than can be represented using size_t (at least a compiler cannot
let you use sizeof for such a type), and a call to malloc/realloc
cannot request such an object, but a call to calloc can.

Being able to allocate such a large object would have some other
consequences, like indexing and pointer subtraction might be harder to
implement, but I didn't find anything that doesn't allow calloc to
return large objects.
 
P

Peter Nilsson

Richard said:
Mark McIntyre said:

No, you're right - the preprocessor reads -1 (in the code under
consideration), and converts it into a pp-token.

No, -1 is actually two pp-tokens, a punctuator and a pp-number. It's
surprising
what sorts of things constitute a pp-number (e.g. 8teen), but a leading
sign is
not part of it. Even when preprocessing tokens are converted to tokens,
-1
remains as two tokens.

All that said, -1 is still the expression that constitutes the argument
to malloc.
It evaluates to -1 (ta da!) and has type int. Prior to calling,
malloc's parameter,
which has type size_t, is _assigned_ the value -1. The semantic rules
of
assignment require a conversion take place from the int value to
size_t.
When the call is actually made, malloc's parameter already has the
converted
value. [At least, that is the way the standard defines the call.]

The major problem that can occur is when malloc is not prototyped and
the
mentioned conversion does not take place (more precisely the behaviour
is undefined). Mark's description is more literally true of what takes
place
on most implementations when undefined behaviour is invoked.

Of course, Mark's description is also literally true of the way that
many
implementations (upon which size_t has the same size as int, neither
has padding bits, and the conversion is a no-op) perform the call, but
it
is unwise to think that all implementations are similar in that regard.

Certainly, on most (all?) stack based implementations where size_t
is wider than int, it is easy to look at the disassembly and see that
what gets pushed onto the stack is a size_t and not an int. This is
an indication that the implementation writer has read the standard
and knows that a conversion needs to be performed before the
function call itself.
 
C

CBFalconer

Jun said:
Harald said:
CBFalconer wrote: [...]
void *calloc(size_t n, size_t s) {
void *result;
size_t sz;

result = NULL;
if (SIZE_MAX / n < s) {

What if n == 0 ?

That should be

if (n == 0 || s == 0 || SIZE_MAX / n > s) {

or something like that.

The version I have just put in nmalloc.c (not yet published) is:

/* calloc included here to ensure that it handles the
same range of sizes (s * n) as does malloc. The
multiplication n*s can wrap, yielding a too small
value, so we must ensure calloc rejects this.
*/
void *ncalloc(size_t n, size_t s)
{
void *result;
size_t sz;

result = NULL;
if (!n || ((size_t)-1) / n > s) {
sz = n * s;
if ((result = nmalloc(sz))) memset(result, 0, sz);
}
return result;
} (* ncalloc *)

which makes the output of ncalloc be that of nmalloc(0) whenever
either n or s is 0. I think there is still a possible glitch when
((size_t)-1) / n) == s. The only thing that needs protection
agains n==0 is the division. s==0 will simply force sz==0.
 
Z

Zara

Richard Heathfield a écrit :

C'MON HEATHFIELD
can you STOP IT???????

BUGS NEVER MAKE SENSE!!!

THAT'S WHY THEY ARE BUGS!!!!

<..>

Please, jacob, do stop shouting.

All RH is saying is that the bug is neither in the standard not in
malloc, it lies in the program that requests a chunk of memory greater
than maximum value of size_t. And so it is.

Trying to put the bug in the standard libraries routine or in the
standard wording is trying to hide the bug done by the programmer.

Best regards (and my best desires for peace)


Zara
 
P

pete

Zara said:
<..>

Please, jacob, do stop shouting.

All RH is saying is that the bug is neither in the standard not in
malloc,

The function under discussion is calloc, not malloc.
it lies in the program that requests a chunk of memory greater
than maximum value of size_t. And so it is.

What RH said is wrong.
sizeof operates on types.
sizeof does not need to be able to report how many bytes
of memory are allocated by calloc.
 
P

Peter Nilsson

Zara said:
Please, jacob, do stop shouting.

All RH is saying is that the bug is neither in the standard not in
malloc,

Yet the committee itself has considered the issue as a weakness
of the standard.
it lies in the program that requests a chunk of memory greater
than maximum value of size_t. And so it is.

Trying to put the bug in the standard libraries routine or in the
standard wording is trying to hide the bug done by the programmer.

True. However... the committee's job is to codify C in terms of actual
practice. [The most perverse example on record is gets().]

I don't think it's an exageration to say there are hundreds of
thousands
of C programs that allocate memory dynamically. A significant portion
(probably even most!) fail to check for size_t overflow (er
wrap-around.)
With size_t becoming wider and wider, now is as good a time as any
to consider allocation functions that take size_t and what they should
do with extremely large (likely bogus) requests.

Richard's stance is basically that new programs should get it right
from the start and existing programs which get it wrong should
be modified and corrected. Sound idea, but a tad idealistic. ;-)

The committee's options include considering more pragmatic
viewpoints. Sure, the options considered don't prevent the kinds
of bug being discussed, but greater detection and mitigation of
effects is not necessarily a Bad Thing (tm).
 
P

pete

pete said:
The function under discussion is calloc, not malloc.


What RH said is wrong.
sizeof operates on types.
sizeof does not need to be able to report how many bytes
of memory are allocated by calloc.

But that's not really the point.
If calloc can't return a pointer to
"space for an array of nmemb objects,
each of whose size is size"
then it should return a null pointer,
rather than a pointer to
((size_t)nmemb * (size_t)size) bytes of memory.
 
P

pete

CBFalconer said:
Jun said:
Harald said:
CBFalconer wrote: [...]

void *calloc(size_t n, size_t s) {
void *result;
size_t sz;

result = NULL;
if (SIZE_MAX / n < s) {

What if n == 0 ?

That should be

if (n == 0 || s == 0 || SIZE_MAX / n > s) {

or something like that.

The version I have just put in nmalloc.c (not yet published) is:

/* calloc included here to ensure that it handles the
same range of sizes (s * n) as does malloc. The
multiplication n*s can wrap, yielding a too small
value, so we must ensure calloc rejects this.
*/
void *ncalloc(size_t n, size_t s)
{
void *result;
size_t sz;

result = NULL;
if (!n || ((size_t)-1) / n > s) {
sz = n * s;
if ((result = nmalloc(sz))) memset(result, 0, sz);
}
return result;
} (* ncalloc *)

which makes the output of ncalloc be that of nmalloc(0) whenever
either n or s is 0. I think there is still a possible glitch when
((size_t)-1) / n) == s.

Then why don't you make it

if (!n || ((size_t)-1) / n >= s)

instead?
 
R

Richard Heathfield

Peter Nilsson said:
No, -1 is actually two pp-tokens, a punctuator and a pp-number.
It's surprising what sorts of things constitute a pp-number
(e.g. 8teen), but a leading sign is not part of it. Even when
preprocessing tokens are converted to tokens, -1 remains as
two tokens.

I appear to have been out-pedanted again. :)

Thanks for the correction.
 
P

Peter Nilsson

CBFalconer said:
... The version I have just put in nmalloc.c (not yet published) is:

/* calloc included here to ensure that it handles the
same range of sizes (s * n) as does malloc. The
multiplication n*s can wrap, yielding a too small
value, so we must ensure calloc rejects this.
*/
void *ncalloc(size_t n, size_t s)
{
void *result;
size_t sz;

result = NULL;
if (!n || ((size_t)-1) / n > s) {
sz = n * s;
if ((result = nmalloc(sz))) memset(result, 0, sz);
}
return result;
} (* ncalloc *)

Make sure you fix the Pascal comments. ;-)
which makes the output of ncalloc be that of nmalloc(0) whenever
either n or s is 0. I think there is still a possible glitch when
((size_t)-1) / n) == s.

Equality implies n * s plus some non-negative remainder equals
(size_t)-1. In other words, you still have the mathematic relation
n*s <= (size_t)-1.

Of course, you may wish to reserve (size_t)-1 for use as an in-band
error signal (for other functions in your suite), but that's your
choice.
 
R

Richard Heathfield

pete said:

What RH said is wrong.
sizeof operates on types.

....and expressions.
sizeof does not need to be able to report how many bytes
of memory are allocated by calloc.

Yes, on reflection I'll take the hit on that. Apologies if I misled anyone.
I think I had malloc firmly on the brain, but calloc on my fingertips, so
to speak.
 
R

Richard Heathfield

pete said:

If calloc can't return a pointer to
"space for an array of nmemb objects,
each of whose size is size"
then it should return a null pointer,
rather than a pointer to
((size_t)nmemb * (size_t)size) bytes of memory.

Certainly true.
 
K

Keith Thompson

Richard Heathfield said:
pete said:



...and expressions.


Yes, on reflection I'll take the hit on that. Apologies if I misled anyone.
I think I had malloc firmly on the brain, but calloc on my fingertips, so
to speak.

I think the fact that calloc() can *theoretically* allocate objects
bigger than SIZE_MAX bytes is just accidental. Any declared object
presumably can't be bigger than that, because sizeof needs to work.
malloc() can't allocate anything bigger than SIZE_MAX bytes because
its argument is of type size_t. calloc() can because it takes two
arguments, but (I'm fairly sure) it doesn't take two arguments to
enable it to allocate huge objects.

It would have been perfectly reasonable for the standard to say that
no single object can be bigger than SIZE_MAX bytes; I'd be surprised
if any implementation of calloc() actually *can* allocate more than
SIZE_MAX bytes. (The ones I've seen work by calling malloc().)
 
D

Dietmar Schindler

Randy said:
No doubt, since I've been searching for several decades for such a
person, and have yet to find one.

You can find many of them six feet under the ground.
 
C

CBFalconer

Richard said:
Peter Nilsson said:

I appear to have been out-pedanted again. :)

Thus I maintain that the action of strtoul (and scanf) is faulty.
A few years too late to change them. They should refuse the
initial sign. Which is what my readxwd does (read and return error
status).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,817
Latest member
DicWeils

Latest Threads

Top