size_t or int for malloc-type functions?

R

Richard Heathfield

jacob navia said:

Who cares about rings?

Anyone who cares about how unsigned integer arithmetic works in C.
We are speaking about overflow in a well defined
context.

Overflow doesn't occur with unsigned integer types in C. We covered that.
Yes, the C standard defines the behavior
for overflow, and overflow then, it is defined
for unsigned integers.

No, overflow *doesn't happen* for unsigned integers.
This doesn't mean that it
doesn't happen or that this "ring" stuff is meaningful.

Yes, it does.
Or are you implying that

65521 x 65552 is 65296 and NOT 4295032592

It's implementation-defined. The result of multiplying two unsigned types
together depends on their values and the width of the type.

The nonsense of heathfield becomes worst given the context where
he is saying that we should go on using this construct:

malloc (n * sizeof *p)

to allocate memory instead of using calloc that should test
for overflow!

What you call nonsense is in fact good sense. Using calloc is almost always
the wrong thing, since calloc writes 0 to every byte in the allocated
block, which is hardly ever the right thing to do. (If all-bits-zero meant
NULL for pointers and 0.0 for floating point values, that would be a
different matter, but they don't so it isn't.) Furthermore, if n is an
unsigned type (as it should be, in my opinion), n * sizeof *p can't
overflow so there is nothing for calloc to test.
The bug I am referring to is when you multiply
65521 * 65552 --> 65 296

If you can meaningfully allocate 65521*65552 bytes in a single aggregate
object, then sizeof has to be able to report the size of such an object,
which means size_t has to be at least 33 bits, which means the "bug" you
refer to doesn't occur. If size_t is no more than 32 bits, it doesn't make
sense to try to allocate an aggregate object > 2^32-1 bytes in size.
Since malloc doesn't see anything wrong it will succeed,
giving you a piece of RAM that it is NOT as long as you
would think it is, but several orders of magnitude SMALLER.

As a matter of fact, it will give you at least the number of bytes you
requested (if it gives you any bytes at all). It is not malloc's fault if
you have misinterpreted unsigned integer arithmetic.
Even when heathfield says a patently insane stuff he is
"the guru" and people here will justify his ramblings.

If what I say is insane, it should be easy to disprove. But you've never
managed it yet.
I am well aware of what the standard says for overflow.

Then you will understand that to use a signed type instead of an unsigned
type as an argument to malloc is to introduce the potential for undefined
behaviour without solving the problem you were setting out to solve, and is
therefore a bad idea.
 
J

jacob navia

Richard Heathfield a écrit :
If you can meaningfully allocate 65521*65552 bytes in a single aggregate
object, then sizeof has to be able to report the size of such an object,
which means size_t has to be at least 33 bits, which means the "bug" you
refer to doesn't occur. If size_t is no more than 32 bits, it doesn't make
sense to try to allocate an aggregate object > 2^32-1 bytes in size.

C'MON HEATHFIELD
can you STOP IT???????

BUGS NEVER MAKE SENSE!!!

THAT'S WHY THEY ARE BUGS!!!!

Your "reasoning" is

Patient: Doctor doctor, each time I move my leg I have an horrible
pain!!!

Doctor Heathfield: Well, do not move it then!!!

The whole point is precisely that an overflow bug can occur
in your "idiom" and to avoid it there are two solutions:

1) Call calloc(n,size);
2) If you do not want to use calloc because of the wasted
micro-micro seconds in zeroing the memory you write your own.

For example:
With 32 bit ints, and 64 bit long longs ( a common configuration)
you write:

void *mycalloc(size_t n, size_t s)
{
long long m = (long long)s * (long long) n;
if (m>>32)
return 0;
return malloc((size_t)m);
}

The test (m>>32) just tests the upper 32 bits. Obviously
more sophisticated tests are possible.
 
J

John Bode

jacob navia wrote:

[snip]
His majesty is always right, no matter how much nonsense
he says.


Of course not. It is an unsigned number. The whole point is that
when you have

void fn(unsigned arg);

and you write:

fn(-3);

an implicit cast from signed to unsigned is done,
what in all implementations I know leads to no
machine code, but just a change in the way the bits
of the argument are interpreted.

Obviously this is a simplification of a real situation
when the values are not explicitely gibve like in the
example above.



This is nonsense. Why I obtain

65521 * 65552 --> 65296 ????

You (and heathfield) are just playing with words. That
overflow is defined for unsigned numbers within C doesn't
mean that the result is mathematically VALID, or that is
the expected result.

Just because the result isn't what's expected doesn't mean the
operation is mathematically invalid.
The whole point of my argumentation is that the "idiom"

result = malloc(n * sizeof *p);

is a dangerous construct.

This is like arguing that wearing a seatbelt is always dangerous
because you could potentially become trapped in a burning car after a
wreck. Weakness in one corner case *does not* translate to "dangerous"
in general. The benefit of using the idiom far outweighs the potential
risk.

There are two ways to deal with this problem. One method would be to
perform a basic sanity check against an unsigned wraparound *before*
calling malloc(); it shouldn't be hard to create a wrapper function
that takes an element size and count and returns NULL (and potentially
sets errno) if the requested block is too large:

int *x = sane_malloc(sizeof *x, count);
if (!x)
{
if (errno == EMEMRQST) // or code of your choice
{
// memory request exceeds size_t
}
}

Another method is to hack malloc() to take a signed argument and pray
that every system it's implemented on treats signed integer overflow in
the same useful manner.

I know which method I'd prefer.
Use

result = calloc(n, sizeof(*p))

and ensure your implementation of calloc doesn't have the bug!



It would catch the overflow in some cases above, as I explained in
my post.

So, basically, you're willing to trade the illusion of safety in one
corner case for cutting your usable address space in half.
You ropinion may be different, but it would be helpful if you
tried to advance an argument, just saying
"makes no sense"

makes no sense to anyone but you.

It makes no sense to me, either. You think you're protecting a
programmer from shooting himself in the foot, but in reality you're
just giving him extra bullets. Now he can call malloc() with an
invalid size request *by design*.
 
B

Ben Bacarisse

CBFalconer said:
What this has brought home to me is that calloc should be included
in the nmalloc package, so that the same maximum size criterion
will be applied. I.E:

void *ncalloc(size_t nmemb, size_t size) {
size_t sz;
void *p;

sz = nmemb * size;
if ((sz < nmemb) || (sz < size)) return NULL;
if (p = nmalloc(sz)) memset(p, 0, sz);
return p;
}
I am also having qualms about the overflow test.

I think it is wrong about 1/3 of the time. I.e. there are many cases
where the multiplication wraps round but where the reduced result (sz)
is not less than one or other of multipliers.

P J Plauger recently posted a division-based test. I don't know of a
simple test for a multiplication that will overflow that does not use
division. The fact that no one has posted one so far suggests that
there isn't one but I look forward to being wrong about that :)
 
R

Richard Heathfield

jacob navia said:
Richard Heathfield a écrit :

C'MON HEATHFIELD
can you STOP IT???????

Stop what? Being right?
The whole point is precisely that an overflow bug can occur
in your "idiom"

No, it can't, as I have explained several times. Until you have done the
necessary research that enables you to understand this, there is little
point in my attempting to address your other misconceptions.
 
G

Guest

CBFalconer said:
I think the test detects a zero field already.

Not consistently. If both nmemb and size are zero, then sz < nmemb is
false, and sz < size is also false, so ncalloc returns nmalloc(0). If
only one of nmemb or size is zero, then ncalloc returns NULL.
 
A

av

Nevertheless this behavior of unsigned C arithmetic is
sureley NEVER CORRECT and I have never seen a useful
program that relies on this behavior for something useful.

the "behavior of unsigned C arithmetic" mod 2^n is ok.
but for normal calculation could be well have some flag in the data
position for see if there are errors in overflow for that data
 
K

kuyper

jacob said:
(e-mail address removed) a écrit : ....

Who cares about rings?

Rings are the mathematical construct that correspond to the way in
which the C standard defines arithmetic for unsigned types. If you
don't care about C unsigned arithmetic, then you don't need to worry
about rings. You don't have to actually know the term; so long as you
understand what modulo means you've got enough of the concept for
practical purposes. But when you imply that modular arithmetic involves
adjusting the mathematical result, rather than actually BEING the
mathematical result, it implies that there's a deficiency in your
mathematical knowledge that you might want to correct (or not - your
choice). Granted, the mathematical terminology isn't of great practical
importance outside of mathematics. However, when you see someone using
it, I'd recommend delaying any criticism of that terminology until
you've mastered it yourself.
We are speaking about overflow in a well defined
context. Yes, the C standard defines the behavior
for overflow, and overflow then, it is defined

No, it defines the behavior in such a way that overflow never happens,
and says so explicitly.
for unsigned integers. This doesn't mean that it
doesn't happen or that this "ring" stuff is meaningful.

Or are you implying that

65521 x 65552 is 65296 and NOT 4295032592

Correct: that is precisely the case when, as in this case,
multiplication is defined modulo 65536.
With that 'x' I mean multiplication as is
understood by all people in this planet except
C buffs that know more about C standards than about
software engineering and mathematics???

Don't forget mathematicians, who also have a broader understanding of
the concept of multiplication that you do.
 
R

Randy Howard

If this is a concern, you can write a wrapper around malloc() that
takes two arguments (as calloc() does) and checks whether the result
wraps around.

Yeah, I said that yesterday. It didn't take hold though.

Common sense doesn't count when you are coding to protect those without
common sense. That's the best way I can think of to describe what
Navia has been up to lately.
 
C

christian.bau

P.J. Plauger said:
The problem with int is that it throws away half the address space.
This is a *big* issue with 16-bit address spaces, but of course
such machines are largely relegated to embedded systems, and small
ones at that. Nevertheless, I regularly need to deal with objects
bigger than half the address space even today, perhaps because I
write so much systems code. So I find the choice of slicing the
address space in half arbitrary and potentially dangerous.

There are machines where int is 32 bits, and size_t is 64 bit. I am
sure I have used machines with 16 bit ints that could allocate objects
of a few hundred KB successfully in old DOS times.

(Question: What is the largest amount of data anyone here has
successfully allocated using malloc () and used? Just curious).
 
C

christian.bau

Kenny said:
Seems to work just fine for MS. But then again, they're not "most
implementors", I suppose. They're insignificant.

In practice, many, perhaps most implementations cannot successfully
allocate anywhere near (size_t) -1 bytes.

First, if for example size_t = 32 bits, then it is likely that the OS
cannot allocate objects greater than 2^32 bytes, and the implementation
of malloc couldn't handle OS objects greater than that. malloc usually
has a few bytes of overhead, so malloc ((size_t) -1) won't succeed
anyway.

If size_t is 64 bits, there is no way malloc ((size_t) -1) could
succeed. And most 32 bit systems have limit at 3GB or 3.5GB. Something
like malloc ((size_t) -100) could only succeed on a system where size_t
is much smaller than the actual address space, for example 32 bits with
64 bit pointers.
 
R

Randy Howard

There are machines where int is 32 bits, and size_t is 64 bit. I am
sure I have used machines with 16 bit ints that could allocate objects
of a few hundred KB successfully in old DOS times.

(Question: What is the largest amount of data anyone here has
successfully allocated using malloc () and used? Just curious).

I don't recall keeping score, or trying to set a record, but I've
written code for memory testing of systems with 64GB of RAM (AMD64 and
EM64t systems) and have malloc'd (successfully) over 8GB in a single
chunk, I'm fairly sure. I don't have one in front of me with that much
memory right now, or I'd do some experiments.
 
R

Randy Howard

In practice, many, perhaps most implementations cannot successfully
allocate anywhere near (size_t) -1 bytes.

That may very well be true. However, it doesn't really matter, as
malloc() will return the appropriate response for those cases. :)
And most 32 bit systems have limit at 3GB or 3.5GB.

Actually it's usually 2GB, due to splitting of address space between
the kernel and user space. Some go as high as 3GB with special boot
options. However, there are hacks (outside of malloc) that allow for
what Intel calls "Physical Address Extension" (PAE) to allow systems
with Intel 32-bit processors to see memory above 4GB, sort of like the
old extended/expanded memory hacks in the DOS days. Again,
proprietary, and different APIs to use the memory from what standard C
supports.
 
C

christian.bau

CBFalconer said:
What this has brought home to me is that calloc should be included
in the nmalloc package, so that the same maximum size criterion
will be applied. I.E:

void *ncalloc(size_t nmemb, size_t size) {
size_t sz;
void *p;

sz = nmemb * size;
if ((sz < nmemb) || (sz < size)) return NULL;
if (p = nmalloc(sz)) memset(p, 0, sz);
return p;
}

Just wanted to say: This will not catch many problems. For example on a
32 bit system, nmemb = 0x10001, size = 0x10001, sz = 0x20001.
 
J

jacob navia

christian.bau a écrit :
Just wanted to say: This will not catch many problems. For example on a
32 bit system, nmemb = 0x10001, size = 0x10001, sz = 0x20001.

lcc-win32 uses this:

void *calloc(size_t n,size_t s)
{
long long siz = (long long)n * (long long)s;
void *result;
if (siz>>32)
return 0;
result = malloc((size_t)siz);
if (result)
memset(result,0,(size_t)siz);
return result;
}


sizeof(long long)=8*8
sizeof(size_t)=4*8
 
O

Old Wolf

Mark said:
If the argument is unsigned, you can't pass a -ve value to it.

"argument" means the value passed in. In this code:

#include <stdlib.h>
int foo() { malloc(-1); }

the argument to malloc is the value -1 (of type int).

The type "size_t" w.r.t. malloc is known as the "formal parameter type"
(often the word 'formal' is dropped for convenience's sake).
 
O

Old Wolf

Richard said:
No, you don't get overflow with unsigned types.

Actually you do. However, the behaviour on overflow is well-defined.

I suppose you could quibble about the exact meaning of the word
'overflow', but it is clear what Navia's meaning is and it serves no
purpose to pretend he is saying something other than what he is.
 
O

Old Wolf

Richard said:
On any given implementation, either size_t is big enough to store 65521 *
65552 or it isn't. If it is, there is no issue. And if it is not, your
request is meaningless, since you're asking for an object bigger than the
system can provide.

Firstly, systems might exist where you can allocate more memory
than SIZE_MAX.

That aside, wouldn't it be more sensible behaviour for malloc to
return NULL or take some other action when you try request an
object bigger than the system can provide, rather than returning
a smaller object than requested? I think this is Navia's point.
 
B

Ben Pfaff

Old Wolf said:
Actually you do. However, the behaviour on overflow is well-defined.

C99 6.2.5p9 makes it pretty clear that the Standard takes Richard's
point of view:

A computation involving unsigned operands can never
overflow, [...]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top