size_t or int for malloc-type functions?

jacob navia · Jan 3, 2007

(e-mail address removed) a écrit :

Plenty of CRC and checksum generating code find this behavior useful.
Can't believe you've never seen them since they're all over the place.

When doing multiplications?

This is a crc code for instance:
/*
========================================================================= */
#define DO1(buf) crc = crc_table[((int)crc ^ (*buf++)) & 0xff] ^ (crc >> 8);
#define DO2(buf) DO1(buf); DO1(buf);
#define DO4(buf) DO2(buf); DO2(buf);
#define DO8(buf) DO4(buf); DO4(buf);

/*
========================================================================= */
uLong ZEXPORT crc32(crc, buf, len)
uLong crc;
const Bytef *buf;
uInt len;
{
if (buf == Z_NULL) return 0L;
#ifdef DYNAMIC_CRC_TABLE
if (crc_table_empty)
make_crc_table();
#endif
crc = crc ^ 0xffffffffL;
while (len >= 8)
{
DO8(buf);
len -= 8;
}
if (len) do {
DO1(buf);
} while (--len);
return crc ^ 0xffffffffL;
}

No multiplications as far as I see.

Keith Thompson · Jan 3, 2007

Ben Pfaff said:
Keith Thompson said:

jacob navia said:

void *calloc(size_t n,size_t s)
{
long long siz = (long long)n * (long long)s;
void *result;
if (siz>>32)
return 0;
result = malloc((size_t)siz);
if (result)
memset(result,0,(size_t)siz);
return result;
}

Click to expand...

[...]

I wonder if "siz&~0xFFFFFFFF" might be marginally more efficient than
"siz>>32". [...]

Click to expand...

I'd recommend "siz > SIZE_MAX" as being both clear and portable.

Excellent! I wish I'd thought of that myself. It's much clearer to
the reader, and likely to be clearer to the compiler (i.e., there's a
greater potential for optimization if there happens to be an efficient
code sequence that does the same thing).

Keith Thompson · Jan 3, 2007

jacob navia said:
Yes, I know that, and I agree that the semantics of unsigned is
wrap around. What I am saying is that when "the result cannot be
represented" and this wrap around semantics reduces the result,
this reduced result is mathematically WRONG in the sense of the USUAL
multiplication operation.

PHEW!!!!

No, it is not "WRONG", it is *different*. Multiplication of unsigned
values does not follow the mathematical semantics of multiplication
over the infinite set of integers. It's not supposed to. If you're
going to use unsigned types in C, you just have to be aware of this.

Multiplication and other operations over a finite set (e.g., modulo N)
are well-defined and commonly used in mathematics, though probably
less common than operations on the infinite set of integers. This
finite arithmetic corresponds closely to the behavior of unsigned
integers in C.

Specifically when I use the malloc (p * sizeof *p) "idiom"
even if the semantics are well defined this is NOT what I
inteded with that multiplication!!!!

As I've said elsewhere in this thread, I actually agree with this
(except that you probably didn't want to use "p" twice). You have to
be careful to avoid operands that will result in wraparound. The real
bug is that you're asking for way too much memory; the problem is
aggravated by the fact that, due to the wraparound semantics of
unsigned types, the bug can be difficult to detect.

I emphatically do not agree that this means that the
p = malloc(COUNT * sizeof *p);
idiom is too dangerous to use. But providing a wrapper (similar to
calloc() but without the initialization to zero) isn't a bad idea.

There is no point in throwing me standards texts because I am not
questioning them. I am just saying that this "results that cannot be
represented" lead to a wrong result being passed to malloc!!!

I just can't understand why it is impossible to agree in such
an evident stuff !!!

It's largely because you insist on using the word "overflow" in direct
contradiction to the way the standard uses that same word. If you'll
use terminology consistent with the standard, it just might turn out
that we're largely in agreement (about the problem if not about the
solution).

Richard Heathfield · Jan 3, 2007

Old Wolf said:

That aside, wouldn't it be more sensible behaviour for malloc to
return NULL or take some other action when you try request an
object bigger than the system can provide, rather than returning
a smaller object than requested?

But malloc does *not* return (a pointer to space for) a smaller object than
requested.

Richard Heathfield · Jan 3, 2007

jacob navia said:

Specifically when I use the malloc (p * sizeof *p) "idiom"
even if the semantics are well defined this is NOT what I
inteded with that multiplication!!!!

Then write what you did intend.

There is no point in throwing me standards texts because I am not
questioning them. I am just saying that this "results that cannot be
represented" lead to a wrong result being passed to malloc!!!

If your program passes the wrong result to malloc, that is a bug in your
program, not a bug in malloc or in unsigned arithmetic.

I just can't understand why it is impossible to agree in such
an evident stuff !!!

What is obvious to one person is far from obvious to another, which is why
it is necessary to have independent standards - and these can often seem
arbitrary in the choices they make, precisely because one man's obvious is
another man's obscure.

The C language is defined not by random opinions on clc but by an
international standard, which frequently makes arbitrary choices (and
frequently does *not*, much to some people's frustration). If you want to
discuss the language rationally with other people, you'll need to get used
to that.

Richard Heathfield · Jan 3, 2007

Old Wolf said:

Actually you do.

No, actually you don't.

Peter Nilsson · Jan 3, 2007

Richard said:
... The width of size_t is implementation-defined. ...

SIZE_MAX is implementation-defined in C99, but AFAIK, C90 doesn't
require any facet of size_t to be implementation-defined.

Richard Heathfield · Jan 3, 2007

Mark McIntyre said:

No, thats just how you typed it.

No, the argument to malloc is the expression -1, which is clearly of type
int, and is equally clearly negative!

As far as malloc is concerned, you passed UINT_MAX.

No, malloc has no clue what you passed, and UINT_MAX has nothing to do with
it except, perhaps, coincidentally. As far as malloc is concerned, it
*receives* a parameter with the value (size_t)-1, which is a (very)
positive value.

Richard Heathfield · Jan 3, 2007

Peter Nilsson said:

SIZE_MAX is implementation-defined in C99, but AFAIK, C90 doesn't
require any facet of size_t to be implementation-defined.

Sorry, I should have said "implementation-dependent".

That is, clearly it has to be some width or other, and yet the Standard does
not say which width (except that we know it must be at least 15 bits wide
in C90, 16 in C99). So it depends on the implementation. Indeed, it is
defined by the implementation! Nevertheless, strictly speaking it is not
"implementation-defined" within the meaning of the Act.

Richard Heathfield · Jan 3, 2007

CBFalconer said:

IIRC a ring defines a set of objects that are members of the ring,
and a set of operations on those objects, such that <m1 operation
m2> yields a member of the ring. unsigned objects and the
operations +, -, and * meet this definiton. The operation / does
not. Exponentiation does.

Well, a ring is a set with two defined binary operations, + and *, usually
taken to mean addition and multiplication, that meet a set of primitive
axiomatic conditions which I won't go into here. The existence of
additional operations (such as division) doesn't stop it being a ring even
if they don't meet the condition that m1 op m2 yields a member of the ring.

CBFalconer · Jan 3, 2007

Old said:
"argument" means the value passed in. In this code:

#include <stdlib.h>
int foo() { malloc(-1); }

the argument to malloc is the value -1 (of type int).

The type "size_t" w.r.t. malloc is known as the "formal parameter
type" (often the word 'formal' is dropped for convenience's sake).

No it isn't. The compiler knows that malloc wants a size_t, and so
it automatically converts that value, resulting in SIZE_T_MAX (or
whateer it is called).

Keith Thompson · Jan 3, 2007

Richard Heathfield said:
Old Wolf said:

But malloc does *not* return (a pointer to space for) a smaller object than
requested.

No, but it can return a pointer to a smaller object than you *tried*
to request. (But that's hardly malloc's fault.)

CBFalconer · Jan 3, 2007

Ben said:
Old Wolf said:

Actually you do. However, the behaviour on overflow is well-defined.

Click to expand...

C99 6.2.5p9 makes it pretty clear that the Standard takes Richard's
point of view:

A computation involving unsigned operands can never
overflow, [...]

I have code that inputs from a text stream, forbids an initial '-'
(or '+') sign (which must be handled by the calling code, if
present) and follows the C conventions for unsigned values, yet
returns an error code signifying 'overflow' or EOF or invalid
stream (no digits). This can reject such values at input. The
decision is up to the caller.

CBFalconer · Jan 3, 2007

jacob said:
.... snip ...

I just can't understand why it is impossible to agree in such
an evident stuff !!!

In part it is your fault for not clearly defining meanings,
ignoring the standard, and heaving invective at those who correct
you. In some respects I think this has produced a new competive
game called 'how many Navia errors can I spot'.

CBFalconer · Jan 3, 2007

christian.bau said:
Just wanted to say: This will not catch many problems. For example
on a 32 bit system, nmemb = 0x10001, size = 0x10001, sz = 0x20001.

Yes. Corrected else thread.

CBFalconer · Jan 3, 2007

jacob said:
christian.bau a écrit :
.... snip ...

lcc-win32 uses this:

void *calloc(size_t n,size_t s)
{
long long siz = (long long)n * (long long)s;
void *result;
if (siz>>32)
return 0;
result = malloc((size_t)siz);
if (result)
memset(result,0,(size_t)siz);
return result;
}

Bad. long longs can overflow, leading to undefined behaviour. No
guarantee you ever get to testing the product. Casts are always
suspicious.

CBFalconer · Jan 3, 2007

Ben said:
Keith Thompson said:

jacob navia said:

void *calloc(size_t n,size_t s)
{
long long siz = (long long)n * (long long)s;
void *result;
if (siz>>32)
return 0;
result = malloc((size_t)siz);
if (result)
memset(result,0,(size_t)siz);
return result;
}

Click to expand...

[...]

I wonder if "siz&~0xFFFFFFFF" might be marginally more efficient than
"siz>>32". [...]

Click to expand...

I'd recommend "siz > SIZE_MAX" as being both clear and portable.

I don't think so. Ignoring the casts, maybe

if (!(SIZE_MAX - siz)) thingsarebad();
else carryon();

I seriously doubt that most machines can generate a value larger
than SIZE_MAX.

CBFalconer · Jan 3, 2007

Mark said:
Which limits it to a fairly small subset of systems, by the way...

Which is alright, since this is a system routine and not expected
to be portable.

CBFalconer · Jan 3, 2007

Nelu said:
.... snip ...

Why not:

void *calloc(size_t n, size_t s) {
void *result;
size_t sz;
if(SIZE_MAX/n<s) {
sz=n*s;
result=malloc(n*s);
if(result) {
memset(result,0,sz);
return result;
}
}
return NULL;
}

I think that works everywhere. I would rework it slightly to:

void *calloc(size_t n, size_t s) {
void *result;
size_t sz;

result = NULL;
if (SIZE_MAX / n < s) {
sz = n * s;
result = malloc(sz);
if (result) memset(result, 0, sz);
}
return result;
}

largely to install some blanks for readability.

CBFalconer · Jan 3, 2007

Randy said:
. snip ...

Actually it's usually 2GB, due to splitting of address space between
the kernel and user space. Some go as high as 3GB with special boot
options. However, there are hacks (outside of malloc) that allow for
what Intel calls "Physical Address Extension" (PAE) to allow systems
with Intel 32-bit processors to see memory above 4GB, sort of like the
old extended/expanded memory hacks in the DOS days. Again,
proprietary, and different APIs to use the memory from what standard C
supports.

Less than that. For a von Neuman machine the code itself has to go
somewhere, not to mention minor details such as the file system,
i/o buffers, etc.

size_t, ssize_t and ptrdiff_t	56	Oct 12, 2013
malloc	40	May 1, 2011
Rock, Paper, Scissor game. Im getting TypeError, unsupported operand type(s) for -=: 'NoneType' and 'int'	2	Aug 29, 2023
size_t in inttypes.h	4	May 26, 2011
size_t, when to use it? (learning)	45	Apr 10, 2014
Machines where size of size_t is not equal to size of unsigned int/long	12	Sep 30, 2013
return -1 using size_t???	44	Feb 11, 2012
size_t or ssize_t	11	Feb 16, 2006

size_t or int for malloc-type functions?

jacob navia

Keith Thompson

Keith Thompson

Richard Heathfield

Richard Heathfield

Richard Heathfield

Peter Nilsson

Richard Heathfield

Richard Heathfield

Richard Heathfield

CBFalconer

Keith Thompson

CBFalconer

CBFalconer

CBFalconer

CBFalconer

CBFalconer

CBFalconer

CBFalconer

CBFalconer

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads