gcc knows about malloc()

J

Jordan Abel

SuperKoko said:
No. Multiplying by zero is correct in math (so my analogy is not
perfect).
Here, in math, there is a single "undefined thing" : division by zero.
But, the idea is that you can't compensate UB, in any way...
Trying to compensate with well-defined behavior doesn't help (the
multiplication by zero), and trying to compensate with undefined
behavior is not better.



Jordan Abel:
No, the standard doesn't say that.
It says that you can convert (with an explicit C cast) a function to a
pointer to function of an incompatible type, and cast it back to a
compatible type... Otherwise, using a pointer converted (with an
explicit C cast... Not any mean of buggy reintepretation of the bytes
of the representation of the pointer) from an original int(*)() pointer
to a void*(*)() pointer and calling a function on this new pointer
creates UB.
Strictly speaking, your program contains a single pointer conversion :
from int(*)() to void*(*)().

The original pointer is int(*)()... Nowhere in your code does appear an
explicit conversion from void*(*)() to int(*)().... You just have a
buggy mean to get a int(*)() pointer from nowhere sensible (from a
symbol which should refer to an int() function but that you interpret
as a void*() function).

But that's not what your cite says. The standard may say so elsewhere,
but it doesn't say it in what you quoted.
 
K

Keith Thompson

Eric Sosman said:
It depends on what you think "it" is. If "it" is the
original code fragment


... then there's no "implicit int" in play at the point of
the call. The call uses a function pointer whose type almost
matches that of malloc() -- the only discrepancy being the
omission of the prototype -- so all is well unless, as I
mentioned, size_t is promotable. (I overlooked something:
there's also trouble if, for example, TWELVE_MILLION is an
int or double or some such, because there's no prototype to
force its conversion to size_t.)

In the following, assume that there's no visible prototype for
malloc().

Suppose different pointer-to-function types are represented
differently. The standard guarantees that any pointer-to-function
type can be converted to any other pointer-to-function type, but the
compiler has to know the type of the original pointer in order to
generate correct code for the conversion.

In the above code fragment, the compiler doesn't necessarily know that
the malloc() function returns void*. (It's allowed to know this,
since it's in the standard library, but it's not required to.) The
compiler sees the identifier "malloc", and it knows that it's the name
of a function, but it doesn't know anything else about it; in
particular, it doesn't know the function's return type. One
reasonable thing for the compiler to do is to assume that the function
returns int (I'm not sure whether this is required, but it's certainly
allowed). So it generates code to convert something of type
"int (*)()" to type "void *(*)()", and applies this conversion to the
value of malloc. But malloc is really of type ""void *(*)()" already
(we've just hidden this fact from the compiler), so the conversion
yields garbage, and the attempt to do a function call using this
garbage pointer-to-function value invokes undefined behavior.

This is the same problem we see with the more common
int *ptr = (int*)malloc(sizeof int);
except that we're lying to the compiler about the type of the malloc
function rather than lying to the compiler about the type of the
result returned by the malloc function. In either case, we're
covering up the lie by doing a conversion, but we're not telling the
compiler what to convert from.

Ignoring the problem with the TWELVE_MILLION argument, this is all
likely to "work" either if all pointer-to-function types have the same
representation (which is probably true for all actual implementations)
*or* if the compiler uses its knowledge of the actual type of
malloc(). That just means that the undefined behavior will rarely, if
ever, cause any visible problems.

You can't meaningfully use malloc unless the compiler knows its type,
and the only way to tell the compiler what malloc's type is is to
Let's try a variation, shall we? In one file we'll have

#include <stdlib.h>
typedef void * (*fptr)(size_t); /* for readability */
fptr get_malloc_ptr(void) {
return malloc;
}

This function just returns a pointer to malloc(), which in turn
is properly declared by the header. Now in a separately-compiled
file we'll have

#include <stddef.h> /* for size_t */
void *my_malloc(size_t bytes) {
void * (*f)(size_t) = get_malloc_ptr();
return f(bytes);
}

This function acquires a pointer to malloc(), calls via the pointer,
and returns the result. Note carefully that no malloc() prototype
is visible at the point of the call -- there isn't a declaration
of any kind visible, with or without a prototype. Yet there is no
error here, no undefined behavior, no contravention of anything
except good sense.

If the second code fragment has a visible prototype for
get_malloc_ptr(), this should work correctly without invoking
undefined behavior. The compiler knows the type of the result of
get_malloc_ptr() because you've declared it properly. You haven't
lied to the compiler.

(If there is no visible prototype for get_malloc_ptr(), the compiler
will assume it returns int; the attempt to initialize f, which is of
type void *(*)(size_t), with an expression of type int is a constraint
violation.)
The original code does pretty much the same thing, except
that it uses a different way of acquiring the pointer to malloc().
Both examples make their actual call to malloc() via a function
pointer expression of the proper type.

No, the original code does a conversion of a pointer-to-function value
without knowing the actual type of the value being converted.

[...]
 
N

Nelu

Ian Collins said:
Well it must be close, AMD have over 20% of the market and Intel are
shipping 64bit CPUs as well.

What about the SPARC processors, aren't they 64 bit? SUN servers
were popular at some point (now they're switching to AMD-64).
Are PPC 64 bits?
 
I

Ian Collins

Nelu said:
What about the SPARC processors, aren't they 64 bit? SUN servers
were popular at some point (now they're switching to AMD-64).
Are PPC 64 bits?
Yes and yes.
 
E

Eric Sosman

Keith said:
In the following, assume that there's no visible prototype for
malloc().

Suppose different pointer-to-function types are represented
differently. The standard guarantees that any pointer-to-function
type can be converted to any other pointer-to-function type, but the
compiler has to know the type of the original pointer in order to
generate correct code for the conversion. [...]

That's something I hadn't considered, and I think
you may be right.
 
M

Mark McIntyre

AFAICT that's an error only if `size_t' is a type that
is subject to the "integer promotions."

Its an error if malloc returns anything other than int, which it does.
Irrespective of how you fool with the return type to cast it to a
pointer type, the compiler has been lied to and told to fetch an int
from some memory location which may (or may not) be a bucket of fetid
dingos kidneys.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
 
O

Old Wolf

Mark said:
Its an error if malloc returns anything other than int, which it does.
Irrespective of how you fool with the return type to cast it to a
pointer type, the compiler has been lied to and told to fetch an int
from some memory location which may (or may not) be a bucket of fetid
dingos kidneys.

The above code doesn't tell the compiler to fetch an int from
anywhere. The cast of the function pointer expression 'malloc'
results in a value of type 'pointer to function returning void* and
taking one argument of type size_t' and there is nothing
undefined about this conversion.

The actual definition of malloc() matches this type so there is
no undefined behaviour when malloc is called through this
function pointer.

It is quite different to casting the return value of
malloc. The above code is equivalent to:

void * (*fptr)(size_t) = malloc;
fptr(TWELVE_MILLION);

which is quite legal.
 
K

Keith Thompson

Old Wolf said:
The above code doesn't tell the compiler to fetch an int from
anywhere.
True.

The cast of the function pointer expression 'malloc'
results in a value of type 'pointer to function returning void* and
taking one argument of type size_t'

True (more or less).
and there is nothing
undefined about this conversion.

False, I believe.

Consider the expression
malloc
If a correct prototype is visible, this expression (after the implicit
conversion that's done to function names in most contexts) is
void *(*)(size_t)
Casting this expression to the same type, of course, performs no
actual conversion and simply yields the same pointer-to-function
value.

But in the absence of a prototype, the compiler doesn't know the type
of the expression
malloc
so it cannot, in the general case, generate correct code to convert
this value to type void *(*)(size_t). It's likely to generate code
that would convert a value of type int (*)() to type
void *(*)(size_t); applying this code to a value that's *actually* of
type void *(*)(size_t) unleashes the nasal demons.

(The nasal demons are likely to be invisible and harmless if all
pointer-to-function types have the same representation, which is very
likely since it's the most obvious and straightforward way to satisfy
the standard's guarantee that any pointer-to-function type can be
meaningfully converted to any other pointer-to-function type and back
again.)
The actual definition of malloc() matches this type so there is
no undefined behaviour when malloc is called through this
function pointer.

Yes, but the actual definition of malloc() is not visible to the
compiler.
It is quite different to casting the return value of
malloc. The above code is equivalent to:

void * (*fptr)(size_t) = malloc;
fptr(TWELVE_MILLION);
Yes.

which is quite legal.

but invokes undefined behavior.
 
D

Dave Vandervies

That, i'd doubt.

Datum: I've been going to local beige-box stores looking for pricing
on new low-end systems to replace my aging desktop. Every quote I've
gotten so far has included a 64-bit processor.


dave
 
O

Old Wolf

Keith said:
Consider the expression
malloc
If a correct prototype is visible, this expression (after the implicit
conversion that's done to function names in most contexts) is
void *(*)(size_t)
Casting this expression to the same type, of course, performs no
actual conversion and simply yields the same pointer-to-function
value.

But in the absence of a prototype, the compiler doesn't know the type
of the expression
malloc
so it cannot, in the general case, generate correct code to convert
this value to type void *(*)(size_t). It's likely to generate code
that would convert a value of type int (*)() to type
void *(*)(size_t); applying this code to a value that's *actually* of
type void *(*)(size_t) unleashes the nasal demons.

Seems to make sense. Is there any documentation in the
standard to back this up?

Personally I'm not convinced that the code should even
compile. In C99 it certainly doesn't: C99 does not have
implicit function declaration.

I don't have a copy of C90 to check, but I am wondering,
what is the scope of the identifier 'malloc' that is
"declared" by:

if (0) malloc();

? My copy of gcc does not compile if the line is changed to:
if (0) { malloc(); }

saying that in the next line, 'malloc' is an undeclared symbol.
This which would be the only case I've ever heard of where
adding braces around STMT in

if (x) STMT;

makes a difference to anything, and also the first case where
identifiers declared in the block of a control statement can
"leak" to outside the control statement.

My copy of GCC also permits:

sizeof( foo() );

as "declaring" foo; it now seems difficult to argue that

sizeof( 1/0 );

should be permitted, as we were discussing in another thread.
 
G

Guest

Old said:
Seems to make sense. Is there any documentation in the
standard to back this up?

I don't know about C89, but from C99:

7.1.3:
All identiï¬ers with external linkage in any of the following
subclauses (including the
future library directions) are always reserved for use as identiï¬ers
with external
linkage.
[...]
No other identiï¬ers are reserved. If the program *declares or*
deï¬nes an identiï¬er in a
context in which it is reserved (other than as allowed by 7.1.4), or
deï¬nes a reserved
identiï¬er as a macro name, the behavior is undeï¬ned. (emphasis
mine)

Does C89 have similar wording? If you want, you could argue that 7.1.4
is poorly worded, though.
Personally I'm not convinced that the code should even
compile. In C99 it certainly doesn't: C99 does not have
implicit function declaration.

I don't have a copy of C90 to check, but I am wondering,
what is the scope of the identifier 'malloc' that is
"declared" by:

if (0) malloc();

? My copy of gcc does not compile if the line is changed to:
if (0) { malloc(); }

saying that in the next line, 'malloc' is an undeclared symbol.
This which would be the only case I've ever heard of where
adding braces around STMT in

if (x) STMT;

makes a difference to anything, and also the first case where
identifiers declared in the block of a control statement can
"leak" to outside the control statement.

Try
int main(void) {
if(0) (enum { zero }) 0;
return zero;
}

There is an implicit block for if-statements in C99, so adding braces
won't make a difference. My compiler rejects this code in C99 mode, but
accepts it in C89 mode, so I'm assuming that this is a C99 change.
My copy of GCC also permits:

sizeof( foo() );

as "declaring" foo; it now seems difficult to argue that

sizeof( 1/0 );

should be permitted, as we were discussing in another thread.

Why is this the same thing? If 1/0 is not evaluated, there is no
undefined behaviour. However, a declaration doesn't need evaluation to
have effect.
 
J

jaysome

There's at least as many 64bit desktop system shipping these days as
there are 32bit, not including the server space which has even greater
bias to 64bit.

But a large percentage of those 64-bit desktop systems are running
32-bit code in which int and void* have the same size. I'm running
Windows XP on an AMD 64-bit dual-core processor. All of my
applications are compiled with 32-bit int and 32-bit void*.

A lot depends on what OS and compiler one is using. There are a great
many OS/compiler combinations in which int and void* are both 32 bits
on a 64-bit processor.

That said, I couldn't put up with someone who insisted that int and
void* have the same size. Such a person would be out the door on my
account.
 
R

Richard Bos

Datum: I've been going to local beige-box stores looking for pricing
on new low-end systems to replace my aging desktop. Every quote I've
gotten so far has included a 64-bit processor.

And every single one of them will have a 32-bit time_t. Right?

Sometimes I despair of us humans, I really do.

Richard
 
S

SuperKoko

Jordan said:
But that's not what your cite says. The standard may say so elsewhere,
but it doesn't say it in what you quoted.
3.3.4 (Cast operators)
"
A pointer to a function of one type may be converted to a pointer to a
function of another type and back again; the result shall compare equal
to the original pointer. If a converted pointer is used to call a
function that has a type that is not compatible with the type of the
called function, the behavior is undefined.
"
It seems pretty clear to me. Here, the type of the function is int().
Note that, in the context of this paragraph, convert means : Use a cast
operator (it is in the chapter on type casts).

Things are quite simple :
Either your first UB (declaring malloc with an incorrect prototype)
happens to do bad things such as a crash, in that case your program is
bad or crashes.
Or, the first UB happens to be well-defined by the compiler and results
in a valid int() function. In that case a pointer int(*)() pointing to
that function, is valid, and you can use it, and it points to a int()
function... But converting it to void*(*)() and using this new pointer
has UB except, once again, that your compiler document this UB too.
But, the compiler will have either to document the two UB separately,
or to document the combination of the two UB : i.e., it might say that
it is allowed to declare incorrectly a function and that taking the
address of the function yields a pointer which points to the correct
function (i.e., as if the declaration was correct).

But you can't assume that "knowing that the first UB doesn't behave
incorrectly" is sufficient to avoid the second UB. You must read the
compiler's documentation to know how and why, the first UB doesn't
behave incorrectly.
 
J

James Dow Allen

Note that the optimization in question has nothing to do with any
details of
either processor architecture or malloc implementation.
GCC knows about and has special handling for a large number of standard C functions
(as well as a smaller number of functions that are not part of Standard C.)

I didn't review gcc source, but did do a "strings" on the compiler
object code.
Several standard function names appeared together, but malloc appeared
a second time later, almost by itself.
Which functions it knows about is documented in the GCC manual (search for
'built-in functions'.) (GCC manuals can be found at http://gcc.gnu.org/onlinedocs/ )

I tried this, and, searching for "built-in functions" got a list of ...
(surprise!)
built-in functions. malloc is not built-in. Gnu has its own malloc()
implementation,
and mention of it shows up in other searches, but that isn't what I'm
looking for
either.

I've concluded that readers in c.l.c, if any, who know of the
optimization, or
even understood what it is, were too distracted by my peculiar casting
of
malloc to attend to the interesting issue. (I thought context made it
clear that
the peculiar casting was simply for humor, but it now seems doubtful
anyone here shares my sense of humor.)

James
 
E

Eric Sosman

Old said:
Seems to make sense. Is there any documentation in the
standard to back this up?

Personally I'm not convinced that the code should even
compile. In C99 it certainly doesn't: C99 does not have
implicit function declaration.

Keith's argument can also be illustrated with

#include <stddef.h>
long malloc(const char*, double, struct foo*);
void * (*pseudo_malloc)(size_t) = malloc;

.... which does not rely on implicit int.

After some pondering, I think Keith is right. His
argument is really the same as that for other kinds of
type mismatches, like

/* file1.c */
double trouble;

/* file2.c */
extern int trouble;
...
double toil = (double)trouble;

Even though the compiler of file2.c is told to convert the
value of `trouble' to the type of `trouble's definition, it's
been misinformed about the "starting point" for the conversion
and (probably) generates incorrect code.
 
G

Guest

Eric said:
Keith's argument can also be illustrated with

#include <stddef.h>
long malloc(const char*, double, struct foo*);
void * (*pseudo_malloc)(size_t) = malloc;

Adding the required cast, and making one implicit conversion explicit:

long malloc(const char*, double, struct foo*);
void * (*pseudo_malloc)(size_t) = (void *(*)(size_t)) &malloc;
... which does not rely on implicit int.

After some pondering, I think Keith is right. His
argument is really the same as that for other kinds of
type mismatches, like

/* file1.c */
double trouble;

/* file2.c */
extern int trouble;
...
double toil = (double)trouble;

It's closer to double *toil = (double *) &trouble;.
Even though the compiler of file2.c is told to convert the
value of `trouble' to the type of `trouble's definition, it's
been misinformed about the "starting point" for the conversion
and (probably) generates incorrect code.

I don't think it's valid, but I don't think it's as likely to blow up
as you seem to either.
 
A

Andrew Poelstra

And every single one of them will have a 32-bit time_t. Right?

Sometimes I despair of us humans, I really do.
Well, when the terminator happens, we'll just have to wait until
2038, and then they'll reset to 1970 and start only killing hippies.

So it's not all bad. ;-)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

malloc 40
array-size/malloc limit and strlen() failure 26
malloc 33
gcc alignment options 19
Malloc question 9
malloc and maximum size 56
malloc and alignment question 8
using my own malloc() 14

Members online

No members online now.

Forum statistics

Threads
474,184
Messages
2,570,973
Members
47,529
Latest member
JaclynShum

Latest Threads

Top