why is casting malloc a bad thing?

T

Tak-Shing Chan

Very unusual. Generally one knows what kind of data one wants when one
declares it; only while using them do you sometimes get confused.

Have you written any program generators? Are all of them
zero-defect implementations?
You know, I'd expect to get a warning;

Which part of the standard mandates such warnings?
and I'd expect malloc()-casters
to cast away that warning by writing

printf("%d\n", (int)*x);

or even

printf("%d\n", *(int *)x);

Your implication that malloc()-casters are doing so just to
shut up the warnings is at best a faulty generalization. When I
do cast malloc, I am trying to get /more/ warnings, not less.
Note that I abhor useless casts, too, so I would never cast this:

int *x = malloc(sizeof *x);

because the type of x is crystal clear in this context. Now,
perhaps, having a mix of casted and uncasted mallocs in the same
translation unit would have the best of both worlds--detecting
missing <stdlib.h> and at the same time detecting type errors:

int *x = malloc(128 * sizeof *x);
if (x) {
foo(128, TYPE_IS_INTEGER, x);
free(x);
}
:
:
:
x = (int *) malloc(2048 * sizeof *x);
if (x) {
bar(2048, TYPE_IS_INTEGER, x);
free(x);
}
Actually, I do dismiss this possibility, because I don't believe it
happens that way.

Yet, it just happened in the above program.
Moreover, it is not a reason to cast malloc() in particular. Would you
also expect to see the cast on

int *y;

y = (int *)x;

If not, why make an exception for malloc()?

No, because the purpose of the diagnostic is to prompt the
programmer to fix the wrong declaration of x. The diagnostic
produced by y = x is a clue to the observant programmer that a
problem has occurred with the type of x or y. On the other hand,
there is not enough information for the implementation to
diagnose the problem in x = malloc(sizeof *x).

Tak-Shing
 
T

Tak-Shing Chan

On Wed, 28 Jan 2004 18:49:52 +0000,


When you start casting x here, why not cast it everywhere you use it? This
doesn't have to do with malloc().

It has to do with the fact that malloc returns a void *,
which can be assigned to any pointer including the wrong ones.

Tak-Shing
 
D

Dan Pop

In said:
Not at all. For example, I do not have to rely on the semantics of
trigraphs, since I avoid them like the plague.

You may still inadvertently fall upon them, if you're not extremely
careful. E.g.

puts("Huh??!");

This is more of an issue when automatically encoding binary data in
string literals (e.g. using a base64-style algorithm) that are supposed
to be used by C programs. You *must* check that you're not inadvertently
inserting a trigraph sequence in that string literal.

See, their semantics are haunting even the trigraph haters...

Dan
 
S

Sidney Cadot

Dan said:
You may still inadvertently fall upon them, if you're not extremely
careful. E.g.

puts("Huh??!");

Like for accidental failure to include <stdlib.h>, my compiler help me
by warning about such mishaps. And yes, this /has/ prevented mayhem once
or twice :)

Best regards,


Sidney
 
S

Sidney Cadot

Dan said:
[...]
If you don't use the appropriate command line options to enable such
warnings, that's more of a problem than casting malloc results can ever be.

In short, I think this argument is stale, and is getting staler all the
time.
Try a reality check: how many newbies have you seen enabling such
warnings when they show the way they compiled their code (quite often,
people posting here show their compilation command line)?

All too many, for sure. I wouldn't necessarily recommend malloc casting
to newbies. I wouldn't recommend C to newbies, for that matter.
And keep in mind that this discussion is focused on newbies, experienced
programmers have already defined their preference pro or con casting
void pointers and it is not this thread that is going to make them change
their minds (you will keep casting when this discussion is over, I will
keep avoiding such casts as the plague).

I agree on this for most part, except that this shows a level of
subtlety on this issue that does not rhyme well with the kind of
reactions that code samples with malloc casts invariably evokes
around here.

Best regards,

Sidney
 
A

Amarendra GODBOLE

[..snip...]
to the comp.lang.c newsgroup. I find his argument compelling
because it is sound and *not* because of his "reputation".

[...snip...]

Hmmm...why do we _always_ find you on the opposite bank of the river ? ;)

Cheers,
Amarendra
 
M

Michael Wojcik

You are right. Stroustrup's opinion matters on C++, and your opinion
matters on C2, but neither of your (neither's of you?) opinion matters
on C. C++ is Stroustrup's language and C2 is yours, fine, you can keep
them. Neither of you has claim over C.

Yes, that was rather my point. (The "C2" language is purely an imaginary
invention, created for purposes of withering sarcasm - that may not have
been clear.) And actually, while I respect Stroustrop's opinions on C++,
I wouldn't follow them blindly either.
 
A

Amarendra GODBOLE

[...snip...]
This also avoids a potential problem wherein the roll resists
turning, thereby leading to excessive fragmentation ;-)

To avoid all such issues (mentioned here, as where as elsewhere in
this thread), we use water instead. ;)

Cheers,
Amarendra
 
M

Michael Wojcik

Stroustrop uses English carefully, and what he stated (in "The C++
Programming Language Third Edition") is subtly different, and compared
to Mark Bruno's statement above, defensible and correct.

Defensible, certainly, but I disagree with "correct".
He actually
states "However, *good* C programs tend to be C++ programs. For
example, every program in Kernighan and Ritchie, The C Programming
Language, 2nd Edition, is a C++ program."(emphasis in the original).

Yes, I've read that (in the Special Edition, but I believe it's the
same text there), and some of his other writings on the subject.

The second sentence may be correct (I haven't checked every program
in K&R2 to verify this claim). The former, however, is both subjective
("good") and vague ("tend"), so it has no claim to correctness.

Some of the features which make C programs not C++ programs are in fact
attributes of good C programs, as I define them; thus for me good C
programs tend not to be C++ programs.
In an appendix he gives an exhaustive treatment of when, and why, C
programs are not C++ programs, and when the two are semantically
different.

Indeed. I do not find this a compelling argument for his position
either. I also know the differences between the two languages - which
is good, since I write both C and C++ programs and I like to get them
as right as I can. Yet I have arrived at a different conclusion.

Stroustrop has his opinion on this subject. I don't think it's an
arbitrary, foolish, or malicious one (though I do suspect it's partly
motivated by his relationship to C++ - which is perfectly natural,
and of course he might well hold the same opinion in other circum-
stances). I just don't believe it's a matter of fact, as some people
seem to do.

--
Michael Wojcik (e-mail address removed)

Auden often writes like Disney. Like Disney, he knows the shape of beasts --
(& incidently he, too, might have a company of artists producing his lines) --
unlike Lawrence, he does not know what shapes or motivates these beasts.
-- Dylan Thomas
 
C

Chris Torek

... I would like to know [why]
double* p = malloc(8*sizeof(double));==>OK for some here because
implicit coversion is well enough
but why
double* p = (double*) malloc(8*sizeof(double));==>NOK for some.

It is not so much a "not OK" as "causes more problems than it solves".

Consider, for instance, a (rather weak) analogy. Suppose that for
some reason you have an urgent need to travel 200 miles (or 300 km
in Europe :) ) in about an hour. You can:

- drive your car at 200 mph (300 kph), or

- take a 500-mph plane, giving you about 30 minutes to get on
and off the plane at the two ends of the travel.

There are tradeoffs here, and neither solution is absolutely perfect
-- but one is statistically quite a bit more risky than the other.
Perhaps driving at 200 mph kills 17% of the people who do it, while
flying at 500 mph kills 0.0034% of people who do it.

In *my* experience -- which dates back to 1981, when I first
started using C -- it is "casting malloc" that is like "driving 200
mph". *Many* more programs that have a cast have been discovered
to be broken, yet produced no compile-time diagnostics at all.

Specifically, I have seen at least dozens, if not hundreds (I never
kept a count), of cases of code that failed at runtime due to casts
hiding a missing "#include <stdlib.h>" or other declaration of
malloc(). In the compilers we used in the 1980s (including VAX
PCC, Plauger's own Whitesmiths C, VAX/VMS C, and a number of C
compilers for IBM PCs), malloc() had type:

char *malloc();

and casts were required for all situations except "assigning the
result to a variable of type `char *'". Without the cast, you got
a diagnostic, contrary to what Dennis Ritchie's earliest C compilers
did. (Apparently, in the Bad Old Days of:

int i 3; /* initialize i to 3 */

and

printf(2, "error\n"); /* write to file descriptor 2 */

no diagnostics occurred for:

int malloc(); char *p; ... p = malloc(N);

I never used these compilers myself -- by 1981, it was apparently
clear to C compiler writers that this was a bad idea.)

Accordingly, in "K&R-1 C" -- i.e., C as described by the original
White Book and implemented by compilers like PCC and VAX/VMS C --
one usually had to cast malloc(). As a result, if one failed to
declare malloc(), it acquired the "implicit int" declaration. C
being C, writing:

int malloc(); /* implicit */
double *p = (double *)malloc(N * sizeof *p);

normally compiled warning-free. On many machines, it even worked --
but on some machines it failed. If you declared malloc correctly,
either by hand or by including some header file (stdlib.h did not
exist yet and the name and even existence of this header varied),
the code would work on those other machines too.

In December 1989, the ANSI X3J11 committee finalized the original
ANSI C standard; and in this C, the type of malloc() had changed
from:

char *malloc();

to:

void *malloc(size_t);

Moreover, "void *" (which had been a useless type that even caused
some compilers to core dump!) now existed, and had a special
assignment-conversion property. Because of this property, it was
now possible to remove the casts from old code, and create new code
without casts:

#include <stdlib.h>
...
double *p = malloc(N * sizeof *p);

Such code immediately drew a "diagnostic" -- always a "warning" in
every compiler I used in the 1990s -- if and only if you had
forgotten the "#include" (or a programmer-supplied declaration,
but if you had a Standard C compiler, including the standard header
was the obvious best choice).

In particular, this meant that programs could now be written such
that a VERY COMMON MISTAKE was found at compile time. And -- as
I said earlier -- I subsequently observed what I believe to be
thousands of potential runtime failures, and least dozens of actual
runtime failures, prevented in code in which the (existing) casts
were *removed* and the warning diagnostic fixed by adding the
missing "#include" line.

Had those casts been left in, or had new code been written using
those casts, I believe much more code would have eventually failed
during porting or testing or (most expensively) after distribution.
This leads to what *I* believe is an inescapable conclusion:

IF ONE WRITES IN ANSI C89 (or ISO C90), casting malloc
is not necessary, and adds HUGE amounts of risk.

There are some who argue (see other postings in this thread,
particularly those by Tak-Shing Chan) that, even in strict C89,
there are possible situation in which casting malloc also removes
some risk. But in the the last decade, while I have personally
observed huge numbers of of "failure to include <stdlib.h>" errors,
including dozens of cases in which real, in-use code DID fail
because of this combined with an unnecessary cast, I have never --
NOT ONCE -- seen a situation in which real, in-use code failed in
a way that would have been diagnosed at compile time had a cast
been used. In other words, we might restate the above as:

In C89, casting the result of malloc() has proven over time to
add significant risk of error, and has a theoretical chance of
reducing error. Statistically, code that does cast malloc()
contains many more errors than code that does not.

The only sensible conclusion to the above is that one should not
cast malloc()'s return value in this case -- only some sort of
external constraint (such as "I am not always using C89") could
change this.

Now, in C99, that huge risk -- the failure to declare malloc()
using "#include <stdlib.h>" -- has changed from a frequently-undiagnosed
error to a required-diagnostic error. In other languages that
vaguely resemble C, it has always required a diagnostic. Thus, in
these non-C89 languages, the biggest risk in casting malloc() has
been removed. So we can add:

IF ONE WRITES ONLY IN C99 (or some other non-C language),
casting the result of malloc() adds little if any risk of
error.

Of course, no one yet has a decade of experience in "time-tested"
or "proven" C99 risk-assessment -- the new standard has been in
existence for less than four years, and few compilers implement it
even today. (Whether you believe my own "time-tested, proven" C89
risk-assessment above is up to you, of course.)

In any case, each programmer (or person directing programmers and
thus setting standards for them) must make his own risk/reward
analysis. The risks of casting malloc() in C89 are, I think,
well-established and significant. The rewards are, I think,
nonexistent in practice -- the theoretical situations in which it
helps never occur in real code. The risks in C99 are far smaller;
and if you have additional constraints, or believe you will never
use a C89 compiler that lacks a "warn on implicit int" diagnostic,
your own risk may be smaller than that in "generic C89". In this
case, whether to cast becomes purely a style issue:

While there are many *wrong* styles, there is no single
*right* style.

My own style will continue to be "do not cast", because the casts
are still unnecessary (in C -- I am not writing C++ here and when
I write C++ I almost never use malloc()), and I find that removing
all unnecessary source code produces programs that are smaller,
faster, more comprehensible, and/or more maintainable. But just
as there are English-language styles in which one does not "omit
needless words" (see Strunk&White), there are such code styles too.
 
S

Sidney Cadot

Chris said:
[snipped a well-balanced article concerning malloc() casting]

Amen to all that. I think you have accurately conveyed the majority view
here, while at the same time giving some attention to opposing views
without dismissing them out of hand. If something along these lines
could end up in the FAQ, that would be great.

Best regards, Sidney
 
C

Christian Bau

magesh said:
Well I am not sure to measure upto u guys, my question may be redundnant.
My question is this, if people pretend with their own arguments
that casting malloc return may cause undefined behaviour, I would like
to know

double* p = malloc(8*sizeof(double));==>OK for some here because
implicit coversion is well enough
but why
double* p = (double*) malloc(8*sizeof(double));==>NOK for some.

I assume that casting the malloc's return is just an indication for my
codevelopers who may see an assignement of p from a malloc a 500 or more
lines after my pointer declaration p.

If we prentend that implicit conversion is safe and the fact of casting
may cause UB due to probable alignement problems, its a bit difficult to
understant why implicit conversion may be safer than an explicit more
than a matter of style.

You missed the point completely. Correct casts are not a problem at all.
Incorrect casts are a problem. The compiler doesn't know whether a cast
is correct or incorrect. That leads to the principle that you should
_never_ use a cast unless you cannot avoid it.

int* p = malloc (2 * sizeof (int));
int* p = (int *) malloc (2 * sizeof (int));

do exactly the same thing. The cast has been used correctly. It didn't
do any damage, but it didn't help either.

extern int nalloc (size_t size);

int* p = nalloc (2 * sizeof (int));
int* p = (int *) nalloc (2 * sizeof (int));

In this case, the compiler can catch the error in the first statement,
but not in the second one. The cast did hide the error.

Apply the general principle (never cast unless it cannot be avoided),
and it is obvious that you shouldn't cast the result of malloc.
 
C

Christian Bau

Sidney Cadot said:
No, it does not actually. This argument may have been relevant 10 years
ago, but it is no longer. Sure, it's not a required diagnostic in C89
mode, but any halfway decent compiler is able to warn about an
undeclared malloc() nowadays.

If you don't use the appropriate command line options to enable such
warnings, that's more of a problem than casting malloc results can ever be.

In short, I think this argument is stale, and is getting staler all the
time.

It helps you in case you spelt the function name incorrectly.
 
C

Christian Bau

[email protected] (Dan Pop) said:
The only *good* reason for renaming an identifier is because the original
name was improperly chosen. Hence my "bizarre" statement above.

The original name may have been chosen properly according to the
requirements that you knew at that time. When requirements change,
meanings can change and a name may not be chosen properly anymore.
 
E

E. Robert Tisdale

Christian said:
extern int nalloc(size_t size);

int* p = nalloc(2*sizeof(int));
int* p = (int*)nalloc(2*sizeof(int));

In this case, the compiler can catch the error in the first statement,
but not in the second one. The cast did hide the error.

In this case, your example begs the question,
"What was the error in this example?"
Was nalloc supposed to return a pointer to an int?
 
C

Christian Bau

"E. Robert Tisdale said:
In this case, your example begs the question,
"What was the error in this example?"
Was nalloc supposed to return a pointer to an int?

I'll have to repeat myself:

"Tisdale, you idiot, go away!"
 
J

J. J. Farrell

magesh said:
I completely agree with the fact that proper header files should be
included, I would be amazed a proper C file that never calls a function
from stdlib, so normal programmers include them, if they give attention
to warnings from the compiler or else this thread is not meant for them.

But that's the whole point! Assume the programmer makes a mistake
and forgets to include the header. If he casts the return value from
malloc() he does not necessarily get a diagnostic; some compilers
give one if invoked in certain ways, many do not give one by default,
some do not give one no matter how they are invoked. On the other
hand, if he does not cast the return from malloc(), all C compilers
must give a diagnostic. Without the cast you will always get a
message; with the cast you might or might not get a message.

I remember one poster here presenting some sample code that included
a cast on malloc(). He was advised to remove the cast, and patiently
explained that he'd had to add the cast because without it he got
some strange message about 'invalid assignment'. If the mantra
'never cast malloc()' had been burned into his brain, he would have
had to think a little more or ask around to find what his bug was.
declaration of p 1000 lines before.
1001: p= malloc(sizeof(*p)*10);

could you tell me what type is p, while u are going through just this
part of ur code.

1) It is atrocious style to declare a pointer 1000 lines earlier,
especially if it is not used regularly enough for the reader to
already know what it is. Such code is unmaintainable.

2) Why do you care what type it is at the point where you malloc()
it? You might need to know soon afterwards, but I don't regard a
cast on a call to malloc() as an apropriate way of commenting code!
Reagrding code maintenance, the other solution p= malloc(sizeof(*p)*10);
may be easy to change the type of p without changing the malloc line in
ur code, after that?? the fact that the type of p has changed may have
more impact on other lines where u use them, than those on one malloc
statement!!!

Indeed, and the fewer such changes there are to be made, the more
likely that they'll be made correctly. The fewer changes there are,
the more quickly the overall update can be made. The fewer changes
there are, the lower the risk of introducing bugs. Maintainable
reliable code is written to adapt itself automatically to changes
as much as possible.
 
E

E. Robert Tisdale

But that's the whole point! Assume [that]
the programmer makes a mistake and forgets to include the header.
If he casts the return value from malloc(),
he does not necessarily get a diagnostic;
some compilers give one if invoked in certain ways,
many do not give one by default,
some do not give one no matter how they are invoked.

Why should C programmers cobble their code
just to accommodate inferior C compilers?
On the other hand, if he does not cast the return from malloc(),
all C compilers must give a diagnostic.

A cryptic an misleading diagnostic if it doesn't tell the programmer
that the definition of malloc is missing.
 
C

Chris Torek

Someone notes that, if one follows the "comp.lang.c accepted
practice" of not casting malloc() return values, and uses most
ordinary C89/C90 compilers in their default non-warning modes
so that one gets only the required diagnostics, forgetting to
#include said:
A cryptic an misleading diagnostic if it doesn't tell the programmer
that the definition of malloc is missing.

For example, if one were to write:

struct zorg *new_zorg(double degree_of_evilness) {
struct zorg *new = malloc(sizeof *new);

new->degree = degree_of_evilness;
return new;
}

and feed it to the Generic C Compiler, one might get:

% cc -c zorg.c
zorg.c: In function `new_zorg':
zorg.c:4: warning: initialization makes pointer from integer without a cast

It is certainly true that this warning leaves something to be
desired. The problem is not making a "pointer from integer",
but rather with "malloc never explicitly declared, thus implicitly
declared instead". Nonetheless, this is hardly a *new* problem:

% cc -c oops.c
oops.c:6: syntax error before `for'
% cat oops.c
/* kind of like strlen() but returns an int *./
int like_strlen(const char *p) {
int i;

/* we just need to find the first p that is '\0' */
for (i = 0; p != 0; i++)
continue;
return i;
}

There is nothing wrong on line 6, nor line 5, nor 4 (blank), nor
3, nor even line 2 -- the problem is all the way back on line 1,
where a typographic error inserted a period between the * and /
meant to close the comment. There are all kinds of constructs like
this, where a compiler issues the wrong message. Programmers *need*
to learn not to trust where the machine points, in the same way
that automobile mechanics need to learn not to assume that, if the
On Board Diagnostics in an engine computer says "oxygen sensor
failure", it is not necessarily the sensor itself that failed.
(The problem can be in the wire connecting the sensor to the
computer. The sensor is a "replacement item" -- the ones in the
exhaust system in particular may have only a few years' lifetime
-- so the diagnostic is *usually* right, but not always.)

It is always a good idea to compare the machine's complaint against
what you (believe you) know to be the case. If the machine says
"you are making a pointer from an integer", but you know malloc()
does not return an integer, then you should ask yourself: where is
this integer? Why does the compiler believe there is an integer
here? Then, if all else fails, you could even post to comp.lang.c:

"I have this C code here, and the compiler is giving me a
message involving an `integer' when there is no integer.
Why is that?"

Ideally, you should include a complete, compileable chunk of code,
and the exact text of the error or warning message, too. But the
goal here -- besides fixing the problem -- is learning why a C
compiler might think there is an integer here, when you are certain
there is none. Then you will hear all about "implicit int function
declarations", why they are bad, that your compiler needs a tuneup
(e.g., "-Wmissing-declarations") if not a complete replacement,
and of course the inevitable cascade of "don't cast malloc in C /
compile your C code with an Ada-I-mean-C++-compiler" flamage. :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,139
Messages
2,570,806
Members
47,352
Latest member
Maricruz09

Latest Threads

Top