about NULL as a parameter to a libc function

B

BartC

memcpy(), in a very real sense, is a basic language feature. Yes, it's
in the library rather than in the core language, but its performance is
just as important as the performance of an assignment.

And having memcpy() check for null pointers and do nothing makes certain
errors *more difficult to detect*. If I write

memcpy(target, source, size);

and either target or source happens to be a null pointer, it's almost
certainly because I made a mistake.

If size is zero, that's quite likely to be a mistake too. Yet the standard
defines the behaviour in that case. (I've been looking at the Python
implementation code; checking for size being zero is the most common check
before calling memcpy(), even though it is not actually needed. Checks for
destination or source being null are rarer.)
A memcpy() that hides that mistake
from me isn't doing me any favors; it's by no means certain that doing
nothing is the correct behavior that will "fix" the bug.

Letting some code deliberately crash seems a rather drastic way of telling
you something is wrong! And not crashing doesn't mean nothing else is wrong.
Are these scripting languages of your own design? Perhaps using null
for empty strings wasn't the best choice. Is there a way to represent
an empty string using a non-null pointer?

(I use counted strings, which also have a tag saying if it's a string or
not. Having a length=0 and ptr=0 was the simplest way of
doing things. It's not a big deal, I need to do extra fixing-up of string
arguments. (And that reminds me I'm planning to do away with
zero-terminators too, a bigger headache when calling C and Windows
functions.))
As for untrusted callers, when you check for an error condition, you
have to know how to handle it. How do you know that the way you handle
it (say, doing nothing for a null pointer) is the correct behavior?

Sometimes I start writing a function before fully documenting it or knowing
exactly how it will be used, or whether a null parameter is even likely.
It's simplest to just put checks for null right from the start. Usually it
doesn't hurt. If the value *shouldn't* have been null, then I will find out
when the program doesn't do what I want!
 
K

Keith Thompson

BartC said:
If size is zero, that's quite likely to be a mistake too. Yet the standard
defines the behaviour in that case. (I've been looking at the Python
implementation code; checking for size being zero is the most common check
before calling memcpy(), even though it is not actually needed. Checks for
destination or source being null are rarer.)

A size of zero isn't nearly as likely to be a mistake as a null
pointer. For example, suppose you have an array of (whatever) and
a count of how many elements of the array are currently meaningful.
It makes sense to use memcpy() to copy only the meaningful elements.
The fact that the behavior for size==0 is well defined means you
don't need to code that as a special case.

If you insist on treating a null pointer as if it were a pointer to
an empty (whatever), you can certainly do that -- but don't expect
memcpy() to have special-case code to check for null pointers just
so you don't have to.

I don't think anybody has claimed that the C standard library is a model
of consistency. It's evolved over the years from earlier libraries,
some of them independently developed, many of them optimized for much
smaller and slower systems. But I personally find the way memcpy() is
defined to be reasonable, and I think most other C programmers do as
well. (If you're saying that you don't like the way C is defined,
you're certainly not alone.)
Letting some code deliberately crash seems a rather drastic way of telling
you something is wrong! And not crashing doesn't mean nothing else is wrong.

Letting code deliberately crash is intended to tell *the developer*
that something is wrong. It's not just a random occurrence;
there's a lot of carefully crafted code in your operating system
that's designed to cause, handle, and report crashes.

[...]
Sometimes I start writing a function before fully documenting it or knowing
exactly how it will be used, or whether a null parameter is even likely.
It's simplest to just put checks for null right from the start. Usually it
doesn't hurt. If the value *shouldn't* have been null, then I will find out
when the program doesn't do what I want!

Will you *always* find out? Not all bugs are detected when they should
be. If my code is buggy, I'd much rather have it crash early than
quietly behave in some way other than what I intended.
 
J

James Kuyper

ISO/IEC 9899:1999 says, in �7.1.4 "Use of library Functions": ....
So, the C standard explicitly says that you invoke undefined behaviour
if you pass null pointers to a function that does not expect them. And
that in turn means that any behaviour is correct.

The OP's question was, essentially, why those particular functions were
not defined as expecting null pointers. memcmp(), for instance, could
have been defined as either doing nothing, or returning some kind of
error indication, whenever either of it's pointer arguments is null. It
was not so defined, and gaoqiang wants to know why.

I think we have answered that question; but you've answered a different one.
 
N

Nick Keighley

Nick Keighley said:
On 10/28/2011 09:49 AM, gaoqiang wrote:
a good function should deal with unwanted [invalid?] parameters.
yes but if the caller *knows* the parameter is valid it can be
inefficient to check over and over again
/* how many times do we have to check s1 isn't NULL? */
strcpy (s1, s2);
strcpy (s1, s3);
...
strcpy (s1, sn);

That's not really a likely piece of code. A better example is:

 strcpy(s2,s1);
 strcpy(s3,s1);
 ...

yes, my example was poor

In this case whatever or whoever created p1 and p2 would be responsible.

exactly. Why is this different from the strcpy() example?
It
would be unreasonable to expect the callee to do these checks.

"what world needs is a simple way to factorise prime numbers"
Bill Gates
/* cmp() must terminate */
/* this could be impossible to verify */
void sort (T a[], size_t n, int (*cmp) (T, T));

You're jumping a bit from checking for an uninitialised pointer, to proving
the correctness of potentially thousands of lines of code!

I'm pointing out several examples where it becomes increasingly
onerous for a function to check the correctness of its arguments.
So if the latter
is difficult to do, we do nothing about the simpler cases either?

the simple cases tend to end up in inner loops. I'd like these
functions to be very efficient. It's the C way. If you want more
protection and slightly (or much) slower code use a higher level
language or decorate your C cocde with checks. Write your own "safe"
versions of the string.h functions if you like.
 
N

Nick Keighley

Le 29/10/11 16:39, Nick Keighley a écrit :



What a moron!

How can you check an error code if the program crashes?

Go figure, maybe he runs his applications always inside a debugger.

There is no point in discussing with people like this any further.

quite
 
S

Seebs

It does appear that IEEE 754 specifies this lunacy of a NaN comparing
not equal with itself, however.

It's not lunacy. These people... they do a lot of things that look
really weird, but it turns out that there are sound technical reasons for
it.

It's like the goat you sacrifice to make a SCSI chain work.

-s
 
P

Patrick Scheible

Seebs said:
It's not lunacy. These people... they do a lot of things that look
really weird, but it turns out that there are sound technical reasons for
it.

It's like the goat you sacrifice to make a SCSI chain work.

Somehow SCSI got a worse reputation than it deserves. I used NeXTs with
several SCSI devices hanging off of them and never had any trouble with
them. Contemporary IDE drives were a PITA by comparison.

-- Patrick
 
P

Phil Carmody

Robert Wessel said:
Somehow SCSI got a worse reputation than it deserves. I used NeXTs with
several SCSI devices hanging off of them and never had any trouble with
them. Contemporary IDE drives were a PITA by comparison.


It was the combination of multitudes of connectors and different
busses that had partial interoperability.[...]
And [...]
Then [...]
Or [...]
and [...]
Or [...]
Nor [...]
And [...]

It improved a lot over the years (although parallel SCSI never became
consistent enough to be considered "easy"), but early SCSI fell *far*
short of its promise to be a general purpose I/O channel.

I'll stop ranting now.

What amuses me is that on the other side of the interface, the OS
talking SCSI commands to adapters, the interface was so generically
useful that it got used and abused by all kinds of non-SCSI devices.
(At least in Linux.) Wanna burn an IDE CDR? Use sg. Plug in a USB
stick? Find it under /dev/sd*. ...

Phil
 
S

Seebs

Somehow SCSI got a worse reputation than it deserves. I used NeXTs with
several SCSI devices hanging off of them and never had any trouble with
them. Contemporary IDE drives were a PITA by comparison.

It mostly, I think, had to do with the Mac "innovation" of a 25-pin port
using a common ground. I had endless trouble with SCSI devices using that
connector. Probably my only real complaint about the Amiga 3000.

-s
 
J

jgharston

Robert said:
Interestingly(?) many C compilers for Win16 allowed you to store into
location zero, at least in small or medium model where a 16 bit

C doesn't allow or disallow storage to location zero, the environment
the program is executing on allows or disallows storage to location
zero, usually with the absence or presence of memory mapping/
protection
hardware.

JGH
 
I

ImpalerCore

What would you expect strcmp("", NULL) to return?

Ideally, I would love to see the behavior dependent on preprocessor
flag.

If DEBUG is defined,

Constraint violation: s2 != NULL, file example.c, line 9, function
strcmp
[crash]

If DEBUG is not defined: [crash]

Whether adding explicit argument constraint checking to the standard
library is a good idea, I don't know. Do I think it would be useful,
absolutely. My opinion is that unless NULL is explicitly allowed as a
valid representation of a string in the standard, the result should be
a crash, even if NULL in the statement 'strcmp( NULL, NULL );' can be
thought of in some scenarios as equivalent.

If someone requires a function with defensive programming behavior
that allows NULL pointer values as arguments, it should not be named
'strcmp'.

Best regards,
John D.
 
K

Kaz Kylheku

Now, the Standard guarantees that free(NULL) is a valid no-op.

Since 1989. Some kids which were born then got their university degrees now and
are working as software devs. Can we move on?
 
J

James Kuyper

On 11/03/2011 11:00 AM, Kenneth Brody wrote:
....
Note, too, that it's perfectly acceptable (as far as the C language itself
is concerned) to have NULL point to a "valid" address, as long as no valid
object can be at that address. ...

That is correct.
... For example, it would be legal (AFAIK) for
an implementation to do something like this:

extern char __NULL__;
#define NULL ((void *)(&__NULL__))

No, NULL is required to expand into a null pointer constant. A null
pointer constant is either "An integer constant expression with the
value 0, or such an expression cast to type void *," (6.3.2.3p3).

ICEs must have integer type; &__NULL__ has a pointer type. The operands
of an ICE "shall only have operands that are integer constants,
enumeration constants, character constants, sizeof expressions whose
results are integer constants, and floating constants that are the
immediate operands of casts." (6.6p6). __NULL__ qualifies as none of
those things. While "an implementation may accept other forms of
constant expressions" (6.6p10), those other forms will, by definition of
"other", not have the right form to qualify as any of the specific types
of constants specified in 6.6p6.

What is permitted is for NULL to expand into a particular form of null
pointer constant that will be converted, when it occurs in a pointer
context, into a pointer which happens to point at __NULL__. It would
require a fair amount of compiler magic, however, to make that happen
without also having 0 == &__NULL__.
 
J

James Kuyper

On 11/03/2011 11:26 AM, Kenneth Brody wrote:
....
memcpy() returns the first parameter. In ye olde days, this was probably to
allow the buffer to be "efficiently" passed to some other function to be
handled. (Besides, the only other "reasonable" return would be nothing at all.)

It would have been marginally more useful for memcpy(dest, source,
count) to have returned either (void*)((char*)dest+count) or
(void*)((char*)source+count).
How would returning "the number of bytes actually copied" be useful? It's
guaranteed (assuming no UB) to the the number of bytes passed to it.

He's suggesting that it no longer be UB if one or both of the pointer
arguments is null, in which case returning 0 as the number of bytes
copied would have some use. Not a lot, but some.
 
K

Kaz Kylheku

On 11/03/2011 11:00 AM, Kenneth Brody wrote:
...

That is correct.


No, NULL is required to expand into a null pointer constant. A null
pointer constant is either "An integer constant expression with the
value 0, or such an expression cast to type void *," (6.3.2.3p3).

Since __NULL__ is an external name in the implementation space, and this is
supposed to be some implementation's definition of NULL anyway, those rules are
moot. The implementation could arrange for &__NULL__ to be a constant
expression to make this all work.
 
J

James Kuyper

Since __NULL__ is an external name in the implementation space, and this is
supposed to be some implementation's definition of NULL anyway, those rules are
moot. The implementation could arrange for &__NULL__ to be a constant
expression to make this all work.

I don't think your argument is valid. Where is it stated in the standard
that those conditions render those rules moot? If your argument were
valid, it would render the requirement that NULL expand into a null
pointer constant (7.17p3) meaningless. I doubt that it was intended to
be meaningless.
 
K

Kaz Kylheku

I don't think your argument is valid. Where is it stated in the standard
that those conditions render those rules moot? If your argument were
valid, it would render the requirement that NULL expand into a null
pointer constant (7.17p3) meaningless. I doubt that it was intended to
be meaningless.

6.6 10 An implementation may accept other forms of constant expressions.

This sort of thing only makes a difference to highly contrived program
which does something like this:

#define STR(X) #X
#define STRE(X) STR(X)

if (!is_null_pointer_constant(STRE(NULL))) {
printf("unconventional null pointer constant");
}

Where is_null_pointer_constant is a function which lexically analyzes
and parses the stringified expansion of NULL, and validates what kind
of expression it is.

Since implementation can accept other forms of constant expressions,
this program cannot conclude that the implementation is nonconforming
(and is itself not maximally portable since its output depends on
the implementation.)
 
J

James Kuyper

6.6 10 An implementation may accept other forms of constant expressions.

Yes, I've seen that argument before, and addressed it in one of the
paragraphs you've snipped from one of my earlier messages. Those forms
are "other" than "integer constants, enumeration constants, character
constants, ... and floating constants", which are the only kinds allowed
in ICE. The "other forms of constant expressions" that "an
implementation may accept" don't meet the requirements of any of the
standard-named categories of constants, so they only fall into the
"other" category. They can be used anywhere that the standard requires a
constant expression, without specifying a particular type of constant
expression; they cannot be used as when a specific type of constant
expression is required, as is indeed the case for null pointer
constants, which require an integer constant expression.

Note that an implementation doesn't need to use 6.6p10 to make &__NULL__
qualify as a constant expression; it already qualifies under the
category "address constant expression" (6.6p9).
 
K

Kaz Kylheku

Note that an implementation doesn't need to use 6.6p10 to make &__NULL__
qualify as a constant expression; it already qualifies under the
category "address constant expression" (6.6p9).

&__NULL__ can be a contant integral expression evaluating to zero, if the
implementors so want.
 
J

James Kuyper

&__NULL__ can be a contant integral expression evaluating to zero, if the
implementors so want.

Perhaps, though I doubt that the fact that __NULL__ is a reserved word
gives an implementation permission to ignore the standard semantics of
'&', which can never appear as the first token of an expression with
integer type. If the __NULL__ occurred in user code, it would be a
different matter - but if the user code contains NULL, the
implementation's not allowed to invoke "undefined behavior" as a result
of a feature of that expansion, as an excuse to violate the
standard-defined semantics of '&'.

However, null pointer constants are defined as involving an "integer
constant expression", a phrase whose meaning as a whole is defined in
6.6p6. That definition is not merely the combination of the definitions
of integer, constant, and expression - many constant expressions with
integral type fail to qualify as integer constant expressions.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,952
Messages
2,570,115
Members
46,701
Latest member
mathewpark

Latest Threads

Top