about NULL as a parameter to a libc function

James Kuyper · Oct 28, 2011

I assume there is a good reason why it returns what it does. I'm not
familiar enough with such things to guess at the reason.

Unlike some of the string functions, memcpy() has no permission to copy
anything less than the exact count of bytes that it was given (except
when the behavior is undefined). It could have been defined to return 0
if either pointer argument were null, but there seems little point in that.

A more useful return value for memcpy(dest, src, count) would be
(void*)((char*)dest + count). A naive C-based implementation of memcpy()
calculates that value automaticly as a side effect of doing it's job; a
clever compiler might arrange, at no extra cost, for it to already be
residing in the same register used for returning small scalar values
from a function. The same might also be true even of more efficient
implementations using assembly language.
That value could be useful when doing a long series of memcpy() calls
into consecutive memory locations. A similar argument applies to several
of the str*() functions. However, it's too late now for changes like
that to the C standard library.

James Kuyper · Oct 28, 2011

What would you expect strcmp("", NULL) to return?

I wouldn't recommend such a change, but if it were made, it's often not
that difficult to come up with reasonable definitions of the behavior in
the cases that are currently undefined. In this case, the following
would be one plausible approach:

int strcmp(const char *s1, const char *s2)
{
if(s1 && s2)
return old_strcmp(s1, s2);
return !!s1 - !!s2;
}

Keith Thompson · Oct 28, 2011

Mark Bluemel said:
That's one point of view. Another point of view is that code should
abide by the contract of the api and pass the appropriate arguments to
to functions.

For example, my man page for strcmp states "The strcmp() function
compares the two strings s1 and s2" - a NULL pointer isn't a string, so
why should the function deal with it?

Mine says the same thing -- and it's wrong. It compares the strings
*pointed to by* s1 and s2. (Yes, the meaning is clear enough,
but it's potentially misleading to inexperienced C programmers who
might thing that strings are really pointers.)

Keith Thompson · Oct 28, 2011

gaoqiang said:
I just got an idea that NULL pointer is nothing different with other
invalid pointer. If a function checks the NULL pointer ,then why not
other invalid pointers? to say void*p=0x01 ?

Because it is in general not possible to determine whether a given
pointer is valid or not.

It's the caller's responsibility to pass a valid pointer.

Kaz Kylheku · Oct 28, 2011

What would you expect strcmp("", NULL) to return?

If this combination of inputs were to be defined, I would expect the null
pointer not to compare equal to any string, since it isn't one.

The remaining question, then, is where the null pointer lands in a
lexicographic comparison against strings. I would place it ahead of
strings, so that if you a null mixed up among a set of strings, it
makes itself obvious by taking the top position in a sorted list.

Furthermore, strcmp(NULL, NULL) ought return zero, since it is astonishing
if a thing is not deemed equal to itself (reflexive property of the equivalence
relation).

Keith Thompson · Oct 28, 2011

Ben Bacarisse said:
That does not match my recollection. What incarnation of C had this
property, and when was it changed?

<snip>

I know that *some* C implementations had that property. I've heard that
porting from VAX to SPARC exposed a lot of bugs in code that assumed it
could dereference a null pointer and get zero. But I haven't heard that
the *first* C implementation did this.

Oh, and a null pointer and the empty string have never been the same
thing, because *strings are not pointers*.

Ben Pfaff · Oct 28, 2011

James Kuyper said:
A more useful return value for memcpy(dest, src, count) would be
(void*)((char*)dest + count). A naive C-based implementation of memcpy()
calculates that value automaticly as a side effect of doing it's job; a
clever compiler might arrange, at no extra cost, for it to already be
residing in the same register used for returning small scalar values
from a function. The same might also be true even of more efficient
implementations using assembly language.
That value could be useful when doing a long series of memcpy() calls
into consecutive memory locations. A similar argument applies to several
of the str*() functions. However, it's too late now for changes like
that to the C standard library.

There's a GNU extension named mempcpy() that has the return value
semantics that you suggest might be useful for memcpy().

POSIX 2008 standardized a common C library extension called
stpcpy(), that acts like strcpy() except that it returns a
pointer to the null terminator byte written to the destination
string.

jacob navia · Oct 28, 2011

Le 28/10/11 22:40, Nobody a Ã©crit :

What would you expect strcmp("", NULL) to return?

I would return a previously defined error value
(implementation defined) like INT_MIN+20 for instance.

Furthermore errno should be set to EINVAL

Ben Bacarisse · Oct 28, 2011

Kaz Kylheku said:
Furthermore, strcmp(NULL, NULL) ought return zero, since it is
astonishing if a thing is not deemed equal to itself (reflexive
property of the equivalence relation).

There is a fine tradition that says otherwise. Even in C, x == x is
not always 1 (e.g. when x is NaN).

Kaz Kylheku · Oct 28, 2011

I just got an idea that NULL pointer is nothing different with other
invalid
pointer.

That is false. A null pointer is a valid value. It is not valid for
the purposes of dereferencing.

A portable program can initialize a pointer to null: it is not a bad value.

Other kinds of invalid pointers ere not only invalid pointers, but invalid
values. Portable programs cannot use such pointers in any way; even just
creating one and assigning it to a variable is undefined behavior.

Keith Thompson · Oct 28, 2011

Though it's not required to do so. The behavior is undefined; a seg
fault is not required.

When you use a function such as memcpy, you often have to check that each
operand is in fact not null:

if (A && B && N) memcpy(A, B, N);

when A, B, N are themselves arguments to your code, for example. Wouldn't it
be a lot easier to just write:

memcpy(A, B, N);

and know that no copying is done when any argument is null or zero? Having a
null source or destination is not necessarily an error (and if it is, you
check it conventionally).

One could make exactly the same argument for a simple assignment
(without the N, of course). Perhaps it would be simpler to be able to
write:

*A = *B;

and have it quietly do nothing if either A or B is a null pointer. But
as others have said in parallel followups, when you're writing a
statement that copies data, you should generally have enough awareness
of the context to *know* that there's actually something to be copied.

Do you actually find yourself making such checks in your own code?

There are a nearly unlimited number of ways you can call memcpy()
incorrectly. You can try to copy too few or too many bytes. You can
pass invalid non-null pointers. You can reverse the first and second
arguments. You can pass perfectly valid pointers that just don't happen
to point to the objects you really wanted to copy.

The question is, would it really be worthwhile for memcpy() to detect
just a few kinds of errors (when other equally serious errors are
undetectable)?

How often would having memcpy() check for null pointers make it easier
to write correct code *in practice*?

[...]

Keith Thompson · Oct 28, 2011

Kaz Kylheku said:
If this combination of inputs were to be defined, I would expect the null
pointer not to compare equal to any string, since it isn't one.

The remaining question, then, is where the null pointer lands in a
lexicographic comparison against strings. I would place it ahead of
strings, so that if you a null mixed up among a set of strings, it
makes itself obvious by taking the top position in a sorted list.

Furthermore, strcmp(NULL, NULL) ought return zero, since it is
astonishing if a thing is not deemed equal to itself (reflexive
property of the equivalence relation).

One could make a case that a null pointer is Not A String, in a similar
sense to the way NaN is Not A Number. NaN compares unequal to itself.

Defining *some* consistent set of behavior wouldn't be difficult.
Getting everyone to agree that it's the right definition would be
impossible. (And leaving things the way they are is, IMHO, the best
approach.)

Kaz Kylheku · Oct 28, 2011

Le 28/10/11 22:40, Nobody a Ã©crit :

I would return a previously defined error value
(implementation defined) like INT_MIN+20 for instance.

This is neither a good debugging aid for those who regard the above to be a
bug, nor a very meaningful extension of behavior for those who don't.

Richard Damon · Oct 28, 2011

Trapping null to a segfault requires extra circuitry which complicates the
processor and introduces performance penalties (which are difficult to analyze,
since they have to do with the access pattern to pages, and the replacement
strategy of entries the translation cache.)

The trapping of the Null pointer come out basically for free from the
segment/virtual memory processing that allows those computers to run
multiple programs with virtual memory. At most it causes a single if in
the OS to see if the access to unmapped page fault occurs for a page
that should be added to the programs address space or not.

Keith Thompson · Oct 29, 2011

Keith Thompson said:
I know that *some* C implementations had that property. I've heard that
porting from VAX to SPARC exposed a lot of bugs in code that assumed it
could dereference a null pointer and get zero. But I haven't heard that
the *first* C implementation did this.

[...]

It was probably VAX to Motorola 68k, not SPARC.

Kaz Kylheku · Oct 29, 2011

One could make a case that a null pointer is Not A String, in a similar
sense to the way NaN is Not A Number. NaN compares unequal to itself.

In ISO 9899:1999 there is an absence of a requirement that a NaN compares equal
to itself, which isn't the same thing at all.

Keith Thompson · Oct 29, 2011

Kaz Kylheku said:
In ISO 9899:1999 there is an absence of a requirement that a NaN
compares equal to itself, which isn't the same thing at all.

6.5.9 just says that the == operator "yields 1 if the specified relation
is true and 0 if it is false"; it doesn't say just what that means.
It's not entirely clear from that that 1.0 == 1.0 must yield 1.

Annex F (which is optional) says that "The relational and equality
operators provide IEC 60559 comparisons." I'm not entirely sure what
that means, but I guess it requires a NaN to compare unequal to itself.

In any case, my point was that a NaN *conventionally* compares unequal
to itself, and this *could* be used as a precedent to argue that
strcmp(NULL, NULL) might reasonably return a result denoting inequality.

And again, I'm not advocating any such thing.

Ben Bacarisse · Oct 29, 2011

Keith Thompson said:
6.5.9 just says that the == operator "yields 1 if the specified relation
is true and 0 if it is false"; it doesn't say just what that means.
It's not entirely clear from that that 1.0 == 1.0 must yield 1.

Annex F (which is optional) says that "The relational and equality
operators provide IEC 60559 comparisons." I'm not entirely sure what
that means, but I guess it requires a NaN to compare unequal to
itself.

Also, 6.2.6.1 p4 says (in part):

"Two values (other than NaNs) with the same object representation
compare equal..."

I think that's what Kaz is referring to by the "absence of a
requirement that a NaN compares equal to itself".

Kaz Kylheku · Oct 29, 2011

Also, 6.2.6.1 p4 says (in part):

"Two values (other than NaNs) with the same object representation
compare equal..."

I think that's what Kaz is referring to by the "absence of a
requirement that a NaN compares equal to itself".

Well, that would be one place where the requirement isn't made.
The other places would be ... all other sentences of the document.

It does appear that IEEE 754 specifies this lunacy of a NaN comparing
not equal with itself, however.

Jorgen Grahn · Oct 29, 2011

....

It's really a bit different, because it's the only invalid pointer
which you typically *do* want in your program.

What makes you think that location 0x01 is universally invalid?

It's not, but for practical reasons it's /almost/ universally invalid.
If you want to crash in order to detect bugs like

p=NULL; *p = 42;

then you probably want the same protection against

q=NULL; q->foo = 42;

I bet most systems try to avoid having data in the low address space
for this reason.

/Jorgen

How do I output a function with a parameter argument?	6	Dec 18, 2022
Outputting a string to STDOUT with our libc functions	4	Aug 17, 2008
Passing NULL as a function pointer	44	Feb 17, 2011
Using istream in as a function parameter	5	Jul 14, 2013
Empty function parameter list	7	Jun 13, 2012
Table of "safe" methods to suppress "unused parameter" warnings?	47	Mar 26, 2014
Passing parameter to function not expecting parameter	8	Aug 25, 2010
How can I execute a function ONLY if fetch request returns 404 status?	0	Sep 17, 2022

about NULL as a parameter to a libc function

James Kuyper

James Kuyper

Keith Thompson

Keith Thompson

Kaz Kylheku

Keith Thompson

Ben Pfaff

jacob navia

Ben Bacarisse

Kaz Kylheku

Keith Thompson

Keith Thompson

Kaz Kylheku

Richard Damon

Keith Thompson

Kaz Kylheku

Keith Thompson

Ben Bacarisse

Kaz Kylheku

Jorgen Grahn

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads