strings and NULL argument passing

S

sanjay

Hi,

I have a doubt about passing values to a function accepting string.

======================================
#include <iostream>
using namespace std;

int main()
{
void print(const string&);
print("hi");
print(NULL);
return 0;
}
void print(const string& s)
{
cout<<"s is "<<s<<endl;
}
======================================

The above program compiles successfully but fails at run time because
NULL is passed as argument in the second call to print.

Why doesn't the compiler give an error on passing of NULL value?

How could we check for such arguments in our program when we are using
strings?

Regards
Sanjay Raghani
 
F

Fred Zwarts

sanjay said:
Hi,

I have a doubt about passing values to a function accepting string.

======================================
#include <iostream>
using namespace std;

int main()
{
void print(const string&);
print("hi");
print(NULL);
return 0;
}
void print(const string& s)
{
cout<<"s is "<<s<<endl;
}
======================================

The above program compiles successfully but fails at run time because
NULL is passed as argument in the second call to print.

I don't think so. Did you check it?
You can check it by printing the address of s in the function print.
I think that the error occurs before print is called.
The compiler tries to create a temporary string and passes the address of that string to print.
print ("hi"); is compiled as print (string ("hi"));.
Depending on you definition of NULL,
print (NULL); may be compiled as print (string (NULL));.
This may fail during the creation of the temporary, not in the function print.
Why doesn't the compiler give an error on passing of NULL value?

How could we check for such arguments in our program when we are using
strings?

You can't. The error occurs before your function starts.
You should check the argument earlier.
 
R

Rolf Magnus

sanjay said:
======================================
#include <iostream>
using namespace std;

int main()
{
void print(const string&);
print("hi");
print(NULL);
return 0;
}
void print(const string& s)
{
cout<<"s is "<<s<<endl;
}
======================================

The above program compiles successfully but fails at run time because
NULL is passed as argument in the second call to print.

Why doesn't the compiler give an error on passing of NULL value?

Well, semantically, NULL can be converted to std::string, just as "hi" can.
How could the compiler know that NULL isn't a valid value?
How could we check for such arguments in our program when we are using
strings?

I don't see a way to do that, except for providing an overload of the
function for const char*. The problem is that the error already manifests
before print is even entered, in the constructor of std::string.
BTW: The implementation I use throws an exception of type std::logic_error
in this case, but I don't think that this is required by the standard.
 
S

sanjay

Well, semantically, NULL can be converted to std::string, just as "hi" can.
How could the compiler know that NULL isn't a valid value?


I don't see a way to do that, except for providing an overload of the
function for const char*. The problem is that the error already manifests
before print is even entered, in the constructor of std::string.
BTW: The implementation I use throws an exception of type std::logic_error
in this case, but I don't think that this is required by the standard.

Hi Rolf,

Thanks for the reply..

Can you elaborate bit more where exactly do you throw the exception
for handling such situation?

Regards
Sanjay Raghani
 
B

Bo Persson

sanjay said:
Hi Rolf,

Thanks for the reply..

Can you elaborate bit more where exactly do you throw the exception
for handling such situation?

The constructor for std::string might do that.

For performance reasons, the std::string constructor taking a const
char* is NOT required to verify that the pointer is not null. If it
doesn't, bad things will happen when it tries to determine the length
of the (non-existant) char sequence pointed to.



Bo Persson
 
J

Juha Nieminen

sanjay said:
Why doesn't the compiler give an error on passing of NULL value?

Short answer: Because std::string has a constructor which takes a
const char* as parameter, and a null pointer is a perfectly valid
pointer for it. Technically speaking the compiler cannot know if that
constructor handles a null pointer correctly or not, so it has no reason
to issue any error.
How could we check for such arguments in our program when we are using
strings?

Make a version of print() which takes a const char* as parameter and
performs the check. If a non-null pointer is passed, it simply calls the
print() taking a std::string as parameter, else it performs whatever
error termination you want (eg. assert()).
 
J

James Kanze

Well, semantically, NULL can be converted to std::string, just
as "hi" can. How could the compiler know that NULL isn't a
valid value?

Maybe we understand "semantically" differently, but I would say
that his problem is precisely that NULL can't be semantically
converted to an std::string. His code only compiles because it
is syntactically correct; he provides a value which can be
converted to type char const*, and that's all the compiler
checks. Semantically, of course, NULL isn't a string, in any
sense of the word, and it doesn't make sense to try to convert
it to a string. (In C, one might use it to indicate the absense
of a string; std::string doesn't support that concept, however.)
I don't see a way to do that, except for providing an overload
of the function for const char*.

An implementation of std::string could easily cause it to
trigger a compiler error; just provide a private constructor
which takes some other type of pointer (which would make
std::string(NULL) ambiguous).
The problem is that the error already manifests before print
is even entered, in the constructor of std::string. BTW: The
implementation I use throws an exception of type
std::logic_error in this case, but I don't think that this is
required by the standard.

It's undefined behavior. With a good implementation of
std::sttring, it won't compile. (Regretfully, I don't know of
any implementations which are good by that definition:).
 
J

James Kanze

[...]
Overloading for char const* is a fine option.  Using a string
type that performs the run-time check is also OK.  The problem
with either of those approaches is that it imposes the
run-time check, even for C-style string literals (e.g. "hi")
whose type cannot be null, but which decay to the pointer
type.

Compared to the rest of what the constructor has to do, I rather
suspect that the run-time cost of checking isn't measurable.
A function template can be defined to avoid the overhead of
the check for character array literals.  Another, "catch-all"
function template can generate a compile-time error for any
other argument type that could otherwise be inadvertently
converted to a string.
#include <iostream>
#include <string>
/* Print a standard string. */
void print(std::string const& s) {
     std::cout << "s is " << s << '\n';
}
/* Print a C-style string literal. */
template<std::size_t Size>
void print(char const (&c_str)[Size]) {
     print(std::string( c_str ));

Or better yet:
print( std::string( c_str, Size - 1 ) ) ;

No need to count the characters if you already know how many
there are.

Of course, this fails if the string literal was "a\0b", or
something of the sort. It also doesn't work (but nor does your
suggestion) when interfacing with C (where all you've got is a
char const*).
/* Generate a compile time error for unacceptable types. */
template<typename String>
void print(String const& s) {
     s.is_not_of_an_acceptable_string_type();
}

As pointed out earlier, this trick (with some adaption) could be
used directly in std::string.

It's not a panacea, however. You really do have to support
constructing strings from char const*, which can be a null
pointer, even if it isn't a literal. (Of course, it can also be
an invalid pointer, and there's no way you can check for that.)
 
J

James Kanze

"assert" means "I know this is true."  It's enforced
documentation, not a general-purpose way to terminate a
program.
Anyway, there's nothing here that implies termination.  An
exception can be thrown, or the null pointer can just be
accepted as an empty string, or a C-style error code can be
returned.  In my experience, the exception is usually the
right way to go.

A null pointer is not a string (in the general sense), nor is it
something which can be converted into a string. If his
interface requires a string, then passing it a null pointer
should cause an assertion failure. If his interface supports
the idea of a nullable string (e.g. something you might get when
reading a VARCHAR field from a database), then it has to support
something more than std::string.
 
R

Rolf Magnus

Jeff said:
[snipped code that (oops) initialized a std::string from NULL]
Compared to the rest of what the constructor has to do, I rather
suspect that the run-time cost of checking isn't measurable.

Which constructor? std::string?

Yes. The memory allocation alone would outweigh that simple check by far.
If the OP's std::string isn't detecting null initializers, the decision
apparently was made by the implementor that the check was expensive
enough to avoid.

The problem is that the C++ standard doesn't require it, so I guess some
library implementor could think that there is no point in doing such a
check, since the user can't rely on it anyway.
 
R

Rolf Magnus

James Kanze wrote:

Maybe we understand "semantically" differently, but I would say
that his problem is precisely that NULL can't be semantically
converted to an std::string. His code only compiles because it
is syntactically correct;

Yes, I guess you're right.
An implementation of std::string could easily cause it to
trigger a compiler error; just provide a private constructor
which takes some other type of pointer (which would make
std::string(NULL) ambiguous).

What I meant was that I see no way without altering standard headers.
It's undefined behavior. With a good implementation of
std::sttring, it won't compile.

Would it actually be allowed by the standard to have an additional
constructor in std::string?
 
J

James Kanze

[snipped code that (oops) initialized a std::string from NULL]
Compared to the rest of what the constructor has to do, I
rather suspect that the run-time cost of checking isn't
measurable.
Which constructor? std::string? If the OP's std::string
isn't detecting null initializers, the decision apparently was
made by the implementor that the check was expensive enough to
avoid.

Or that it wasn't necessary, because the underlying system would
take care of it in the call to strlen (which dereferences the
pointer). If the system you're running on guarantees a core
dump or its equivalent in the case of a dereferenced null
pointer, you've got the check, even if you didn't want it:).
If the library is only designed for use on Windows and Unix
based systems, there's no point in doing more.
It may also be possible (and is, in the OP's case) to avoid
the std::string altogether, by working directly with the
c-style string.

That's a different issue; if profiling shows that he's spending
too much time constructing the string, such an alternative
should surely be considered.
/* Print a C-style string literal. */
template<std::size_t Size>
void print(char const (&c_str)[Size]) {
print(std::string( c_str ));
Or better yet:
print( std::string( c_str, Size - 1 ) ) ;
No need to count the characters if you already know how many
there are.
Nice catch.

Except as I go on to point out, it doesn't work:-(.
The template cannot even be declared in a C header, so it is
clearly not meant to be a C-language interface function. I'm
not sure why you bring that up; I don't see it as relevant
here. If the function is to be callable from C, it also
cannot be overloaded for char const* and std::string, since it
must be declared extern "C". This is possible only in C++.

I didn't mean that the function itself would be called from C.
I was wondering about the more general issue---how you handle a
string in the form of a char const* which you got from a C
interface.

Of course, the simplest and the safest is just to convert it to
an std::string immediately. So you're right that my comments
really aren't that relevant. Except that such interfaces could
easily be a source of null pointers.
It is only (intended to be) an optimization of the run-time
code.

I thought that the original problem was to catch NULL pointers,
so that they wouldn't be used to create an std::string.
I don't have to support any such thing. Of course, if the
client writes something like print(std::string(0)), there's
not much I can do. There is a definite trade-off between
convenience and safety.

I was considering std::string, not this particular function.
I don't know of a fool-proof way, but you can sometimes detect
nonsense if you control the memory allocation. You can check
that the pointer value is within (or outside) some range. You
can also catch a bunch of bugs by stomping on all deallocated
memory with a magic byte pattern (I like 0xDeadBeef) and
checking for that byte pattern in the addressed memory.

Certainly. One's debugging operator new and operator delete
already take care of much of that. And potentially could do
more; under Solaris or Linux, for example, there are a number of
additional controls you could make on a pointer; something like:

bool
isValid( void const* userPtr )
{
extern int end ;
int stack ;
return (char*)userPtr < (char*)&end // static
|| (char*)userPtr > (char*)(&stack) // stack
|| MemoryChecker::isAllocated( userPtr ) ;
}

The MemoryChecker::isAllocated can be as simple or as complex as
you want. (For the simplest solution, just drop this term, and
replace "(char*)&end" with "(char*)sbrk()" in the first term.
But you should be able to do better than that with a specialized
function, even at reasonable cost.)

I believe that similar system dependent solutions are possible
on most systems. (And this solution doesn't work in a
multithreaded environment, where you have more than one stack.)

You could (and probably should?) make the isValid function a
template, and verify pointer alignment as well.

In the end, of course, it's probably simpler and more effective
to just use Purify, valgrind or something similar. (In the
past, I developed a lot of such solutions, because Purify was
the only existing product, and it's expensive. Today, I doubt
that I'd bother, but I continue to use my older tools because my
test harnesses are built around them.)
 
J

James Kanze

James Kanze wrote:
Would it actually be allowed by the standard to have an
additional constructor in std::string?

That's a good question. If it doesn't affect the overload
resolution of legal calls to the constructor, I think so, if
only under the as if rule. Thus, if as a library implementor, I
do something like:

namespace std {
template< ... >
class basic_string
{
// ...
private:
struct _Hidden {} ;
basic_string( int _Hidden::*,
Allocator const& = Allocator() ) ;
} ;

can a legal program detect the presence of the additional
constructor?

(One issue might be whether the following program is legal:

#include <string>

void
f( bool t )
{
if ( t ) {
std::string s(0) ;
}
}

int
main()
{
f( false ) ;
}

..)

In practice, on thinking about it, I'm not sure that it's worth
the effort. I only catches the case where you initialize with a
null pointer constant (0, NULL or the like), which is, one would
hope, pretty rare. You still need a run-time check (generally
provided directly by the hardware) in case of a variable which
happens to contain a null ponter.
 
J

James Kanze

That does not follow. I consider it an abuse of assertions to
use them as detectors of contract violation. Assertions are
often appropriate for post-conditions, but rarely for
pre-conditions.

Assertions are useful for detecting programming errors.
Violation of a pre-condition is a programming error.
Exceptions should, in my opinion, not be part of the interface
definition of functions; exceptions are best reserved, for
error-reporting, and that specifically includes run-time contract
violations.

I agree with the middle clause: exceptions are best reserved for
error reporting. Which means that I disagree with the other two
parts: error reporting is a vital part of the interface
definition of a function, and run-time contract violations are
programming errors: "impossible" conditions (in a correct
program) not covered by the interface, and not reported as
"errors".
In the case at hand, std::invalid_argument (or a derivative)
seems obviously to be the best choice.

If the contract says so. The contract can specify many things:

-- The caller is not allowed to pass a null pointer. Doing so
violates the contract, which results in "undefined
behavior"---an assertion failure, unless performance
considerations deem otherwise.

-- The caller is allowed to pass a null pointer, and is
guaranteed a specific type of exception. I'd consider this
case fairly rare, but there are probably cases where it is
reasonable.

-- The caller is allowed to pass a null pointer, which the
function maps into a specific string, e.g. "" or
"<<NULL>>"", or whatever.

In general (and there are exceptions), a programming error
should result in the fastest and most abrupt termination of the
program as possible.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top