To assert or not to assert...

I

ImpalerCore

I stumbled across a couple assert threads from a while ago. I seem to
have a hard time figuring out how to use assert properly, and I go
back and forth over how best to represent errors from a library's
perspective. From what I've gathered, assert is intended to catch
programming errors, but I don't really know what that means in the
context when writing a library.

There are three main errors that I try to handle.

1. Memory allocation faults.

These faults are caused when malloc, realloc, or calloc fails. There
are several methods that this can be communicated to the user. For
functions that allocate structures, the return value represents either
the allocated object if successful or returns NULL if allocation
failed. The caller of the function is responsible for handling the
result.

\code snippet
c_array_t* numbers = NULL;

numbers = c_array_new( int, 20 );
if ( !numbers )
{
/* User specified method to handle error. */
fprintf( stderr, "c_array_new: failed to allocate array 'numbers'!
\n" );
return EXIT_FAILURE;
}

....
\endcode

In this scenario, I know that I definitely don't want to assert on a
memory fault at the library level, because in general the user may be
able to recover from the situation, and I want to give him that
chance.

However, there are other functions that may invoke a buffer expansion
as part of the operation, requiring a call to malloc or realloc that
may fail. This needs to be communicated back to the user somehow.
Here is an example.

void gc_array_push_back( c_array_t* array, void* object, size_t
type_sz );
#define c_array_push_back( array, object, type ) \
(gc_array_push_back( (array), (object), sizeof( type ) ))

\question
Should the internals of 'gc_array_push_back' use assert to check for
null array pointers.?

Consider two possible methods to verify the array parameter.

void gc_array_push_back( c_array_t* array, void* object, size_t
type_sz )
{
/* variable decls */

assert( array != NULL );

/* push_back code */
}

vs

void gc_array_push_back( c_array_t* array, void* object, size_t
type_sz )
{
/* variable decls */

if ( array )
{
/* push_back code */
}
}
\endquestion

Most of the time, an array NULL pointer is likely an error, so assert
would help catch programming errors. However, I can't guarantee that
there isn't a user that would have a valid use case for having a null
array pointer, perhaps if it's considered as an empty array in another
structure. So in this case, I feel that the 'if ( array )' construct
feels right.

Another question within the same context of the 'c_array_push_back'
function is how to communicate an allocation failure if the array
needs more memory, but the allocation fails. At the minimum, I want
to make sure that I don't lose any of the contents of the array
buffer, so that the array contents remain the same. There seem to be
several methods to communicate that an allocation error in this case.

A. Modify 'gc_array_push_back' to respond with an error code via the
return value (or as an additional argument to the function, but I
really prefer not to use that style of interface).

This is a common scheme that is used and recommended often, and for
good reason. The main drawback is that sometimes there is competition
for the return value slot (not in the gc_array_push_back function's
case since it returns void, but in other functions). If I want to
carry the error code and something else, I need to write a custom
struct and it complicates the interface somewhat.

struct c_array_return
{
<data_type> data;
int error;
}

The other issue is that there can be conflicts between interpretation
of the return value. For example, if I make a copy of an array.

\code snippet
c_array_t* c_array_copy( c_array_t* array )
{
c_array_t* copy = NULL;

if ( array )
{
/* copy array */
}

return copy;
}
\endcode

If I allow arrays that are NULL pointers, then checking the return
value of c_array_copy for NULL is troublesome because I can't
distinguish whether the parameter 'array' is NULL or if the allocation
failed, which may be an important distinction.

B. Use some global variable that users should set before the
operation and check after the operation, akin to setting errno to
ENOMEM.

One of the issues brought up before is that ENOMEM isn't portable, so
it's not a reliable mechanism to use in of itself. However, the
concept is usable provided that you maintain the global state
yourself.

In my case, I have my library use a wrapper that maintains a global
state that is accessed using a macro 'c_enomem'. It functions
similarly to errno, but I have control over what it does provided that
my library calls my wrapper.

\code snippet
/* A flag that signals an out-of-memory error has occurred. */
static c_bool gc_private_allocator_enomem = FALSE;

c_bool* gc_error_not_enough_memory( void )
{
return &gc_private_allocator_enomem;
}

#define c_enomem (*gc_error_not_enough_memory())

void* c_malloc( size_t size )
{
void* mem = NULL;

if ( size )
{
mem = gc_private_allocator_ftable.malloc( size );
if ( !mem ) {
gc_private_allocator_enomem = TRUE;
}
}

return mem;
}
\endcode

With this kind of mechanism in place, I can do something like the
following:

\code snippet
c_array_t* numbers = NULL;
int n;

numbers = c_array_new( int, 20 );
/* Fill up numbers array */

c_enomem = FALSE;
c_array_push_back( numbers, &n, int );
if ( c_enomem ) {
fprintf( stderr, "c_array_push_back: allocation failure!\n" );
/* Try to recover or save gracefully if desired */
}
\endcode

This scenario also has its drawbacks, but it's the poison that I've
chosen particular for functions that would instead have to return
something in addition to an error code.

These are the main two error handling schemes that I'm familiar with
that has the granularity to handle function level out-of-memory
conditions. Signals and callback handlers are also useful tools, but
I've not been able to figure out a framework that seems to work at
this level. I've only used them as general error handlers if any
allocation fails, rather than at specific locations.

2. Invalid arguments

This is where I really struggle with what is assertable and what is
left to the user is left to blow up.

Take a function to erase an element out of an array.

void gc_array_erase( c_array_t* array, size_t pos )
{
if ( array )
{
/* what to assert if anything */
if ( pos < array->size )
{
/* erase the element at index 'pos' */
}
}
}

Does the condition 'pos < size' constitute a good assert candidate?
There is no possible scenario where a pos >= array->size would ever do
anything, but does the crime fit the punishment of abort in a library
like this? If not, is the silent treatment ok, or should an out-of-
range error be communicated back somehow? I can't see the pros and
cons enough to make a decision and stick to it.

I could see maybe having another layer of assert that takes the middle
ground. If the user wants to assert for conditions when pos is larger
than the array size and other invalid arguments, the library could
have something like the following.

void gc_array_erase( c_array_t* array, size_t pos )
{
if ( array )
{
safe_assert( pos < array->size );
if ( pos < array->size )
{
/* erase the element at index 'pos' */
}
}
}

This would give the users of the library some semblance of choice on
how strict to apply the assert checking mechanism.

3. Programming errors

At the library level, do I assert things within my library interface
that the user may have messed up? Take for example inserting an
object into an array in a sorted fashion.

\code snippet
void gc_array_insert_sorted( c_array_t* array, void* object,
c_compare_function_t cmp_fn, size_t type_sz )
{
if ( array )
{
assert( c_is_array_sorted( array, cmp_fn ) );
...
}
}
\endcode

In this scenario, I have a function that verifies that all the
elements are in sorted order. Is this an assertable offense? It's
certainly possible that the user may want to insert something sorted
in the array even though the array itself is not sorted. I don't feel
that I have the right for my library to demand that the array is
sorted via assert, even if the constraint of having a sorted array is
violated. This could be another 'safe_assert' or another level of
assert to verify this property.

There are cases that assert seems to be a good use for. Particularly,
if I have a valid array pointer, it's buffer better be valid too.

\code snippet
void gc_array_insert_sorted( c_array_t* array, void* object,
c_compare_function_t cmp_fn, size_t type_sz )
{
if ( array )
{
assert( array->buffer != NULL );
}
}
\endcode

This seems like a great candidate to be used in an assert, since I
specifically designed the interface to always have at least one
element allocated (the 'array->size' can still be zero though if no
elements are inserted into the array).

The last thing is whether it's recommended or not to append a string
to the assert macro to help provide flavor to the condition.
Something like

\code
assert( pos < array->size && "pos outside range: >= array->size" );
\endcode

Some things I've been pondering as of late.

Best regards,
John D.
 
U

Uno

ImpalerCore said:
I stumbled across a couple assert threads from a while ago. I seem to
have a hard time figuring out how to use assert properly, and I go
back and forth over how best to represent errors from a library's
perspective. From what I've gathered, assert is intended to catch
programming errors, but I don't really know what that means in the
context when writing a library.

There are three main errors that I try to handle.

1. Memory allocation faults.

These faults are caused when malloc, realloc, or calloc fails. There
are several methods that this can be communicated to the user. For
functions that allocate structures, the return value represents either
the allocated object if successful or returns NULL if allocation
failed. The caller of the function is responsible for handling the
result.

\code snippet
c_array_t* numbers = NULL;

numbers = c_array_new( int, 20 );
if ( !numbers )
{
/* User specified method to handle error. */
fprintf( stderr, "c_array_new: failed to allocate array 'numbers'!
\n" );
return EXIT_FAILURE;
}

...
\endcode

In this scenario, I know that I definitely don't want to assert on a
memory fault at the library level, because in general the user may be
able to recover from the situation, and I want to give him that
chance.

However, there are other functions that may invoke a buffer expansion
as part of the operation, requiring a call to malloc or realloc that
may fail. This needs to be communicated back to the user somehow.
Here is an example.

void gc_array_push_back( c_array_t* array, void* object, size_t
type_sz );
#define c_array_push_back( array, object, type ) \
(gc_array_push_back( (array), (object), sizeof( type ) ))

\question
Should the internals of 'gc_array_push_back' use assert to check for
null array pointers.?

Consider two possible methods to verify the array parameter.

void gc_array_push_back( c_array_t* array, void* object, size_t
type_sz )
{
/* variable decls */

assert( array != NULL );

/* push_back code */
}

vs

void gc_array_push_back( c_array_t* array, void* object, size_t
type_sz )
{
/* variable decls */

if ( array )
{
/* push_back code */
}
}
\endquestion

Most of the time, an array NULL pointer is likely an error, so assert
would help catch programming errors. However, I can't guarantee that
there isn't a user that would have a valid use case for having a null
array pointer, perhaps if it's considered as an empty array in another
structure. So in this case, I feel that the 'if ( array )' construct
feels right.

Another question within the same context of the 'c_array_push_back'
function is how to communicate an allocation failure if the array
needs more memory, but the allocation fails. At the minimum, I want
to make sure that I don't lose any of the contents of the array
buffer, so that the array contents remain the same. There seem to be
several methods to communicate that an allocation error in this case.

A. Modify 'gc_array_push_back' to respond with an error code via the
return value (or as an additional argument to the function, but I
really prefer not to use that style of interface).

This is a common scheme that is used and recommended often, and for
good reason. The main drawback is that sometimes there is competition
for the return value slot (not in the gc_array_push_back function's
case since it returns void, but in other functions). If I want to
carry the error code and something else, I need to write a custom
struct and it complicates the interface somewhat.

struct c_array_return
{
<data_type> data;
int error;
}

The other issue is that there can be conflicts between interpretation
of the return value. For example, if I make a copy of an array.

\code snippet
c_array_t* c_array_copy( c_array_t* array )
{
c_array_t* copy = NULL;

if ( array )
{
/* copy array */
}

return copy;
}
\endcode

If I allow arrays that are NULL pointers, then checking the return
value of c_array_copy for NULL is troublesome because I can't
distinguish whether the parameter 'array' is NULL or if the allocation
failed, which may be an important distinction.

B. Use some global variable that users should set before the
operation and check after the operation, akin to setting errno to
ENOMEM.

One of the issues brought up before is that ENOMEM isn't portable, so
it's not a reliable mechanism to use in of itself. However, the
concept is usable provided that you maintain the global state
yourself.

In my case, I have my library use a wrapper that maintains a global
state that is accessed using a macro 'c_enomem'. It functions
similarly to errno, but I have control over what it does provided that
my library calls my wrapper.

\code snippet
/* A flag that signals an out-of-memory error has occurred. */
static c_bool gc_private_allocator_enomem = FALSE;

c_bool* gc_error_not_enough_memory( void )
{
return &gc_private_allocator_enomem;
}

#define c_enomem (*gc_error_not_enough_memory())

void* c_malloc( size_t size )
{
void* mem = NULL;

if ( size )
{
mem = gc_private_allocator_ftable.malloc( size );
if ( !mem ) {
gc_private_allocator_enomem = TRUE;
}
}

return mem;
}
\endcode

With this kind of mechanism in place, I can do something like the
following:

\code snippet
c_array_t* numbers = NULL;
int n;

numbers = c_array_new( int, 20 );
/* Fill up numbers array */

c_enomem = FALSE;
c_array_push_back( numbers, &n, int );
if ( c_enomem ) {
fprintf( stderr, "c_array_push_back: allocation failure!\n" );
/* Try to recover or save gracefully if desired */
}
\endcode

This scenario also has its drawbacks, but it's the poison that I've
chosen particular for functions that would instead have to return
something in addition to an error code.

These are the main two error handling schemes that I'm familiar with
that has the granularity to handle function level out-of-memory
conditions. Signals and callback handlers are also useful tools, but
I've not been able to figure out a framework that seems to work at
this level. I've only used them as general error handlers if any
allocation fails, rather than at specific locations.

2. Invalid arguments

This is where I really struggle with what is assertable and what is
left to the user is left to blow up.

Take a function to erase an element out of an array.

void gc_array_erase( c_array_t* array, size_t pos )
{
if ( array )
{
/* what to assert if anything */
if ( pos < array->size )
{
/* erase the element at index 'pos' */
}
}
}

Does the condition 'pos < size' constitute a good assert candidate?
There is no possible scenario where a pos >= array->size would ever do
anything, but does the crime fit the punishment of abort in a library
like this? If not, is the silent treatment ok, or should an out-of-
range error be communicated back somehow? I can't see the pros and
cons enough to make a decision and stick to it.

I could see maybe having another layer of assert that takes the middle
ground. If the user wants to assert for conditions when pos is larger
than the array size and other invalid arguments, the library could
have something like the following.

void gc_array_erase( c_array_t* array, size_t pos )
{
if ( array )
{
safe_assert( pos < array->size );
if ( pos < array->size )
{
/* erase the element at index 'pos' */
}
}
}

This would give the users of the library some semblance of choice on
how strict to apply the assert checking mechanism.

3. Programming errors

At the library level, do I assert things within my library interface
that the user may have messed up? Take for example inserting an
object into an array in a sorted fashion.

\code snippet
void gc_array_insert_sorted( c_array_t* array, void* object,
c_compare_function_t cmp_fn, size_t type_sz )
{
if ( array )
{
assert( c_is_array_sorted( array, cmp_fn ) );
...
}
}
\endcode

In this scenario, I have a function that verifies that all the
elements are in sorted order. Is this an assertable offense? It's
certainly possible that the user may want to insert something sorted
in the array even though the array itself is not sorted. I don't feel
that I have the right for my library to demand that the array is
sorted via assert, even if the constraint of having a sorted array is
violated. This could be another 'safe_assert' or another level of
assert to verify this property.

There are cases that assert seems to be a good use for. Particularly,
if I have a valid array pointer, it's buffer better be valid too.

\code snippet
void gc_array_insert_sorted( c_array_t* array, void* object,
c_compare_function_t cmp_fn, size_t type_sz )
{
if ( array )
{
assert( array->buffer != NULL );
}
}
\endcode

This seems like a great candidate to be used in an assert, since I
specifically designed the interface to always have at least one
element allocated (the 'array->size' can still be zero though if no
elements are inserted into the array).

The last thing is whether it's recommended or not to append a string
to the assert macro to help provide flavor to the condition.
Something like

\code
assert( pos < array->size && "pos outside range: >= array->size" );
\endcode

Some things I've been pondering as of late.

I've been thinking about the purview of assert since I got Plauger's
source for the standard c library.

I recommend the book.
 
I

Ian Collins

I stumbled across a couple assert threads from a while ago. I seem to
have a hard time figuring out how to use assert properly, and I go
back and forth over how best to represent errors from a library's
perspective. From what I've gathered, assert is intended to catch
programming errors, but I don't really know what that means in the
context when writing a library.

I disagree with that use. Tests and testing should catch programming
errors, asserts catch condition that should never happen and can't be
handled by the application. There will always be unforeseen conditions
and there may be conditions that are deemed terminal. Failure of memory
allocation is often one of these terminal events.

There are three main errors that I try to handle.

1. Memory allocation faults.

These faults are caused when malloc, realloc, or calloc fails. There
are several methods that this can be communicated to the user. For
functions that allocate structures, the return value represents either
the allocated object if successful or returns NULL if allocation
failed. The caller of the function is responsible for handling the
result.

That's a sensible policy for a library writer. malloc is just another
library function after all!
However, there are other functions that may invoke a buffer expansion
as part of the operation, requiring a call to malloc or realloc that
may fail. This needs to be communicated back to the user somehow.

B. Use some global variable that users should set before the
operation and check after the operation, akin to setting errno to
ENOMEM.

One of the issues brought up before is that ENOMEM isn't portable, so
it's not a reliable mechanism to use in of itself. However, the
concept is usable provided that you maintain the global state
yourself.

This is a common approach.

2. Invalid arguments

This is where I really struggle with what is assertable and what is
left to the user is left to blow up.

Take a function to erase an element out of an array.

void gc_array_erase( c_array_t* array, size_t pos )
{
if ( array )
{
/* what to assert if anything */
if ( pos< array->size )
{
/* erase the element at index 'pos' */
}
}
}

Does the condition 'pos< size' constitute a good assert candidate?

If the function doesn't return an error condition, it's a call you
should make a document. Consider if there any harm in simply doing
nothing. It's better for a library function to use return codes.

3. Programming errors

At the library level, do I assert things within my library interface
that the user may have messed up? Take for example inserting an
object into an array in a sorted fashion.

\code snippet
void gc_array_insert_sorted( c_array_t* array, void* object,
c_compare_function_t cmp_fn, size_t type_sz )
{
if ( array )
{
assert( c_is_array_sorted( array, cmp_fn ) );
...
}
}
\endcode

In this scenario, I have a function that verifies that all the
elements are in sorted order. Is this an assertable offense?

More like a return an error one.

In general, I'd use asserts as a last resort in a library. Let the user
make the call!
 
I

Ian Collins

I've been thinking about the purview of assert since I got Plauger's
source for the standard c library.

I recommend the book.

Did you have to quote the whole post to add that??
 
U

Uno

Ian said:
Did you have to quote the whole post to add that??

Well, yeah. I thought he makes a pretty good case for ambivalence in
all 3 cases, so I didn't feel like I was one to make editorial snips.

Has there been a bandwidth disturbance as a result of my quoting
something I thought was quotable?
 
I

Ian Collins

Well, yeah. I thought he makes a pretty good case for ambivalence in all
3 cases, so I didn't feel like I was one to make editorial snips.

Has there been a bandwidth disturbance as a result of my quoting
something I thought was quotable?

No, but several scroll wheels will probably crap out a few seconds
earlier than they should...
 
R

Rui Maciel

Uno said:
Well, yeah. I thought he makes a pretty good case for ambivalence in
all 3 cases, so I didn't feel like I was one to make editorial snips.

Has there been a bandwidth disturbance as a result of my quoting
something I thought was quotable?

By not snipping your quote you forced everyone who read your post to needlessly scroll down through
an absurdly long quote just in order to read an advertisement to some book. You could've done a
whole lot better than that, unless you don't care if you write readable text.


Rui Maciel
 
U

Uno

Rui said:
By not snipping your quote you forced everyone who read your post to needlessly scroll down through
an absurdly long quote just in order to read an advertisement to some book. You could've done a
whole lot better than that, unless you don't care if you write readable text.


Rui, you are stupid to imagine the way you do. Nobody forced you to
read anything. I just want you to know that.
 
E

Ersek, Laszlo

I stumbled across a couple assert threads from a while ago. I seem to
have a hard time figuring out how to use assert properly, and I go back
and forth over how best to represent errors from a library's
perspective. From what I've gathered, assert is intended to catch
programming errors, but I don't really know what that means in the
context when writing a library.

Your library manipulates resources and data structures. Put these
resources into set R.

Each function of your library, no matter if it's meant for internal usage
only or (also) for the client programmer, has an "applicability domain",
or precondition. That is, for function f_i, you have a zeroth or first
order logical predicate P_i over the elements of R. If the predicate
evaluates to true, the function can work, otherwise, it cannot.

If the client programmer (or the language implementation) may falsify a
predicate in a well-defined client program, you signal an error.

If neither the client programmer nor the system can falsify a predicate,
you write an assert instead of a selection statement. As a rule of thumb,
whenever you're tempted to write an "if", write and "if". Whenever you're
tempted to write a comment like "if foo equals bar and baz is greater than
quux, then the logic of my library *ensures* that xizzy is non-null, so
I'm just gonna access *xizzy without chekcing it", then write an assert.


struct whatever
{
int foo, bar, baz, quux;
char *xizzy;
};


/*
Call this with an initialized whatever object. "req" must be 0 or 1.
Return value: 0 for success, -1 for an error.
*/
int
frobnicate(const struct whatever *w, int req)
{
switch (req) {
case 0:
if (w->foo == w->bar && w->baz > w->quux) {

/* do something here, like */
if (0 > fprintf(stdout, "%d", w->foo == w->quux)) {
return -1;
}

/*
if foo equals bar and baz is greater than quux, then the logic of
my library *ensures* that xizzy points to a NUL terminated array
of characters, that is, xizzy is at least non-NULL. If it is, then
it is not an error to handle here, but some other part of the
library has messed up xizzy and that part must be fixed.
*/
assert(0 != w->xizzy);
if (0 > fprintf(stdout, "%s\n", w->xizzy)) {
return -1;
}
}

return 0;

case 1:
/* ... */
return 0;
}

return -1;
}

assert() has the nice property on well-configured sensible platforms that
you'll end up with a core dump, and you can do post-mortem debugging, that
is, try to find out where xizzy went (or didn't change from) NULL.

In short, assert(P(...)) is a synonym for "!P(...) is impossible here --
if it still occurs, it's my own damn fault".

Cheers,
lacos
 
I

ImpalerCore

I disagree with that use.  Tests and testing should catch programming
errors, asserts catch condition that should never happen and can't be
handled by the application.

From a library perspective, I consider that there are two kinds of
programming errors. Internal ones within the library, where a user
provides valid input and environment, but due to a logic error I do
something that I detect later because I made an assumption when it
actually was not correct. Then there are the external errors that are
driven by user behavior, whether by giving bad input, or going behind
the API's back and monkeying around, or problems with system
resources.

I don't think I have enough experience to know where to draw that line
yet. Are 'condition that should never happen' considered programming
errors if I detect that it happens?
 There will always be unforeseen conditions
and there may be conditions that are deemed terminal.  Failure of memory
allocation is often one of these terminal events.





That's a sensible policy for a library writer.  malloc is just another
library function after all!

Glad I'm not completely off base.
This is a common approach.

<snip>








If the function doesn't return an error condition, it's a call you
should make a document.  Consider if there any harm in simply doing
nothing.  It's better for a library function to use return codes.

That's the issue, if I pass an invalid 'pos' argument, from the
library perspective, it doesn't harm anything, as I can silently
ignore it and go about my merry business. The question is if passing
an invalid 'pos' and silently ignoring it harms the user experience.
My gut is telling me that it is, because the user is likely making a
logical error (overrunning the bounds of the array). If I assert
'(pos < size)' from the library, it's a message to the library user
that you're violating the contract to gc_array_erase, you *better* do
something about it. I can accomplish the same thing using return
codes or a 'errno = EINVAL' mechanism as well, which is telling the
user that *maybe* you should do something about it, and then place the
burden on the user; I just can't quite make the case convincing enough
that using an assert in this scenario is 'The Wrong Thing To Do'.
<snip>







More like a return an error one.

In general, I'd use asserts as a last resort in a library.  Let the user
make the call!

I agree completely. I'm just trying to grasp what the last resort's
are. Thanks for your comments.

Best regards,
John D.
 
R

Rui Maciel

Uno said:
Rui, you are stupid to imagine the way you do. Nobody forced you to
read anything. I just want you to know that.

I notice that reading comprehension isn't exactly your strong suit. After all, if you happen to
read what I've wrote you will notice that said that

"you forced everyone who read your post to needlessly scroll down through
an absurdly long quote"

What did you do? You forced everyone to needlessly scroll down through an absurdly long quote. Who
did you forced? Everyone who read your post.

See? That's not hard, is it?

But keep up with the childish name calling. Good form, Uno.


Rui Maciel
 
I

ImpalerCore

Your library manipulates resources and data structures. Put these
resources into set R.

Each function of your library, no matter if it's meant for internal usage
only or (also) for the client programmer, has an "applicability domain",
or precondition. That is, for function f_i, you have a zeroth or first
order logical predicate P_i over the elements of R. If the predicate
evaluates to true, the function can work, otherwise, it cannot.

If the client programmer (or the language implementation) may falsify a
predicate in a well-defined client program, you signal an error.

If neither the client programmer nor the system can falsify a predicate,
you write an assert instead of a selection statement. As a rule of thumb,
whenever you're tempted to write an "if", write and "if". Whenever you're
tempted to write a comment like "if foo equals bar and baz is greater than
quux, then the logic of my library *ensures* that xizzy is non-null, so
I'm just gonna access *xizzy without chekcing it", then write an assert.

struct whatever
{
   int foo, bar, baz, quux;
   char *xizzy;

};

/*
   Call this with an initialized whatever object. "req" must be 0 or 1.
   Return value: 0 for success, -1 for an error.
*/
int
frobnicate(const struct whatever *w, int req)
{
   switch (req) {
     case 0:
       if (w->foo == w->bar && w->baz > w->quux) {

         /* do something here, like */
         if (0 > fprintf(stdout, "%d", w->foo == w->quux)) {
           return -1;
         }

         /*
           if foo equals bar and baz is greater than quux, then the logic of
           my library *ensures* that xizzy points to a NUL terminated array
           of characters, that is, xizzy is at least non-NULL.. If it is, then
           it is not an error to handle here, but some other part of the
           library has messed up xizzy and that part must be fixed.
         */
         assert(0 != w->xizzy);
         if (0 > fprintf(stdout, "%s\n", w->xizzy)) {
           return -1;
         }
       }

       return 0;

     case 1:
       /* ... */
       return 0;
   }

   return -1;

}

assert() has the nice property on well-configured sensible platforms that
you'll end up with a core dump, and you can do post-mortem debugging, that
is, try to find out where xizzy went (or didn't change from) NULL.

In short, assert(P(...)) is a synonym for "!P(...) is impossible here --
if it still occurs, it's my own damn fault".

Thanks for your input. I pretty much agree with your comments.

Let me give you a somewhat contrived assert use-case. The resizable
generic array stores the element size in the c_array struct.

struct c_array
{
void* buffer;
size_t size;
size_t alloc_size;
size_t element_size;
...
};

If the library is used correctly, 'c_array_new( int, 20 )' will assign
'element_size' to 'sizeof( int )'. Everywhere I can think of, there
shouldn't be any case where 'element_size' is 0. I provide the struct
as a public interface rather than an opaque structure because if the
user wants to extend the interface by defining custom functions, the
details are available. If I do find a scenario where I detect
'element_size' is 0, it's likely that either the user or I seriously
messed up by inadvertently setting element_size to 0, or some memory
operation overran its bounds and corrupted the c_array struct.

To me, this feels like an assert candidate. I can go about asserting
this condition in a couple of ways. I can place it in every single
array function to try to catch the issue as early as possible, or I
can use it only in array functions that specifically access array-
element_size. However, there is still a nagging voice in the back of
my mind saying that this could still cause problems because it doesn't
allow the application to try to handle it gracefully and just blows
up. You hope that unit testing would catch these things, but there's
still the possibility of something happening.

As a newish C library writer, I don't know the scope of what problems
an assert like this may or may not cause.

Best regards,
John D.
 
E

Ersek, Laszlo

Let me give you a somewhat contrived assert use-case. The resizable
generic array stores the element size in the c_array struct.

struct c_array
{
void* buffer;
size_t size;
size_t alloc_size;
size_t element_size;
...
};

If the library is used correctly, 'c_array_new( int, 20 )' will assign
'element_size' to 'sizeof( int )'. Everywhere I can think of, there
shouldn't be any case where 'element_size' is 0. I provide the struct
as a public interface rather than an opaque structure because if the
user wants to extend the interface by defining custom functions, the
details are available. If I do find a scenario where I detect
'element_size' is 0, it's likely that either the user or I seriously
messed up by inadvertently setting element_size to 0, or some memory
operation overran its bounds and corrupted the c_array struct.

AIUI, c_array_new() is a macro. It probably expands to a function call of
some sort, or a comma operator expression, or a do { ... } while (0)
statement. Anyhow, you probably pass the macro's first argument to the
sizeof operator in parentheses. I'm too lazy to check now, but I think if
the code compiles (that is, the compiler doesn't emit a diagnostic), you
can rest assured that sizeof will evaluate to an integer constant
expression with positive value. So there seems to be no need to add any
kind of error checking (for this purpose) to c_array_new(); the compiler
does it for you statically. (I'm ignoring VLA's.)

To me, this feels like an assert candidate. I can go about asserting
this condition in a couple of ways. I can place it in every single
array function to try to catch the issue as early as possible, or I can
use it only in array functions that specifically access
array->element_size.

I would place an assert() wherever I rely on array->element_size being
positive.

However, there is still a nagging voice in the back of my mind saying
that this could still cause problems because it doesn't allow the
application to try to handle it gracefully and just blows up.

You could draw a sharp line between a client's interface and an
implementor's interface. As long as the client uses the dedicated
interfaces, you check the client's input and later on rely on assert()s.

Once the client becomes an implementor and accesses the internals, he will
assume your role as the programmer responsible for maintaining assert()s
and checking user input. That is, the (re-)implementor will need access to
complete documentation, and perhaps the source too. You do have to provide
that, but once he messes with the internals, you're no longer responsible.

A programmer who builds another data structure and delegates some tasks to
your library is just another client, utilizing the public interface, not
penetrating the responsibility boundaries of your library.

See also the public / protected / private member visibilities in C++. That
is, what kinds of member functions may contain assert()s referring to what
kinds of member objects?

Aggregation/composition should be used for task delegation, inheritance
should be used for interface uniformity (or some such). I think a
container library should encourage the first use case foremost. See
<http://stackoverflow.com/questions/269496/inheritance-vs-aggregation> and
the like on the web.

Cheers,
lacos
 
I

ImpalerCore

AIUI, c_array_new() is a macro. It probably expands to a function call of
some sort, or a comma operator expression, or a do { ... } while (0)
statement. Anyhow, you probably pass the macro's first argument to the
sizeof operator in parentheses. I'm too lazy to check now, but I think if
the code compiles (that is, the compiler doesn't emit a diagnostic), you
can rest assured that sizeof will evaluate to an integer constant
expression with positive value. So there seems to be no need to add any
kind of error checking (for this purpose) to c_array_new(); the compiler
does it for you statically. (I'm ignoring VLA's.)

In this case, I have a single alloc function and 2 macros that invoke
it.

void* gc_array_alloc( size_t sz, size_t n, c_bool zeroize );

#define c_array_new( type, n ) (gc_array_alloc( sizeof(type), (n),
FALSE ))
#define c_array_new0( type, n ) (gc_array_alloc( sizeof(type), (n),
TRUE ))

And I have a 'zero_initialize' member that memsets the new object
memory to 0 on creation or if the array expands the buffer.
I would place an assert() wherever I rely on array->element_size being
positive.

This sounds reasonable to me.
You could draw a sharp line between a client's interface and an
implementor's interface. As long as the client uses the dedicated
interfaces, you check the client's input and later on rely on assert()s.

Once the client becomes an implementor and accesses the internals, he will
assume your role as the programmer responsible for maintaining assert()s
and checking user input. That is, the (re-)implementor will need access to
complete documentation, and perhaps the source too. You do have to provide
that, but once he messes with the internals, you're no longer responsible..

A programmer who builds another data structure and delegates some tasks to
your library is just another client, utilizing the public interface, not
penetrating the responsibility boundaries of your library.

See also the public / protected / private member visibilities in C++. That
is, what kinds of member functions may contain assert()s referring to what
kinds of member objects?

I didn't consider this. I have been trying to look up C library code
for assert usage. Looking up how assert is used in C++ classes is a
very good idea.
Aggregation/composition should be used for task delegation, inheritance
should be used for interface uniformity (or some such). I think a
container library should encourage the first use case foremost. See
<http://stackoverflow.com/questions/269496/inheritance-vs-aggregation> and
the like on the web.

I can agree with that.

Thanks again,
John D.
 
I

Ian Collins

From a library perspective, I consider that there are two kinds of
programming errors. Internal ones within the library, where a user
provides valid input and environment, but due to a logic error I do
something that I detect later because I made an assumption when it
actually was not correct. Then there are the external errors that are
driven by user behavior, whether by giving bad input, or going behind
the API's back and monkeying around, or problems with system
resources.

Simple rule - never assert on user input!
 
P

Phil Carmody

Ian Collins said:
Simple rule - never assert on user input!

Never assert on anything that's possible (unless you have absolutely
no sensibly way of handling it).

Phil
 
I

ImpalerCore

Never assert on anything that's possible (unless you have absolutely
no sensibly way of handling it).

Ok, let me phrase an assert question on this simple implementation of
strlen.

\code
size_t strlen( const char* s )
{
size_t len = 0;

while (s[len] != 0)
len++;

return len;
}
\endcode

I can write the implementation to do one of the three things.

1. Let it bomb as is.

2. Put in 'assert( s != NULL );'. When user passes a NULL into it, a
diagnostic message is printed and aborted.

3. Check s for NULL, and avoid doing anything that would cause the
function to bomb.

\code
size_t strlen( const char* s )
{
size_t len = 0;

if ( s )
{
while (s[len] != 0)
len++;
}

return len;
}
\endcode

According to Ian's rule if I understand it correctly, the first and
only the first implementation should be used, since it's the arguments
are driven specifically by user input.

Which of these is the correct implementation for what you expect
strlen to do? From my experience, the first style is what is done for
the functions of the C standard library. I don't quite understand why
it's 'The Right Thing' to do over solutions 2 and 3.

Follow-up question is, if it's ok for strlen, shouldn't it be okay for
every other function you write? I actually have instances of all
three styles in my code, but I can't seem to come up with succinct
reasons for when to use each.

Maybe I'm being too dense about assert and shouldn't worry about it as
much.
 
I

Ian Collins

ImpalerCore wrote:



The point about assertions is that they make a claim about the code.

assert(x != y);

is a claim that *under no circumstances* can x ever be equal to y, *if*
the program is correct.

When an assertion fires, it's a demonstration that your code has a bug.
The bug may be that the assertion shouldn't be there, of course; but, if
the assertion has been carefully chosen, its firing means that it's time
to debug the program.

Not necessarily your code has a bug, it could be a misbehaving library
or hardware.
 
K

Keith Thompson

ImpalerCore said:
Never assert on anything that's possible (unless you have absolutely
no sensibly way of handling it).

Ok, let me phrase an assert question on this simple implementation of
strlen.

\code
size_t strlen( const char* s )
{
size_t len = 0;

while (s[len] != 0)
len++;

return len;
}
\endcode

I can write the implementation to do one of the three things.

1. Let it bomb as is.

2. Put in 'assert( s != NULL );'. When user passes a NULL into it, a
diagnostic message is printed and aborted.

3. Check s for NULL, and avoid doing anything that would cause the
function to bomb.

\code
size_t strlen( const char* s )
{
size_t len = 0;

if ( s )
{
while (s[len] != 0)
len++;
}

return len;
}
\endcode

According to Ian's rule if I understand it correctly, the first and
only the first implementation should be used, since it's the arguments
are driven specifically by user input.

Who says the argument to strlen() is driven by user input? I see
no user input in the code you posted.

Users don't call strlen(); code does. It's your responsibility
as a programmer to avoid calling strlen() with anything that will
invoke undefined behavior. If you make such a call in response to
use input, it's because you weren't sufficiently careful in your
handling and checking of the user input.

The strlen() function, as the standard defines it, creates an
implicit contract: it will return the length of the string pointed to
by s as long as s actually points to a string. It's the caller's
responsibility to meet that requirement. The user (the person
running your program) isn't a party to the contract.

(Sorry if the above sounds a bit harsh; it wasn't meant that way.)
 
S

Seebs

That's a reasonable objection, but it's one that can easily be fixed up:

assert(is_misbehaving(library) == 0);
assert(is_misbehaving(hardware) == 0);

Exactly.

For a great example of how NOT to do this, look at the RPM package manager,
which calls a function which requires root privileges, then asserts that
the call succeeded. Dumb. That's a situation which requires a diagnostic
explaining what went wrong. (I think it first checks that you have the
privileges, but there are other reasons the call can fail...)

-s
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,954
Messages
2,570,116
Members
46,704
Latest member
BernadineF

Latest Threads

Top