I
ImpalerCore
I stumbled across a couple assert threads from a while ago. I seem to
have a hard time figuring out how to use assert properly, and I go
back and forth over how best to represent errors from a library's
perspective. From what I've gathered, assert is intended to catch
programming errors, but I don't really know what that means in the
context when writing a library.
There are three main errors that I try to handle.
1. Memory allocation faults.
These faults are caused when malloc, realloc, or calloc fails. There
are several methods that this can be communicated to the user. For
functions that allocate structures, the return value represents either
the allocated object if successful or returns NULL if allocation
failed. The caller of the function is responsible for handling the
result.
\code snippet
c_array_t* numbers = NULL;
numbers = c_array_new( int, 20 );
if ( !numbers )
{
/* User specified method to handle error. */
fprintf( stderr, "c_array_new: failed to allocate array 'numbers'!
\n" );
return EXIT_FAILURE;
}
....
\endcode
In this scenario, I know that I definitely don't want to assert on a
memory fault at the library level, because in general the user may be
able to recover from the situation, and I want to give him that
chance.
However, there are other functions that may invoke a buffer expansion
as part of the operation, requiring a call to malloc or realloc that
may fail. This needs to be communicated back to the user somehow.
Here is an example.
void gc_array_push_back( c_array_t* array, void* object, size_t
type_sz );
#define c_array_push_back( array, object, type ) \
(gc_array_push_back( (array), (object), sizeof( type ) ))
\question
Should the internals of 'gc_array_push_back' use assert to check for
null array pointers.?
Consider two possible methods to verify the array parameter.
void gc_array_push_back( c_array_t* array, void* object, size_t
type_sz )
{
/* variable decls */
assert( array != NULL );
/* push_back code */
}
vs
void gc_array_push_back( c_array_t* array, void* object, size_t
type_sz )
{
/* variable decls */
if ( array )
{
/* push_back code */
}
}
\endquestion
Most of the time, an array NULL pointer is likely an error, so assert
would help catch programming errors. However, I can't guarantee that
there isn't a user that would have a valid use case for having a null
array pointer, perhaps if it's considered as an empty array in another
structure. So in this case, I feel that the 'if ( array )' construct
feels right.
Another question within the same context of the 'c_array_push_back'
function is how to communicate an allocation failure if the array
needs more memory, but the allocation fails. At the minimum, I want
to make sure that I don't lose any of the contents of the array
buffer, so that the array contents remain the same. There seem to be
several methods to communicate that an allocation error in this case.
A. Modify 'gc_array_push_back' to respond with an error code via the
return value (or as an additional argument to the function, but I
really prefer not to use that style of interface).
This is a common scheme that is used and recommended often, and for
good reason. The main drawback is that sometimes there is competition
for the return value slot (not in the gc_array_push_back function's
case since it returns void, but in other functions). If I want to
carry the error code and something else, I need to write a custom
struct and it complicates the interface somewhat.
struct c_array_return
{
<data_type> data;
int error;
}
The other issue is that there can be conflicts between interpretation
of the return value. For example, if I make a copy of an array.
\code snippet
c_array_t* c_array_copy( c_array_t* array )
{
c_array_t* copy = NULL;
if ( array )
{
/* copy array */
}
return copy;
}
\endcode
If I allow arrays that are NULL pointers, then checking the return
value of c_array_copy for NULL is troublesome because I can't
distinguish whether the parameter 'array' is NULL or if the allocation
failed, which may be an important distinction.
B. Use some global variable that users should set before the
operation and check after the operation, akin to setting errno to
ENOMEM.
One of the issues brought up before is that ENOMEM isn't portable, so
it's not a reliable mechanism to use in of itself. However, the
concept is usable provided that you maintain the global state
yourself.
In my case, I have my library use a wrapper that maintains a global
state that is accessed using a macro 'c_enomem'. It functions
similarly to errno, but I have control over what it does provided that
my library calls my wrapper.
\code snippet
/* A flag that signals an out-of-memory error has occurred. */
static c_bool gc_private_allocator_enomem = FALSE;
c_bool* gc_error_not_enough_memory( void )
{
return &gc_private_allocator_enomem;
}
#define c_enomem (*gc_error_not_enough_memory())
void* c_malloc( size_t size )
{
void* mem = NULL;
if ( size )
{
mem = gc_private_allocator_ftable.malloc( size );
if ( !mem ) {
gc_private_allocator_enomem = TRUE;
}
}
return mem;
}
\endcode
With this kind of mechanism in place, I can do something like the
following:
\code snippet
c_array_t* numbers = NULL;
int n;
numbers = c_array_new( int, 20 );
/* Fill up numbers array */
c_enomem = FALSE;
c_array_push_back( numbers, &n, int );
if ( c_enomem ) {
fprintf( stderr, "c_array_push_back: allocation failure!\n" );
/* Try to recover or save gracefully if desired */
}
\endcode
This scenario also has its drawbacks, but it's the poison that I've
chosen particular for functions that would instead have to return
something in addition to an error code.
These are the main two error handling schemes that I'm familiar with
that has the granularity to handle function level out-of-memory
conditions. Signals and callback handlers are also useful tools, but
I've not been able to figure out a framework that seems to work at
this level. I've only used them as general error handlers if any
allocation fails, rather than at specific locations.
2. Invalid arguments
This is where I really struggle with what is assertable and what is
left to the user is left to blow up.
Take a function to erase an element out of an array.
void gc_array_erase( c_array_t* array, size_t pos )
{
if ( array )
{
/* what to assert if anything */
if ( pos < array->size )
{
/* erase the element at index 'pos' */
}
}
}
Does the condition 'pos < size' constitute a good assert candidate?
There is no possible scenario where a pos >= array->size would ever do
anything, but does the crime fit the punishment of abort in a library
like this? If not, is the silent treatment ok, or should an out-of-
range error be communicated back somehow? I can't see the pros and
cons enough to make a decision and stick to it.
I could see maybe having another layer of assert that takes the middle
ground. If the user wants to assert for conditions when pos is larger
than the array size and other invalid arguments, the library could
have something like the following.
void gc_array_erase( c_array_t* array, size_t pos )
{
if ( array )
{
safe_assert( pos < array->size );
if ( pos < array->size )
{
/* erase the element at index 'pos' */
}
}
}
This would give the users of the library some semblance of choice on
how strict to apply the assert checking mechanism.
3. Programming errors
At the library level, do I assert things within my library interface
that the user may have messed up? Take for example inserting an
object into an array in a sorted fashion.
\code snippet
void gc_array_insert_sorted( c_array_t* array, void* object,
c_compare_function_t cmp_fn, size_t type_sz )
{
if ( array )
{
assert( c_is_array_sorted( array, cmp_fn ) );
...
}
}
\endcode
In this scenario, I have a function that verifies that all the
elements are in sorted order. Is this an assertable offense? It's
certainly possible that the user may want to insert something sorted
in the array even though the array itself is not sorted. I don't feel
that I have the right for my library to demand that the array is
sorted via assert, even if the constraint of having a sorted array is
violated. This could be another 'safe_assert' or another level of
assert to verify this property.
There are cases that assert seems to be a good use for. Particularly,
if I have a valid array pointer, it's buffer better be valid too.
\code snippet
void gc_array_insert_sorted( c_array_t* array, void* object,
c_compare_function_t cmp_fn, size_t type_sz )
{
if ( array )
{
assert( array->buffer != NULL );
}
}
\endcode
This seems like a great candidate to be used in an assert, since I
specifically designed the interface to always have at least one
element allocated (the 'array->size' can still be zero though if no
elements are inserted into the array).
The last thing is whether it's recommended or not to append a string
to the assert macro to help provide flavor to the condition.
Something like
\code
assert( pos < array->size && "pos outside range: >= array->size" );
\endcode
Some things I've been pondering as of late.
Best regards,
John D.
have a hard time figuring out how to use assert properly, and I go
back and forth over how best to represent errors from a library's
perspective. From what I've gathered, assert is intended to catch
programming errors, but I don't really know what that means in the
context when writing a library.
There are three main errors that I try to handle.
1. Memory allocation faults.
These faults are caused when malloc, realloc, or calloc fails. There
are several methods that this can be communicated to the user. For
functions that allocate structures, the return value represents either
the allocated object if successful or returns NULL if allocation
failed. The caller of the function is responsible for handling the
result.
\code snippet
c_array_t* numbers = NULL;
numbers = c_array_new( int, 20 );
if ( !numbers )
{
/* User specified method to handle error. */
fprintf( stderr, "c_array_new: failed to allocate array 'numbers'!
\n" );
return EXIT_FAILURE;
}
....
\endcode
In this scenario, I know that I definitely don't want to assert on a
memory fault at the library level, because in general the user may be
able to recover from the situation, and I want to give him that
chance.
However, there are other functions that may invoke a buffer expansion
as part of the operation, requiring a call to malloc or realloc that
may fail. This needs to be communicated back to the user somehow.
Here is an example.
void gc_array_push_back( c_array_t* array, void* object, size_t
type_sz );
#define c_array_push_back( array, object, type ) \
(gc_array_push_back( (array), (object), sizeof( type ) ))
\question
Should the internals of 'gc_array_push_back' use assert to check for
null array pointers.?
Consider two possible methods to verify the array parameter.
void gc_array_push_back( c_array_t* array, void* object, size_t
type_sz )
{
/* variable decls */
assert( array != NULL );
/* push_back code */
}
vs
void gc_array_push_back( c_array_t* array, void* object, size_t
type_sz )
{
/* variable decls */
if ( array )
{
/* push_back code */
}
}
\endquestion
Most of the time, an array NULL pointer is likely an error, so assert
would help catch programming errors. However, I can't guarantee that
there isn't a user that would have a valid use case for having a null
array pointer, perhaps if it's considered as an empty array in another
structure. So in this case, I feel that the 'if ( array )' construct
feels right.
Another question within the same context of the 'c_array_push_back'
function is how to communicate an allocation failure if the array
needs more memory, but the allocation fails. At the minimum, I want
to make sure that I don't lose any of the contents of the array
buffer, so that the array contents remain the same. There seem to be
several methods to communicate that an allocation error in this case.
A. Modify 'gc_array_push_back' to respond with an error code via the
return value (or as an additional argument to the function, but I
really prefer not to use that style of interface).
This is a common scheme that is used and recommended often, and for
good reason. The main drawback is that sometimes there is competition
for the return value slot (not in the gc_array_push_back function's
case since it returns void, but in other functions). If I want to
carry the error code and something else, I need to write a custom
struct and it complicates the interface somewhat.
struct c_array_return
{
<data_type> data;
int error;
}
The other issue is that there can be conflicts between interpretation
of the return value. For example, if I make a copy of an array.
\code snippet
c_array_t* c_array_copy( c_array_t* array )
{
c_array_t* copy = NULL;
if ( array )
{
/* copy array */
}
return copy;
}
\endcode
If I allow arrays that are NULL pointers, then checking the return
value of c_array_copy for NULL is troublesome because I can't
distinguish whether the parameter 'array' is NULL or if the allocation
failed, which may be an important distinction.
B. Use some global variable that users should set before the
operation and check after the operation, akin to setting errno to
ENOMEM.
One of the issues brought up before is that ENOMEM isn't portable, so
it's not a reliable mechanism to use in of itself. However, the
concept is usable provided that you maintain the global state
yourself.
In my case, I have my library use a wrapper that maintains a global
state that is accessed using a macro 'c_enomem'. It functions
similarly to errno, but I have control over what it does provided that
my library calls my wrapper.
\code snippet
/* A flag that signals an out-of-memory error has occurred. */
static c_bool gc_private_allocator_enomem = FALSE;
c_bool* gc_error_not_enough_memory( void )
{
return &gc_private_allocator_enomem;
}
#define c_enomem (*gc_error_not_enough_memory())
void* c_malloc( size_t size )
{
void* mem = NULL;
if ( size )
{
mem = gc_private_allocator_ftable.malloc( size );
if ( !mem ) {
gc_private_allocator_enomem = TRUE;
}
}
return mem;
}
\endcode
With this kind of mechanism in place, I can do something like the
following:
\code snippet
c_array_t* numbers = NULL;
int n;
numbers = c_array_new( int, 20 );
/* Fill up numbers array */
c_enomem = FALSE;
c_array_push_back( numbers, &n, int );
if ( c_enomem ) {
fprintf( stderr, "c_array_push_back: allocation failure!\n" );
/* Try to recover or save gracefully if desired */
}
\endcode
This scenario also has its drawbacks, but it's the poison that I've
chosen particular for functions that would instead have to return
something in addition to an error code.
These are the main two error handling schemes that I'm familiar with
that has the granularity to handle function level out-of-memory
conditions. Signals and callback handlers are also useful tools, but
I've not been able to figure out a framework that seems to work at
this level. I've only used them as general error handlers if any
allocation fails, rather than at specific locations.
2. Invalid arguments
This is where I really struggle with what is assertable and what is
left to the user is left to blow up.
Take a function to erase an element out of an array.
void gc_array_erase( c_array_t* array, size_t pos )
{
if ( array )
{
/* what to assert if anything */
if ( pos < array->size )
{
/* erase the element at index 'pos' */
}
}
}
Does the condition 'pos < size' constitute a good assert candidate?
There is no possible scenario where a pos >= array->size would ever do
anything, but does the crime fit the punishment of abort in a library
like this? If not, is the silent treatment ok, or should an out-of-
range error be communicated back somehow? I can't see the pros and
cons enough to make a decision and stick to it.
I could see maybe having another layer of assert that takes the middle
ground. If the user wants to assert for conditions when pos is larger
than the array size and other invalid arguments, the library could
have something like the following.
void gc_array_erase( c_array_t* array, size_t pos )
{
if ( array )
{
safe_assert( pos < array->size );
if ( pos < array->size )
{
/* erase the element at index 'pos' */
}
}
}
This would give the users of the library some semblance of choice on
how strict to apply the assert checking mechanism.
3. Programming errors
At the library level, do I assert things within my library interface
that the user may have messed up? Take for example inserting an
object into an array in a sorted fashion.
\code snippet
void gc_array_insert_sorted( c_array_t* array, void* object,
c_compare_function_t cmp_fn, size_t type_sz )
{
if ( array )
{
assert( c_is_array_sorted( array, cmp_fn ) );
...
}
}
\endcode
In this scenario, I have a function that verifies that all the
elements are in sorted order. Is this an assertable offense? It's
certainly possible that the user may want to insert something sorted
in the array even though the array itself is not sorted. I don't feel
that I have the right for my library to demand that the array is
sorted via assert, even if the constraint of having a sorted array is
violated. This could be another 'safe_assert' or another level of
assert to verify this property.
There are cases that assert seems to be a good use for. Particularly,
if I have a valid array pointer, it's buffer better be valid too.
\code snippet
void gc_array_insert_sorted( c_array_t* array, void* object,
c_compare_function_t cmp_fn, size_t type_sz )
{
if ( array )
{
assert( array->buffer != NULL );
}
}
\endcode
This seems like a great candidate to be used in an assert, since I
specifically designed the interface to always have at least one
element allocated (the 'array->size' can still be zero though if no
elements are inserted into the array).
The last thing is whether it's recommended or not to append a string
to the assert macro to help provide flavor to the condition.
Something like
\code
assert( pos < array->size && "pos outside range: >= array->size" );
\endcode
Some things I've been pondering as of late.
Best regards,
John D.