Here are three examples that I would consider to be counter-examples:
1. A 'struct any *' pointer representation that is a simple index.
This could provide a level of indirection into a table. The table
element could have type and bounds information, along with some other
form of address for the pointee. When the representation (a simple
index) is loaded into a 'struct bar *' instead of into a 'struct foo *',
a trap could be generated.
2. A 'struct any *' pointer representation that encodes bounds
information. While the original post "has this covered" because the
bounds of the the original pointee encompass the bounds of the members
and sub-members, it's not safe in the general case. When the
representation is loaded into a 'struct bigger *' instead of a 'struct
smaller *', the bounds mismatch could generate a trap.
3. A 'struct any *' pointer representation that encodes type
information. Maybe for the sole reason of generating a trap when the
representation is loaded into an incompatible pointer type of object.
It seems clear to me that size, alignment, argument promotion (none) and
format of 'struct foo *' and 'struct bar *' must be the same, but I
don't yet understand how that ties into compatible types nor into
defined behaviour, since
"Certain object representations need not represent a value of the
object type. If the stored value of an object has such a representation
and is read by an lvalue expression that does not have character type,
the behavior is undefined. If such a representation is produced by a
side effect that modifies all or any part of the object by an lvalue
expression that does not have character type, the behavior is
undefined.41) Such a representation is called a trap representation."
Why can a valid 'struct foo *' value's representation represent a valid
'struct foo *' value but not a trap for a 'struct bar *'? For example,
it might be useful to trap a 'const struct baz *' representation read
into a 'struct baz *' object. A single bit in the representation would
be sufficient for that. The representation would be the same, wouldn't it?
Example #1: "...interchangeability as arguments to functions..."
/* libbaz.h */
typedef void f_baz_callback(structptr_t);
extern void BazFunc(f_baz_callback * Callback, structptr_t StructPtr);
/* libbaz.c */
typedef struct any * structptr_t;
#include "libbaz.h"
void BazFunc(f_baz_callback * callback, structptr_t sptr) {
/*
* 'struct any' is an incomplete object type.
* Trap representations are more limited than if it was a
* a complete object type.
*
* A trap representation for _any_ pointer type could
* still be present. A trapresentation for _any_
* 'struct XXX *' could still be present.
*
* A trapresentation based on bounds could still be present
* if 'sptr' is non-null, but somehow indicates 0 bytes
* of storage, or some other invalid value.
*
* A trapresentation based on lifetime could still be
* present. Same with 'const'-ness.
*
* etc.
*
* foo.c and bar.c have a different type for 'sptr', but
* since the representation is the same, there's no problem.
*/
callback(sptr);
}
/* foo.c */
typedef struct s_foo * structptr_t;
#include "libbaz.h"
struct s_foo {
int i;
};
f_baz_callback foo_callback;
void foo_callback(structptr_t sptr) {
sptr->i = 42;
}
void foo_func(void) {
struct s_foo foo;
BazFunc(foo_callback, &foo);
}
/* bar.c */
typedef struct s_bar * structptr_t;
#include "libbaz.h"
struct s_bar {
double d;
};
f_baz_callback bar_callback;
void bar_callback(structptr_t sptr) {
sptr->d = 3.14159;
}
void bar_func(void) {
struct s_bar bar;
BazFunc(bar_callback, &bar);
}
Example #2: "...and members of unions."
/* libnextgen.h version 1.0 */
struct apple;
struct orange;
union u_dyn_obj {
struct apple * apple;
struct orange * orange;
};
extern void NextGenFunc(union u_dyn_obj * DynamicObject);
/* libnextgen.h version 2.0 */
struct apple;
struct orange;
struct dog;
struct cat;
union u_dyn_obj {
struct apple * apple;
struct orange * orange;
struct dog * dog;
struct cat * cat;
};
extern void NextGenFunc(union u_dyn_obj DynamicObject);
/* user.c */
#include "libnextgen.h"
void UserFunc(void) {
struct apple apple;
union u_dyn_obj dyn_obj;
/*
* It doesn't matter which version of the header we
* were built with, _nor_ which version of the library
* is installed, because the representation (and thus
* size) and alignment are always going to be the same.
*
* We only work with apples and oranges, but 2.0's
* support for dogs and cats doesn't affect us.
*/
dyn_obj.apple = &apple;
NextGenFunc(dyn_obj);
}
Example #3: "... Such a representation is called a trap representation."
/* hmmm1.c */
#include <stdlib.h>
#include <stdio.h>
struct s_smaller {
char arr[4];
};
struct s_bigger {
char arr[sizeof (struct s_smaller)];
double d;
};
int main(void) {
void * storage;
struct s_smaller * smaller;
/* Allocate enough storage for an s_smaller */
storage = calloc(1, sizeof (struct s_smaller));
if (!storage)
return 0;
smaller = storage;
/*
* Problem #3.1: Although the representation is
* the same for both types, the value cannot
* point to an s_bigger due to insufficient storage.
* There's enough storage for arr, but that's
* irrelevant.
*/
(*((struct s_bigger **) &smaller))->arr[0] = 'C';
printf("Result: %s\n", (char *) storage);
return 0;
}
Example #4: "... Such a representation is called a trap representation."
/* hmmm2.c */
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
struct s_smaller {
char arr[4];
};
struct s_bigger {
char arr[sizeof (struct s_smaller)];
double d;
};
union u_of_ptrs {
struct s_smaller * smaller;
struct s_bigger * bigger;
};
void discard_provenance(
union u_of_ptrs * left,
union u_of_ptrs * right,
union u_of_ptrs * combined
);
int main(void) {
void * storage;
union u_of_ptrs first, first_backup, second, third;
/* Allocate enough storage for an s_bigger */
storage = calloc(1, sizeof (struct s_bigger));
if (!storage)
return 0;
/* Plenty of storage for an s_smaller */
first.smaller = storage;
/* Backup */
memcpy(&first_backup, &first, sizeof first_backup);
/* Free some storage */
storage = realloc(storage, sizeof (struct s_smaller));
if (!storage)
return 0;
/* Right amount of storage */
second.smaller = storage;
/* Compare the representations */
if (memcmp(&first_backup, &second, sizeof first_backup))
return 0;
/* Discard any "provenance" for a later test */
discard_provenance(&first_backup, &second, &third);
/*
* Problem #4.1: second.bigger cannot point to an
* s_bigger, as there's insufficient storage.
* There's storage enough for arr, but that's
* irrelevant.
*/
second.bigger->arr[0] = '1';
/*
* Problem #4.2: Same problem with first.bigger, even
* though its "provenance" was from the earlier allocation.
*/
first.bigger->arr[1] = '2';
/*
* Problem #4.3: Same problem with third.bigger, even
* though its "provenance" has been discarded.
*/
third.bigger->arr[2] = '3';
printf("Result: %s\n", (char *) storage);
return 0;
}
void discard_provenance(
union u_of_ptrs * left,
union u_of_ptrs * right,
union u_of_ptrs * combined
) {
unsigned char * lp = (void *) left;
unsigned char * rp = (void *) right;
unsigned char * cp = (void *) combined;
unsigned char * end = (void *) (combined + 1);
while (cp < end)
*cp++ = *lp++ & *rp++;
}
I would certainly appreciate a C99-/C11-conforming implementation that
is able to catch the problems of examples #3 & #4. One way would be to
deem trap representations for one object type and not for another, where
the types are not compatible.
My interpretation of "same representation and alignment requirements"
for struct pointer types is along the lines of:
- If there are padding bits in one, there are padding bits at the
same positions in the other
- If there are parity bits in one, there are parity bits at the same
positions in the other
- If a segment is encoded in one, then it is encoded in the same way
in the other
- If type information is encoded in one, then it is encoded in the
same way in the other
- If bounds information is encoded in one, then it is encoded in the
same way in the other
- If lifetime/duration information is encoded in one, then it is
encoded in the same way in the other
- etc.
Since this interpretation supports the fair examples #1 & #2 as well as
the more contrived examples #3 & #4, I fail to understand the benefit of
adopting a more restrictive interpretation which seemingly prohibits the
problems of #3 and #4 from being caught; perhaps with trap
representations. But perhaps I've misunderstood.
- Shao Miller