struct named 0

  • Thread starter Mohd Hanafiah Abdullah
  • Start date
M

Mohd Hanafiah Abdullah

Is the following code conformat to ANSI C?

typedef struct {
int a;
int b;
} doomdata;

int main(void)
{
int x;

x = (int)&((doomdata*)0)->b;
printf("x=%d\n", x);
return x;
}

The part,
(int)&((doomdata*)0)->b;

is it conformant to ANSI C? What is it supposed to do?

Thanks for any tips.

Napi
 
M

Michael Mair

Mohd said:
Is the following code conformat to ANSI C?

typedef struct {
int a;
int b;
} doomdata;

int main(void)
{
int x;

x = (int)&((doomdata*)0)->b;
printf("x=%d\n", x);
return x;
}

The part,
(int)&((doomdata*)0)->b;

is it conformant to ANSI C? What is it supposed to do?

According to prior discussions, it is at least shady but
probably not conformant, depending on whether the address
0 is dereferenced or not.
This is a version of the offsetof macro from <stddef.h>:
offsetof(doomdata,b)
gives you the byte offset of b in struct doomsdata, that means
if we have
struct doomdata d;
unsigned char *p = (unsigned char *)&d;
then the address of d.b is p+offsetof(doomdata,b).
In contrast to offsetof which expands to an expression of
type size_t, the above will give you an int.

Use offsetof.


Cheers
Michael
 
S

S.Tobias

Mohd Hanafiah Abdullah said:
typedef struct {
int a;
int b;
} doomdata;

is it conformant to ANSI C?

IMO no, because it doesn't point to any valid object, so
invokes UB.

Question to others:
Would this be correct?
(int)&((doomdata*)0)->a;
 
J

Joona I Palaste

According to prior discussions, it is at least shady but
probably not conformant, depending on whether the address
0 is dereferenced or not.
This is a version of the offsetof macro from <stddef.h>:
offsetof(doomdata,b)
gives you the byte offset of b in struct doomsdata, that means
if we have
struct doomdata d;
unsigned char *p = (unsigned char *)&d;
then the address of d.b is p+offsetof(doomdata,b).
In contrast to offsetof which expands to an expression of
type size_t, the above will give you an int.
Use offsetof.

To be more specific, offsetof may be defined as the version mentioned by
the OP, but it doesn't have to be. However, whatever offsetof is defined
as, it is guaranteed to always be legal and valid code for that
particular implementation. The version mentioned by the OP may or may
not be valid and legal code, depending on the implementation.
 
M

Malcolm

Mohd Hanafiah Abdullah said:
The part,
(int)&((doomdata*)0)->b;

is it conformant to ANSI C? What is it supposed to do?
This is the offsetof() macro, which calculates the offset of a structure
member relative to the struct origin.
On most platforms it works, but it suffers a few problems on unusual
systems. For instance, if NULL is not all bits zero, or if there are trap
pointer representations, then the method won't work naturally. So I believe
that in C89 it is not required to work.
 
J

Jack Klein

According to prior discussions, it is at least shady but
probably not conformant, depending on whether the address
0 is dereferenced or not.

No it has undefined behavior, and has nothing to do with the "address
0", but with a null pointer. C is defined in terms of an abstract
machine, and in the abstract machine this expression dereferences a
null pointer.
This is a version of the offsetof macro from <stddef.h>:
offsetof(doomdata,b)
gives you the byte offset of b in struct doomsdata, that means
if we have
struct doomdata d;
unsigned char *p = (unsigned char *)&d;
then the address of d.b is p+offsetof(doomdata,b).
In contrast to offsetof which expands to an expression of
type size_t, the above will give you an int.

Use offsetof.

Yes, indeed. The implementation's offsetof() macro might indeed
contain similar code, excepting the cast to int. But the
implementation is not constrained to follow the rules of the abstract
machine, only user programs are.
 
J

Jack Klein

IMO no, because it doesn't point to any valid object, so
invokes UB.

Question to others:
Would this be correct?
(int)&((doomdata*)0)->a;

Technically it is still undefined behavior, as the semantics of the
expression dereference a null pointer.
 
J

Jack Klein

This is the offsetof() macro, which calculates the offset of a structure
member relative to the struct origin.

No, it can't be the offsetof() macro. This expression specifically
yields an int, whereas the offsetof() macro yields a size_t. This
expression yields undefined behavior.
On most platforms it works, but it suffers a few problems on unusual
systems. For instance, if NULL is not all bits zero, or if there are trap
pointer representations, then the method won't work naturally. So I believe
that in C89 it is not required to work.

When will people get it through their heads that it makes not a whit
of difference whether all bits zero happens to be a representation of
a null pointer on a particular platform?

The correspondence between an integer constant expression evaluating
to 0 and a null pointer is exactly the same as quite a few other
constant expressions, that is something that is evaluated and
substituted at compile-time and has nothing to do with run-time
values.

The conversion of a quoted string to an array of '\0' terminated
values is a compile-time conversion.

The conversion of escape sequences like '\n' and '\t' to their single
character equivalents in string or character constants is a
compile-time conversion.

The conversion of an integer constant expression evaluating to 0, or
such an expression cast to the type pointer-to-void, to a null pointer
constant is a compile-time conversion, not a run-time one.

If the one and only representation for a null pointer in a particular
implementation is 0xDEADBEEF, and such a value is returned by a call
to fopen(), for example, a comparison of that value against NULL or 0
will still be true.
 
J

Jack Klein

Is the following code conformat to ANSI C?

typedef struct {
int a;
int b;
} doomdata;

int main(void)
{
int x;

x = (int)&((doomdata*)0)->b;

No, the line above invokes undefined behavior, because it dereferences
a null pointer.
printf("x=%d\n", x);
return x;
}

The part,
(int)&((doomdata*)0)->b;

is it conformant to ANSI C? What is it supposed to do?

No, it is most certainly not conforming. It might have been written
as an example, or it might have been written by someone who doesn't
know that the offsetof(struct_type,member_name) macro defined in
 
M

Malcolm

Jack Klein said:
No, it can't be the offsetof() macro. This expression specifically
yields an int, whereas the offsetof() macro yields a size_t. This
expression yields undefined behavior.
This is true. int is not guaranteed to be big enough to hold the offset of a
structure element (!).
When will people get it through their heads that it makes not a whit
of difference whether all bits zero happens to be a representation of
a null pointer on a particular platform?
Unfortunately it does make a difference.
For instance consider

char **list = calloc(N, sizeof(char *));

for(i=0;i<N;i++)
if( someconditon() )
list = malloc(10);

for(i=0;i<N;i++)
free(list);

Also consider if a null pointer is cast to an integral type, or if a pointer
derived by adding an offset to the null pointer is cast to an integer. This
cast is a simple bitwise conversion, so results will differ on a platform on
which NULL is not all bits zero.

However char *ptr = 0; will always set ptr to NULL, regardless of whether
the null pointer is all bits zero. Here you are correct.
 
M

Martin Ambuhl

Mohd said:
The part,
(int)&((doomdata*)0)->b;

is it conformant to ANSI C? What is it supposed to do?
#include <stdio.h>
#include <stddef.h>

typedef struct
{
int a;
int b;
} doomdata;

int main(void)
{
int x;
#if 0
/* mha: the following is an attempt to mimic offsetof on an
implementation not having offsetof in <stddef.h> */
x = (int) &((doomdata *) 0)->b;
#endif
x = offsetof(doomdata, b);
printf("x=%d\n", x);

return 0; /* mha: returning x, when x is not one
of 0, EXIT_SUCCESS, or EXIT_FAILURE
is at best implementation-defined */
}
 
S

S.Tobias

Technically it is still undefined behavior, as the semantics of the
expression dereference a null pointer.

(This deserves a separate thread, but since I asked the above
question here, I'll continue here too.)

As I understand the expression constitutes an access to the structure
member.

1. Does a member access constitute an access to the *whole* structure?
eg.:
struct A { int i; int _i; };
struct B { int i; float f; };
struct A a = {0};
struct B *pb = (struct B*)&a;
pb->i; //UB?
(*pb).i; //UB?
Do I access the first int sub-object in `a' only, or do I access
the whole object `a'?

2. I see certain similarity between structs and arrays (in fact,
both are called "aggregates").
Why is it that for array:
&a[5];
doesn't constitute object access (6.5.3.2#3), whereas for struct:
&s.m;
&ps->m;
the expressions do constitute access?
Why is the language designed like this?
 
X

xarax

Jack Klein said:
No it has undefined behavior, and has nothing to do with the "address
0", but with a null pointer. C is defined in terms of an abstract
machine, and in the abstract machine this expression dereferences a
null pointer.
/snip/

It does not dereference any pointer. The & cancels out
the ->.

btw: If the dereference actually happened, how could the
compiler figure out the address of an int that was produced
by the dereference?
 
C

Christian Bau

"Malcolm said:
Also consider if a null pointer is cast to an integral type, or if a pointer
derived by adding an offset to the null pointer is cast to an integer. This
cast is a simple bitwise conversion, so results will differ on a platform on
which NULL is not all bits zero.

Not quite; a conversion from a pointer type to an integer does whatever
the implementation thinks is a good idea; usually the bits of the
representation are copied unchanged, but that is not necessarily so.

In C99, there seems to be an actual requirement that casting an integer
zero to a pointer type will produce a null pointer. If null pointers
don't have all bits zero, and an integer 0 has all bits zero, and
converting an integer 0 to a pointer produces a null pointer, then logic
says that this conversion cannot leave the bits unchanged.
 
M

Malcolm

Christian Bau said:
In C99, there seems to be an actual requirement that casting an integer
zero to a pointer type will produce a null pointer. If null pointers
don't have all bits zero, and an integer 0 has all bits zero, and
converting an integer 0 to a pointer produces a null pointer, then logic
says that this conversion cannot leave the bits unchanged.
But in this case we are not casting an integer 0 to a pointer, but a null
pointer to an integer.
 
M

Malcolm

xarax said:
It does not dereference any pointer. The & cancels out
the ->.
However it is illegal even to load a non-valid pointer, except NULL or
course, into a pointer variable.
One reason is that some platforms have hardware traps that trigger an error
when illegal values are loaded into address registers. Another reason is
that sometimes with segmented architectures you can get strange results. For
instance if char *ptr points to the beginning of a segment, then ptr - 1
might compare to greater than ptr. The address of NULL + a few bytes is an
illegal value, and I don't believe that the cast to an int makes the
expression a legal one, though I am open to correction on this.
 
J

Jack Klein

(This deserves a separate thread, but since I asked the above
question here, I'll continue here too.)

As I understand the expression constitutes an access to the structure
member.

1. Does a member access constitute an access to the *whole* structure?
eg.:

The term 'access' is really only used in the C standard in conjunction
with the volatile qualifier, where the wording is unfortunately vague
enough that it can be construed several different ways.
struct A { int i; int _i; };
struct B { int i; float f; };
struct A a = {0};
struct B *pb = (struct B*)&a;
pb->i; //UB?
(*pb).i; //UB?

The two statements above actually have defined behavior, but not for
the reason you might think. The language guarantees that a pointer to
structure, suitably cast, is also a pointer to its first member. So
in given that 'pb' holds any of the following:

- a pointer to any structure type whose first member is an int

- a pointer to an array of ints

- a pointer to a single int

....and the int pointed to has a valid value, the expressions, though
not recommended, will work as designed. The compiler must generate
code equivalent to *(int *)pb, and if there is actually an int there
all is well.
Do I access the first int sub-object in `a' only, or do I access
the whole object `a'?

Now that depends what you mean by access. Let's assume a Pentium
(1/2/3/4) platform with a typical compiler, which means the size of
either of your structures is 8 8-bit bytes. Now let's also assume
that this implementation allocates all structures on an address evenly
divisible by 8, a not uncommon performance feature of such
implementations.

With the assumptions above, if the processor needs to access the
structure from memory it will perform a 64-bit access physically, so
even though your code does not direct the abstract machine to touch
the value of other members in any way, the entire memory space holding
the structure will be physically read.
2. I see certain similarity between structs and arrays (in fact,
both are called "aggregates").
Why is it that for array:
&a[5];
doesn't constitute object access (6.5.3.2#3), whereas for struct:
&s.m;
&ps->m;
the expressions do constitute access?
Why is the language designed like this?

There are actually more differences than similarities between structs
and arrays, despite the fact that both are aggregates. Structs are
first class objects, meaning they can be assigned, passed to and
returned from functions by value, and their names are never implicitly
converted to pointers. Arrays are not first class objects and do not
share any of the characteristics above.

As for other differences in this particular case, this is spelled out
by paragraph 3 of 6.5.3.2 of C99:

[begin quotation]
The unary & operator returns the address of its operand. If the
operand has type ‘‘type’’, the result has type ‘‘pointer to type’’. If
the operand is the result of a unary * operator, neither that operator
nor the & operator is evaluated and the result is as if both were
omitted, except that the constraints on the operators still apply and
the result is not an lvalue. Similarly, if the operand is the result
of a [] operator, neither the & operator nor the unary * that is
implied by the [] is evaluated and the result is as if the & operator
were removed and the [] operator were changed to a + operator.
Otherwise, the result is a pointer to the object or function
designated by its operand.
[end quotation]

Note the differences between applying '&' to the result of a '*'
operator and to the result of a '[]' operator. In the former case,
neither '&' nor '*' are evaluated as such, but note "the constraints
on the operators still apply".

Now let's back up to paragraph 1 of 6.5.3.2, which lists the
constraints for the unary '&' operator:

[begin quotation]
The operand of the unary & operator shall be either a function
designator, the result of a [] or unary * operator, or an lvalue that
designates an object that is not a bit-field and is not declared with
the register storage-class specifier.
[end quotation]

Notice that the expression under discussion,

(int)&((doomdata*)0)->a;

....is none of these things. Specifically, the operand of the '&'
operator, '((doomdata*)0)->a' is:

- not a function designator

- not the result of a [] operator

- not the result of a unary * operator

- and, because of the null pointer, not an lvalue

Finally consider one last thing, namely that regardless of whether
there is an actual access to an object, the expression explicitly
performs pointer arithmetic on a null pointer, and such use of a null
pointer is undefined in and of itself.
 
J

Jack Klein

Not quite; a conversion from a pointer type to an integer does whatever
the implementation thinks is a good idea; usually the bits of the
representation are copied unchanged, but that is not necessarily so.

In C99, there seems to be an actual requirement that casting an integer
zero to a pointer type will produce a null pointer. If null pointers
don't have all bits zero, and an integer 0 has all bits zero, and
converting an integer 0 to a pointer produces a null pointer, then logic
says that this conversion cannot leave the bits unchanged.

I am unsure of your meaning here. Do you mean an integral constant at
the source level, as in:

double *dp = (double * 0);

....or do you actually mean the value of an integer object, as in:

int x = 0;
double *dp = (double *x);

If you mean the latter, I have never noticed anything in C99 requiring
the result be a null pointer.

Could you please clarify and include chapter & verse?

Thanks.
 
J

Jack Klein

This is true. int is not guaranteed to be big enough to hold the offset of a
structure element (!).
Unfortunately it does make a difference.
For instance consider

char **list = calloc(N, sizeof(char *));

What is the above supposed to prove? It most specifically does not
guarantee to set the N pointers in list to NULL. Setting all bits to
0 with calloc() or memset() is a run-time operation and has nothing at
all to do with the compiler-time conversion of constant expressions to
null pointer constants.
for(i=0;i<N;i++)
if( someconditon() )
list = malloc(10);

for(i=0;i<N;i++)
free(list);


This code is badly broken, what is the point?
Also consider if a null pointer is cast to an integral type, or if a pointer
derived by adding an offset to the null pointer is cast to an integer. This
cast is a simple bitwise conversion, so results will differ on a platform on
which NULL is not all bits zero.

Again, what is your point. Casting a pointer, null or not, to an
integral type, is defined if and only if there is an integral type
large enough, but let's assume and bypass that. The operation of the
cast is completely implementation-defined, thus automatically neither
portable nor strictly conforming. If you do such a thing, it is up to
you to take such implementation-defined behavior into account as a
part of anything and everything you do with those integers.

As for adding an offset to a null pointer, that is undefined period.
However char *ptr = 0; will always set ptr to NULL, regardless of whether
the null pointer is all bits zero. Here you are correct.

Yes, because this is a compile-time conversion, which happens during
source code translation. This is no more or less remarkable than the
fact that:

char a1 [3] = "\n";
char a2 [3];
a2[0] = '\\';
a2[1] = 'n';
a2[3] = '\o';

....will produce two strings that will not test equal under any string
or memory comparison function.
 
X

xarax

Malcolm said:
However it is illegal even to load a non-valid pointer, except NULL or
course, into a pointer variable.
One reason is that some platforms have hardware traps that trigger an error
when illegal values are loaded into address registers. Another reason is
that sometimes with segmented architectures you can get strange results. For
instance if char *ptr points to the beginning of a segment, then ptr - 1
might compare to greater than ptr. The address of NULL + a few bytes is an
illegal value, and I don't believe that the cast to an int makes the
expression a legal one, though I am open to correction on this.

Notwithstanding what you wrote, the compiler will not
generate a load into an address register, because the
& cancels out the ->. The compiler has all the information
it needs to resolve the expression to a constant offset
value.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,156
Messages
2,570,878
Members
47,405
Latest member
DavidCex

Latest Threads

Top