array of size 1

Roman Mashak · Aug 14, 2006

Hello, All!

I've met the code containing this kind of structure:

typedef struct cmd
{
unsigned int Cmd;
unsigned int Code;
unsigned int Data[1];
} CMD;

Further in the code what programmer does is:

....
#define MAX_MSGLEN 1600
CMD *tmsg;
int stmsg[MAX_MSGLEN/4+1];

tmsg = (CMD_ACK *)stmsg;
memset(tmsg, 0x0, MAX_MSGLEN);
....

So as far as I understood, after all, we may address tmsg.Data as having
several elements (much more then 1)? What's actually the sense and point of
using such technique? FAQ says it's more or less standard conformant, but I
didn't find how.

Thanks for help.

With best regards, Roman Mashak. E-mail: (e-mail address removed)

jaysome · Aug 14, 2006

Hello, All!

I've met the code containing this kind of structure:

typedef struct cmd
{
unsigned int Cmd;
unsigned int Code;
unsigned int Data[1];
} CMD;

Further in the code what programmer does is:

...
#define MAX_MSGLEN 1600
CMD *tmsg;
int stmsg[MAX_MSGLEN/4+1];

tmsg = (CMD_ACK *)stmsg;
memset(tmsg, 0x0, MAX_MSGLEN);
...

So as far as I understood, after all, we may address tmsg.Data as having
several elements (much more then 1)? What's actually the sense and point of
using such technique? FAQ says it's more or less standard conformant, but I
didn't find how.

Thanks for help.

With best regards, Roman Mashak. E-mail: (e-mail address removed)

My compiler says:

error C2065: 'CMD_ACK' : undeclared identifier

I think that is part of your problem.

Roman Mashak · Aug 14, 2006

Hello, jaysome!
You wrote on Mon, 14 Aug 2006 01:11:44 -0700:

??>> typedef struct cmd
??>> {
??>> unsigned int Cmd;
??>> unsigned int Code;
??>> unsigned int Data[1];
??>> } CMD;
??>>
??>> Further in the code what programmer does is:
??>>
??>> ...
??>> #define MAX_MSGLEN 1600
??>> CMD *tmsg;
??>> int stmsg[MAX_MSGLEN/4+1];
??>>
??>> tmsg = (CMD_ACK *)stmsg;

Sorry, I made a typo. Correct is:

tmsg = (CMD *)stmsg;

??>> memset(tmsg, 0x0, MAX_MSGLEN);

With best regards, Roman Mashak. E-mail: (e-mail address removed)

Sweta · Aug 14, 2006

Such structure makes code speak for itself. Cmd is followed by Code
followed by the Data.
Alternative way is to define Data as a pointer to unsigned int.
typedef struct cmd
{
unsigned int Cmd;
unsigned int Code;
unsigned int * Data;
} CMD;

Will that change the way compiler looks at it?

Sweta

Hallvard B Furuseth · Aug 14, 2006

Roman said:
I've met the code containing this kind of structure:

typedef struct cmd
{
unsigned int Cmd;
unsigned int Code;
unsigned int Data[1];
} CMD;

(..code using Data[larger than 0]..)

So as far as I understood, after all, we may address tmsg.Data as
having several elements (much more then 1)? What's actually the sense
and point of using such technique?

Using Data[] rather than *Data can make it simpler to manage the objects
- you only need one malloc() and one free() per object. Also that uses
a bit less memory and fewer function calls.

FAQ says it's more or less standard conformant, but I didn't find how.

This is conformant:

#include <stddef.h>
#include <stdlib.h>
...
CMD *foo = malloc(sizeof(CMD) + sizeof(unsigned int) * whatever);
...
unsigned int *data_ptr =
(unsigned int *)((char *)foo + offsetof(CMD, Data[0]));
use data_ptr[larger than 0];

and this is not:

unsigned int *data_ptr = foo->Data;
use data_ptr[larger than 0];

The reason is that the implementation is allowed to "know" that the
expression foo->Data is just a 1-element array, so accessing elements
beyond that via the 'Data' member of CMD can fail. The offsetof hack
avoids that, since it's just an address computation based on foo.

In C99, you can replace this hack with a flexible array member:
typedef struct cmd
{
unsigned int Cmd;
unsigned int Code;
unsigned int Data[];
} CMD;
You allocate this similarly to the above, but can access it sensibly,
without the offsetof() hack. Remember to add one to the number of
elements when allocating, since there is no longer 1 member included in
the size.

Frederick Gotham · Aug 14, 2006

Roman Mashak posted:

typedef struct cmd
{
unsigned int Cmd;
unsigned int Code;
unsigned int Data[1];
} CMD;

There could be 5,432 bytes of padding between the first two members, and
23,343 bytes of padding between the latter two. Take this into consideration.

Simon Biber · Aug 15, 2006

Hallvard said:
Roman said:

I've met the code containing this kind of structure:

typedef struct cmd
{
unsigned int Cmd;
unsigned int Code;
unsigned int Data[1];
} CMD;

(..code using Data[larger than 0]..)

So as far as I understood, after all, we may address tmsg.Data as
having several elements (much more then 1)? What's actually the sense
and point of using such technique?

Click to expand...

Using Data[] rather than *Data can make it simpler to manage the objects
- you only need one malloc() and one free() per object. Also that uses
a bit less memory and fewer function calls.

FAQ says it's more or less standard conformant, but I didn't find how.

Click to expand...

This is conformant:

#include <stddef.h>
#include <stdlib.h>
...
CMD *foo = malloc(sizeof(CMD) + sizeof(unsigned int) * whatever);
...
unsigned int *data_ptr =
(unsigned int *)((char *)foo + offsetof(CMD, Data[0]));
use data_ptr[larger than 0];

and this is not:

unsigned int *data_ptr = foo->Data;
use data_ptr[larger than 0];

The reason is that the implementation is allowed to "know" that the
expression foo->Data is just a 1-element array, so accessing elements
beyond that via the 'Data' member of CMD can fail. The offsetof hack
avoids that, since it's just an address computation based on foo.

Nice try, but I don't think you're right.

The compiler knows that foo is a pointer to CMD. It knows the size of CMD.

foo is a pointer to CMD.

(char*)foo is a pointer to CMD, converted to (char*). This doesn't
necessarily lose the information about the object that it pointed to.

(char*)foo + offsetof(CMD, Data[0]) is a char*, but the compiler may be
able to tell that it points to the Data member of a CMD structure.

data_ptr is an unsigned int*, which a sufficiently advanced compiler may
be able to tell points to the Data member of a CMD structure.

As such, it is within its rights to disallow access to elements through
data_ptr, beyond the first one.

It would be a quite advanced bounds-checking system that could detect
and diagnose this type of error!

There is an approach that is perhaps even more foolproof, and that is to
never associate the void pointer from malloc with the CMD type.

void *foo_void = malloc(sizeof(CMD) + sizeof(unsigned int) * whatever);
CMD *foo_CMD = foo_void;
....
unsigned int *data_ptr =
(unsigned int *)((char *)foo_void + offsetof(CMD, Data[0]));

Now, I use foo_void to calculate a suitable spot to start storing some
unsigned ints in. This pointer was never associated with the CMD type,
and so the compiler has no business complaining that data_ptr is really
pointing to the Data member of a CMD structure.

And I don't believe there could be any real alignment problems with
this, given that the compiler has guaranteed we can store at least one
unsigned int at the offset of Data.

On the other hand, perhaps there are alignment issues. What if, when
calculating the struct packing, the compiler packed the members of CMD
closer than is allowed for typical unsigned ints, in the knowledge that
it could keep track of when that particular data member was being used,
and insert the correct code to grab it byte by byte and reassemble it.

If I circumvent that system by applying the odd byte offset to a void
pointer from malloc, the compiler may not know that it needs to generate
code for an unaligned access, and thus it could fail at runtime.

Are there any compilers for systems that trap unaligned memory access,
that do support "packed" structs through generating special code to pack
and unpack them? If so, would my code above prevent the compiler from
generating the necessary packing and unpacking code for usage of data_ptr?

Hallvard B Furuseth · Aug 17, 2006

Simon said:
Hallvard said:

Roman said:

typedef struct cmd
{
unsigned int Cmd;
unsigned int Code;
unsigned int Data[1];
} CMD;

Click to expand...

(...)
This is conformant:
#include <stddef.h>
#include <stdlib.h>
...
CMD *foo = malloc(sizeof(CMD) + sizeof(unsigned int) * whatever);
...
unsigned int *data_ptr =
(unsigned int *)((char *)foo + offsetof(CMD, Data[0]));
use data_ptr[larger than 0];

and this is not:
unsigned int *data_ptr = foo->Data;
use data_ptr[larger than 0];
The reason is that the implementation is allowed to "know" that the
expression foo->Data is just a 1-element array, so accessing elements
beyond that via the 'Data' member of CMD can fail. The offsetof hack
avoids that, since it's just an address computation based on foo.

Click to expand...

Nice try, but I don't think you're right.

The compiler knows that foo is a pointer to CMD. It knows the size of
CMD.

Sorry, you are starting off wrong right there. Converting a pointer too
CMD* does not tell the compiler that the pointed-to object is a CMD. If
it did, we could hardly cast pointers back and forth at all.

Basically, C objects must be _accessed_ via an expression which has the
effective type of the object. Just pointing to the object with a
pointer to a different type is OK. There are exceptions, including:
Malloced memory have no unambiguous declared type, so in this case the
effective type of the object is the type of whatever was last stored
there. Access via character types cancels the rules, more or less.
(This is C99 section 6.5 - but the above hack is most relevant in C90,
since that doesn't support 'int Data[];'. Oh well

Anyway, the upshot is that ((char*)data_ptr + anything) above just a
pointer to some memory, the compiler does _not_ know that the contents
is inside a CMD. An array bounds checking implementation can know the
size of the malloced block at runtime - it can use pointers which
consist of (pointer to start of object, offset, object length). And it
can construct such a pointer for Data[] when accessed as the Data
member. But (char*)foo + offsetof(...) avoids that.

OTOH, you may end up with the right conlusion after all. If there is
padding behind Data[1] in the struct, and you assign a CMD to *foo, then
maybe the implementation is allowed to notice that data_ptr[1] is not a
valid object - and vice versa, if you assign to data_ptr[1] then *foo is
not valid. So maybe the "struct hack" strictly speaking is only valid
when Data[] has character type, since that cancels the rules. I don't
remember what the standard says about padding bytes, and I'm not up to
digging in standardese now.

There is an approach that is perhaps even more foolproof, and that is to
never associate the void pointer from malloc with the CMD type.

void *foo_void = malloc(sizeof(CMD) + sizeof(unsigned int) * whatever);
CMD *foo_CMD = foo_void;

Right idea, wrong problem

If my "conforming code" actually isn't,
then you can use a CMD* if you like, but not make the _malloced object_
start with a CMD. That is, use offsetof() to access all of CMD's
members, never use it directly. Also you couldn't assign it to a CMD.
Anyway, it gets ugly.

And I don't believe there could be any real alignment problems with
this, given that the compiler has guaranteed we can store at least one
unsigned int at the offset of Data.
Right.

On the other hand, perhaps there are alignment issues.

Nope. If you store the data where the compiler would have stored it,
and initially aligned well (like at the beginning of a malloced block),
then any resulting alignment problem must be because you are emulating a
compiler bug.

What if, when calculating the struct packing, the compiler packed the
members of CMD closer than is allowed for typical unsigned ints,

....then the struct used as an ordinary struct whose members are accesed
as ordinary struct members. So the compiler won't do that.

If I circumvent that system by applying the odd byte offset to a void
pointer from malloc, the compiler may not know that it needs to generate
code for an unaligned access, and thus it could fail at runtime.

If you deliberatly misalign your data, your code is buggy. The compiler
is not required to fix your bugs (as in, "generate code for an unaligned
access"). This is related to the hack with Data[] above.

Hallvard B Furuseth · Aug 17, 2006

Oops... sight typo. I said:
If you deliberatly misalign your data, your code is buggy. The compiler
is not required to fix your bugs (as in, "generate code for an unaligned
access"). This is
*not*

related to the hack with Data[] above.

Chris Torek · Aug 20, 2006

In an aside relating to the "struct hack", and compilers "remembering"
the sizes of struct elements ...

Are there any compilers for systems that trap unaligned memory access,
that do support "packed" structs through generating special code to pack
and unpack them?

Yes, at least two such compilers exist: GCC and Diab.

If so, would my code above prevent the compiler from generating the
necessary packing and unpacking code for usage of data_ptr?

I am not sure which the "code above" meant (I deleted all of them,
but there were at least two major possibilities).

Since GCC is pretty widely available, you can experiment to determine
when its "packed" attribute is maintained and when it is lost. I
(or anyone else who has it) could do the same with Diab -- but this
seems like an exercise in guessing, at best: is there anything that
would require a compiler to be consistent between its treatment of
packing particular structure members, and its treatment of retained
array sizes used for code optimization?

(Diab does something fairly clever: it warns you whenever it
compiles an expression that refers to a "packed" or "volatile"
element in such a way that the compiler notices that it has
"forgotten" the attribute. Of course, depending on how this is
implemented internally -- I have no idea how it works inside --
it seems likely that someone might forget to warn about the
forgetting at some point.

)

Simon Biber · Aug 21, 2006

Chris said:
In an aside relating to the "struct hack", and compilers "remembering"
the sizes of struct elements ...

Yes, at least two such compilers exist: GCC and Diab.

I am not sure which the "code above" meant (I deleted all of them,
but there were at least two major possibilities).

Since GCC is pretty widely available, you can experiment to determine
when its "packed" attribute is maintained and when it is lost. I
(or anyone else who has it) could do the same with Diab -- but this
seems like an exercise in guessing, at best: is there anything that
would require a compiler to be consistent between its treatment of
packing particular structure members, and its treatment of retained
array sizes used for code optimization?

From experiment, it's pretty easy to lose the "packed" attribute.

[sbiber@charlie ~/prog/c]$ cat pack.c
#include <stdio.h>
#include <stddef.h>

struct test_t {
int a;
char b;
int c[1];
};

struct test_t test = {10, 20, {30}};

int main(void)
{
int *p = (int*)((char*)&test + offsetof(struct test_t, c));
printf("%d\n", p[0]);
return 0;
}

[sbiber@charlie ~/prog/c]$ gcc pack.c && ./a.out
30

[sbiber@charlie ~/prog/c]$ gcc -fpack-struct pack.c && ./a.out
Bus Error (core dumped)

This begs the question: Does this code have undefined behaviour? If not,
then gcc with -fpack-struct is not a conforming compiler.

type casting	9	May 22, 2006
re-definition of standard types	9	Oct 27, 2006
Engineering a list container. Part 1.	71	Dec 7, 2013
how to define the array of strings	7	Jun 13, 2005
array of structures	4	Jul 21, 2005
review of the "container library", part 1/?	18	Mar 1, 2011
buffer overflow	20	Nov 8, 2006
setting macro	6	Jul 22, 2005

array of size 1

Roman Mashak

jaysome

Roman Mashak

Sweta

Hallvard B Furuseth

Frederick Gotham

Simon Biber

Hallvard B Furuseth

Hallvard B Furuseth

Chris Torek

Simon Biber

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads