problem with sizeof

  • Thread starter Walter Roberson
  • Start date
W

Walter Roberson

typedef struct USB_iobuf_common_header_s {
unsigned char function;
unsigned char size_lo;
unsigned char size_hi;
} USB_iobuf_common_header;
typedef struct USB_iobuf_i2c_header_s {
unsigned char addr;
unsigned char subaddr_lo;
unsigned char subaddr_hi;
unsigned char datasize_lo;
unsigned char datasize_hi;
} USB_iobuf_i2c_header;
reg = data[sizeof(USB_iobuf_common_header) +
sizeof(USB_iobuf_i2c_header)];
At this point I expect to receive data in 'reg', but it's a junk I'm
getting. What's the problem with that?

C compilers are permitted to (and in practice many do) put trailing
padding in structures so that arrays of the structures would have
the elements align properly. The sizes you would naively expect
from the above structures is 3 characters and 5 characters respectively,
but it would not be at all unusual for padding to be tossed in causing
the first structure to be 4 characters and the second to be 8 characters.
You probably intend that 'addr' follows immediately after 'size_hi'
in the overall message array, but an indeterminate amount of padding
could occur between them. sizeof() a structure is the size including
any extra padding, rather than the sums of the sizes of the visible
elements.

Probably what you will have to end up doing is using a union
after size_hi .
 
M

Mike Wahler

Roman Mashak said:
Hello,

[OT]

[/OT]

In the program I'm working on I receive a data stream, having structure as
follows:

| common_header | I2C_header | data ... |

My purpose is to parse the 'common_header', and read the data following
it. This is my code snippet:

#define MSG_SIZE 1000

enum ctrl_cmds {
CMD_ECHO = 0x0,
CMD_I2C_WRITE,
CMD_I2C_READ,
CMD_SUBREG_READ,
CMD_SUBREG_WRITE,
CMD_IC_SWRESET /* software reset */
};

/* transport interface header */
typedef struct USB_iobuf_common_header_s {
unsigned char function;
unsigned char size_lo;
unsigned char size_hi;
} USB_iobuf_common_header;

typedef struct USB_iobuf_i2c_header_s {
unsigned char addr;
unsigned char subaddr_lo;
unsigned char subaddr_hi;
unsigned char datasize_lo;
unsigned char datasize_hi;
} USB_iobuf_i2c_header;

char data[MSG_SIZE];
char reg;
...

/* get data from USB into buffer. Data points to beginning of stream */
CDC_Read(..., data, MSG_SIZE);

/* parse header */
USB_iobuf_common_header *hdr = (USB_iobuf_common_header *)data;
switch (hdr->function) {
case CMD_I2C_READ:
...
break;
case CMD_I2C_WRITE:
reg = data[sizeof(USB_iobuf_common_header) +
sizeof(USB_iobuf_i2c_header)];
break;

At this point I expect to receive data in 'reg', but it's a junk I'm
getting. What's the problem with that?

A guess:

Often a compiler will insert 'padding' bytes in a structure,
making 'sizeof()' report a size larger than the sum of its
members. This is usually done in the interest of performance,
and sometimes because of alignment requirements for the target
platform. Since you're using the result of sizeof() to compute
an offset, this could skew your anticipated results.

If this turns out to be the case, check your compiler documentation,
it may have a switch or #pragma to allow you to configure this padding.

-Mike
 
D

Dinesh P

Hello Mr.Roman,
I can visualize two issues in this code.
1. Padding of Structures(as explained by others )
2. In C the array is indexed from '0'.
The byte should be accessed like
data[sizeof(USB_iobuf_common_header)+sizeof(USB_iobuf_i2c_header) -1]

Regards,
Dinesh
 
R

Ravishankar S

Roman Mashak said:
Hello, Roman!
You wrote to All on Wed, 2 Jan 2008 09:04:22 -0800:

[skip]
RM> /* get data from USB into buffer. Data points to beginning of stream */
RM> CDC_Read(..., data, MSG_SIZE);

RM> /* parse header */
RM> USB_iobuf_common_header *hdr = (USB_iobuf_common_header *)data;

By the way, my compiler (arm-gcc 4.2.1) spits out a warning on the above
line: "cast increases required alignment of target type". Is this because
'data' and 'USB_iobuf_common_header' are of different types ?

I know in most cases casting is a wrong way and should be avoided, but I
can't see another method to extract specific field from header. Any
suggestions?

Thanks a lot.

With best regards, Roman Mashak. E-mail: (e-mail address removed)

I guess the problem is with alignment and padding of structs. The
USB_iobuf_common_header is made the size 4 by compiler and its
alignment is set at 4 (for efficiency probably ?). For USB_iobuf_i2c_header
its made size 8 bytes and 8-byte alignment.

The "data" array has 1 byte aligment, so naturally the compiler gives the
warning of mismatched alignment (necessary because of ARM architecture..)

Becasue of the change in expected sizes this line:
reg = data[sizeof(USB_iobuf_common_header) +

does not work as expected.

Solution: Use the GNU specific (non Standard) __attribute__((packed)) for
the structs. I dont know if this may increase the code size..
 
R

Richard Tobin

MW> Often a compiler will insert 'padding' bytes in a structure,
MW> making 'sizeof()' report a size larger than the sum of its
MW> members. This is usually done in the interest of performance,
MW> and sometimes because of alignment requirements for the target
MW> platform. Since you're using the result of sizeof() to compute
MW> an offset, this could skew your anticipated results.
[OT]
I'm actually receving the data stream via USB interface from application
compiled by MS VisualStudio compiler. The recepient of data stream is Atmel
At91 ARM-based controller running firmware compiled by arm-gcc port.
[/OT]
Does this mean, I need to configure padding on both sides of connection?

No, it means you shouldn't expect to be able to read and write data
directly as structs in a portable program. Instead of reading a
struct, read the individual parts of it. You may also need to address
byte-order problems unless the two ends are known to be compatible.
MW> If this turns out to be the case, check your compiler documentation,
MW> it may have a switch or #pragma to allow you to configure this padding.

If portability isn't a requirement, then using a pragma to remove
padding may be the simplest solution.

-- Richard
 
R

Richard Tobin

You may also need to address
byte-order problems unless the two ends are known to be compatible.

Looking back at your original post i see it only has single-byte
values, so this is not a issue.

-- Richard
 
B

Ben Bacarisse

Roman Mashak said:
In the program I'm working on I receive a data stream, having structure as
follows:
/* transport interface header */
typedef struct USB_iobuf_common_header_s {
unsigned char function;
unsigned char size_lo;
unsigned char size_hi;
} USB_iobuf_common_header;

typedef struct USB_iobuf_i2c_header_s {
unsigned char addr;
unsigned char subaddr_lo;
unsigned char subaddr_hi;
unsigned char datasize_lo;
unsigned char datasize_hi;
} USB_iobuf_i2c_header;

char data[MSG_SIZE];
char reg;

You have to decide if you need a portable solution or not. If not,
then packing the structs using some compiler-specific magic will do
the trick. To be portable, I would treat the message as an unsigned
char array and access it with functions (or macros) which use an index
to get at the data.

You can make the offsets less mysterious by calculating each one from
the previous offset and its size:

enum usb_offset {
usb_idx_function = 0,
usb_idx_size = usb_idx_function + sizeof(uint8_t),
usb_idx_i2c_addr = usb_idx_size + sizeof(uint16_t),
usb_idx_i2c_subaddr = usb_idx_i2c_addr + sizeof(uint8_t),
usb_idx_i2c_datasize = usb_idx_i2c_subaddr + sizeof(uint16_t),
usb_idx_i2c_data = usb_idx_i2c_datasize + sizeof(uint16_t)
};

uint8_t usb_get_byte(unsigned char *msg, usb_offset idx)
{
return msg[idx];
}

uint16_t usb_get_u16(unsigned char *msg, usb_offset idx)
{
return msg[idx] + (msg[idx + 1] << 8);
}

You can add duplicate enum entries to document other offsets. For
example if the i2c header is not always there you might add:

usb_idx_data = usb_i2c_addr /* data offset for short-header messages */
 
R

Roman Mashak

Hello,

[OT]

[/OT]

In the program I'm working on I receive a data stream, having structure as
follows:

| common_header | I2C_header | data ... |

My purpose is to parse the 'common_header', and read the data following it.
This is my code snippet:

#define MSG_SIZE 1000

enum ctrl_cmds {
CMD_ECHO = 0x0,
CMD_I2C_WRITE,
CMD_I2C_READ,
CMD_SUBREG_READ,
CMD_SUBREG_WRITE,
CMD_IC_SWRESET /* software reset */
};

/* transport interface header */
typedef struct USB_iobuf_common_header_s {
unsigned char function;
unsigned char size_lo;
unsigned char size_hi;
} USB_iobuf_common_header;

typedef struct USB_iobuf_i2c_header_s {
unsigned char addr;
unsigned char subaddr_lo;
unsigned char subaddr_hi;
unsigned char datasize_lo;
unsigned char datasize_hi;
} USB_iobuf_i2c_header;

char data[MSG_SIZE];
char reg;
....

/* get data from USB into buffer. Data points to beginning of stream */
CDC_Read(..., data, MSG_SIZE);

/* parse header */
USB_iobuf_common_header *hdr = (USB_iobuf_common_header *)data;
switch (hdr->function) {
case CMD_I2C_READ:
...
break;
case CMD_I2C_WRITE:
reg = data[sizeof(USB_iobuf_common_header) +
sizeof(USB_iobuf_i2c_header)];
break;

At this point I expect to receive data in 'reg', but it's a junk I'm
getting. What's the problem with that?

Thanks.

Best regards, Roman.
 
R

Roman Mashak

Hello, Roman!
You wrote to All on Wed, 2 Jan 2008 09:04:22 -0800:

RM> /* transport interface header */
RM> typedef struct USB_iobuf_common_header_s {
RM> unsigned char function;
RM> unsigned char size_lo;
RM> unsigned char size_hi;
RM> } USB_iobuf_common_header;

RM> typedef struct USB_iobuf_i2c_header_s {
RM> unsigned char addr;
RM> unsigned char subaddr_lo;
RM> unsigned char subaddr_hi;
RM> unsigned char datasize_lo;
RM> unsigned char datasize_hi;
RM> } USB_iobuf_i2c_header;

I forgot to mention that 'sizeof(USB_iobuf_common_header)' reports 4 and
'sizeof(USB_iobuf_i2c_header)' reports 8.

Best regards, Roman.
 
R

Roman Mashak

Hello, Walter!
You wrote on Wed, 2 Jan 2008 00:48:16 +0000 (UTC):

??>> reg = data[sizeof(USB_iobuf_common_header) +
??>> sizeof(USB_iobuf_i2c_header)];

??>> At this point I expect to receive data in 'reg', but it's a junk I'm
??>> getting. What's the problem with that?

WR> C compilers are permitted to (and in practice many do) put trailing
WR> padding in structures so that arrays of the structures would have
WR> the elements align properly. The sizes you would naively expect
WR> from the above structures is 3 characters and 5 characters
WR> respectively, but it would not be at all unusual for padding to be
WR> tossed in causing the first structure to be 4 characters and the second
WR> to be 8 characters. You probably intend that 'addr' follows immediately
WR> after 'size_hi' in the overall message array, but an indeterminate
WR> amount of padding could occur between them. sizeof() a structure is the
WR> size including any extra padding, rather than the sums of the sizes of
WR> the visible elements.

WR> Probably what you will have to end up doing is using a union
WR> after size_hi .

Do you mean something like this:

struct USB_iobuf_common_header_s {
unsigned char function;
unsigned char size_lo;
unsigned char size_hi;
union {
unsigned char addr;
unsigned char subaddr_lo;
unsigned char subaddr_hi;
unsigned char datasize_lo;
unsigned char datasize_hi;
} USB_iobuf_i2c_header_s;
};


With best regards, Roman Mashak. E-mail: (e-mail address removed)
 
R

Roman Mashak

Hello, Mike!
You wrote on Tue, 1 Jan 2008 16:56:18 -0800:

[skip]
MW> A guess:

MW> Often a compiler will insert 'padding' bytes in a structure,
MW> making 'sizeof()' report a size larger than the sum of its
MW> members. This is usually done in the interest of performance,
MW> and sometimes because of alignment requirements for the target
MW> platform. Since you're using the result of sizeof() to compute
MW> an offset, this could skew your anticipated results.

MW> If this turns out to be the case, check your compiler documentation,
MW> it may have a switch or #pragma to allow you to configure this padding.

[OT]
I'm actually receving the data stream via USB interface from application
compiled by MS VisualStudio compiler. The recepient of data stream is Atmel
At91 ARM-based controller running firmware compiled by arm-gcc port.
[/OT]

Does this mean, I need to configure padding on both sides of connection?

Thanks.

With best regards, Roman Mashak. E-mail: (e-mail address removed)
 
R

Roman Mashak

Hello, Roman!
You wrote to All on Wed, 2 Jan 2008 09:04:22 -0800:

[skip]
RM> /* get data from USB into buffer. Data points to beginning of stream */
RM> CDC_Read(..., data, MSG_SIZE);

RM> /* parse header */
RM> USB_iobuf_common_header *hdr = (USB_iobuf_common_header *)data;

By the way, my compiler (arm-gcc 4.2.1) spits out a warning on the above
line: "cast increases required alignment of target type". Is this because
'data' and 'USB_iobuf_common_header' are of different types ?

I know in most cases casting is a wrong way and should be avoided, but I
can't see another method to extract specific field from header. Any
suggestions?

Thanks a lot.

With best regards, Roman Mashak. E-mail: (e-mail address removed)
 
B

Barry Schwarz

Hello, Walter!
You wrote on Wed, 2 Jan 2008 00:48:16 +0000 (UTC):

??>> reg = data[sizeof(USB_iobuf_common_header) +
??>> sizeof(USB_iobuf_i2c_header)];

??>> At this point I expect to receive data in 'reg', but it's a junk I'm
??>> getting. What's the problem with that?

WR> C compilers are permitted to (and in practice many do) put trailing
WR> padding in structures so that arrays of the structures would have
WR> the elements align properly. The sizes you would naively expect
WR> from the above structures is 3 characters and 5 characters
WR> respectively, but it would not be at all unusual for padding to be
WR> tossed in causing the first structure to be 4 characters and the second
WR> to be 8 characters. You probably intend that 'addr' follows immediately
WR> after 'size_hi' in the overall message array, but an indeterminate
WR> amount of padding could occur between them. sizeof() a structure is the
WR> size including any extra padding, rather than the sums of the sizes of
WR> the visible elements.

WR> Probably what you will have to end up doing is using a union
WR> after size_hi .

Do you mean something like this:

struct USB_iobuf_common_header_s {
unsigned char function;
unsigned char size_lo;
unsigned char size_hi;
union {
unsigned char addr;
unsigned char subaddr_lo;
unsigned char subaddr_hi;
unsigned char datasize_lo;
unsigned char datasize_hi;
} USB_iobuf_i2c_header_s;

Unless you really intend for the five members of the union to all
occupy the same byte, which is almost completely opposite your
original post, you don't want this.


Remove del for email
 
B

Barry Schwarz

Hello, Roman!
You wrote to All on Wed, 2 Jan 2008 09:04:22 -0800:

[skip]
RM> /* get data from USB into buffer. Data points to beginning of stream */
RM> CDC_Read(..., data, MSG_SIZE);

RM> /* parse header */
RM> USB_iobuf_common_header *hdr = (USB_iobuf_common_header *)data;

By the way, my compiler (arm-gcc 4.2.1) spits out a warning on the above
line: "cast increases required alignment of target type". Is this because
'data' and 'USB_iobuf_common_header' are of different types ?

Actually it is because they have different alignment requirements.
data is an array of char. In C, char is synonymous with byte. On
most implementations, char has no alignment (or some say it must be
aligned on a multiple of 1). As you noted in a previous message, your
structures have sizes of 4 and 8 (due to padding as others have
discussed). If the compiler is going to pad the structure out to a
multiple of 4, chances are that it will also require the structures to
be aligned on a multiple of 4. Since data is not so constrained, your
cast raises the possibility of undefined behavior whenever data is not
aligned on a multiple of 4 (which in general will happen 75% of the
time).

Unfortunately, this is not the only problem. You have no idea where
the padding is located. In your USB_iobuf_common_header, the padding
could be before size_lo or before or after size_hi. In your
USB_iobuf_i2c_header, the three padding bytes could be anywhere before
or after the last four members in any combination.

If you want to treat three elements of data as if they were the three
members of USB_iobuf_common_header, two solutions come to mind:

One is to copy the three bytes member by member (not with
memcpy) into an instance of the structure.

The other is to create defines of the form
#define common_function(x) data(x)
#define common_size_lo(x) data(x+1)
and use these in place of hdr->function, hdr->size_lo, etc.

It is also possible that your compiler has an option to not pad the
structures but that solution would not be portable.
I know in most cases casting is a wrong way and should be avoided, but I
can't see another method to extract specific field from header. Any
suggestions?


Remove del for email
 
C

christian.bau

You shouldn't care what application sends the data or what compiler it
is compiled with.

You should have somewhere a written specification that defines the
data stream that is going between two units. If you don't, you are in
deep shit. If someone tells you that source code for the application
writing the data is a specification, then that person is an idiot.

Once you hold that specification in your hands, it is your task to
read the data according to the specification. It seems that your
compiler cannot match data streams consisting of three bytes to
structs, because your struct with three char members had a size of
four bytes. Hint: Arrays with three char elements have a size of three
bytes.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,738
Latest member
JinaMacvit

Latest Threads

Top