Contrived casting situation

  • Thread starter Christopher Benson-Manica
  • Start date
C

Christopher Benson-Manica

The thread where casting is being discussed motivated me to try this. Let's
say you wanted to populate a BankRecord structure (as I defined below) with
17-character record numbers, but with the record numbers separated into a SSN
and an account number (and with a terminating '\0')...

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct {
char SSN[9];
char AccountNum[8];
char end;
} BankRecord;

int main( int argc, char * argv[] )
{
int i;
BankRecord *a;

if( argc < 2 ) {
printf( "No information provided\n" );
return EXIT_FAILURE;
}
if( (a=malloc((argc-1)*(sizeof(BankRecord)))) == NULL ) {
printf( "Malloc() failed\n" );
return EXIT_FAILURE;
}
for( i=1; i < argc; i++ ) {
snprintf( (char*)&a[i-1], sizeof(BankRecord), "%s", argv );
}
for( i=0; i < argc-1; i++ ) {
printf( "%s\n", (char*)&a ); /* just to prove that it works */
}
return EXIT_SUCCESS;
}

1) Is this code legal C? (it compiled with no warnings for me and worked
correctly)
2) Is the cast of a structure to a char* the best way to solve this contrived
problem?
3) How can I declare BankRecord in such a way so that end is a const char
equal to '\0'? Would that be desirable?
4) Any other comments?
 
E

Eric Sosman

Christopher said:
The thread where casting is being discussed motivated me to try this. Let's
say you wanted to populate a BankRecord structure (as I defined below) with
17-character record numbers, but with the record numbers separated into a SSN
and an account number (and with a terminating '\0')...

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct {
char SSN[9];
char AccountNum[8];
char end;
} BankRecord;

int main( int argc, char * argv[] )
{
int i;
BankRecord *a;

if( argc < 2 ) {
printf( "No information provided\n" );
return EXIT_FAILURE;
}
if( (a=malloc((argc-1)*(sizeof(BankRecord)))) == NULL ) {
printf( "Malloc() failed\n" );
return EXIT_FAILURE;
}
for( i=1; i < argc; i++ ) {
snprintf( (char*)&a[i-1], sizeof(BankRecord), "%s", argv );
}
for( i=0; i < argc-1; i++ ) {
printf( "%s\n", (char*)&a ); /* just to prove that it works */
}
return EXIT_SUCCESS;
}

1) Is this code legal C? (it compiled with no warnings for me and worked
correctly)


The code is "legal" in the sense that it exhibits no
undefined behavior (unless I've missed something). However,
it is not strictly conforming because it has implementation-
defined behavior: sizeof(BankRecord) is at least 18, but
might be larger because of padding within the struct. Thus,
the program's output when given a 20-character command-line
argument, say, depends on the details of the implementation.

In short: It is not guaranteed that the ninth input
character lands in the AccountNum[0] spot; it might instead
land in the limbo between SSN and AccountNum. The code
works, but might not be doing what you want.
2) Is the cast of a structure to a char* the best way to solve this contrived
problem?

No, because it invokes the implementation-defined behavior
mentioned above and is thus not portable.
3) How can I declare BankRecord in such a way so that end is a const char
equal to '\0'? Would that be desirable?

There's no way to achieve this with a declaration. Whether
that's a desirable state of affairs is a matter of taste -- if
you want C++ and constructors, you know where to find them.
4) Any other comments?

Yes: Write what you mean, not something else that you sort
of think might produce the same effect. If you've got two
distinct values -- SSN and AccountNum -- treat them as such.
If you want to view them as sub-fields of a single larger
string, declare a single 18-element array to contain that
single larger string.

A guess: Did you cut your teeth on FORTRAN, and perhaps
become overly fond of the EQUIVALENCE declaration? If so,
my recommendation is to lose that habit when writing C.
 
M

Matt Gregory

Christopher said:
The thread where casting is being discussed motivated me to try this. Let's
say you wanted to populate a BankRecord structure (as I defined below) with
17-character record numbers, but with the record numbers separated into a SSN
and an account number (and with a terminating '\0')...

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct {
char SSN[9];
char AccountNum[8];
char end;
} BankRecord;

int main( int argc, char * argv[] )
{
int i;
BankRecord *a;

if( argc < 2 ) {
printf( "No information provided\n" );
return EXIT_FAILURE;
}
if( (a=malloc((argc-1)*(sizeof(BankRecord)))) == NULL ) {
printf( "Malloc() failed\n" );
return EXIT_FAILURE;
}
for( i=1; i < argc; i++ ) {
snprintf( (char*)&a[i-1], sizeof(BankRecord), "%s", argv );
}
for( i=0; i < argc-1; i++ ) {
printf( "%s\n", (char*)&a ); /* just to prove that it works */
}
return EXIT_SUCCESS;
}

1) Is this code legal C? (it compiled with no warnings for me and worked
correctly)
Yep.

2) Is the cast of a structure to a char* the best way to solve this contrived
problem?


You could use a union, but I don't think that would be better.
3) How can I declare BankRecord in such a way so that end is a const char
equal to '\0'? Would that be desirable?

You can't. It would be desirable in a way, but I think it would create
a lot of complexities in the language that would quickly defeat its
desirability.
4) Any other comments?

I think if the argument passed is too long, the a[i-1] in the snprintf
call will end up null-less. I think maybe sscanf would be better because
you can specify the length in the format specifier and it will append a
null even if the entire length is used. Other than that I think its a
valid piece of code.

Matt Gregory
 
E

Eric Sosman

Eric said:
[...]
typedef struct {
char SSN[9];
char AccountNum[8];
char end;
} BankRecord;
[...]

In short: It is not guaranteed that the ninth input
character lands in the AccountNum[0] spot; it might instead
land in the limbo between SSN and AccountNum.

Um, er, "Oops." The ninth input character necessarily
lands in SSN[8]; it's the *tenth* character whose location
is uncertain. Sorry about that.
 
A

Al Bowers

Christopher said:
The thread where casting is being discussed motivated me to try this. Let's
say you wanted to populate a BankRecord structure (as I defined below) with
17-character record numbers, but with the record numbers separated into a SSN
and an account number (and with a terminating '\0')...

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct {
char SSN[9];
char AccountNum[8];
char end;
} BankRecord;

int main( int argc, char * argv[] )
{
int i;
BankRecord *a;

if( argc < 2 ) {
printf( "No information provided\n" );
return EXIT_FAILURE;
}
if( (a=malloc((argc-1)*(sizeof(BankRecord)))) == NULL ) {
printf( "Malloc() failed\n" );
return EXIT_FAILURE;
}
for( i=1; i < argc; i++ ) {
snprintf( (char*)&a[i-1], sizeof(BankRecord), "%s", argv );
}
for( i=0; i < argc-1; i++ ) {
printf( "%s\n", (char*)&a ); /* just to prove that it works */


You should try another test:
printf("AccountNum = %s\n",a.Accountum);
and see if it still works as expected.
}
return EXIT_SUCCESS;
}

1) Is this code legal C? (it compiled with no warnings for me and worked
correctly)
2) Is the cast of a structure to a char* the best way to solve this contrived
problem?

No, You have issues of struct member padding which are address in
fag questions 2.13 and 2.12

http://www.eskimo.com/~scs/C-faq/q2.13.html
http://www.eskimo.com/~scs/C-faq/q2.12.html

--
Al Bowers
Tampa, Fl USA
mailto: (e-mail address removed) (remove the x)
http://www.geocities.com/abowers822/
 
C

Christopher Benson-Manica

Matt Gregory said:
You can't. It would be desirable in a way, but I think it would create
a lot of complexities in the language that would quickly defeat its
desirability.

Just making sure, thanks :)
I think if the argument passed is too long, the a[i-1] in the snprintf
call will end up null-less. I think maybe sscanf would be better because
you can specify the length in the format specifier and it will append a
null even if the entire length is used. Other than that I think its a
valid piece of code.

According to my documentation for snprintf(), it writes size-1 characters and
the size-th character is guaranteed to be '\0', so if sizeof(BankRecord) is
18, I'm fine. Of course, as the previous reply noted, that isn't guaranteed
to be the case...
 
C

Christopher Benson-Manica

Eric Sosman said:
Um, er, "Oops." The ninth input character necessarily
lands in SSN[8]; it's the *tenth* character whose location
is uncertain. Sorry about that.

So structure fields are not guaranteed to be contiguous? How unfortunate :(
 
J

Joona I Palaste

Christopher Benson-Manica said:
Eric Sosman said:
Um, er, "Oops." The ninth input character necessarily
lands in SSN[8]; it's the *tenth* character whose location
is uncertain. Sorry about that.
So structure fields are not guaranteed to be contiguous? How unfortunate :(

No, they aren't. The implementation is free to add any padding it likes
between any two members, or after the last member. The first member
MUST begin at offset 0, though.

--
/-- Joona Palaste ([email protected]) ---------------------------\
| Kingpriest of "The Flying Lemon Tree" G++ FR FW+ M- #108 D+ ADA N+++|
| http://www.helsinki.fi/~palaste W++ B OP+ |
\----------------------------------------- Finland rules! ------------/
"That's no raisin - it's an ALIEN!"
- Tourist in MTV's Oddities
 
M

Matt Gregory

Eric said:
The code is "legal" in the sense that it exhibits no
undefined behavior (unless I've missed something). However,
it is not strictly conforming because it has implementation-
defined behavior: sizeof(BankRecord) is at least 18, but
might be larger because of padding within the struct. Thus,
the program's output when given a 20-character command-line
argument, say, depends on the details of the implementation.

Oh, I forgot about that. Oops. This is the very thing that
drove me nuts the first time I tried to translate a Pascal
program to C which wrote records to disk by calling Write().
Damn. Never did get that debugged. Had to switch back to
Pascal.

Matt Gregory
 
J

Joe Wright

Joona said:
Christopher Benson-Manica said:
Eric Sosman said:
Um, er, "Oops." The ninth input character necessarily
lands in SSN[8]; it's the *tenth* character whose location
is uncertain. Sorry about that.
So structure fields are not guaranteed to be contiguous? How unfortunate :(

No, they aren't. The implementation is free to add any padding it likes
between any two members, or after the last member. The first member
MUST begin at offset 0, though.
The case for padding in structures is for alignment of multibyte
objects, int, long, double, etc. The char is a single-byte object. It
can appear anywhere and has no known alignment requirements, even as an
array. There need be no padding between char objects. I betcha... sizeof
(BankRecord) == 18 on every implementation we have.
 
J

Joona I Palaste

Joe Wright said:
Joona said:
Christopher Benson-Manica said:
Eric Sosman <[email protected]> spoke thus:
Um, er, "Oops." The ninth input character necessarily
lands in SSN[8]; it's the *tenth* character whose location
is uncertain. Sorry about that.
So structure fields are not guaranteed to be contiguous? How unfortunate :(

No, they aren't. The implementation is free to add any padding it likes
between any two members, or after the last member. The first member
MUST begin at offset 0, though.
The case for padding in structures is for alignment of multibyte
objects, int, long, double, etc. The char is a single-byte object. It
can appear anywhere and has no known alignment requirements, even as an
array. There need be no padding between char objects. I betcha... sizeof
(BankRecord) == 18 on every implementation we have.

As long as "anywhere" means "anywhere after the first member", that is
true. There need be no padding between char objects, but there CAN be.
Arrays, OTOH, are required to be contiguous and padding-free.

--
/-- Joona Palaste ([email protected]) ---------------------------\
| Kingpriest of "The Flying Lemon Tree" G++ FR FW+ M- #108 D+ ADA N+++|
| http://www.helsinki.fi/~palaste W++ B OP+ |
\----------------------------------------- Finland rules! ------------/
"He said: 'I'm not Elvis'. Who else but Elvis could have said that?"
- ALF
 
K

Kevin Bracey

In message <[email protected]>
Joe Wright said:
typedef struct {
char SSN[9];
char AccountNum[8];
char end;
} BankRecord;

The case for padding in structures is for alignment of multibyte
objects, int, long, double, etc. The char is a single-byte object. It
can appear anywhere and has no known alignment requirements, even as an
array. There need be no padding between char objects. I betcha... sizeof
(BankRecord) == 18 on every implementation we have.

My most commonly used platform, which is ARM-based, has
sizeof(BankRecord) == 20, as all aggregate types are a whole number of 32-bit
words, and 32-bit aligned. This makes life easier when doing structure
assignments and initialisation, as it can use "load/store multiple word"
instructions, which only access word-aligned and sized data.

This may be atypical, but it suits the target architecture, and the C
standard is thankfully designed to allow implementations to make choices like
this.

The only pitfall is the usual one of programmers assuming, like you, details
of the structure padding. One example is the BSD <arpa/tftp.h> header file
which contains:

struct tftphdr {
short tur_opcode; /* TFTP opcode */
/******** OOPS - 2 bytes of unexpected padding here *******/
union {
struct {
char tur_stuff[1]; /* RRQ/WRQ/OACK params */
} tu_rq;
struct {
short tud_block; /* block# or error code */
char tud_data[1]; /* data or error string */
} tu_data;
} th_u;
}; /* sizeof(struct tftphdr) == 8 */
 
J

Joe Wright

Kevin said:
In message <[email protected]>
Joe Wright said:
typedef struct {
char SSN[9];
char AccountNum[8];
char end;
} BankRecord;

The case for padding in structures is for alignment of multibyte
objects, int, long, double, etc. The char is a single-byte object. It
can appear anywhere and has no known alignment requirements, even as an
array. There need be no padding between char objects. I betcha... sizeof
(BankRecord) == 18 on every implementation we have.

My most commonly used platform, which is ARM-based, has
sizeof(BankRecord) == 20, as all aggregate types are a whole number of 32-bit
words, and 32-bit aligned. This makes life easier when doing structure
assignments and initialisation, as it can use "load/store multiple word"
instructions, which only access word-aligned and sized data.

This may be atypical, but it suits the target architecture, and the C
standard is thankfully designed to allow implementations to make choices like
this.

The only pitfall is the usual one of programmers assuming, like you, details
of the structure padding. One example is the BSD <arpa/tftp.h> header file
which contains:

struct tftphdr {
short tur_opcode; /* TFTP opcode */
/******** OOPS - 2 bytes of unexpected padding here *******/
union {
struct {
char tur_stuff[1]; /* RRQ/WRQ/OACK params */
} tu_rq;
struct {
short tud_block; /* block# or error code */
char tud_data[1]; /* data or error string */
} tu_data;
} th_u;
}; /* sizeof(struct tftphdr) == 8 */
You are, of course, correct. My unthoughtful assumption that the world
is like Intel or SPARC coerces me to commit these errors. I am truly
sorry.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,079
Messages
2,570,574
Members
47,207
Latest member
HelenaCani

Latest Threads

Top