Copying from one struct to another, simple assignment?

C

CBFalconer

Jack said:
.... snip ...


Correct conclusion, oversimplified and incorrect explanation, Chuck.

True. However my statement is the limitation I impose on myself.
 
K

Keith Thompson

CBFalconer said:
True. However my statement is the limitation I impose on myself.

As do I. But your statement that "Names with a leading '_' char are
reserved for the implementation" was false and misleading -- and I had
already provided a more accurate explanation 7 hours before you
posted.

*Please* read the thread before you post. It's not that hard.
 
Q

qarnos

No problem, except that you are using a name not available to you.
Names with a leading '_' char are reserved for the implementation.

....if they are followed by a capital letter. Name beginning with a
*single* underscore followed by a lowercase letter are fine, IIRC.
The thing you CAN'T do is compare structs for equality.  i.e.:

    if (foo == bar) ...

is illegal.  The == operator doesn't know where there may be
padding.

I would assume it's more to do with the issue of comparing pointers
(shallow vs. deep).
 
Q

qarnos

Unlikely, because structs aren't pointers! :)  There's no intrinsic
reason why == couldn't be made to work for structs, deep comparison
and all.

It's possible, however, that I've misunderstood your point, and that
there is a similarity between structure comparison and pointer
comparison that I've overlooked. If so, care to explain?

Actually, I was thinking more of structs containing pointers. For
example, a struct containing a pointer to a char string. Should the
structs compare equal if the strings contain the same symbols, but
with different storage address'?

Another issue with automagic deep-comparison would be structs which
contain pointers to themselves. Talk about a can of worms!
 
F

Flash Gordon

qarnos said:
Actually, I was thinking more of structs containing pointers. For
example, a struct containing a pointer to a char string. Should the
structs compare equal if the strings contain the same symbols, but
with different storage address'?

The obvious (to me) answer is that the comparison would be a shallow
comparison, so the structs would only compare equal if the pointer
values were the same. A bigger problem is a struct that contains a
union, but the obvious answer to that would be to make comparison of
structs containing unions a constraint violation.
Another issue with automagic deep-comparison would be structs which
contain pointers to themselves. Talk about a can of worms!

Easily solved by making it a shallow comparison ;-)
 
C

CBFalconer

Richard said:
qarnos said:

Unlikely, because structs aren't pointers! :) There's no
intrinsic reason why == couldn't be made to work for structs,
deep comparison and all.

However the C standard is written to allow use of simple things
such as memcmp to check equality of larger objects. This prevents
== use on structs, because there is no control on use of the
padding areas. Just as you have to use strcmp on strings, you have
to write a structcmp to operate on your structure.
 
F

Flash Gordon

Golden said:
Need not be a violation, but the compiler would have to zero the bits of the
union not being used for the type stored into it. As this would be a
performance hit, a compile time switch could be set.

That assumes that the compiler knows what type was stored in it last. In
some instances the only way this could be done would be to track this at
run time. For example, the program could prompt the user for what type
of data to store in the union within the struct, do lots of other stuff,
then do the comparison in code in a different translation unit.
Currently unions are not implemented like this.
Yes. What should == mean in the case of pointers inside a structure?

int *a;
int *b;

if (a == b)
if (*a == *b)

I thought it was obvious that I was suggesting it would test the pointer
values rather than what was pointed at. I.e. a==b. After all, the rest
of the time in C when you compare a pointer it compares the pointer
rather than what is pointed at!
 
B

Ben Bacarisse

Golden California Girls said:
Need not be a violation, but the compiler would have to zero the bits of the
union not being used for the type stored into it. As this would be a
performance hit, a compile time switch could be set.

The compiler can't know which member of the union is "active" so it
would not know how to do the comparison (and if it did know, by some
magic, it would not need to zero anything).
Yes. What should == mean in the case of pointers inside a structure?

int *a;
int *b;

if (a == b)
if (*a == *b)

The first would be the only reasonable choice. If you took the
second, you would not be doing a shallow comparison (consider the case
of a linked list where the pointer is to another instance of the
struct).

Also, adding a * would make testing for == undefined if any of the
members was a null pointer (or any other pointer that can not be
dereferenced).
 
K

Keith Thompson

Golden California Girls said:
Need not be a violation, but the compiler would have to zero the bits of the
union not being used for the type stored into it. As this would be a
performance hit, a compile time switch could be set.

It wouldn't just be a performance hit, it would be nearly impossible
to implement. Consider:

union u {
char c;
int i;
};

void set_char(char *ptr, char value)
{
*ptr = value;
}

int main(void)
{
union u x;
union u y;

x.i = 12345;
y.i = 54321;
x.c = '*';
set_char(&y.c, '*');

if (x == y) {
/* ? */
}
return 0;
}

The compiler might zero the excess bytes of x when assigning '*' to
x.c, but when it sees ``*ptr = value;'', it has no way of knowing that
it's assigning a value to a union member. The compiler would not only
have to keep track of which member of a union is current, it would
have to propagate this information to code that doesn't refer to any
unions.
Yes. What should == mean in the case of pointers inside a structure?

int *a;
int *b;

if (a == b)
if (*a == *b)

If struct comparison were allowed, I think shallow comparison would be
the only way to go. But since logical equality very often either is
meaningless or corresponds to something other than shallow
member-by-member equality, it can be argued that it's not worth adding
to the language; if you want to compare structs, you can write the
code yourself.
 
G

Guest

It wouldn't just be a performance hit, it would be nearly impossible
to implement.  Consider:
union u {
    char c;
    int i;
};
void set_char(char *ptr, char value)
{
    *ptr = value;
}
int main(void)
{
    union u x;
    union u y;
    x.i = 12345;
    y.i = 54321;
    x.c = '*';
    set_char(&y.c, '*');
    if (x == y) {
        /* ? */
    }
    return 0;
}
The compiler might zero the excess bytes of x when assigning '*' to
x.c, but when it sees ``*ptr = value;'', it has no way of knowing that
it's assigning a value to a union member.  The compiler would not only
have to keep track of which member of a union is current, it would
have to propagate this information to code that doesn't refer to any
unions.

When it does ``*ptr = value;'' in a separate translation unit [going farther
than your example] it sure doesn't.  When it sees ``set_char(&y.c, '*');'' it
does know and should assume that the called function knows nothing and zero the
excess bytes of the pointers object.

I'm not saying this would be easy or even desirable.  There may be some
construction that does make it impossible, right now I can't think of one..




If struct comparison were allowed, I think shallow comparison would be
the only way to go.  But since logical equality very often either is
meaningless or corresponds to something other than shallow
member-by-member equality, it can be argued that it's not worth adding
to the language; if you want to compare structs, you can write the
code yourself.

One of those camel nose in the tent things about allowing structure assignment ...

I think we are actually exploring the weakness of the union in C.  I believe it
was a necessary evil to break the strict type checking

I'm not convinced union *was* necessary. I never use it. If I want to
subvert the type system I find other ways. union does get used as a
hack to find the alignment needed by malloc and friends (a union
of all types will be correctly aligned for, er, all types)
 
B

Ben Bacarisse

Golden California Girls said:
Keith said:
It wouldn't just be a performance hit, it would be nearly impossible
to implement. Consider:

union u {
char c;
int i;
};

void set_char(char *ptr, char value)
{
*ptr = value;
}

int main(void)
{
union u x;
union u y;

x.i = 12345;
y.i = 54321;
x.c = '*';
set_char(&y.c, '*');

if (x == y) {
/* ? */
}
return 0;
}

The compiler might zero the excess bytes of x when assigning '*' to
x.c, but when it sees ``*ptr = value;'', it has no way of knowing that
it's assigning a value to a union member. The compiler would not only
have to keep track of which member of a union is current, it would
have to propagate this information to code that doesn't refer to any
unions.

When it does ``*ptr = value;'' in a separate translation unit [going
farther than your example] it sure doesn't. When it sees
``set_char(&y.c, '*');'' it does know and should assume that the
called function knows nothing and zero the excess bytes of the
pointers object.

You have a typo so am guessing a bit. Do you mean that the compiler
zeros the other bytes in the union at the point it generates
the set_char call? If so, that would be wrong since it does not know
that set_char will set the char (despite the name).
I'm not saying this would be easy or even desirable. There may be some
construction that does make it impossible, right now I can't think
of one.

I think above is an example, but if you want a more obvious dilemma:

some_function(&y.c, &y.i);
if (x == y) /* ? */

<snip>
 
K

Keith Thompson

On 11 Mar, 02:26, Golden California Girls <[email protected]>
wrote: [...]
I think we are actually exploring the weakness of the union in
C.  I believe it was a necessary evil to break the strict type
checking

I'm not convinced union *was* necessary. I never use it. If I want to
subvert the type system I find other ways. union does get used as a
hack to find the alignment needed by malloc and friends (a union
of all types will be correctly aligned for, er, all types)

Unions aren't just used to subvert type checking. For example, they
can be used to implement what other languages call "variant records":

struct variant {
enum foo which_type;
union {
some_type a;
some_other_type b;
yet_another_type c;
} variant_part;
};

The intent is that the value of which_type indicates which union
member is current. You *can* subvert type checking with this, but in
proper use you don't.

You can do the same thing with a struct rather than a union, but using
a union saves space and can be used to match an externally imposed
layout.
 
K

Kenny McCormack

Keith Thompson said:
You can do the same thing with a struct rather than a union, but using
a union saves space and can be used to match an externally imposed
layout.

But even you can see that you can't make either of these assertions
*and* stay within the dogma (100% standard C and damn everything else
and every other consideration) of this newsgroup.

1) "space" is something we are not supposed to know anything about, nor
care anything about.
2) "externally imposed layout"s are, of course, completely OT here.
 
B

Ben Bacarisse

Golden California Girls said:
Ben said:
Golden California Girls said:
Keith Thompson wrote:
union u {
char c;
int i;
};

void set_char(char *ptr, char value)
{
*ptr = value;
}

int main(void)
{
union u x;
union u y;

x.i = 12345;
y.i = 54321;
x.c = '*';
set_char(&y.c, '*');

if (x == y) {
/* ? */
}
return 0;
}

The compiler might zero the excess bytes of x when assigning '*' to
x.c, but when it sees ``*ptr = value;'', it has no way of knowing that
it's assigning a value to a union member. The compiler would not only
have to keep track of which member of a union is current, it would
have to propagate this information to code that doesn't refer to any
unions.
When it does ``*ptr = value;'' in a separate translation unit [going
farther than your example] it sure doesn't. When it sees
``set_char(&y.c, '*');'' it does know and should assume that the
called function knows nothing and zero the excess bytes of the
pointers object.

You have a typo so am guessing a bit. Do you mean that the compiler
zeros the other bytes in the union at the point it generates
the set_char call? If so, that would be wrong since it does not know
that set_char will set the char (despite the name).

Yes. It doesn't know and except in the case of breaking UB it
doesn't matter either.

I'm sorry, I can't see why it does not matter. If set_char is badly
named because its purpose is to print the representation of the union
(and a char pointer to any member will do as the pointer to use for
that) then the compiler can't zero anything. If it sets that char
member, then zeroing the other bytes is fine, but how can the compiler
tell?

It seems as if you are saying that setting a union via a pointer to a
member is already UB so the compiler can do what it likes, but I
dislike guessing (I am almost always wrong) so you'll have to spell it
out for me.
Is there a sequence point in there or is the function call itself
UB?

I must be missing something. Where does the UB come from? As far as
I can tell it is undefined to use both values, but not take the
address of both. Have I got that wrong?
Yes I know what you intend, but without seeing the internals of
some_function and absent a const it could store values in both y.i
and y.c for return. That of course is UB as things stand now.

I think that it is possible to write some_function in such a way that
it is not UB (e.g. "return;") but that the compiler can't tell what it
does so it can't take any action at the point of call.
 
B

Ben Bacarisse

Golden California Girls said:
Ben said:
Golden California Girls said:
Ben Bacarisse wrote:

Keith Thompson wrote:
union u {
char c;
int i;
};

void set_char(char *ptr, char value)
{
*ptr = value;
}

int main(void)
{
union u x;
union u y;

x.i = 12345;
y.i = 54321;
x.c = '*';
set_char(&y.c, '*');

if (x == y) {
/* ? */
}
return 0;
}

The compiler might zero the excess bytes of x when assigning '*' to
x.c, but when it sees ``*ptr = value;'', it has no way of knowing that
it's assigning a value to a union member. The compiler would not only
have to keep track of which member of a union is current, it would
have to propagate this information to code that doesn't refer to any
unions.
When it does ``*ptr = value;'' in a separate translation unit [going
farther than your example] it sure doesn't. When it sees
``set_char(&y.c, '*');'' it does know and should assume that the
called function knows nothing and zero the excess bytes of the
pointers object.
You have a typo so am guessing a bit. Do you mean that the compiler
zeros the other bytes in the union at the point it generates
the set_char call? If so, that would be wrong since it does not know
that set_char will set the char (despite the name).

Yes. It doesn't know and except in the case of breaking UB it
doesn't matter either.

I'm sorry, I can't see why it does not matter. If set_char is badly
named because its purpose is to print the representation of the union
(and a char pointer to any member will do as the pointer to use for
that) then the compiler can't zero anything. If it sets that char
member, then zeroing the other bytes is fine, but how can the compiler
tell?

It can tell because you told it via the .something. Until you did
that the compiler knew the size of the object was the size of the
entire union. Once you change to .something the compiler knows the
size of .something and is free to to anything at all with the rest
of the union.

We have hit the heart of the disagreement. Where does this come from?
I can't find any support for it in the standard (but I am probably
biased). To cut to the chase, you are saying that:

union u y;
y.i = 42;
any_function(&y.c);
/* y.i now indeterminate? */

yes? I can't get that from the standard. What is your reasoning?

You can take the address of both. The UB is more basic than that.
For a second drop the union idea and simply pass the address of the
same int to this black box function.
void blackbox(int *i, int *j);
.
int k;
blakcbox(&k, &k);
If blackbox only reads i and j there is no UB. In any other case
there is the possibility of UB. Blackbox might save a value to i
and then read j. It might save to both but which order?

This analogy has me confused. There is nothing wrong with the code
you wrote. Of course 'blackbox' can invoke UB but so can almost any
function. It can also do lots of things with its pointer arguments
that are perfectly legal as you acknowledge above.
The compiler is free to assume the worst case as I believe the
standard does too and issue a diagnostic.

A compiler can issue any diagnostics it likes. I don't see that as
the issue. If the compiler can "assume the worst" it can treat your
example above as undefined and do anything at all. But I think the
reverse is true: the compiler must assume the best. I.e. if there is
any possible implementation of 'blackbox' that avoids UB it must
compiler the call as if all is well.
 
C

CBFalconer

.... snip ...

I'm not convinced union *was* necessary. I never use it. If I want
to subvert the type system I find other ways. union does get used
as a hack to find the alignment needed by malloc and friends (a
union of all types will be correctly aligned for, er, all types)

You never wrote a compiler, with symbol tables, huh?
 
B

Ben Bacarisse

Golden California Girls said:
Ben said:
We have hit the heart of the disagreement. Where does this come from?
I can't find any support for it in the standard (but I am probably
biased). To cut to the chase, you are saying that:

union u y;
y.i = 42;
any_function(&y.c);
/* y.i now indeterminate? */

yes? I can't get that from the standard. What is your reasoning?

[assuming .i is int and .c is char]

I'm not a standards maven, but K&R2 6.8 p147-8, "it is
implementation-dependent if something is stored as one type and
extracted as another."

I don't consider taking the address of a member as extracting it and I
can't find any text in the standard that supports that view.
any_function hasn't been prototyped, but unless there is const in
it, it has explicitly been given permission to store data to the .c
part. What if anything happens to the rest is
"implementation-dependent".

I know you believe that if any_function does not store into the
union that you expect y.i to be as it was before the function call,
but the standard doesn't say it has to be.

I can't see why. Take another example:

int i; /* indeterminate: just accessing it is UB */
f(&i); /* Fine provided f does not access i before setting it */
What are you passing to any_function, zero or forty two? Consider
big or little endian before you answer.

Neither -- I am passing an address. If I de-reference the pointer the
int is re-interpreted and I will get an implementation defined byte of
that int. If c is unsigned char, the result can not be a trap
representation (there is some debate about the possibility of signed
char having such things -- I don't think it can -- but that is another
matter altogether).
Also consider does the
standard care if you swapped .i and .c in your example? Would there
be a discussion in that case?

I don't think so.
 
K

Keith Thompson

Golden California Girls said:
Ben said:
We have hit the heart of the disagreement. Where does this come from?
I can't find any support for it in the standard (but I am probably
biased). To cut to the chase, you are saying that:

union u y;
y.i = 42;
any_function(&y.c);
/* y.i now indeterminate? */

yes? I can't get that from the standard. What is your reasoning?

[assuming .i is int and .c is char]

I'm not a standards maven, but K&R2 6.8 p147-8, "it is
implementation-dependent if something is stored as one type and
extracted as another."

any_function hasn't been prototyped, but unless there is const in
it, it has explicitly been given permission to store data to the .c
part. What if anything happens to the rest is
"implementation-dependent".

It's implementation-dependent (or whatever the standard now says) if
something *is* stored, not if something *might be* stored.
I know you believe that if any_function does not store into the
union that you expect y.i to be as it was before the function call,
but the standard doesn't say it has to be.

Yes, it does. C99 6.2.4p2:

An object exists, has a constant address, and retains its
last-stored value throughout its lifetime.

Consider this program:

#include <stdio.h>

union u {
char c;
int i;
};

void func(char *ptr)
{
puts("func does nothing");
}

int main(void)
{
union u obj;
obj.i = 12345;
func(&obj.c);
printf("obj.i = %d\n", obj.i);
return 0;
}

obj.i retains its last-stored value throughout its lifetime. After
the value 12345 is stored in obj.i, no other value is stored in obj.i,
and no vlaue is stored, directly or indirectly, in obj.c.

The output of the above program must be:

func does nothing
obj.i = 12345

An implementation that sees the call to func(), decides that obj.c
*might* be updated, and zeros the portion of obj following obj.c, is
non-conforming because it clobbers obj.i. The value of obj.i doesn't
become implementation-dependent (or indeterminate, or whatever the
standard says) until and unless a value is actualy stored in obj.c.

(I'm sure the standard says something about reading a union member
after storing a different one, but I haven't been able to find it.)
 
B

Ben Bacarisse

(I'm sure the standard says something about reading a union member
after storing a different one, but I haven't been able to find it.)

Are you thinking of footnote 82 in 6.5.2.3 p3 (which explain what the
value of union.member is)?

If the member used to access the contents of a union object is not
the same as the member last used to store a value in the object,
the appropriate part of the object representation of the value is
reinterpreted as an object representation in the new type as
described in 6.2.6 (a process sometimes called "type
punning"). This might be a trap representation.

There is a change bar in n1256 on the line that references the
footnote, but since I can't justify buying the standard I have no idea
what exactly has changed from the first C99.
 
K

Keith Thompson

Ben Bacarisse said:
Are you thinking of footnote 82 in 6.5.2.3 p3 (which explain what the
value of union.member is)?

If the member used to access the contents of a union object is not
the same as the member last used to store a value in the object,
the appropriate part of the object representation of the value is
reinterpreted as an object representation in the new type as
described in 6.2.6 (a process sometimes called "type
punning"). This might be a trap representation.

Yup, that's it.
There is a change bar in n1256 on the line that references the
footnote, but since I can't justify buying the standard I have no idea
what exactly has changed from the first C99.

I think I paid $18 for my copy of the standard; it might be more now.
TC1, TC2, and TC3 are available at no charge.

Anyway, the paragraph in question is:

A postfix expression followed by the . operator and an identifier
designates a member of a structure or union object. The value is
that of the named member,82) and is an lvalue if the first
expression is an lvalue. If the first expression has qualified
type, the result has the so-qualified version of the type of the
designated member.

where the "82)" is a superscript reference to the quoted footnote.
The only difference is the reference to the footnote, which was added
by TC3 in response to DR #283,
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_283.htm>, which in
turn isolated one of the points from DR #257,
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_257.htm>.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,189
Members
46,735
Latest member
HikmatRamazanov

Latest Threads

Top