Address within structure problem

K

Kelvin Moss

Hi group,

I have an array of structure.
Say -
typedef struct Person {
int i;
int j;
int k;
}person;

person p[10]={{25, 2, 7},
{7, 3, 89},
{45, 56, 78}};


I am interested in processing all the first members of the struct in
the array together, all the second members of the struct together and
so on... i.e. i need to get {25, 7, 45}, {2, 3, 56} together ans so
on...

I came at the following approach but it doesn't work. Could someone
please tell me the right way to do it.

Thanks.


#include <stdio.h>
#include <stdlib.h>
typedef struct Person {
int i;
int j;
int k;
}person;

person p[10]={{25, 2, 7},
{7, 3, 89},
{45, 56, 78}};


size_t foo(void)
{
return offsetof(person, j);
}

int main()
{
int *struct_addr = NULL;
int addr;
int i;
int count = 0;
size_t offset;

offset = foo();
addr = &(p[0]) + offset; /* This seems wrong */
/* Incorrect pointer arithmetic */
for (i = 0; i < 3; i++) {
count += *((int *) addr);
addr += sizeof(person); /* This too ?? */
}
printf("%d", count);
}
 
A

Arthur J. O'Dwyer

On Thu, 2 Mar 2005, Kelvin Moss wrote:
[indentation restored]
typedef struct Person {
int i;
int j;
int k;
} person;

person p[10]={{25, 2, 7},
{7, 3, 89},
{45, 56, 78}};


I am interested in processing all the first members of the struct in
the array together, all the second members of the struct together and
so on... i.e. i need to get {25, 7, 45}, {2, 3, 56} together [and] so
on...

I came at the following approach but it doesn't work. Could someone
please tell me the right way to do it.

[indentation restored]
size_t foo(void)
{
return offsetof(person, j);
}

int main()
{
int *struct_addr = NULL;
int addr;
int i;
int count = 0;
size_t offset;

offset = foo();
addr = &(p[0]) + offset; /* This seems wrong */

It is. Pointer arithmetic on structs works just the same as it does on
any type; that is, '&(p[0])' is equal to 'p', and thus '&(p[0])+offset'
is equal to '&p[offset]', which isn't valid unless the array 'p' has at
least 'offset' members.
And of course you can't assign that pointer value to an 'int'. You
should really try compiling your code before posting it, if you expect
people to give you any constructive feedback, instead of mostly parroting
the same stuff the compiler would have told you if you'd asked it.

Now, one thing which is conforming, as far as I know, would be the
following:

int main(void)
{
char *it = (char *)p + foo(); /* Address of p[0].j */
size_t size = sizeof *p; /* Size of each element */
int count = 0;
int i;
for (i=0; i < 10; ++i) {
count += *(int *)it;
if (i < 9)
it += size;
}
printf("%d\n", count);
return 0;
}

Notice the check on '(i < 2)' before we actually increment the pointer
'it'. That's very important, since if we didn't check that, we'd end
up incrementing 'it' past the end of the array 'p' and into Undefined
Behavior Land. Still, it's very clumsy, because we're checking 'i'
twice. We could equivalently write

count += *(int *)it;
for (i=1; i < 10; ++i) {
it += size;
count += *(int *)it;
}

But that's straying into "too clever for its own good" territory,
as far as I'm concerned.

Speaking of which... Why do you really want to do this? What's
wrong with the idiomatic solution,

int main(void)
{
int count = 0;
int i;
for (i=0; i < 10; ++i) {
count += p->j;
}
printf("%d\n", count);
return 0;
}

It's shorter, much clearer, and almost certainly faster on any modern
optimizing compiler --- precisely /because/ it's shorter and clearer
and more idiomatic. Compiler writers have more incentive to optimize
this kind of loop than to optimize the icky, pointerful kind of loop
you were trying to demonstrate.

So I recommend the idiomatic solution, unless you're willing to
share a reason why you don't want to use it.

-Arthur
 
W

Walter Roberson

:I have an array of structure.

:I am interested in processing all the first members of the struct in
:the array together, all the second members of the struct together and
:so on... i.e. i need to get {25, 7, 45}, {2, 3, 56} together ans so
:eek:n...

If your structure is a fixed and small number of elements long, then
skip the type juggling and just do the manipulation by hand:

switch (offset) {
case 0:
for (i = 0; i < N_EL; i++ ) count += p.i;
break;
case 1:
for (i = 0; i < N_EL; i++ ) count += p.j;
break;
case 2:
for (i = 0; i < N_EL; i++ ) count += p.k;
break;
}

etc.. Notice this is barely longer than your original code
but without the dangers of getting the offsets wrong and
without having to do a bunch of address arithmetic.

You should reserve the type punning for instances where the
number of elements in the structure type is non-trivial or not
known ahead of time.

I would need to check the standard to see what it says
about alignment of values of the same kind. One could
imagine an architecture with a 6 byte int which was faster
to access when aligned on a 4 byte boundary; in such a case,
struct {int i, j, k;} could potentially have i start at offset 0,
j at offset 8, k at offset 16, whereas int [3] might be
offsets of 0, 6, 12. I do not recall what the standard says
about such matters.
 
K

Kelvin Moss

Arthur said:
We could equivalently write

count += *(int *)it;
for (i=1; i < 10; ++i) {
it += size;
count += *(int *)it;
}

Thanks a lot. It works the way you have shown.

But that's straying into "too clever for its own good" territory,
as far as I'm concerned.

Speaking of which... Why do you really want to do this? What's
wrong with the idiomatic solution,

int main(void)
{
int count = 0;
int i;
for (i=0; i < 10; ++i) {
count += p->j;
}
printf("%d\n", count);
return 0;
}

It's shorter, much clearer, and almost certainly faster on any modern
optimizing compiler --- precisely /because/ it's shorter and clearer
and more idiomatic. Compiler writers have more incentive to optimize
this kind of loop than to optimize the icky, pointerful kind of loop
you were trying to demonstrate.

So I recommend the idiomatic solution, unless you're willing to
share a reason why you don't want to use it.


True, the reason I wanted to do this juggling with pointer arithmetic
was that I had many members in my structure. For all of them I had to
do the same processing again and again.

switch (member_enum) {
case a:
for (i = 0; i < n; i++)
cnt += array.a
/* Some more procesing with arrayi].a
break;
case b:
/* same as case a but now with array.b */
break;
case c:
....

}

So I thought that may be I could write a generic routine to which I
could pass a pointer to member and deal all cases in a generic way.

Does this sound like a valid decision or do you think I should go with
the idiomatic solution ?

Thanks again.
 
A

Arthur J. O'Dwyer

Shouldn't this run only 9 times ?

Yes. Nine iterations within the loop sum the elements from 'p[1].j'
to 'p[9].j'. The value of 'p[0].j' is added to 'count' by the line
immediately above the 'for' loop. So all ten items get counted, but
as "1+9" instead of "10." ;-)
(My original post pointed out why running the 'for' loop for ten
iterations was a Bad Idea: it would have invoked undefined behavior.)

-Arthur
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,160
Messages
2,570,889
Members
47,420
Latest member
ZitaVos505

Latest Threads

Top