Q: Local variables initialization shortcut.

  • Thread starter Jean-Christophe
  • Start date
J

Jens Thoms Toerring

Jean-Christophe said:
On 2 juin, 13:55, (e-mail address removed) (Jens Thoms Toerring) wrote:
You are right because you're talking about *pointers* :
int i;
double d1, d2, *a, *b;
a = &d1;
b = &d2;
i = (int)( b - a ); // assuming &d2 > &d1
then i == 1 // number of variables
because it's a difference of pointers
of a given variable type.
This is a different matter than :
int i;
double d1, d2;
i = (int)( &d2 - &d1 ); // assuming &d2 > &d1
then i == sizeof(double) // number of bytes
because it's a difference of addresses
whatever the variable type is.

I don't agree - you are subtracting adresses and, as
long as you don't cross the border of undefined beha-
viour, it doesn't matter if you directly subtract the
addresses of two variables or first store those addres-
ses in pointers and then subtract their values. All what
(at least when used as defined by the standard, i.e.
when subtracting the addresses of elements of an array)
is the type of the elements. Thus

double a[ 10 ];
double * ap2 = &a[ 2 ],
* ap7 = &a[ 7 ];
printf( "%d %d\n", ( int ) ( &a[ 7 ] - &a[ 2 ] ),
( int ) ( ap7 - ap2 ) );

will always output "5 5", the number of elements in
between (and you're guaranteed that later elements in
the array have higher addresses). If you want the
number of bytes then you have to either multiply by
the size of a double or cast the addresses of the
array elements to char before subtracting them.

When instead of doing that on elements of an array
on addresses of unrelated objects then the results
could actually be different, but there's nothing which
would tell you which one is correct or to be expected -
the compiler writer can pick whatever behaviour (s)he
considers appropriate. It would also be "correct" when
the result would always be 42.
I understand I made the error of expanding my knowledge
of 'micro-controllers' C compilers to the 'PC' C compilers.
Some things won't work at all, and if they do
it's even worse because it's just luck.

You may indeed have gotten used to a "dialect" of C (or
certain ways those compilers handle things not defined
by the standard) and you may have to unlearn a few things
you are taking for granted. Bin there, done that and
sweared quite a bit along the way;-) And it's sometimes
quite hard to grasp why certain constructs aren't well
defined by the standard, especially when one comes from
an assembler background and for quite a number of things
there is an "obvious" way in which one would assumes they
should be dealt with. There's a long list of things that
are either unspecified, undefined or implementation defined
at the end of the C99 standard (Annex J). It's not an easy
read but can help a bit getting an idea what one should
avoid when trying to write portable programs.

Regards, Jens
 
J

Jean-Christophe

I don't agree (...)

Yes, sorry about that.
What I had in mind was this :

#include <stdio.h>
double a,b;
int main(void)
{
unsigned int ia = (unsigned int)&a;
unsigned int ib = (unsigned int)&b;
printf( "ib - ia = %u\r\n", ib - ia );
printf( "ia - ib = %u\r\n", ia - ib );
return 0;
}

Output :
ib - ia = 8 // = sizeof(double)
ia - ib = 4294967288 // 0xFFFFFFF8

When instead of doing that on elements of an array
on addresses of unrelated objects then the results
could actually be different, but there's nothing which
would tell you which one is correct or to be expected -
the compiler writer can pick whatever behaviour (s)he
considers appropriate. It would also be "correct" when
the result would always be 42.

I got that !
(Douglas Noel Adams, uh ?)

You may indeed have gotten used to a "dialect" of C (or
certain ways those compilers handle things not defined
by the standard) and you may have to unlearn a few things
you are taking for granted. Bin there, done that and
sweared quite a bit along the way;-) And it's sometimes
quite hard to grasp why certain constructs aren't well
defined by the standard, especially when one comes from
an assembler background

That's it : I'm an electronician.

and for quite a number of things
there is an "obvious" way in which one would assumes they
should be dealt with. There's a long list of things that
are either unspecified, undefined or implementation defined
at the end of the C99 standard (Annex J). It's not an easy
read but can help a bit getting an idea what one should
avoid when trying to write portable programs.

Thanks again, Jens.
 
E

Eric Sosman

I'm re-writing a messy 7500 lines source code into C,
I kept all the same variables names to ease debugging
and there are a LOT of functions, each one with its
own LOT of local variables - all having different names.
At least I want to initialise all of them to zero to
avoid uninitialized variables ****-up (if I may say so)

Usually it's better to leave the variable uninitialized than
to initialize it with an essentially meaningless value. As part
of their optimization efforts, most compilers perform data flow
analysis to answer questions like "Must we fetch the value of `x'
from memory, or is the value in CPU register 7 still valid?"
Such analysis will discover execution paths that might read a
variable before it's written, and the compiler can usually be told
to produce diagnostics when such paths are detected. The popular
gcc compiler, for example, has the "-Wuninitialized" flag (which
is implied by some others like "-Wall") to enable such warnings.

So if you have

double x;
for (int i = 0; i < N; ++i) {
if (array < 0) {
x = array;
break;
}
}
printf ("%g\n", x);

.... gcc can warn you that `x' might not have a value when used,
because the `if' might never trigger.

BUT if you change the first line to `double x = 0;' gcc will
NOT issue any such warning: It sees that `x' necessarily has a
value, regardless of what happens with the `if' statements. So
gcc will be silent and the output will be zero -- which is fine
if that was in fact the intent, but not so fine if you really
truly expected `x' to be the first negative value in `array',
and there isn't one.

In short, the bug (assuming `array' always has at least one
negative value) is still there, and all you've done is make the
bug's symptom predictable: Your supposedly negative value shows
up as zero. So, which would you rather have: A nice session with
the debugger, trying to discover why `x' isn't negative (or why a
long computation deriving something from `x' behaves strangely),
or a compile-time warning drawing your attention to a potential
problem?
So it won't save me 'some' typing: I'll save a 'lot' of typing.

Seems to me the `lot' of typing does more harm than good.
Usually. YMMV. And so on.
 
N

Noob

Jean-Christophe said:
The function f() has some local (double)
which should all be initialized to zero :

a = b = c = ... = y = z = 0.0;

Can I use a shortcut like this :

memset( &a, 0, number_of_variables * sizeof(double) );

If yes, can I do this :

memset( &a, 0, (&z - &a + sizeof(double)) );

Like this :

void f(void)
{
double a,b,c,d,...,x,y,z;

memset( &a, 0, (unsigned int)( &z - &a + sizeof(double) ) );

...
}

You can use an anonymous struct.
No need for memset, and portable; what more do you want? :)

void foo(void)
{
struct { double a,b,c,d,e,f,g,h; } s = { 0 };
s.a = 42.0;
}
 
B

BartC

Jean-Christophe said:
Yes, sorry about that.
What I had in mind was this :

#include <stdio.h>
double a,b;
int main(void)
{
unsigned int ia = (unsigned int)&a;
unsigned int ib = (unsigned int)&b;
printf( "ib - ia = %u\r\n", ib - ia );
printf( "ia - ib = %u\r\n", ia - ib );
return 0;
}


(char* can be used in place of unsigned int here.)
Output :
ib - ia = 8 // = sizeof(double)
ia - ib = 4294967288 // 0xFFFFFFF8

The trouble is code like this:

double a;
char x;
double b;

On a machine which doesn't need to align doubles, it's quite possible that
(char*)&a - (char*)&b might be 9 bytes. (I don't know what C would do with
&a-&b in that case; possibly divide the 9 bytes by 8 to get 1, the answer
you expect.)

You might try this:

#include <stdio.h>
double a;
char x,y,z;
double b;

int main(void)
{
printf( "&a = %u\n", (unsigned int)&a);
printf( "&b = %u\n", (unsigned int)&b);
printf( "&x = %u\n", (unsigned int)&x);
printf( "&y = %u\n", (unsigned int)&y);
printf( "&z = %u\n", (unsigned int)&z);
return 0;
}

You'll probably find the addresses are all mixed up.

Even when *all* variables are doubles, you won't know which have the lowest
and highest addresses.
 
J

Jean-Christophe

#include <stdio.h>
double a;
char x,y,z;
double b;
int main(void)
{ printf( "&a = %u\n", (unsigned int)&a);
printf( "&b = %u\n", (unsigned int)&b);
printf( "&x = %u\n", (unsigned int)&x);
printf( "&y = %u\n", (unsigned int)&y);
printf( "&z = %u\n", (unsigned int)&z);
return 0;
}

Output :

&a = 4221624
&b = 4221632
&x = 4221640
&y = 4221641
&z = 4221642
You'll probably find the addresses are all mixed up.
Even when *all* variables are doubles, you won't know
which have the lowest and highest addresses.

:eek:)
 
J

James Kuyper

On 2 juin, 12:06, jacob navia :

| Can I use a shortcut like this (...)


Okay, thanks.

Since these variables are located in the stack,
is there a way to FORCE the compiler to align them,
using a compiler directive or something ?

Yes - but it's not a compiler directive, it's "something" called an
array. Instead of a, b, c, ..., z, use arr[0], arr[1], arr[2], ...,
arr[25]. Then you could use memset() on the entire array.

Keep in mind that memset() set the specified number of bytes,
interpreted as unsigned char, to the specified value. If that value is
0, on many systems that will cause the doubles to have a value of 0.0.
In C99, if __STDC_IEC_559__ is pre-defined by the implementation, that's
guaranteed to be true. However, if using C90, or if __STDC_IEC_559__ is
not pre-defined, an implementation is allowed to use a floating point
representation were all-bits-0 represents some other number entirely, or
even a NaN. It could even be a trap representation, in which case any
attempt to retrieve the value of the double objects after calling
memset() has undefined behavior.

The easiest way to portably guarantee that your floating point values
are zero-initialized is to do so explicitly:

double a=0, b=0, c=0, ..., z=0;

if you follow my suggestion, and use an array, you can save a lot of typing:

double arr[26] = {0};

You should understand what that does: it explicitly sets arr[0] to 0,
and implicitly zero-initializes the other 25 elements to 0. There's a
common misunderstanding of code like this; the easiest way to correct
that misunderstanding it to consider the following alternative:

double arr[26] = {1};

That does NOT set ever element of arr to 1. It explicitly sets arr[0] to
1, but the remaining 25 elements would still be implicitly zero-initialized.
That's not what I call 'easy'.

The :) was your clue that it was a joke. He was basically hinting at
the 'a=0, b=0, c=0' approach I described above.
 
B

BartC

Jean-Christophe said:
On Jun 2, 4:08 pm, "BartC" :
Output :

&a = 4221624
&b = 4221632
&x = 4221640
&y = 4221641
&z = 4221642

I tried it on four x86-32 compilers, and get the following (I've added an
extra term to highlight the differences):

&a = 4202528 0
&b = 4202544 16
&x = 4202552 -4
&y = 4202540 -12
&z = 4202536 -4

&a = 4214968 0
&b = 4214984 16
&x = 4214992 -7
&y = 4214977 -15
&z = 4214976 -1

&a = 4244000 0
&b = 4244008 8
&x = 4245044 1037
&y = 4245045 1
&z = 4245046 1

&a = 4231444 0
&b = 4231452 8
&x = 4231464 8
&y = 4231460 -4
&z = 4231468 8

The following was from a non-C compiler (although it's possible that some C
compiler could return this too):

&a = 1638040 0
&b = 1638028 -12
&x = 1638039 11
&y = 1638038 -1
&z = 1638037 -1

While in the C examples, 'a' had the lower address, I don't see how this can
guaranteed (and you might expect it to have higher address, if allocated
first and the stack frame growing downwards). And in general you can't know
what order variables might be allocated in (perhaps in declaration order,
perhaps in alphabetical order, but it can be anything). Or some might not
exist in memory at all.
 
I

Ike Naar

Sorry about the misunderstanding,
this is what I meant :

#include <stdio.h>
double a,b;
int main(void)
{
unsigned int ia = (unsigned int)&a;
unsigned int ib = (unsigned int)&b;
printf( "ib - ia = %u\r\n", ib - ia );
printf( "ia - ib = %u\r\n", ia - ib );
return 0;
}

ib - ia = 8 // = sizeof(double)
ia - ib = 4294967288 // crap

It's probably not crap, but -8 + (UINT_MAX+1) .
 
B

BartC

Jean-Christophe said:
Yes, Jens pointed it to me and I agreed.

But, if the purpose now is to reduce typing, then sticking "s." in front of
every instance of a variable, sounds like more work than simply putting "=0"
once after it's declaration.
 
J

Jens Thoms Toerring

Jean-Christophe said:
What I had in mind was this :
#include <stdio.h>
double a,b;
int main(void)
{
unsigned int ia = (unsigned int)&a;
unsigned int ib = (unsigned int)&b;
printf( "ib - ia = %u\r\n", ib - ia );
printf( "ia - ib = %u\r\n", ia - ib );
return 0;
}
Output :
ib - ia = 8 // = sizeof(double)
ia - ib = 4294967288 // 0xFFFFFFF8

Please note that also converting a pointer to an (unsigned)
int is not well-defined - e.g. an int could be too small
for hoding the value of a pointer. You might try uintptr_t
(an integer type capable of holding object pointers) instead
(but which is an optional type and requires c99). You could
also simply cast to char * (a cast that is guaranteed to
work) and you should get the same (or at least a comparable)
result

printf( "ib - ia = %lu\n",
( unsigned long ) ( ( char * ) &b - ( char * ) &a ) );

without having to convert the pointers to some integer type
(since the size of a char is 1 per definition). The only
possible catch here is that unsigned long could still be
too small to hold a difference between two pointers (and
using "%td" to have the difference between the two pointers
treated as a ptrdiff_t type suffers from the same problem).

BTW, I jut noticed something funny: the output of this program

#include <stdio.h>
double a, b;
int main( void )
{
double c, d;
printf( "global diff: %td\n", ( char * ) &a - ( char * ) &b );
printf( "local diff: %td\n", ( char * ) &c - ( char * ) &d );
return 0;
}

is

global diff: 8
local diff: -8

So the the order of the positions of the variables in memory
(at least on my machine and using gcc 4.6.1) is different for
local and global variables. While not terribly interesting it
could serve as an example that making assumption about the
memory layout of variables the compiler will use is at least
tricky;-)
I got that !
(Douglas Noel Adams, uh ?)

Got it all in one go;-)
Regards, Jens
 
J

Jens Thoms Toerring

Yes, Jens pointed it to me and I agreed.

Sorry, but I didn't come up with the anonymous structure
trick. I don't think anybody before "Noob" mentioned it.

Regards, Jens
 
B

BartC

I tried it on four x86-32 compilers, and get the following (I've added an
extra term to highlight the differences):
While in the C examples, 'a' had the lower address, I don't see how this
can
guaranteed (and you might expect it to have higher address, if allocated
first and the stack frame growing downwards).

Jens' reply reminds me that, for some reason, your test was on static
variables. Rerunning the test on local variables, then generally a had the
higher address.

That's with just two doubles. Declaring doubles a,b,c,d,e (with other
variables interspersed), and displaying the difference between &d and &b, I
was getting results of +8, -16 and +16 over four compilers.
 
E

Eric Sosman

[...]
So the the order of the positions of the variables in memory
(at least on my machine and using gcc 4.6.1) is different for
local and global variables. While not terribly interesting it
could serve as an example that making assumption about the
memory layout of variables the compiler will use is at least
tricky;-)

One implementation *alphabetized* global variables, apparently
during the link phase. I discovered this while trying to port
some code that assumed `double trouble, bubble;' would imply
`&trouble < &bubble', which turned out not to be the case ...

Specifically, the code was doing things like (paraphrased):

int checksum_start;
... data to be checksum-protected ...
int checksum_end;

for (int *p = &checksum_start; ++p < checksum_end; ) {
... accumulate checksum ...
}

When alphabetization put `checksum_end' *before* `checksum_start',
the fan was hit by the fertilizer.
 
J

Jean-Christophe

(...) BTW, I jut noticed something funny:
the output of this program
#include <stdio.h>
double a, b;
int main( void )
{ double c, d;
printf( "global diff: %td\n", ( char * ) &a - ( char * ) &b );
printf( "local diff: %td\n", ( char * ) &c - ( char * ) &d );
return 0;
}
is
global diff: 8
local diff: -8

I understand that : the address allocation for global
variables increases from the starting address,
while for local variables allocated on the stack
the stack pointer is decreased for each new variable.
the result would always be 42.
Got it all in one go ;-)

Well, on MY computer the correct
value is 41.999999999999999999
 
J

Jean-Christophe

One implementation *alphabetized* global variables, apparently
during the link phase. I discovered this while trying to port
some code that assumed `double trouble, bubble;' would imply
`&trouble < &bubble', which turned out not to be the case ...
Specifically, the code was doing things like (paraphrased):
int checksum_start;
... data to be checksum-protected ...
int checksum_end;
for (int *p = &checksum_start; ++p < checksum_end; ) {
... accumulate checksum ...
}

When alphabetization put `checksum_end' *before* `checksum_start',
the fan was hit by the fertilizer.

That's funny because some years ago
I implemented something like this
.... in asm, so it worked pretty well.
 
E

Eric Sosman

That's funny because some years ago
I implemented something like this
... in asm, so it worked pretty well.

Okay, but what has that to do with C? If a construct has some
particular effect in an assembler, do you expect to get the same
outcome in C? If it has some particular effect in Java, do you
expect the same outcome in C? If it has some particular effect in
German, do you expect the same outcome in C?

Don't confuse the tool you're using with the tool you're not.
 
B

BartC

Jean-Christophe said:
I understand that : the address allocation for global
variables increases from the starting address,
while for local variables allocated on the stack
the stack pointer is decreased for each new variable.

I still don't think you appreciate that these things can be unpredictable
(because addresses aren't allocated at the instant the identifier is
encountered in the source). Try this much simpler test:

#include <stdio.h>
double three;
double four;
int main(void)
{

printf( "&three = %u\n", (unsigned int)&three);
printf( "&four = %u\n", (unsigned int)&four);
printf( "&four-&three = %d\n", (int)&four-(int)&three);
return 0;
}

Which do you think has the lower address, 'three' or 'four'? Of my four
compilers, two gave a difference of -8, and two of +8.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,079
Messages
2,570,574
Members
47,207
Latest member
HelenaCani

Latest Threads

Top