Initialized and Uninitialized Global Variables and Executable FileSize

C

C.S.M.G.Sarma

Hi all,
Here is a code snippet that is bugging me off for a while:

#define size (20 * 1024)
unsigned char data_base[size];

/*my application here*/
..........
..........
..........
..........

The global variable "data_base" as you can see is uninitialized. The
executable size was 434KB. when this variable was initialized with "0"
like this:

#define size (20 * 1024)
unsigned char data_base[size] = "";

the executable size increased by nearly 20KB.

My compiler is diab[a PPC cross compiler]. I tried the same on TC,
perhaps after editing the file, and observed that the code size with
initialized goabal variables is larger than that with uninitialized
global variables.

Can anyone tell me why does this happen?

cheers,
Sarma
 
E

Ersek, Laszlo

Hi all,
Here is a code snippet that is bugging me off for a while:

#define size (20 * 1024)
unsigned char data_base[size];

/*my application here*/
.........
.........
.........
.........

The global variable "data_base" as you can see is uninitialized. The
executable size was 434KB. when this variable was initialized with "0"
like this:

#define size (20 * 1024)
unsigned char data_base[size] = "";

the executable size increased by nearly 20KB.

My compiler is diab[a PPC cross compiler]. I tried the same on TC,
perhaps after editing the file, and observed that the code size with
initialized goabal variables is larger than that with uninitialized
global variables.

Can anyone tell me why does this happen?

The two definitions (one with the initialization and one without it) must
have the same effect when looked at from the portable-C programmer's POV.
(All members will be initialized to 0.)

In the second case, the compiler probably treats your code as in

unsigned char data_base[size] = "some specific string of 20K-1 chars";

and generates explicit code (data) for those 20K chars.

In the first case, it relies on the C runtime (OS included) to
zero-initialize the array. (It doesn't necessarily mean a loop or an
explicit memset(), but something that still happens at runtime -- eg.
mmap().)

http://www.linuxjournal.com/article/1059

----v----
A further distinction is made between data variables the user has
initialized and data variables the user has not initialized. If the user
has not specified the initial value of a variable, there is no sense
wasting space in the executable file to store the value. Thus, initialized
variables are grouped into the .data section, and uninitialized variables
are grouped into the .bss section, which is special because it doesn't
take up space in the file--it only tells how much space is needed for
uninitialized variables.
----^----

Your compiler could notice that your second definition involves a "special
case" of initialization.

lacos
 
J

jacob navia

C.S.M.G.Sarma a écrit :
Hi all,
Here is a code snippet that is bugging me off for a while:

#define size (20 * 1024)
unsigned char data_base[size];

/*my application here*/
.........
.........
.........
.........

The global variable "data_base" as you can see is uninitialized. The
executable size was 434KB. when this variable was initialized with "0"
like this:

#define size (20 * 1024)
unsigned char data_base[size] = "";

the executable size increased by nearly 20KB.

In many systems, uninitialized variables like yur "database" are
in the "BSS" section of the executable. This "section" is just an instruction
to the program loader to allocate a zero filled space after the program
is loaded, and takes no space in your executable.

If you initialize your variable (even if you initialize it to zeroes)
it goes OUT of the BSS into the DATA section, and the compiler generates
20K of zeroes that are later loaded into memory when the program starts.
My compiler is diab[a PPC cross compiler]. I tried the same on TC,
perhaps after editing the file, and observed that the code size with
initialized goabal variables is larger than that with uninitialized
global variables.
See above
 
V

Vincenzo Mercuri

C.S.M.G.Sarma ha scritto:
Hi all,
Here is a code snippet that is bugging me off for a while:

#define size (20 * 1024)
unsigned char data_base[size];

/*my application here*/
.........
.........
.........
.........

The global variable "data_base" as you can see is uninitialized. The
executable size was 434KB. when this variable was initialized with "0"
like this:

#define size (20 * 1024)
unsigned char data_base[size] = "";

the executable size increased by nearly 20KB.

My compiler is diab[a PPC cross compiler]. I tried the same on TC,
perhaps after editing the file, and observed that the code size with
initialized goabal variables is larger than that with uninitialized
global variables.

Can anyone tell me why does this happen?

cheers,
Sarma


The same happens with my compiler (gcc 4.4.3 on Linux)

If you initialize or assign *each* array element with the same
value then your object file won't increase in size because
each element will be treated as it had that (same) value;
the compiler just doesn't waste space for the entire array.

For example, either this code:

-------------------------------
#define size (20 * 1024)

unsigned char data_base[size];

int main(void)
{
return 0;
}
-------------------------------

or this:

-------------------------------
#define size (20 * 1024)

unsigned char data_base[size];

int main(void)
{
for(int i = 0; i < size; i++)
data_base = 'a';
return 0;
}
-------------------------------

won't give me a 20K executable file.

The same happens if I write such an initialization:

unsigned char data_base[size] = {0, 0};
(all the other elements will equal 0)

Something different happens when I
initialize the array this way:

unsigned char data_base[size] = {0, 1};

or

unsigned char data_base[size] = {1, 1};

or

unsigned char data_base[size] = "";

because the compiler 'thinks' this is exactly what I
want: to 'create' an entire array object of dimension /size/
and with the first elements with the values I've
just given them. So now, I will get a 20K executable file.
[ the initialization /= ""/ is treated like it was for
each element no matter if to the same values,
a string literal is 'treated as a whole' ]

However, this doesn't happen when I initialize
the array this way:

unsigned char data_base[size] = {'\0'};

because this falls within the first case:
the compiler just doesn't create a 20K
object.

The compiler is kinda aware of the use you are
going to do of that array, so of course it is likely
that this explanation fails as your code will use that array
in a different manner: I just tried to explain the specific
case of a /main/ function that actually does nothing.
It is hardly a technical explanation though.
Sorry if I made you even more confused. :)
 
V

Vincenzo Mercuri

Vincenzo Mercuri ha scritto:
C.S.M.G.Sarma ha scritto:

If you initialize or assign *each* array element with the same
value

with zero values I mean, or by assigning the same values
explicitly
 
M

Mark Bluemel

Hi all,
Here is a code snippet that is bugging me off for a while:

#define size (20 * 1024)
unsigned char data_base[size];

/*my application here*/
.........
.........
.........
.........

The global variable "data_base" as you can see is uninitialized. The
executable size was 434KB. when this variable was initialized with "0"
like this:

#define size (20 * 1024)
unsigned char data_base[size] = "";

the executable size increased by nearly 20KB.

My compiler is diab[a PPC cross compiler]. I tried the same on TC,
perhaps after editing the file, and observed that the code size with
initialized goabal variables is larger than that with uninitialized
global variables.

Can anyone tell me why does this happen?

Jacob's reply seems fairly sound - note that it has little or nothing
to do with the C programming language, and quite a lot to do with the
way that a particular executable file format is organised...
 
B

Ben Bacarisse

Kenneth Brody said:
Ah, but consider this difference between implicitly-initialized and
explicitly-initialized variables:

foo1.c:

int i;

foo2.c:

int i;

It would be perfectly valid for a program to include <ot>object
files</ot> from both source modules.

This is not true. 'int i;' is a "tentative definition" and the effect
(if there is no external definition of i in the translation unit) is
"exactly as if the translation unit contains a file scope declaration of
that identifier [...] with an initializer equal to 0" (6.9.2p2).
I.e. it must behave exactly like:
However, what about:

foo1.c:

int i = 0;

foo2.c:

int i = 0;

Here, it would not be valid to include both <ot>object files</ot>.

The upshot is that neither is valid.

<snip>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,954
Messages
2,570,116
Members
46,704
Latest member
BernadineF

Latest Threads

Top