extern decleration intialization

T

Tapeesh

I would like to know what is the expected behaviour of C compilers when
an extern decleration is intialized.

When the following code is compiled using gcc

//File extern.c
int arr[3] ;
int a ;


//File extern.h
extern int arr[3] = {1,2};
extern int a = 4;

//File main.c
#include "extern.h"
#include <stdio.h>
int main()
{
printf("a = %d\n",a);
return 0;
}

An extern decleration means that a variable is declared and no memory
is allocated for it. So, why does such an initialization work and the
compiler only generates a warning for it?

gcc warning
extern.h:1: warning: `arr' initialized and declared `extern'
extern.h:2: warning: `a' initialized and declared `extern'



Another case is
//File extern.c
int arr[3] = {1,2,3} ;
int a = 5;


//File extern.h
extern int arr[3] = {1,2};
extern int a = 4;

//File main.c
#include "extern.h"
#include <stdio.h>
int main()
{
printf("a = %d\n",a);
return 0;
}

In this case the compiler generates an error of redefinition of "arr"
and "a".

gcc warning and error
extern.h:1: warning: `arr' initialized and declared `extern'
extern.h:2: warning: `a' initialized and declared `extern'
/tmp/ccobzB9o_O(.data+0x0): multiple definition of `arr'
/tmp/ccypnIjR.o(.data+0x0): first defined here
/tmp/ccobzB9o_O(.data+0xc): multiple definition of `a'
/tmp/ccypnIjR.o(.data+0xc): first defined here
collect2: ld returned 1 exit status

Now, this means that for this extern decelaration a memory is
allocated. Is that what is expected ?

Regards
Tapeesh
 
S

slebetman

Tapeesh said:
An extern decleration means that a variable is declared and no memory
is allocated for it.

I've always tought extern means: the variable is declared 'somewhere
else' but I want to use it in this file. Not: declared but no memory is
allocated. If you try to extern an 'unallocated' variable the compiler
will not complain but the linker will complain something like object
not found.

I've always used extern like this:

file1.c:
int myvariable = 100;
....


file2.c:
extern int myvariable; // Yay! I can access a variable in file 1
....


file3.c:
extern int myvariable; // Yay! I can access a variable in file 1
....


I think you should be able to initialise myvariable anywhere in file1,
file2 or file3 but it has to be instantiated somewhere. In this case
file1. Also, from experience, initialising the variable twice will
sometimes get you a warning.

I'm sure the C gurus here can tell you more.
 
S

Suman

Tapeesh said:
I would like to know what is the expected behaviour of C compilers when
an extern decleration is intialized.

When the following code is compiled using gcc

//File extern.c
int arr[3] ;
int a ;


//File extern.h
extern int arr[3] = {1,2};
extern int a = 4;

//File main.c
#include "extern.h"
#include <stdio.h>
int main()
{
printf("a = %d\n",a);
return 0;
}

An extern decleration means that a variable is declared and no memory
is allocated for it. So, why does such an initialization work and the
compiler only generates a warning for it?

N869 [C99 draft]:
6.9.2 External object definitions

Semantics
1 If the declaration of an identifier for an object has
file scope
and an initializer, the declaration is an external
definition for the identifier.
....
 
P

pete

I've always used extern like this:

file1.c:
int myvariable = 100;
...

file2.c:
extern int myvariable; // Yay! I can access a variable in file 1
...

file3.c:
extern int myvariable; // Yay! I can access a variable in file 1
...

This the normal way to do that:

file1.h:
extern int myvariable;
....

file1.c:
#include "file1.h"
int myvariable = 100;
....

file2.c:
#include "file1.h" // Yay! I can access a variable in file 1
....

file3.c:
#include "file1.h" // Yay! I can access a variable in file 1
....

Immediately upon defining any external object
is a good time to decide whether to make it static storage class.
If not, then that might be a good time
to write to write an extern declaration for it.
 
K

Kenneth Brody

I've always tought extern means: the variable is declared 'somewhere
else' but I want to use it in this file. Not: declared but no memory is
allocated. If you try to extern an 'unallocated' variable the compiler
will not complain but the linker will complain something like object
not found.

Well, I have seen systems where the linker will treat the case of all
"extern" definitions and no actual definition as "allocate some memory
in the global variables area for it". (ie: as if there were a single,
uninitialized, definition of that variable.) Whether such behavior is
good or evil is left to the opinion of the reader. Of course, this was
only for data, not code.

I've also seen systems where the linker is smart enough to detect a
difference in size of extern variables. So, if you have "extern int i;"
in module A, and "long i;" in module B, the linker will complain that
the sizes don't match. (Assuming, of course, that sizeof(int) and
sizeof(long) are different.)

But, my understanding of "extern" is the same as yours -- it tells the
compiler that the actual variable is to be found elsewhere.

[...]

--
+-------------------------+--------------------+-----------------------------+
| Kenneth J. Brody | www.hvcomputer.com | |
| kenbrody/at\spamcop.net | www.fptech.com | #include <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------------+
Don't e-mail me at: <mailto:[email protected]>
 
C

Chris Torek

I would like to know what is the expected behaviour of C compilers when
an extern decleration is intialized.

See any of about a dozen of my old articles on this, e.g.,
<http://groups.google.com/group/comp.lang.c/msg/2fba661c2f7b2ad5>
(not sure if this URL will work), or search for message-ID
An extern decleration means that a variable is declared and no memory
is allocated for it.

The problem is, this is not what "extern" means. It does *sometimes*
mean that, but not in any case in which there is an initializer.
 
P

pete

Chris Torek wrote:
The problem is, this is not what "extern" means. It does *sometimes*
mean that, but not in any case in which there is an initializer.

I'm unfamiliar with extern declarations with initializers.
Is there a good use for that,
or is it something extra that's allowed like "auto"
or is it usually associated with a bug?
 
L

lawrence.jones

Tapeesh said:
I would like to know what is the expected behaviour of C compilers when
an extern decleration is intialized.

It is treated as a definition. If there is another definition of the
same object in another file (as there is in your example), the behavior
is undefined. Some systems quietly accept the code as long as there is
at most one explicit initialization, some systems quietly accept the
code even with multiple initializations (with some rules as to which
initialization wins), and some systems reject the code no matter what.

-Larry Jones

My life needs a rewind/erase button. -- Calvin
 
J

Jack Klein

This the normal way to do that:

file1.h:
extern int myvariable;
...

file1.c:
#include "file1.h"
int myvariable = 100;
...

file2.c:
#include "file1.h" // Yay! I can access a variable in file 1
...

file3.c:
#include "file1.h" // Yay! I can access a variable in file 1
...

Immediately upon defining any external object
is a good time to decide whether to make it static storage class.

No, you can't possibly do that. An external object always has static
storage duration. Period. No exceptions.

If you apply the "static" keyword to a file scope declaration or
definition, it is no longer an external object. But it still has
static storage duration.
 
J

Jack Klein

Well, I have seen systems where the linker will treat the case of all
"extern" definitions and no actual definition as "allocate some memory
in the global variables area for it". (ie: as if there were a single,
uninitialized, definition of that variable.) Whether such behavior is
good or evil is left to the opinion of the reader. Of course, this was
only for data, not code.

In actuality it makes little difference, as least as far as
comp.lang.c and standard C are concerned. That is because a program
with more than one external definition for any object or function has
undefined behavior.
I've also seen systems where the linker is smart enough to detect a
difference in size of extern variables. So, if you have "extern int i;"
in module A, and "long i;" in module B, the linker will complain that
the sizes don't match. (Assuming, of course, that sizeof(int) and
sizeof(long) are different.)

But, my understanding of "extern" is the same as yours -- it tells the
compiler that the actual variable is to be found elsewhere.

The actual requirements of the C standard for external objects or
functions:

If the external identifier is not referenced by the program, there
must be exactly 0 or 1 actual definitions of it.

If the external identifier is referenced by the program, there must be
exactly 1 definition of it.

Anything else is undefined.
 
P

pete

Jack said:
No, you can't possibly do that. An external object always has static
storage duration. Period. No exceptions.

You got me there.
If you apply the "static" keyword to a file scope declaration or
definition, it is no longer an external object.

I think it still is.

N869
6.9.2 External object definitions
Semantics
[#1] If the declaration of an identifier for an object has
file scope and an initializer, the declaration is an
external definition for the identifier.

The term "external object" isn't defined,
but I think it should mean an object with an external definition.
 
C

Chris Torek

I'm unfamiliar with extern declarations with initializers.
Is there a good use for that,
or is it something extra that's allowed like "auto"
or is it usually associated with a bug?

I consider it "most similar to auto", out of that list. But note
that:

static int i;
extern int i = 3;

(considered as a complete translation unit) has exactly the same
meaning as:

static int i;
static int i = 3;

That is, "extern" sometimes means "static"!
 
P

pete

Chris said:
I consider it "most similar to auto", out of that list. But note
that:

static int i;
extern int i = 3;

(considered as a complete translation unit) has exactly the same
meaning as:

static int i;
static int i = 3;

That is, "extern" sometimes means "static"!

Thank you.
 
J

Joe Wright

pete said:
Jack Klein wrote:




You got me there.




I think it still is.
Nonsense. At file scope the static keyword turns off external linkage.
N869
6.9.2 External object definitions
Semantics
[#1] If the declaration of an identifier for an object has
file scope and an initializer, the declaration is an
external definition for the identifier.

The term "external object" isn't defined,
but I think it should mean an object with an external definition.

But it still has
static storage duration.
It always had static storage class. The point is you can't "see" it now
from outside the translation unit.
 
J

Joe Wright

Chris said:
I consider it "most similar to auto", out of that list. But note
that:

static int i;
extern int i = 3;

(considered as a complete translation unit) has exactly the same
meaning as:

static int i;
static int i = 3;

That is, "extern" sometimes means "static"!

That's obfuscating. "extern" is a reference to a variable with static
storage class and external linkage. "extern" never means "static" as you
suggest above. I don't think I've ever replied critically to a Torek
post. Be kind.

Consider a C program called 'foo' consisting of three translation units,
foo.c, bar.c and baz.c and a general header, foo.h for common
declarations which is #include'd in all three translation units. I, as
designer of 'foo' determine that a single variable, 'int global' shall
be visible and available to functions in all three translation units.

In only one translation unit, foo.c, I define and initialize at file scope..

int global = 0;

...which gives global static storage and external linkage, and in foo.h I
declare..

extern int global;

As all translation units #include "foo.h", all functions in the program
can read and write 'int global'.

That's the way I do it. YMMV.
 
P

pete

Joe said:
Nonsense. At file scope the static keyword turns off external linkage.

"External" is also used in the standard in contexts
that have nothing to do with linkage.

In translation phase 8
"All external object and function references are resolved."
I think that sentence refers to objects and functions
with internal linkage as well as external linkage,
but I'm not sure.
 
C

Chris Torek

That's obfuscating.

It is certainly not something I would ever recommend writing (and
indeed, I would and do recommend against it). But:
"extern" is a reference to a variable with static storage class
and external linkage.

Unfortunately, this is not the case (though it would be nice if it
were).

The "extern" keyword has two effects, one on "linkage" and one on
"definition-ness".

In order to explain this, we have to define the terms "scope" and
"linkage", and describe what "definition-ness" is. We might as
well also mention "duration".

There are three durations for "objects" (objects being regions of
memory that hold bit patterns): static, automatic, and allocated.

A static-duration object -- a variable, or the contents of a string,
or in C99, the contents of some (but not all) compound literals --
is created at program startup (or possibly even before then, but
the C standards are little concerned with anything before or after
a program runs). It persists in memory until the program exits
(and, again, possibly even afterward).

An automatic-duration object is created by entering the block that
defines it, and destroyed (at least potentially) upon exiting that
block.

An allocated-duration object is created by calling malloc(), and
destroyed by calling free() with the value malloc() returned. (I
prefer to gloss over realloc() at this point. :) ) Since pointers
are required in order to make any sensible use of allocated-duration
objects, we can ignore them here.

The "scope" of an "identifier" (a variable or function name, etc.;
in this case we are only really interested in variables) describes
the locations in which the identifier is visible, within a single
translation unit (roughly, "C source file after expanding #includes").
Given something like:

void f(void) {
int i;
... code section 1 ...
{ double d; ... code section 2 ... }
... code section 3 ...
}

the scope of "i" is most[%] of the body of "f", including all three
code sections. We can refer to "i" in any of those areas and get
the variable "i" that is local to f(). (This assumes there are no
new definitions of an inner-scope "i"; in C89 those would require
braces, while in C99 we could define additional "i"s in "for" loops.)
The scope of "d", however, is restricted to code section 2.

[% I say "most of" rather than "all of" because i does not come
into scope until partway through the declaration. If we were to
declare another variable before "i", we could not refer to "i"
yet:
void f(void) { int *ip = &i; int i = 42; ... } /* ERROR */
so i is not visible in *all* of the body of f().]

Scopes are, in effect, numbered: open braces (and in C99, "for"
loops with declarations) increment a number, while close braces
(and the ends of those C99 for loops) decrement it, and a variable
"goes out of scope" when its number is used up:

int a0; /* scope number = 0 */
void f(void)
{ /* now scope # = 1 */
int a1; /* a1 at scope 1 */
{
int a2; /* a2 at scope 2 */
for (int a3 = 0; a3 < N; a3++) /* a3 at scope 3 */
...;
/* scope 3 terminated, a3 vanishes */
} /* scope 2 terminated, a2 vanishes */
...
} /* scope 1 terminated, a1 vanishes */

Variables named as part of a function prototype are "smuggled down"
into scope 1. (Some compilers actually reserve scope level 1 for
"goto" labels, making function-level variables, including formal
parameters, occupy scope 2. In effect, the counter goes 0, 2, 3,
4, ..., 3, 2, 0. But this is just an implementation gimmick that
achieves what the standard refers to as "function scope" for goto
labels.)

The "linkage" of an identifier has to do with whether the identifier
is visible *outside* a given translation unit. File-scope variables
always have some sort of linkage. There are only two real linkages,
"internal" and "external". Block-scope variables *can* have linkage,
but usually do not (or as the standard says, the linkage they have
is "no linkage", which is kind of Zen-ish I guess :) -- the text
of the standard says three linkages, internal, external, and none).

We can get block-scope variables that have linkage when we use
"extern":

void f(void) {
extern int a;
...
}

though I generally recommend that any identifiers with linkage be
named at file scope. (I find this less confusing, in general,
though there are a reasonable number of toss-up cases as exceptions,
i.e., the code is equally unclear whether the declaration is block
or file. External-linkage variables tend to make code harder to
read, so are best avoided unless doing so also makes the code harder
to read. "... All courses may run ill.")

At file scope, the "static" keyword *always* means "internal linkage"
(because file scope variables invariably have static duration
already, so there is no need for "static" to mean "static duration").
At block scope, the "static" keyword means "static duration" instead,
and has no effect on linkage (which remains "no linkage"). Hence,
in:

static int a0;
void f(void) {
static int a1;
...
}

a0 has file scope, static duration, and internal linkage; a1 has
block scope (numerically "level 1" or "level 2" inside a typical
compiler), static duration, and no linkage.

The "extern" keyword is more problematic. It would make sense if
it always meant "external linkage" -- but file scope variables
already get external linkage unless we suppress it with "static".
So from a linkage standpoint, writing the "extern" on b1 here is
redundant:

int b0; /* file scope, static duration, external linkage */
extern int b1; /* file scope, static duration, external linkage */

In this case, the real use for extern is to suppress "definition-ness".

In Standard C, a variable that is initialized where it is declared
is always a definition of that variable:

int c0 = 3;

Here c0 has the same scope, duration, and linkage as usual (file,
static, and external), but we have definitely defined c0 and given
it an initial value (3). No other translation unit should attempt
to define c0. As Joe Wright notes, we can use "extern" -- preferably
in a header file -- thus:

extern int c0;

to tell the compiler "there is a c0 out there, defined in one
translation unit; use that one."

Now, also in Standard C, if we declare a variable at file scope,
but without the "extern" keyword *and* without an initializer, the
Standard refers to this as a "tentative definition". That is, this
declaration is a definition if and only if the same translation
unit does not define the same variable later, using an actual
initializer:

int c1; /* tentative definition of c1 */
int c2; /* tentative definition of c2 */
...
int c2 = 42; /* actual definition of c2 */

At the end of any given translation unit, the compiler is required
to "go back" to all the tentative definitions that have not been
supplanted by actual definitions, and initialize those variable to
zero (0, '\0', 0.0, NULL, whatever is appropriate given the type
of the variable). (Many compilers for Unix-like systems achieve
this without actually "going back", by using a linker trick in
which an uninitialized variable is marked as being in "BSS" space.
BSS stands for Block Started by Symbol, and dates back to ancient
IBM assembler. The linker combines bss symbols with data symbols
as needed, making things particularly easy for the compiler. But
again, this is just an implementation trick -- the C standard says
that tentative definitions become actual definitions, initialized
to zero.)

Tentative definitions work with internal-linkage identifiers too:

static int c3;
static int c3 = 6 * 9;

The first declaration is a tentative definition, which is replaced
by the second declaration, which is an explicit definition.

[Begin sidebar]

One might wonder why C should have tentative definitions at all.
The answer is, they allow you to create circular data structures
at compile time. Consider the following:

struct queue {
struct queue *forw, *back; /* doubly linked queue */
... additional data ...
};

static struct queue qelem1, qelem2; /* tentative definitions */

/* distinguished dummy queue head (queue element #0) */
static struct queue qhead = { &qelem1, &qelem2 };
static struct queue qelem1 = { &qelem2, &qhead, ... };
static struct queue qelem2 = { &qhead, &qelem1, ... };

Here, queue element 1 points forward to element 2 and backward to
the dummy queue head; element 2 points forward to the head and
backward to element 1; and the head points forward to element 1
and backward to element 2. We have a classic doubly-linked queue
with two elements, all created at compile time, rather than by
some runtime initialization.

Try to write the above without using tentative definitions. Note
that if we were to drop the "static"s and use external linkage
instead, we *could* do it, using the "extern" keyword!

[End sidebar]

Thus, at file scope, C generally uses the "extern" keyword to mean
"suppress even tentative-definition-ness of this external-linkage
identifier". C does not need "extern" to give the identifier
external linkage, because that is the default: we have to use
"static" to give it internal linkage.

But now take a look at the following wording in the C standard:

[#4] For an identifier declared with the storage-class
specifier extern in a scope in which a prior declaration of
that identifier is visible,22 if the prior declaration
specifies internal or external linkage, the linkage of the
identifier at the later declaration becomes the linkage
specified at the prior declaration. If no prior declaration
is visible, or if the prior declaration specifies no
linkage, then the identifier has external linkage.

Although this is written in Standard-ese, it really says: "If the
programmer uses the extern keyword, the compiler has to check to
see if the variable is already declared using the static keyword
so as to give it internal linkage. If this is the case, the compiler
should pretend that the extern keyword is not there, and the static
keyword is there instead. Otherwise -- when no prior declaration
is visible or the prior declaration specifies no linkage -- the
compiler should go ahead and do external linkage."

Hence:

static int d0;
extern int d0;

gives d0 internal linkage.

However, consider the following translation unit:

static int d1;
void f(void)
{
int d1;
{
extern int d1; /* ERROR -- DO NOT DO THIS! */
}
}

Here, we use a block-scope d1 to hide the internal-linkage d1. This
block-scope d1 has no linkage (and automatic duration, and block
scope of course). So the "visible" "prior declaration" that the
Standard-ese above talks about has "no linkage", and the third "d1"
above has external linkage. This, according to the same text in
the standard (three paragraphs on), has undefined behavior:

[#7] If, within a translation unit, the same identifier
appears with both internal and external linkage, the
behavior is undefined.

To avoid the problem, I strongly recommend against ever allowing
any identifier to be defined once with "static" and then again
later with "extern", in the same translation unit. The well-defined
case, illustrated above with d0, can be handled by using the "static"
keyword (at file scope). The undefined case, illustrated above
with d1, occurs when "extern" is used in block scope. Avoiding
using "extern" in block scope is also possible, and often sensible;
if you do both, you will certainly be safe from this particular
pitfall ("belt and suspenders", or -- for the British -- "belt and
braces", as it were).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,821
Latest member
AleidaSchi

Latest Threads

Top