How is it possible to typedef a struct before it has been declared/defined?

S

Shriramana Sharma

Hello. The following code compiles (and links with a dummy main) quite well on both GCC 4.6.3 and CLang 3.0:

Code:
typedef struct MyStruct_Tag MyStruct ;
struct MyStruct_Tag { int x, y ; } ;
MyStruct a ;

and this surprises me very much because I would have expected the compiler to complain if the first type in a typedef statement (or the second in a C++11 using-style typedef) is not yet even declared.

Can anyone please explain how this is legal?

Thanks.
 
E

Eric Sosman

Hello. The following code compiles (and links with a dummy main) quite well on both GCC 4.6.3 and CLang 3.0:

Code:
typedef struct MyStruct_Tag MyStruct ;
struct MyStruct_Tag { int x, y ; } ;
MyStruct a ;

and this surprises me very much because I would have expected the compiler to complain if the first type in a typedef statement (or the second in a C++11 using-style typedef) is not yet even declared.

Can anyone please explain how this is legal?

C (I won't speak for C++) has the notion of an "incomplete
type," a type that has been specified only in part. Early in
the first line you have `struct MyStruct_Tag', and this -- all
by itself -- is enough to inform the compiler of the existence
of the `struct MyStruct_Tag' type. The type is "incomplete" at
this point: The compiler knows the tag name, and knows that it
is a struct type, but doesn't know any of the details.

Despite the name, `typedef' doesn't actually define a new
type: It just registers an alias for a type already known. So
by the end of the first line, the compiler knows that `MyStruct'
is an alias for the `struct MyStruct_Tag' type -- the type is
still incomplete, either under its "official" name or under
its `MyStruct' alias.

The second line "completes" the incomplete declaration,
filling in the missing details. This ability to separate "naming"
the type from "describing" it has two useful consequences:

- As soon as the type name is known you have the ability to write
the names of other related types, most especially pointers to the
named type. That's why you can do things like:

typedef struct node_tag {
struct node_tag *left;
struct node_tag *right;
int keyValue;
} TreeNode;

The type has not been "completely" declared at the point where
`left' and `right' are described, but since the compiler already
knows that `struct node_tag' is a type, it also knows that
`struct node_tag *' is a type. You could separate it further:

typedef struct node_tag TreeNode;
struct node_tag {
TreeNode *left;
TreeNode *right;
int keyValue;
};

.... with exactly the same effect.

- You can leave the type incomplete, telling the compiler only
that a struct type with such-and-such tag name exists but never
describing its innards. This is useful for libraries that want
to have "opaque" or "abstract" types; their headers can write

/* opaque.h */
typedef struct opaque_tag OpaqueData;
OpaqueData *opaqueFactory(void);
void opaqueDestructor(OpaqueData *ptr);
void opaqueSetName(OpaqueData *ptr, const char *name);
const char *opaqueGetName(const OpaqueData *ptr);
// ... and so on

.... and then the library's private implementation files can
write, internally

/* opaque.c */
#include "opaque.h"
struct opaque_tag {
const char *name;
double trouble;
int whatnot;
};

With a setup like this, clients can deal with pointers to the
incompletely-described type, but can never see or mess with
its innards; a new version of the library can make rearrange
or expand those innards without disturbing the clients.

The same thing can be done with `union' types, too. Also,
an array of unknown size is incomplete:

extern int array[]; // incomplete, remains so
// ...
double vector[]; // incomplete
// ...
double vector[] = { 1.2, 2.3, 3.4 }; // completion
 
J

James Kuyper

Hello. The following code compiles (and links with a dummy main) quite well on both GCC 4.6.3 and CLang 3.0:

Code:
typedef struct MyStruct_Tag MyStruct ;
struct MyStruct_Tag { int x, y ; } ;
MyStruct a ;

and this surprises me very much because I would have expected the compiler to complain if the first type in a typedef statement (or the second in a C++11 using-style typedef) is not yet even declared.

Can anyone please explain how this is legal?

The typedef is actually irrelevant to this issue. You could remove the
'typedef' keyword, and declare 'a' as "struct MyStruct_Tag", and you'd
still have the same issues.

The declaration

struct MyStruct_Tag MyStruct;

declares "struct MyStruct_Tag" to be an incomplete struct type. The
contents of that type are unspecified, and therefore so is the size of
the type. You can't declare an object of an incomplete type, nor an
array of it. However, it can still be used in limited ways. In
particular, you can declare a object to be a pointer to that type - the
standard requires that all pointers to struct types have the same
representation and same alignment requirements, so it's possible to pass
around struct pointers without knowing anything about the contents of
the struct they refer to.

A declaration of an incomplete type can be completed at a later point,
and that's exactly what your code does on the very next line. From that
point onward, the type can be used just as if it had been complete from
the beginning.
 
L

Les Cargill

Shriramana said:
Hello. The following code compiles (and links with a dummy main)
quite well on both GCC 4.6.3 and CLang 3.0:

Code:
 typedef struct MyStruct_Tag MyStruct ; struct MyStruct_Tag {
int x, y ; } ; MyStruct a ;

and this surprises me very much because I would have expected the
compiler to complain if the first type in a typedef statement (or the
second in a C++11 using-style typedef) is not yet even declared.

Can anyone please explain how this is legal?

Thanks.

'C' is not inherently a single-pass compiled language.

Makes perfect sense to me - otherwise a self-referential struct
would require a void * and cast.

it solves a catch-22.
 
K

Keith Thompson

Les Cargill said:
Shriramana said:
Hello. The following code compiles (and links with a dummy main)
quite well on both GCC 4.6.3 and CLang 3.0:

Code:
 typedef struct MyStruct_Tag MyStruct ; struct MyStruct_Tag {
int x, y ; } ; MyStruct a ;

and this surprises me very much because I would have expected the
compiler to complain if the first type in a typedef statement (or the
second in a C++11 using-style typedef) is not yet even declared.

Can anyone please explain how this is legal?

'C' is not inherently a single-pass compiled language.

Well, sort of. It's designed for single-pass compilation, but in a few
special cases later declarations can cause earlier declarations to be
"completed". For example:

struct foo;
/* Compiler creates a symbol table entry for "struct foo" as an
incomplete type. */

struct foo { int n; };
/* Makes "struct foo" a complete type */

The compiler doesn't do a second pass over the source file; it just goes
back and updates a symbol that it has previously partially processed.

It's a special-case tweak to the single-pass model, needed because, as
you say:
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,812
Latest member
GracielaWa

Latest Threads

Top