Benign typedef definition

J

jacob navia

Document: WG14/N1360
Date: 2009/02/27
References: WG14/N1321, WG14/N1336, WG14/N1346
Authors: Jim Thomas
Reply to: Jim Thomas <[email protected]>

Subject: benign typedef redefinition

C++ allows a typedef redefinition with the same name as a previous
typedef to appear in the same scope, as long as it names the same type.
Some C compilers allow similar typedef redefinition as an extension,
though C99 does not allow it. Adding benign typedef redefinition to C1x
would enhance consistency with C++, standardize some existing practice,
and safely eliminate a constraint that is unhelpful and an occasional
nuisance to users.

Recommended change (to C1x draft N1336): Change 6.7 #3 from:

If an identifier has no linkage, there shall be no more than one
declaration of the identifier (in a declarator or type specifier) with
the same scope and in the same name space, except for tags as specified
in 6.7.2.3.

to:

If an identifier has no linkage, there shall be no more than one
declaration of the identifier (in a declarator or type specifier) with
the same scope and in the same name space, except a typedef specifier
can be used to redefine the name of any type declared in that scope to
refer to the type to which it already refers, and except for tags as
specified in 6.7.2.3.
--------------------------------------------------------------------

This is exactly the behavior of lcc-win. If a typedef is repeated,
and it names the same type, lcc-win will silently accept this.

This comes mostly from data types that are defined in several headers
and you need to include both of them.

I think this is a good idea.
 
K

Kaz Kylheku

Document: WG14/N1360
Date: 2009/02/27
References: WG14/N1321, WG14/N1336, WG14/N1346
Authors: Jim Thomas
Reply to: Jim Thomas <[email protected]>

Subject: benign typedef redefinition

C++ allows a typedef redefinition with the same name as a previous
typedef to appear in the same scope, as long as it names the same type.

This is not a bad feature idea. Suppoise we have this header:

#ifndef IDENTIFIER
#define IDENTIFIER

#define macro expansion
int function(void);
typedef int type;

#endif

If the #ifndef include guard is removed, then the only thing which prevents
this header from being included twice is the typedef.

If redundant typedefs are allowed, then header which declare functions, define
typedefs and define macros will not require guards.

The guards are ugly, so anything which eliminates some of them is a good thing.

The vast majority of header files in C programs declare functions, define types
and define macros. So this change would in fact eliminate the need for include
guards from vast reams of C code. One thumb up!

One rationale for the include guards is that they don't only prevent errors
from multiple definitions, but speed up compilation, since the definitions
don't have to be processed twice through all of the translation phases.
Moreover, smart compilers can recognize the include guard and not even pass the
header through /any/ translation phases.

However, a compiler can /still/ perform the same optimization, if it can prove
the idempotency of a header, which seems easy to do. So even without the
include guard, a redundant inclusion of a header can be skipped entirely.
Two thumbs up!
Some C compilers allow similar typedef redefinition as an extension,
though C99 does not allow it. Adding benign typedef redefinition to C1x
would enhance consistency with C++, standardize some existing practice,
and safely eliminate a constraint that is unhelpful and an occasional
nuisance to users.

Jim Thomas' concern with C++ compatibility is laughable when considered
together with the degree to which the goal of C compatibility is flouted
by the C++ people.

For instance, C++ does not allow:

int x; // tentative definition in C, definition in C++
int x; // tentative definition in C, error in C++
If an identifier has no linkage, there shall be no more than one
declaration of the identifier (in a declarator or type specifier) with
the same scope and in the same name space, except a typedef specifier
can be used to redefine the name of any type declared in that scope to
refer to the type to which it already refers, and except for tags as
specified in 6.7.2.3.

This is an unnecessarily strong fix, because it allows redefinitions of
typedefs in a block scope too, which isn't that useful, and is at odds
with other names that have no linkage which cannot be redefined.

Another way to fix this would be to introduce ``typedef linkage''.

- file scope typedefs could introduce names which have
``typedef linkage''.

- typedef linkage has ref/def rules similar to static linkage (confined
to a translation unit)

- if the storage class of a declaration is extern, and the name has been
declared already, then it inherits the prior linkage, including typedef
linkage:

// file scope
typedef int x; // x has typedef linkage
static int y; // x has internal linkage

extern int x; // allowed: same x, typedef linkage
extern int y; // allowed: same y, internal linkage

I.e. the relationship between extern and typedef is similar to
that between extern and static, reducing the number of special cases
in the language.

- A typedef declaration has no initializer and so if it appears at file
scope, it can be regarded to be a tentative definition. This is what
will allow multiple file scope typedefs for the same identifier:

typedef int x; // tentative type definition
typedef int x; // ditto

A block scope typedef is a definition, not tentative, and cannot be repeated.

- Each tentative definition must have a type which is compatible with the
previous type, though not necesarily identical. After each such definition,
the type name denotes the composite type of the previously known type,
and the type given in the new definition.

typedef int (*fptr)(); // tentative typedef, arguments not known
typedef int (*fptr)(void); // fptr now known to be no-argument function
typedef int (*fptr)(int); // error, incompatible

Thus the final type is whatever it is at the end of the translation unit,
at which point all tentative typedefs become definitions.
 
S

Spiros Bousbouras

This is not a bad feature idea. Suppoise we have this header:

#ifndef IDENTIFIER
#define IDENTIFIER

#define macro expansion
int function(void);
typedef int type;

#endif

If the #ifndef include guard is removed, then the only thing which prevents
this header from being included twice is the typedef.

If redundant typedefs are allowed, then header which declare functions, define
typedefs and define macros will not require guards.

The guards are ugly, so anything which eliminates some of them is a good thing.

The vast majority of header files in C programs declare functions, define types
and define macros. So this change would in fact eliminate the need for include
guards from vast reams of C code. One thumb up!

If the objective is to eliminate multiple inclusions of the same
header then a "only once" pragma is a much cleaner solution.
One rationale for the include guards is that they don't only prevent errors
from multiple definitions, but speed up compilation, since the definitions
don't have to be processed twice through all of the translation phases.
Moreover, smart compilers can recognize the include guard and not even pass the
header through /any/ translation phases.

However, a compiler can /still/ perform the same optimization, if it can prove
the idempotency of a header, which seems easy to do. So even without the
include guard, a redundant inclusion of a header can be skipped entirely.
Two thumbs up!

I don't see why it is generally easy for a compiler to deduce
that if a header has been included once then it does not need to
be included again. What would the algorithm be ?
 
K

Keith Thompson

Spiros Bousbouras said:
If the objective is to eliminate multiple inclusions of the same
header then a "only once" pragma is a much cleaner solution.
[...]

The tricky part would be defining the semantics. Assuming that a
header is a file, consider systems where multiple files can have the
same name (if they're in different directories), and a single file can
have multiple names (links, symbolic links, mount points, etc.).
 
K

Kaz Kylheku

If the objective is to eliminate multiple inclusions of the same
header then a "only once" pragma is a much cleaner solution.

How can a solution with extraneous syntax be cleaner than a solution
with no extraneous syntax? :)

The cleanest solution is to allow harmless redefinitions and leave it
up to market forces to dictate whether implementations care about optimizing
such cases.

Anyway, if redefinition of typedef is allowed, then some programmers will take
advantage of that and choose not to employ the pragma once, even if it is
available.

The inability to redefine typedef should not force a programmer to use pragma
once.

The pragma once would still be useful in the remaining situations when a header
cannot should not be included more than once.
I don't see why it is generally easy for a compiler to deduce
that if a header has been included once then it does not need to
be included again. What would the algorithm be ?

The task of the algorithm, for determining that a header need not be included,
would be:

a) checking that the token sequence arising from processing the
header, after the expansion of macros, was produced without any dependencies
on macros which have since been redefined.

b) checking that no macros defined by the header have been undefined
since the last time it was included.

c) checking that no macros which are #undef'd in that header have been
defined since the last time it was included.

d) checking that the header doesn't /define/ any names with linkage.

For (a), it would be necessary to associate an object representing the header
in the compiler's memory with a list of of the macros which are invoked by that
header's token sequence. Each such macro could be tagged with an integer
version which increments when it is redefined or undefined, and the list would
maintain the version that was sampled at the time the header was included. If
any such prerequisite macro has a new version, then this effects the
macro-expanded token sequence of the header, and it must be processed in
earnest.

Example:

#undef A
#define A() X Y Z
#include "foo.h"

#undef A
#define A() R S T
#include "foo.h"

If "foo.h" depends on the macro call A() anywhere, then the includes should be
either processed in earnest. (Or, alternatively, a cached representation of the
raw token sequence of "foo.h" held in memory should be added ot the
translation unit, and processed again from the macro-expanding translation
phase onward).

For (b), a list of the macros defined by the header is needed, along with the
version information. Simply visit the macros and check that they still have the
same version. (c) can be combined with (b), since the undefined state of a
macro can just be a special case in the representation.

#include "foo.h" // foo.h provides A, #undef-s B.

...
A(); // dependency on A
B(); // function call or other syntax, B macro not defined
...

#undef A
#define B() X

#include "foo.h" // foo.h must reinstall A, #undef B.

Test (d) is necessary for emitting a diagnostic. If a redundant inclusion of a
header, if actually processed, would emit a diagnostic (whether
standard-required or not), the same diagnostic should be emitted even if the
header is not actually processed.

I might have missed some things, but if I were doing this work seriously, of
course I would dot my proverbial i's and cross t's.

Intuitively, I don't see any huge impediment against deducing that processing a
header can be culled. The algorithm doesn't have to be perfect; it just has to
be right sufficiently often to produce a wortwhile speedup in compiling, and
err on the side of caution (process the header unnecessarily) when it's wrong.
 
K

Kaz Kylheku

Spiros Bousbouras said:
If the objective is to eliminate multiple inclusions of the same
header then a "only once" pragma is a much cleaner solution.
[...]

The tricky part would be defining the semantics. Assuming that a
header is a file, consider systems where multiple files can have the
same name (if they're in different directories), and a single file can
have multiple names (links, symbolic links, mount points, etc.).

The simple answer is to use object equality for files rather than name
equality. A file is the same item if it is the same object.

One reasonable assumption is that features of the filesystem structure such as
links, symbolic links and mount points do not change during the compilation of
a program.

In practice, obtaining object quality for files may present problems to
implementors. What if the filesystem doesn't provide any kind of unique ID for
a file? Implementations may have to suffer along with some kind of
weaker equivalence based on names, such as using a canonicalized full path
name (path name to the object, with symlinks substituted for their targets)
or not even that.

This could be left as implementation-defined behavior, and a quality of
implementation issue.

C programs have to be contrived in order for mechanisms like pragma once to
break on name equivalence.

A simple C program that just uses portable header names like "foobar.h"
will not break.

A program which assumes that #include "../foobar.h" in one place the same thing
as #include "abc/foobar.h" is not highly portable; and it is contrived in such
a way as to break simple implementations of pragma once.

I think this is something that the compiler marketplace can sort out for
itself.
 
M

Mark Wooding

Kaz Kylheku said:
One rationale for the include guards is that they don't only prevent
errors from multiple definitions, but speed up compilation, since the
definitions don't have to be processed twice through all of the
translation phases. Moreover, smart compilers can recognize the
include guard and not even pass the header through /any/ translation
phases.

However, a compiler can /still/ perform the same optimization, if it
can prove the idempotency of a header, which seems easy to do. So even
without the include guard, a redundant inclusion of a header can be
skipped entirely. Two thumbs up!

This is easy to do in a compiler which has an integrated and interleaved
preprocessor, but it's harder with a separate preprocessor.

A preprocessor can easily determine that there's no non-whitespace
non-comment text outside of a standard header-file guard. Working out
whether the contents of a header file are idempotent, on the other hand,
means that the preprocessor needs to know quite a bit about the syntax
of C.

That said, I support the idea of allowing benign redefinitions of
typedefs, and I rather like the idea of guardless header files.

However, there is a fly in the ointment. Many header files -- even most
of the ones that I write -- define structures, and you're not allowed to
redefine them. Permitting this seems easy for tagged structures -- if
the redefinition has members of the same types, with the same names, and
in the same order (and with the same lengths, for bitfields) then the
redefinition is permitted.

Unfortunately,

typedef struct { int mumble; } foo;
typedef struct { int mumble; } foo;

defines the same type `foo' twice to be different structures.

(Of course, we can just encourage programmers to tag their structures.)
- Each tentative definition must have a type which is compatible with the
previous type, though not necesarily identical. After each such definition,
the type name denotes the composite type of the previously known type,
and the type given in the new definition.

typedef int (*fptr)(); // tentative typedef, arguments not known
typedef int (*fptr)(void); // fptr now known to be no-argument function

Those two don't look compatible to me. In particular, the first
mentions a function type which doesn't include a prototype, whereas the
second mentions a function type which does include a prototype.

And note that the compatible type exception you've introduced here
doesn't fix the problem with structures I mentioned above, since,
according to 6.2.7, structural equality is used to determine whether
structure types are compatible if they are `declared in separate
translation units'.

-- [mdw]
 
K

Keith Thompson

Kaz Kylheku said:
A simple C program that just uses portable header names like "foobar.h"
will not break.

A program which assumes that #include "../foobar.h" in one place the
same thing as #include "abc/foobar.h" is not highly portable; and it
is contrived in such a way as to break simple implementations of
pragma once.
[...]

Consider a (non-contrived) program whose sources are scattered across
multiple directories, using conventional names for certain headers.
The same header might be referred to as "foobar.h" in one file, and as
"../another_dir/foobar.h" in another. To avoid breaking such a
program, the implementation would have to recognize that different
names refer to the smae file -- which should be doable on most
systems, but perhaps non-trivial on some. If the implementation
handles it incorrectly, tracking down what the problem is is likely to
be, shall we say, an interesting experience for the programmer.

I'm think I'm less concerned with the difficulty of *implementing*
something like "#pragma once" than with the difficulty of defining its
semantics in the standard.
 
H

Hallvard B Furuseth

Han said:
The proposal is a good one, but what I can't understand is why all
these good proposals have to be described in terms of consistency
with other languages.

Adding to Francis' answer: Inventions may have unintended side effects
that are not clear from looking at the spec. So most standards do their
best to stick to features that have been implemented, and which people
have gained experience with.

A working implementation is commonly a minimum requirement. Another
standard (in C's case a language standard) with the feature is another.
Effects in the different language may differ, but the language likely
has a wider user base and thus provides wider experience than just a
compiler or two.
 
H

Hallvard B Furuseth

I said:
(...) So most standards do their best to stick to features that have
been implemented, and which people have gained experience with.

A working implementation is commonly a minimum requirement. Another
standard (in C's case a language standard) with the feature is another.

Er, the latter is another argument in favor, not anoter minimim
requirement. Otherwise nobody would ever get anywhere:)
 
L

lawrence.jones

In comp.std.c Han from China said:
The proposal is a good one, but what I can't understand is why all
these good proposals have to be described in terms of consistency
with other languages.

They don't *have* to be described that way, but both the C and C++
committees have a stated goal of keeping the common subset language as
large as practicable, so it's a strong argument for the proposal.
 
H

Hallvard B Furuseth

Francis said:
Well. we are supposed to standardise existing practice :)

Yes, and I don't think you came up with that rule by throwing dice.

Anyway I thought the OP was talking about more than the C language,
but I may have parsed his question wrong.
 
J

JosephKK

Document: WG14/N1360
Date: 2009/02/27
References: WG14/N1321, WG14/N1336, WG14/N1346
Authors: Jim Thomas
Reply to: Jim Thomas <[email protected]>

Subject: benign typedef redefinition

C++ allows a typedef redefinition with the same name as a previous
typedef to appear in the same scope, as long as it names the same type.
Some C compilers allow similar typedef redefinition as an extension,
though C99 does not allow it. Adding benign typedef redefinition to C1x
would enhance consistency with C++, standardize some existing practice,
and safely eliminate a constraint that is unhelpful and an occasional
nuisance to users.

Recommended change (to C1x draft N1336): Change 6.7 #3 from:

If an identifier has no linkage, there shall be no more than one
declaration of the identifier (in a declarator or type specifier) with
the same scope and in the same name space, except for tags as specified
in 6.7.2.3.

to:

If an identifier has no linkage, there shall be no more than one
declaration of the identifier (in a declarator or type specifier) with
the same scope and in the same name space, except a typedef specifier
can be used to redefine the name of any type declared in that scope to
refer to the type to which it already refers, and except for tags as
specified in 6.7.2.3.
--------------------------------------------------------------------

This is exactly the behavior of lcc-win. If a typedef is repeated,
and it names the same type, lcc-win will silently accept this.

This comes mostly from data types that are defined in several headers
and you need to include both of them.

I think this is a good idea.

Before looking any further, my initial stance is to allow the new name
with a compiler warning. I consider it entirely possible that
different names of the same structural type may not be conformable or
equivalent.
 
B

Ben Bacarisse

JosephKK said:
Before looking any further, my initial stance is to allow the new name
with a compiler warning. I consider it entirely possible that
different names of the same structural type may not be conformable or
equivalent.

I don't think there is a "new name" -- the proposal is about duplicate
type definitions. Giving the same type (struct or otherwise) more
than one name is common place and should, in my view, go unremarked
upon by the compiler.

On a technical note: the standard can't require a warning (I am not
saying that you were suggesting that). All the standard can do is
note something as a "constraint". A program that violates one or more
constraints must be diagnosed and then all bets are off about what
might happen. The standard can also note something as being undefined
with very similar consequences except that no diagnostic is required.
 
D

David R Tribble

jacob said:
Subject: benign typedef redefinition

C++ allows a typedef redefinition with the same name as a previous
typedef to appear in the same scope, as long as it names the same type.
Some C compilers allow similar typedef redefinition as an extension,
though C99 does not allow it. Adding benign typedef redefinition to C1x
would enhance consistency with C++, standardize some existing practice,
and safely eliminate a constraint that is unhelpful and an occasional
nuisance to users.

Just to remind everyone that this has been discussed before.
See:
http://tinyurl.com/cnfb3a
http://groups.google.com/group/comp.std.c/browse_frm/thread/c5844d722331004a?scoring=d

-drt
 
D

David R Tribble

David said:

In this old thread (from 1999), there was much discussion
about typedefs of VLAs, which could continue to be a problem
at this point. At any rate, it warrants further discussion.

Quoting from one of the posts (1999-10-05):

| Clive D.W. Feather wrote:
| >> int n;
| >> ....
| >> typedef int vector [n++];
| >> typedef int vector [n++];
| >
|
| Tore Lund wrote:
| >> Sorry for being dense, but it is this valid C? And how would
such a
| >> typedef be used in a program? Surely I must be missing
something.
| >
|
| James Kuyper Jr. wrote:
| > It will be valid in C99, which adds the concept of a Variable
Length
| > Arrays (VLAs).
| > Based upon n868, the 1999-01-18 draft of the C99 standard:
| > VLA's can't be declared static, nor 'extern'. sizeof() becomes a
| > run-time expression which evaluates its operand, if that operand
is a
| > VLA. The typedef given above can only be declared and used within
the
| > scope of 'n'. The 'n++' expression gets re-evaluated each time a
| > statement 'n++;' in place of the declaration would be executed. It
does
| > NOT get re-evaluated each time the typedef is used. It's
unspecified
| > whether side-effects of evaluating the expression actually occur -
'n'
| > might or might not increase in value. As a result, I can't think
of any
| > good reason for putting expressions like this that have side-
effects in
| > a VLA declaration.

-drt
 
D

David Thompson

Those two don't look compatible to me. In particular, the first
mentions a function type which doesn't include a prototype, whereas the
second mentions a function type which does include a prototype.
As Kaz said, they aren't identical, but they are compatible, in this
case due to 6.7.5.3p15 previously 6.5.4.3 (a niftier number!),
plus 6.7.5.2p1 previously 6.5.4.1 for the pointer.
And note that the compatible type exception you've introduced here
doesn't fix the problem with structures I mentioned above, since,

(namely duplicated typedef for untagged. Also unions and enums.)
according to 6.2.7, structural equality is used to determine whether
structure types are compatible if they are `declared in separate
translation units'.
Yes, structural equality isn't currently available within a t.u. --
necessarily across scopes, e.g. local to two different functions.
Whether it would be worth changing the rule, or making an exception,
for that case of this new feature, would be an issue. I'm very
skeptical any implementation actually needs this freedom.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,820
Latest member
GilbertoA5

Latest Threads

Top