If you could change the C or C++ or Java syntax, what would you like different?

Keith Thompson · Oct 24, 2010

Seebs said:
Well, it's there in the header.

I suspect that this leads to the central observation:

Because types don't have any existence in object modules, types *are*
purely nomenclature, and the name and the type are largely interchangeable.

And that may bethe central point of disagreement between me and you.

Agreed, types (typically) have no existence in object modules. I don't
think that implies that they're purely nomenclature. After all, objects
don't (necessarily) exist in object modules either.

Given:
int arr[10];
int *ptr = arr+3;
arr[3], *(arr+3), and *ptr are exactly the same object, referred to
by different names. Given:
typedef int word;
int and word, are exactly the same type, referred to by different names.

There is a coherent concept of "type". (I don't think the C Standard
has a single definition of the term, but one of the normative
references probably does.) I assert that the name is not part of the
type, and that two different names can refer to the same type, just
as (to be a bit more precise than I was in the previous paragraph)
two different lvalues can designate the same object. A "type" is
an abstract concept, but it exists independently of its name.

As for many other concepts, we often gloss over the distinction between
a name and the entity that the name refers to. But the distinction is
still there.

You're right, though, it doesn't so much clearly say that typedefs define
types, as it clearly uses the phrase "define types" to refer to things
done by typedef. We're left to infer that if typedef does something referred
to as "define types", that typedef defines types.

I agree. It's not the terminology I would have chosen, but it requires
a more strained reading to conclude that a typedef doesn't "define a
type" than to say that it does. Which, as I've said, leaves me with the
conclusion that "defining a type" does not *create* a type.

I'm still wrestling with what the standard really means by "define"; I
suspect that in at least some cases the standard is just a bit sloppily
worded.

But yet again, char and signed char are distinct in a very real
way that size_t and unsigned long are not (replace signed char and
unsigned long by the appropriate types for a given implementation).
I don't see how your type model recognizes this distinction. And no,
it's not just a matter of either the language or any compiler
not being clever enough to tell things apart; it's about whether
they're defined *by the language* to be the same thing. It's not
a distinction that should necessarily affect how you write code,
but it's a very real distinction in terms of how the language is
defined, and how the standard uses the word "type".

I was thinking about this more. I think the real issues become clearer
when you look at pairs of source modules. Consider a pair of modules:

foo.c:
struct foo { int a; } x;

bar.c:
struct foo { double d; } y;

Consider: You can compile both of these, link them into a single
program, and *nothing is wrong*. There is no clash, there is no
conflict. Sure, you have two definitions of the type "struct foo".
Sure, those definitions are incompatible.

But no one cares! Since there's no case where the type leaks over from
one to the other, it's not an issue.

Hmm. I suppose the idea of two types being the same (or of two type
names referring to the same type) is limited in scope to a single
translation unit.

[...]

Keith Thompson · Oct 24, 2010

Willem said:
Nick Keighley wrote:
) On 18 Oct, 16:39, (e-mail address removed) (Felix Palmen) wrote:
)> I'll just put it here:
)>
)> You can either pretend you're a compiler and don't know anything more
)> abstract than your grammar (and the standard defining it) -- or you can
)> just apply some human abstraction to understand the /intention/ behind
)> C's language constructs and ask yourself in that context whether typedef
)> is a good name or not.
)>
)> My decision is clear: the name is perfect.
)
) I'm not sure you've grasped the distinction between distinct types and
) aliased types

This distinction is irrelevant. The argument is that 'typedef' denotes
something in a higher, more abstract way.
Why do you keep on making these irrelevant points ?

Because there's a very real distinction *on the C language level*.
The distinction isn't always relevant to a programmer. As a
programmer, I benefit from *pretending* that size_t is distinct
from any other type. But while doing so, I also keep in mind that
it is the same type as one of the predefined unsigned integer types.

To put it another way, size_t is a different thing from any other
thing. But the "thing" that it is, in that sense, is not a "type"
as the word is used in the C standard.

If I'm just talking about the use of size_t in a program, I don't
mind talking about it as a type. But if I'm talking about what
the word "type" actually means in C (which I think is what we're
trying to do here), then I will make that distinction.

The abstractions we work with as programmers are built on top of
the constructs defined by the language. I think it's important
to understand both the abstractions and the underlying language
constructs.

) <snip>
)
) apples and oranges are different types but apples and pears aren't (in
) this program)

Technically, they aren't. Conceptually, they are.
Because good programmers think in concepts, 'typedef' is a perfect name.
Who cares if it's not a good name from a compiler-writer point of view ?

Well, compiler writers for starters. But why shouldn't C programmers
care enough to have the same understanding of the language that a
compiler writer needs to have?

[...]

If an implementation has
typedef unsigned long size_t;
then unsigned long and size_t are two different names for the
same type. As a programmer, I should (and do) know when to use
one name for that type and when to use the other.

C99 6.7.7p3 says:

A typedef declaration does not introduce a new type, only a
synonym for the type so specified.

You can say that's irrelevant, and in many contexts it is. But do
you claim that it's not true?

Keith Thompson · Oct 24, 2010

Joshua Maurice said:
A type in C is defined in terms of C's type
system.

Agreed. (That's almost tautological.)

By that definition, C's typedef does not create types,
Agreed.

it does
not introduce types,

Agreed; C99 6.7.7p3 says so in so many words.

and it does not define types.

On this last point I must, with some reluctance, disagree with
you. I don't particularly like the standard's choice of words
in this area, but given that 6.7.7 is titled "Type definitions",
and that 7.17 says "The following types and macros are *defined*
in the standard header <stddef.h>" (emphasis added), I cannot avoid
concluding that a typedef does define a type.

Which just means that "defining a type" does not necessarily create or
introduce a type.

[...]

Keith Thompson · Oct 24, 2010

Seebs said:
... You know, come to think of it. I think this is getting at another
thing that's been bugging me.

There are cases where we clearly mean typedef merely to provide a
convenient shorthand for something, but we don't intend to make a
new and distinct type:

typedef struct foo *foo_t;

Did you mean to have the "*" there?

There are cases where we clearly mean typedef to make a new and distinct
type:

typedef unsigned int size_t;
Yes.

I think a big part of the confusion comes from the fact that we only
have one keyword for the two uses. We thus end up with a keyword which
is used for two very different things, and for one of them, you need
to keep in mind that it's just a synonym, but for the other, you will write
better code if you pretend it's a completely distinct and incompatible
type.

Now why would you use the word "pretend" if you think that they
really are distinct types?

Joshua Maurice · Oct 24, 2010

[...]

A type in C is defined in terms of C's type
system.

Click to expand...

Agreed. (That's almost tautological.)

By that definition, C's typedef does not create types,
Agreed.

it does
not introduce types,

Click to expand...

Agreed; C99 6.7.7p3 says so in so many words.

and it does not define types..

Click to expand...

On this last point I must, with some reluctance, disagree with
you. I don't particularly like the standard's choice of words
in this area, but given that 6.7.7 is titled "Type definitions",
and that 7.17 says "The following types and macros are *defined*
in the standard header <stddef.h>" (emphasis added), I cannot avoid
concluding that a typedef does define a type.

Which just means that "defining a type" does not necessarily create or
introduce a type.

I think you're left with a contradiction then.

First, the header "type definitions" isn't very compelling at all. It
just means that all of the things under the header are loosely related
to type definitions. Anything more is reading into it more than
necessary.

So, that leaves the "the header defines size_t" problem.

Remember "6.7.7 Type definitions / 4" which says, in exactly these
words:
typedef MILES int;
MILES distance;
The type of distance is int.

Then remember "7.17 Common definitions <stddef.h> / 4", which states
that the recommended way to "define size_t in the header" is with a
typedef.

Then remember how typedef does say it defines type names. It never
says it defines types. It says it does not specify a new type. I
merely provides a synonym.

Thus, the only reasonable conclusion is that when it says "define
size_t" in that header, they're using a shorthand, a looseness of
terms, and what they really mean is that "the header shall define the
type /name/". To say that it really does define a type means we need
to throw out either "7.17 Common definitions <stddef.h> / 4" or "6.7.7
Type definitions / 4" and the section describing typedef. Also we'd
also need to throw out all sensible definitions of words of type
theory in the programming community at large.

Keith Thompson · Oct 24, 2010

Nick Keighley said:
the semantics of the C language are defined by the standard. If you
want to layer another set of sematics on top of that that's your
lookout. I wouldn't encourage anyone to do that

Really? I'd say that layering another set of semantics on top of
the semantics defined by the C language is a very large part of
what we call "programming".

C has structs and pointers. It doesn't have linked lists. A linked
list is a layer of abstraction on top of C.

Keith Thompson · Oct 24, 2010

Joshua Maurice said:
[...]

Fine. I can't call that position idiotic. However, I can and will call
your supposition that "signed integer overflow is a type error"
retarded. You either have a very serious English problem, you have no
clue what you're talking about, or your brain left you for a short
period of time.

Click to expand...

Perhaps you could do something with your urge to spew personal abuse
than posting it here.

Click to expand...

True. I was just overcome that someone with so little knowledge could
make it to the C standards committee. Either he's a troll, and he
deserved what he got, or he actually was on the standards committee
and he needs to go read a book. He actually said that signed integer
overflow is a type error. /sigh

I think you've drastically underestimated his knowledge.

Seebs · Oct 24, 2010

And that may bethe central point of disagreement between me and you.

Agreed, types (typically) have no existence in object modules. I don't
think that implies that they're purely nomenclature. After all, objects
don't (necessarily) exist in object modules either.

Given:
int arr[10];
int *ptr = arr+3;
arr[3], *(arr+3), and *ptr are exactly the same object, referred to
by different names. Given:
typedef int word;
int and word, are exactly the same type, referred to by different names.

True. But consider:

#if INT_MAX > 65536
typedef int word;
#else
typedef long word;
#endif

Now... After this, I think it's quite reasonable for someone to say that
"word" is not the same type as int or long, but rather, is the same type
as one of them on any given system but possibly not on other systems. At
this point, "word" has acquired a meaning which is definitely diffeent
from the definitions of either int or long. And if you're talking about
the program as a C program, rather than as a program built with a specific
implementation (including compiler options), it is incorrect to claim that
it is int, and also incorrect to claim that it is long.

It gets weirder. So far as I can tell, given:
foo.c:
typedef struct { int a; } word;
extern word x;

bar.c:
typedef struct { int a; } word;
word x = { 1 };

The two declarations of x are of DIFFERENT types. Which happen to be
compatible.

There is a coherent concept of "type". (I don't think the C Standard
has a single definition of the term, but one of the normative
references probably does.) I assert that the name is not part of the
type, and that two different names can refer to the same type, just
as (to be a bit more precise than I was in the previous paragraph)
two different lvalues can designate the same object. A "type" is
an abstract concept, but it exists independently of its name.

In trivial cases, yes. In more complex cases, though, you have a name
which maps onto some "type" in that sense, but you don't know which one,
and you also don't need to, because you can just treat it as being its
own type.

I think the model of viewing the alias as "merely" an alias makes it harder
to work effectively with such a type.

As for many other concepts, we often gloss over the distinction between
a name and the entity that the name refers to. But the distinction is
still there.

Agreed.

But when you are looking at logical types, like size_t, it turns out that
you get a much better model of their behavior by glossing over the
distinction, I think.

I agree. It's not the terminology I would have chosen, but it requires
a more strained reading to conclude that a typedef doesn't "define a
type" than to say that it does. Which, as I've said, leaves me with the
conclusion that "defining a type" does not *create* a type.

Hmm.

Okay, let's try an experiment.

Let us label terms:

type1: The underlying logical mapping of storage space to interpretation,
plus any magic flags that allow you to distinguish, e.g., between char and
signed/unsigned char, things like that.
type2: The mapping from a name to a type1.
type3: The name that is mapped to a type1 by a type2.

With these, we can now say:

* typedef defines type3s, and creates type2s.
* typedef does not create a type1.
* typedef does not define a type1.
* "size_t" is a type3.
* <stddef.h> defines a type2 named "size_t".

I think that as soon as we have the distinction between the different
senses in which people talk about types, it's easy to agree on everything.

I'm still wrestling with what the standard really means by "define"; I
suspect that in at least some cases the standard is just a bit sloppily
worded.

I think so.

A puzzle for you: Can you propose a strictly conforming program which can
determine which standard unsigned integer type, if any, size_t is?

If not, I think that explains why the standard is loose with the terminology
here; *for the purposes of strictly conforming programs*, size_t is its own
type, and you can never have a strictly conforming program which assumes that
it is the same as any other type. You can *know* that it must be the same
as some other type, but since you don't know which one, you can't write a
reasonably sane C program which uses that information in any way.

Knowing that typedef "does not create a new type" is perhaps important when
you're thinking about typedefs *you* create, because it's why you know
that you can do:

typedef int banana;
banana main(banana argc, char **argv) {
return (banana) 0;
}

(Okay, maybe that SPECIFIC example isn't important, but the principle is.)

But when it comes to typedefs provided by the language, so far as I can tell,
it's *not* important to know that they don't create new types, because there's
no way for portable code to rely on or make use of that information.

But yet again, char and signed char are distinct in a very real
way that size_t and unsigned long are not (replace signed char and
unsigned long by the appropriate types for a given implementation).
I don't see how your type model recognizes this distinction.

I think it's that parenthetical that matters. Since my model of C is,
for the most part, abstracted away from any specific implementation, even
though I know that size_t is almost certainly the same as a standard
unsigned integer type, there is no code I can ever reasonably write which
is affected by this information.

So the weakness of my model is that it doesn't help me much when I want
to write code specific to a given implementation that I specifically plan
to prevent people from porting.

And no,
it's not just a matter of either the language or any compiler
not being clever enough to tell things apart; it's about whether
they're defined *by the language* to be the same thing. It's not
a distinction that should necessarily affect how you write code,
but it's a very real distinction in terms of how the language is
defined, and how the standard uses the word "type".

Lemme give you another example. I firmly believe that, intentionally
or otherwise, C89 invoked undefined behavior if you accessed an uninitialized
object of type 'unsigned char', because all access to uninitialized
objects was undefined behavior. (I'm also pretty sure this was not
intentional.)

In C99, that's gone, because the undefined behavior from accessing
uninitialized objects is now handled through trap representations (and
let me say, I think that it's a beautiful solution). And unsigned char
doesn't *have* any trap representations.

That said:

#include <stdio.h>
int main(void) {
unsigned char u;
printf("%d\n", (int) u);
return 0;
}

This code still looks wrong to me. I can show you chapter and verse to prove
that this code does NOT invoke undefined behavior, only unspecified
behavior. Assuming that INT_MAX is greater than UCHAR_MAX, I can even
tell you that it's guaranteed to print a number from 0 to UCHAR_MAX
(inclusive).

But my primary working model of C ignores this in favor of saying "that
accesses an uninitialized value, and it's wrong."

Which is to say... My primary working model of C is noticably more
conservative than the language spec. I avoid writing some lines of code
which are technically not constraint violations, *as if* they were in
fact constraint violations, because it produces better code, and because
I consider it a mere artifact of circumstance or necessity that the
compiler won't catch and flag those things.

Hmm. I suppose the idea of two types being the same (or of two type
names referring to the same type) is limited in scope to a single
translation unit.

I think it's slightly fancier. I think it's a single translation unit *or*
the built-in types. I am pretty sure that "int" is the same type in every
translation unit. And so are all the aliases of the built-in types, I think.

I think structures, unions, enums, and functions (and function pointers)
have unique types in each translation unit, but that arrays and basic types
don't. I am not, however, totally sure of this.

-s

Seebs · Oct 24, 2010

Did you mean to have the "*" there?

I think so. I frequently use this idiom. I have been known to use it
with only a "struct foo;" visible to create opaque handles.

Now why would you use the word "pretend" if you think that they
really are distinct types?

Because they're clearly not incompatible.

I tend to think of them as distinct-but-compatible, by which I mean "you
can tell them apart, but the compiler can still interchange them". The way
you can tell them apart, of course, is that they have different names.

I sort of wish, now, that C had some kind of lovely hackery to allow us
to tag a typedef as "and also mark this as incompatible with the type it's
otherwise identical to".

-s

Keith Thompson · Oct 24, 2010

Joshua Maurice said:
I think you're left with a contradiction then.

First, the header "type definitions" isn't very compelling at all. It
just means that all of the things under the header are loosely related
to type definitions. Anything more is reading into it more than
necessary.

And in fact the text of that section says

In a declaration whose storage-class specifier is typedef,
each declarator defines an identifier to be a typedef name
that denotes the type specified for the identifier in the way
described in 6.7.5.

So, that leaves the "the header defines size_t" problem.

Remember "6.7.7 Type definitions / 4" which says, in exactly these
words:
typedef MILES int;
MILES distance;
The type of distance is int.

Yes, which is consistent with typedef not creating or introducing
a new type.

Then remember "7.17 Common definitions <stddef.h> / 4", which states
that the recommended way to "define size_t in the header" is with a
typedef.

7.17 doesn't use the word "typedef", though it's implied.
The recommendation in paragraph 4 (which is not in the original
C99 standard but was added by one of the Technical Corrigenda)
is about the choice of which types to use, not about whether or
not to use a typedef. (A typedef is probably the only valid way
to declare size_t et al, barring some sort of compiler magic.)

But 7.17p1 does say, in so many words, that <stddef.h> defines types.
Perhaps that was an error, but there are enough cases of similar
usage in the standard that I think it was deliberate.

Then remember how typedef does say it defines type names. It never
says it defines types. It says it does not specify a new type. I
merely provides a synonym.

Thus, the only reasonable conclusion is that when it says "define
size_t" in that header, they're using a shorthand, a looseness of
terms, and what they really mean is that "the header shall define the
type /name/". To say that it really does define a type means we need
to throw out either "7.17 Common definitions <stddef.h> / 4" or "6.7.7
Type definitions / 4" and the section describing typedef. Also we'd
also need to throw out all sensible definitions of words of type
theory in the programming community at large.

I think the root of the confusion is that the standard uses the
words "define" and "definition" in odd ways, or at least in ways
that you and I both find odd.

Look at the definition of the word "definition", C99 6.7p6.
It's restricted to object definitions, function definitions,
enumeration constants, and typedef names. Other constructs that
unquestionably *create types* are not "definitions".

Once you realize that (and assume that a "definition" is a thing
that "defines" something), it's not a big leap to understand that
"defining a type" doesn't necessarily *create* a type.

Seebs · Oct 24, 2010

Look at the definition of the word "definition", C99 6.7p6.
It's restricted to object definitions, function definitions,
enumeration constants, and typedef names. Other constructs that
unquestionably *create types* are not "definitions".

Yes. e.g.:
struct foo { int a; };
this is not a definition, so far as I can tell. It's a declaration.

Objects and functions make sense, pretty much -- the definition is
the thing that causes you to generate code-or-data.

Enumeration constants and typedef names, I think, are in a different
category; the "definition" is just the "thing that tells you what it
is". So I tend to view definitions as having these two categories. I'm
not sure what the underlying cause is.

Once you realize that (and assume that a "definition" is a thing
that "defines" something), it's not a big leap to understand that
"defining a type" doesn't necessarily *create* a type.

Right. It just tells you what the type is. And of course, really, what
you're "defining" is the typedef-name. But it turns out that mapping a
typedef name to a type is something that can be described as "defining
a type" without trouble...

.... unless people come to it with the expectation that "defining" necessarily
implies "creating". Which is actually a very surprising assumption to me.
I guess it matches somewhat with the way C talks about "definitions" for
functions and objects, but it's not the way I'd otherwise expect it to work.

-s

Joshua Maurice · Oct 24, 2010

And in fact the text of that section says

In a declaration whose storage-class specifier is typedef,
each declarator defines an identifier to be a typedef name
that denotes the type specified for the identifier in the way
described in 6.7.5.

Yes, which is consistent with typedef not creating or introducing
a new type.

7.17 doesn't use the word "typedef", though it's implied.
The recommendation in paragraph 4 (which is not in the original
C99 standard but was added by one of the Technical Corrigenda)
is about the choice of which types to use, not about whether or
not to use a typedef. (A typedef is probably the only valid way
to declare size_t et al, barring some sort of compiler magic.)

But 7.17p1 does say, in so many words, that <stddef.h> defines types.
Perhaps that was an error, but there are enough cases of similar
usage in the standard that I think it was deliberate.

I think the root of the confusion is that the standard uses the
words "define" and "definition" in odd ways, or at least in ways
that you and I both find odd.

Look at the definition of the word "definition", C99 6.7p6.
It's restricted to object definitions, function definitions,
enumeration constants, and typedef names. Other constructs that
unquestionably *create types* are not "definitions".

Once you realize that (and assume that a "definition" is a thing
that "defines" something), it's not a big leap to understand that
"defining a type" doesn't necessarily *create* a type.

Two replies. First is a new one. Second is a rehash.

So, the C standard clearly states that certain headers define the type
size_t. However, size_t is not a type. It is a type name. Why do you
choose the interpretation "the header defines the type which is named
size_t"? Why is that a sensible extrapolation / correction of the
standard? Moreover, that correction isn't sensible on its own. You
also need to define the type name. It should read "the header defines
the type which is named size_t, and the header defines the type name
size_t". The alternative correction is "the header defines the type
name size_t". Which seems more plausible to you? The one which is
entirely consistent with programming practices at large, type theory
at large, and which merely needs to change "a thing" to "the name of
the thing", or is the sensible one the one which requires much more
manipulation to get it into a formal state and breaks with all type
theory convention and programming literature convention?

Rehashed argument: all other kinds of definitions in the C standard,
which include object definitions, function definitions, type name
definitions, all C definitions are singular specification of some
distinct new thing (in that translation unit). Moreover, this is true
of all kinds of definitions in all other programming languages, type
theory literature, and so on, known to me. It would be quite odd if a
"type definition" in C was not a singular specification of a distinct
new type. It would completely break convention. Do you really think
that the C standard intends the reading which would have us conclude
that
typedef MILES int;
is a type definition for the type which is named int, and
typedef METERS int;
is also a type definition for the type which is named int?

Rui Maciel · Oct 24, 2010

Felix said:
Just try -- for a second -- to realize there is no confusion at all.
Then think about why people decided to give typedef its name. Probably
to confuse people, right?

No, because when you use typedef you are attributing a type definition to a new identifier, which
will then serve the role as a type specifier. After attributing the type definition to a identifier
you can use it for "syntatic convenience". No more, no less. No type was created. The only reason
why someone can be confused by it is if that someone happened to learn how to hack code in C without
learning how to program in C, which includes understanding the basics of C's type system.

Rui Maciel

Rui Maciel · Oct 24, 2010

Seebs said:
... You know, come to think of it. I think this is getting at another
thing that's been bugging me.

There are cases where we clearly mean typedef merely to provide a
convenient shorthand for something, but we don't intend to make a
new and distinct type:

typedef struct foo *foo_t;

There are cases where we clearly mean typedef to make a new and distinct
type:

typedef unsigned int size_t;

I think a big part of the confusion comes from the fact that we only
have one keyword for the two uses. We thus end up with a keyword which
is used for two very different things, and for one of them, you need
to keep in mind that it's just a synonym, but for the other, you will
write better code if you pretend it's a completely distinct and
incompatible type.

I would argue that differently. I would argue that the decision to declare size_t as a typedef
name, as it was intended to be a new type, wasn't a good one. Probablly that goal would be better
achieved by implementing size_t as either a new built-in type (which would be a very hard way to
implement it) or by wrapping its data type with a struct which would then be reffered through a
typedef.

But now that you mention it, it would be great if the C programming language offered a way to
declare a new type specifier which would be interpreted by the compiler as a new, distinct type. By
doing that it would simplify the creation of new types (i.e., sidestep the need to wrap stuff in
structs) and it would help programmers, particularly those developing libraries, write cleaner,
safer code.

Rui Maciel

Joshua Maurice · Oct 24, 2010

But now that you mention it, it would be great if the C programming language offered a way to
declare a new type specifier which would be interpreted by the compiler as a new, distinct type. By
doing that it would simplify the creation of new types (i.e., sidestep the need to wrap stuff in
structs) and it would help programmers, particularly those developing libraries, write cleaner,
safer code.

People say this, but I don't quite "get it". I think you're
overlooking something, or you're thinking of something very different
than I.

So, let's call this a "strong typedef". Now, presumably the motivation
here is to provide a strong typedef for one of the data types,
specifically one of the "basic" types, an array type of a "data type",
a struct type, a pointer type, and after pondering this for 5 seconds
I think this covers all I wanted.

Now, presumably we don't need this for struct types. We don't need the
ability to create a strong typedef of a struct type. If we wanted
that, we could just define a new struct type. If you want an opaque
type, just do:
struct some_special_name_to_my_library;
typedef some_special_name_to_my_library* my_opaque_type;

For a pointer type, the question from a user perspective "doesn't make
sense". There would be little desire to create a pointer type which is
distinct from all other types. In fact, I'm not even sure what
semantics that would have. A pointer type is intrinsically tied to the
pointed-to type. You can't have two pointer types "pointer to X" and
"pointer to Y" which are different types, but X and Y are the same
types. Now, if you want a pointer type which is distinct from all
other types, and which points to an otherwise distinct type as well,
just declare that distinct type and make a pointer type from it.

Making a strong typedef of an array type also lacks "meaning" in the
same way as a strong typedef of a pointer type.

So, that leaves us with the "basic" types, specifically bool, (the
rest of?) the integer types, and the floating point types. This is
where we want a strong typedef. We want a new type size_t which is
distinct from the other basic integer types. However, we also want
this new type to inherit the basic rules of integers. Most simply, we
want:
size_t x;
x = 1;
to be valid code. Moreover, we want:
int x;
size_t y;
x = 1;
y = x;
to be valid code. Thus, we want implicit conversion rules between
size_t and the basic integer types. However, that partially defeats
the purpose of such a thing to begin with. We want the type to be
considered distinct, but we also want implicit conversion rules. The
only thing this seems to buy us in C is that pointer types and array
types on size_t would indeed be incompatible, and the type checker
will gladly inform you of this. However, it would still be quite easy
to assign a size_t to an int, and vice versa, and the type checker
will remain quite silent.

I think that this would have a lot more value in a language with
function overloading, such as C++. This would allow us to overload
function (and specialize templates) based on the strong typedef name
without fear of also affecting the "unsigned int" case, depending on
platform.

So, that's my take on it. This is sort of just pulled out of my ass,
so I might have missed something. Why do you want a strong typedef?
What would that give you that you can't easily do in the C language
now? How would it help prevent errors by introducing "more" type
errors? (I'm not trying to say that there are no such cases, but I'm
tempted to think that they're not as common as you would think in C.)

Nick Keighley · Oct 24, 2010

NickKeighleywrote:
) On 18 Oct, 16:39, (e-mail address removed) (Felix Palmen) wrote:

)> I'll just put it here:
)>
)> You can either pretend you're a compiler and don't know anything more
)> abstract than your grammar (and the standard defining it) -- or you can
)> just apply some human abstraction to understand the /intention/ behind
)> C's language constructs and ask yourself in that context whether typedef
)> is a good name or not.
)>
)> My decision is clear: the name is perfect.
)
) I'm not sure you've grasped the distinction between distinct types and
) aliased types

This distinction is irrelevant. The argument is that 'typedef' denotes
something in a higher, more abstract way.
Why do you keep on making these irrelevant points ?

beacause I don't think they're irrelevent but go to the core of the
argument. Yes typedef serves a purpose. Mostly for making C's
appalling type definitions even remotely readable.

) apples and oranges are different types but apples and pears aren't (in
) this program)

Technically, they aren't. Conceptually, they are.

we'll I'd prefer it if the langauge enforced the conceptual difference

Because good programmers think in concepts, 'typedef' is a perfect name.
Who cares if it's not a good name from a compiler-writer point of view ?
me

That would be the same as using different commands to move a file across
filesystems than on the same filesystem. (For example, the default action
when dragging files in windows is different when you're going to another
drive. This confuses the hell out of a lot of people.)

I don't really see the connection.

OO langauges go a bit farer in enforcing this. You can make sure
Apples can't be assigned to Oranges whilst allowing a FruitBasket to
hold both Apples and Oranges.

Rui Maciel · Oct 24, 2010

Joshua said:
People say this, but I don't quite "get it". I think you're
overlooking something, or you're thinking of something very different
than I.

I believe it's the latter. I'll explain bellow.

So, let's call this a "strong typedef". Now, presumably the motivation
here is to provide a strong typedef for one of the data types,
specifically one of the "basic" types, an array type of a "data type",
a struct type, a pointer type, and after pondering this for 5 seconds
I think this covers all I wanted.

I don't believe there was a need to try to redefine what has been said. By doing so, particularly
in the way it was done (i.e., refer to the declaration of a new type as "strong typedef"), you
only get to muddy the waters. There is a quite simpler and clearer way to refer to this, which is
simply to call it declaring a new type.

So, when you declare a new type the main motivation is to benefit from the need to comply with the
C programming language's type rules in order for the code to be considered correct. In practice,
it constitutes a way to enable programmers to enforce a series of checks which can be used as a
guarantee that the program works as expected (i.e., avoid bugs which would otherwise be silent).

Another misconception is that the motivation behind this new feature is to create new types from
standard types. That isn't true. The motivation is to be able to create new types from any
object type.

Now, presumably we don't need this for struct types. We don't need the
ability to create a strong typedef of a struct type. If we wanted
that, we could just define a new struct type. If you want an opaque
type, just do:
struct some_special_name_to_my_library;
typedef some_special_name_to_my_library* my_opaque_type;

I don't agree with this. It makes as much sense, and it is as useful, to declare a new type from
a struct as from any standard type. The purpose behind this idea is to give the programmer an
easier, more convenient way to take advantage of the compiler's ability to enforce typing rules.
Therefore, the point of this idea is to apply this feature to all C's types, standard and custom.

For a pointer type, the question from a user perspective "doesn't make
sense". There would be little desire to create a pointer type which is
distinct from all other types.

It does make sense. Let me explain.

Let's start by considering that it was possible to declare new types through a language construct
which shares the semantics adopted for the typedef construct but instead of using the "typedef"
keyword, it used some other keyword, such as "newtype". Then, consider the following example:

<code>
newtype int * file_handle;

void do_stuff_with_a_file(file_handle file)
{
// do stuff
}

int main(void)
{
file_handle foo;
int * bar;

// some stuff happens, including initializing foo and bar

do_stuff_with_a_file(foo); // this is OK.
do_stuff_with_a_file(bar); // This is not OK. The compiler throws an error.

// other stuff
}
</code>

So, from this example, the ability to easily declare new types and, as a consequence, take
advantage of C's typing rules to enforce coding rules imposed by the developers (which may serve
as sanity checks) would be a simple and convenient way to help programmers write correct code.

In fact, I'm not even sure what
semantics that would have. A pointer type is intrinsically tied to the
pointed-to type. You can't have two pointer types "pointer to X" and
"pointer to Y" which are different types, but X and Y are the same
types.

Why do you believe you can't have two pointer types which are interpreted by the compiler as being
different types? The key issue here is that it would be a simple and convenient way to let the
programmer take advantage of a compiler's type checks to enforce conceptual and even safety rules.

Now, if you want a pointer type which is distinct from all
other types, and which points to an otherwise distinct type as well,
just declare that distinct type and make a pointer type from it.

Making a strong typedef of an array type also lacks "meaning" in the
same way as a strong typedef of a pointer type.

As this is the same issue covered above, the same reply also applies.

So, that leaves us with the "basic" types, specifically bool, (the
rest of?) the integer types, and the floating point types. This is
where we want a strong typedef. We want a new type size_t which is
distinct from the other basic integer types. However, we also want
this new type to inherit the basic rules of integers. Most simply, we
want:
size_t x;
x = 1;
to be valid code. Moreover, we want:
int x;
size_t y;
x = 1;
y = x;
to be valid code. Thus, we want implicit conversion rules between
size_t and the basic integer types.

I don't believe implicit conversion rule are needed at all. It can be argued that they do more
harm than good. As a consequence, I don't want implicit conversion rules. Maybe I'm wrong and
I'm missing something. If that's the case then feel free to correct me.

However, that partially defeats
the purpose of such a thing to begin with. We want the type to be
considered distinct, but we also want implicit conversion rules.

Read above.

So, that's my take on it. This is sort of just pulled out of my ass,
so I might have missed something. Why do you want a strong typedef?
What would that give you that you can't easily do in the C language
now? How would it help prevent errors by introducing "more" type
errors? (I'm not trying to say that there are no such cases, but I'm
tempted to think that they're not as common as you would think in C.)

I believe these questions are answered by the example which was provided.

Rui Maciel

Felix Palmen · Oct 24, 2010

* Rui Maciel said:
No, because when you use typedef you are attributing a type definition to a new identifier, which
will then serve the role as a type specifier.

[...]

It's amazing that after ALL my posts in this thread, there are STILL
people who think my statement that "typedef" is indeed a good name had
something to do with it's language standard definition.

Not even understanding my point makes it quite pointless do discuss it.

Maybe I'll try a last time in this thread (Because, after posting it so
often, I think I learned to express it more comprehensibly):

There's a need for defining a type when working with typed languages. So
there is a conceptual function "define type". C /supports/ this
conceptual function by several language constructs, where typedef is one
of them and the only one able to name your type in the global name
space. It happens to support the conceptual function of defining your
type by just setting an alias to an existing named or anonymous type,
but that's most of the time completely irrelevant to the programmer.
Therefore, the name "typedef" is fine, it's what you want to do
conceptually by using it.

Joshua Maurice · Oct 24, 2010

I believe it's the latter. I'll explain bellow.

I don't believe there was a need to try to redefine what has been said. By doing so, particularly
in the way it was done (i.e., refer to the declaration of a new type as "strong typedef"), you
only get to muddy the waters. There is a quite simpler and clearer way to refer to this, which is
simply to call it declaring a new type.

So, when you declare a new type the main motivation is to benefit from the need to comply with the
C programming language's type rules in order for the code to be considered correct. In practice,
it constitutes a way to enable programmers to enforce a series of checks which can be used as a
guarantee that the program works as expected (i.e., avoid bugs which would otherwise be silent).

Another misconception is that the motivation behind this new feature is to create new types from
standard types. That isn't true. The motivation is to be able to create new types from any
object type.

I don't agree with this. It makes as much sense, and it is as useful, to declare a new type from
a struct as from any standard type. The purpose behind this idea is to give the programmer an
easier, more convenient way to take advantage of the compiler's ability to enforce typing rules.
Therefore, the point of this idea is to apply this feature to all C's types, standard and custom.

It does make sense. Let me explain.

Let's start by considering that it was possible to declare new types through a language construct
which shares the semantics adopted for the typedef construct but instead of using the "typedef"
keyword, it used some other keyword, such as "newtype". Then, consider the following example:

<code>
newtype int * file_handle;

void do_stuff_with_a_file(file_handle file)
{
// do stuff

}

int main(void)
{
file_handle foo;
int * bar;

// some stuff happens, including initializing foo and bar

do_stuff_with_a_file(foo); // this is OK.
do_stuff_with_a_file(bar); // This is not OK. The compiler throws an error.

// other stuff}

</code>

So, from this example, the ability to easily declare new types and, as a consequence, take
advantage of C's typing rules to enforce coding rules imposed by the developers (which may serve
as sanity checks) would be a simple and convenient way to help programmers write correct code.

Why do you believe you can't have two pointer types which are interpreted by the compiler as being
different types? The key issue here is that it would be a simple and convenient way to let the
programmer take advantage of a compiler's type checks to enforce conceptual and even safety rules.

As I said, I didn't see an immediate obvious semantics for it. I guess
we could do what you propose, hypothetically speaking. We'd need to
change a bit more places in the C standard though. Before, pointer
types and array types were "derived types", that is they were only
defined in one single very specific way in terms of other types. Now
we're proposing to second a second method to specify pointer types.
For example, we'd have to change 6.7.5.1 Pointer declarators / 2. I'm
not saying it's unworkable. I'm just saying it's a bigger change than
just specifying a new keyword, especially when one considers the
impact on the compiler writers. A strong typedef of a struct is
simple, but a strong typedef of a pointer type would be quite
interesting I suspect.

I don't believe implicit conversion rule are needed at all. It can be argued that they do more
harm than good. As a consequence, I don't want implicit conversion rules. Maybe I'm wrong and
I'm missing something. If that's the case then feel free to correct me..

Read above.

I believe these questions are answered by the example which was provided.

Rui Maciel

Ok. Sensible. I think you're going to have a lot more explicit casts
for strong typedefs of the basic types which will sort of defeat their
utility. So, will the following be legal?
strong typedef my_int int;
my_int x = 5;
IIRC, the literal 5 has type int, and without implicit conversion
rules, that assignment / initialization would not be well formed,
unless we add some more rules to convert literals to the target type
implicitly. Even then, you'd need an explicit cast for something
like:
strong typedef my_int int;
int x = 5;
my_int y = (my_int)x;
I guess that's sane.

Would you let a programmer specify new implicit conversion rules? If
not, then you have a new integer type which is not implicitly
convertible to or from any other type. What about implicitly
convertible to "bool" and/or usable as a condition of an "if" without
a cast?

I guess that I'm not sure how useful this would be to a programmer in
the real world.

Keith Thompson · Oct 24, 2010

Seebs said:
And that may bethe central point of disagreement between me and you.

Click to expand...

Agreed, types (typically) have no existence in object modules. I don't
think that implies that they're purely nomenclature. After all, objects
don't (necessarily) exist in object modules either.

Click to expand...

Given:
int arr[10];
int *ptr = arr+3;
arr[3], *(arr+3), and *ptr are exactly the same object, referred to
by different names. Given:
typedef int word;
int and word, are exactly the same type, referred to by different names.

Click to expand...

True. But consider:

#if INT_MAX > 65536
typedef int word;
#else
typedef long word;
#endif

Now... After this, I think it's quite reasonable for someone to say that
"word" is not the same type as int or long, but rather, is the same type
as one of them on any given system but possibly not on other systems. At
this point, "word" has acquired a meaning which is definitely diffeent
from the definitions of either int or long. And if you're talking about
the program as a C program, rather than as a program built with a specific
implementation (including compiler options), it is incorrect to claim that
it is int, and also incorrect to claim that it is long.

But it's correct to claim that it's *either* int or long.

And in some sense, such uses of #if mean that you have two different
programs, one where word is the same type as int and one where
word is the same type as long. (That's not a very strong argument;
as far as the standard is concerned, preprocessing directives are
as much a part of the language as anything else.)

It gets weirder. So far as I can tell, given:
foo.c:
typedef struct { int a; } word;
extern word x;

bar.c:
typedef struct { int a; } word;
word x = { 1 };

The two declarations of x are of DIFFERENT types. Which happen to be
compatible.

Yeah, that's odd (but necessary to be able to use the same struct type
across translation units within a program).

In trivial cases, yes. In more complex cases, though, you have a name
which maps onto some "type" in that sense, but you don't know which one,
and you also don't need to, because you can just treat it as being its
own type.

I think the model of viewing the alias as "merely" an alias makes it harder
to work effectively with such a type.

Sure, if that's the only model you use.

I work effectively with size_t, for example, by thinking of it as a
distinct thing; I called it an "stype" in another post. But I still
keep in mind that it's really an alias for something else.

Agreed.

But when you are looking at logical types, like size_t, it turns out that
you get a much better model of their behavior by glossing over the
distinction, I think.

Yes, it's called abstraction.

Hmm.

Okay, let's try an experiment.

Let us label terms:

type1: The underlying logical mapping of storage space to interpretation,
plus any magic flags that allow you to distinguish, e.g., between char and
signed/unsigned char, things like that.
type2: The mapping from a name to a type1.
type3: The name that is mapped to a type1 by a type2.

With these, we can now say:

* typedef defines type3s, and creates type2s.
* typedef does not create a type1.
* typedef does not define a type1.
* "size_t" is a type3.
* <stddef.h> defines a type2 named "size_t".

I think that as soon as we have the distinction between the different
senses in which people talk about types, it's easy to agree on everything.

Maybe (though "type1", "type2", and "type3" are perhaps not the best
choice for keeping the concepts straight).

The question is, what does "type" mean? Since the term is used, IMHO,
reasonably consistently by the standard, using the word "type" to refer
to some other concept, however useful that concept might be, is unwise.

I think so.

A puzzle for you: Can you propose a strictly conforming program which can
determine which standard unsigned integer type, if any, size_t is?

No. But the language is not limited to strictly conforming programs.

If not, I think that explains why the standard is loose with the terminology
here; *for the purposes of strictly conforming programs*, size_t is its own
type, and you can never have a strictly conforming program which assumes that
it is the same as any other type. You can *know* that it must be the same
as some other type, but since you don't know which one, you can't write a
reasonably sane C program which uses that information in any way.

Knowing that typedef "does not create a new type" is perhaps important when
you're thinking about typedefs *you* create, because it's why you know
that you can do:

typedef int banana;
banana main(banana argc, char **argv) {
return (banana) 0;
}

(Okay, maybe that SPECIFIC example isn't important, but the principle is.)

But when it comes to typedefs provided by the language, so far as I can tell,
it's *not* important to know that they don't create new types, because there's
no way for portable code to rely on or make use of that information.

There are reasons to want to know something other than making direct use
of them. Knowledge is itself A Good Thing.

And, as I've said before, knowing what typedefs do and don't actually do
can be helpful in understanding error messages, or their absence. One
example: if you pass a pointer of the wrong type to time(), the error
message might refer to type long rather than type time_t.
message isn't that specific, as it turns out.)

I think it's that parenthetical that matters. Since my model of C is,
for the most part, abstracted away from any specific implementation, even
though I know that size_t is almost certainly the same as a standard
unsigned integer type, there is no code I can ever reasonably write which
is affected by this information.

So the weakness of my model is that it doesn't help me much when I want
to write code specific to a given implementation that I specifically plan
to prevent people from porting.

Your model of C *programming* is consistent. Your model of the C
*language*, I suggest, is not. Using a higher level of abstraction
is good, but so is being aware of what it's based on.

Lemme give you another example. I firmly believe that, intentionally
or otherwise, C89 invoked undefined behavior if you accessed an uninitialized
object of type 'unsigned char', because all access to uninitialized
objects was undefined behavior. (I'm also pretty sure this was not
intentional.)

In C99, that's gone, because the undefined behavior from accessing
uninitialized objects is now handled through trap representations (and
let me say, I think that it's a beautiful solution). And unsigned char
doesn't *have* any trap representations.

That said:

#include <stdio.h>
int main(void) {
unsigned char u;
printf("%d\n", (int) u);
return 0;
}

This code still looks wrong to me. I can show you chapter and verse to prove
that this code does NOT invoke undefined behavior, only unspecified
behavior. Assuming that INT_MAX is greater than UCHAR_MAX, I can even
tell you that it's guaranteed to print a number from 0 to UCHAR_MAX
(inclusive).

But my primary working model of C ignores this in favor of saying "that
accesses an uninitialized value, and it's wrong."

Of course. It's just as logically wrong as this perfectly legal
program:

#include <stdio.h>
#define SIX 1+5
#define NINE 8+1
int main(void)
{
printf("%d * %d = %d\n", SIX, NINE, SIX * NINE);
return 0;
}

I do not suggest for a moment that just understanding the language rules
is enough to write good code. Many things that do not violate any
language rule are *logically* wrong.

Which is to say... My primary working model of C is noticably more
conservative than the language spec. I avoid writing some lines of code
which are technically not constraint violations, *as if* they were in
fact constraint violations, because it produces better code, and because
I consider it a mere artifact of circumstance or necessity that the
compiler won't catch and flag those things.

But you understand where the "circumstance or necessity" comes
from, right? That's all I'm saying, that a good C programmer should
understand (and not deny the existence of) the underlying language
rules on which all our wonderful abstractions are based.

I think it's slightly fancier. I think it's a single translation unit *or*
the built-in types. I am pretty sure that "int" is the same type in every
translation unit. And so are all the aliases of the built-in types, I think.

I think structures, unions, enums, and functions (and function pointers)
have unique types in each translation unit, but that arrays and basic types
don't. I am not, however, totally sure of this.

And there's some special rule about structs, isn't there? (Too lazy to
look it up right now.)

What is the most astounding C++ syntax construct?	0	Dec 22, 2022
Change in C	53	Sep 17, 2009
Python source to C++ and/or Java	0	Mar 7, 2013
The syntax of C (note: homework task)	8	Jul 8, 2010
Is this PEP-able? (syntax for functools.partial-like functionality)	3	Aug 20, 2013
Native Rich-Textarea, what do you think?	4	Dec 30, 2011
If you could add anything you want	55	May 12, 2006
On the development of C	211	Mar 9, 2009

If you could change the C or C++ or Java syntax, what would you like different?

Keith Thompson

Keith Thompson

Keith Thompson

Keith Thompson

Joshua Maurice

Keith Thompson

Keith Thompson

Seebs

Seebs

Keith Thompson

Seebs

Joshua Maurice

Rui Maciel

Rui Maciel

Joshua Maurice

Nick Keighley

Rui Maciel

Felix Palmen

Joshua Maurice

Keith Thompson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads