If you could change the C or C++ or Java syntax, what would you like different?

K

Keith Thompson

Ben Bacarisse said:
The thing is I don't think that it's an error in K&R C. There was no
concept of "undefined" and in most cases people seemed to assume that,
provided they have the right picture of what the compiler is doing,
their code will work.

Right.

On the other hand, I can easily imagine the above code working as
expected with one K&R or pre-K&R compiler, and failing in some
arbitrarily bad way on another. For example, there certainly could have
been compilers on which long is not exactly twice as wide as short,
and/or where the order in which the parameters are stored does not match
the high-order vs. low-order halves of a long object.
 
S

Seebs

But it's not a new type, and it wasn't created by the typedef.

I am not convinced. See next example.
An existing type changed from not being denoted by the name "foo" to
being denoted by the name "foo".

What existing type?

typedef struct { int x; } foo;

What is the "existing type"? What do you mean by "existing", anyway? In
what way did that type have existence prior to this line of code executing?

Can you suggest a way in which I could then declare something else, without
using "typedef foo bar" or the equivalent, such that (somethingelse *) and
(foo *) are considered compatible types?
I'd say that you're mangling the meaning of the word "two".
A name is not the thing it refers to.

No, but when we're talking about language, the name is more interesting.
"size_t" is a type name, and "unsigned int" is a type name. They are
two different names for the same type. There are not two types here,
just one.

The thing you put in front of x to declare x is the name of a type, but it's
just as useful to say that it's the type of x.
I disagree. I don't think a type-name is part of the type; it's just a
name that refers to a type.

I think the existence of pairs of types with identical storage and
representation, but which are considered distinct, suggests that "typeness"
has to do with something other than representation or storage.

For historical reasons, C in some cases allows you to have two types
which happen to be considered interchangeable, but I think they're still
two types.
Yes, "ln" created a link (which refers to a file), just as "typedef"
creates a type name (which refers to a type).
Okay.
So after those two commands, do you have two files? I don't think so.
If you had two files, for example, you could append data to one without
affecting the other.

Well, then. Tell me which one of them is the file, and which one isn't a
file.

Or are they both not-a-file? If neither of them is a file, then we're going
to confuse a lot of people when we say that the thing where "touch a" creates
a directory entry isn't creating a file. Now, we could argue that really,
"touch a" is doing two things; first, it creates a file, and second, it
creates a link to that file. But that's not what anyone actually says when
discussing it; they say it creates a file.

So we do something, which creates a thing which is definitely a file. Then
we do another thing, which creates a thing which is precisely equivalent
to the first thing. There is no way in which the first thing is different
from the second; you can delete either one and the other continues to be
exactly the same as it was before. So if one of them is a file, so's the
other one...

Anyway, I think the whole thing is killed by the heading for 6.7. If
typedef is introduced under the heading "Type Definitions", then the standard
has clearly committed to the notion that typedef is defining types.

-s
 
R

Rui Maciel

Jon said:
Sounds reasonable: "A typedef defines a type specification (aka,
"definition") but does not define a type". Or said tersely (and for
maximum confusion as C afficionados seem to like), "A typedef defines a
type but does not define a type".

You are wrong. You are probablly confusing identifiers with types. If a typedef identifier was a
new type then then typing restritions would be applied to that new identifier. As a typedef
identifier is not a new type, you can use any object defined through a typedef identifier as if it
was defined explicitly throught the original type.

A new language is necessary just to
<snip/>

Nonsense. For this issue alone, the only thing that is needed is that people get a grasp on the
language and learn the basics. If someone doesn't know this particular language then they will not
be in a better position if they start to invest their time in learning (or learning partially) some
other language, let alone a new one.


Rui Maciel
 
J

Jon

(Aside: I've been reading the standards documents so I am armed
(reloaded?) and dangerous). ;)
I am not convinced. See next example.


What existing type?

typedef struct { int x; } foo;

The above is a type name (re)definition of an anonymous(?) struct.
What is the "existing type"? What do you mean by "existing", anyway?
In what way did that type have existence prior to this line of code
executing?

Can you suggest a way in which I could then declare something else,
without using "typedef foo bar" or the equivalent, such that
(somethingelse *) and (foo *) are considered compatible types?




No, but when we're talking about language, the name is more
interesting.


The thing you put in front of x to declare x is the name of a type,
but it's just as useful to say that it's the type of x.

The things in front of an identifier are declaration specifiers. One kind
of declaration specifier is a type specifier, the set of which includes,
e.g., 'int' and also typedef-name. Another kind of specifier is a
storage-class specifier, the set of which includes, e.g., 'extern' and
also 'typedef'.*

extern int x; // type of ident 'x' is 'extern int'
typedef int foo; // type of ident 'foo' is 'typedef int'
foo x; // type of 'x' is int

According to the rules specified in the standard, the type of ident 'x'
in the last example above should be 'foo', but according to examples in
the standard, the type of ident 'x' is 'int', which means that typedef
declarations are special from other declarations as far as I grok.

*"The typedef specifier is called a "storage-class specifier" for
syntactic convenience only".

Types are descriptions of objects (e.g., data objects) or functions.
Types are a subdivision of a particular thing, e.g., "integer types". It
is probably proper to think in terms of "type specification" but not
proper to think in terms of "type name". It is probably proper to
synonymously use: "type", "type specification", "type description". It is
a stretch to say: "'int' is the name of a type", because 'int' *is* a
type. Things that can have names are: objects, functions, tags, members,
typedefs, labels, macros, macro parameters. Declarations introduce or
redeclare names into the translation unit. Declarations do not introduce
types.
I think the existence of pairs of types with identical storage and
representation, but which are considered distinct, suggests that
"typeness" has to do with something other than representation or
storage.

Yes, it has to do only with the description, which just so happens to
*be* the type.
For historical reasons, C in some cases allows you to have two types
which happen to be considered interchangeable, but I think they're
still
two types.

If the description is different ("synonymous" means "the same"), then
they are different.
Anyway, I think the whole thing is killed by the heading for 6.7. If
typedef is introduced under the heading "Type Definitions", then the
standard has clearly committed to the notion that typedef is defining
types.

Actually, no. I thought that too at first but see it differently now
unless I'm getting into the proverbial "analysis paralysis".
 
J

Jon

Rui said:
<snip/>

Nonsense. For this issue alone, the only thing that is needed is
that people get a grasp on the language and learn the basics.

Many of us have been for decades fixing up the shortcomings via a number
of ways (not to mention idoms, processes, libraries, any yes, macros) and
are taking the next obvious step. So, soon new programmers won't have to
learn archaic languages (unless they want to be historians?).
If
someone doesn't know this particular language then they will not be
in a better position if they start to invest their time in learning
(or learning partially) some other language, let alone a new one.

Read what you wrote just above and you tell me what's silly about what
you said. (Hint: types, what are they?).
 
J

Jon

Seebs said:
It is entirely possible. I would be a little surprised, given that I
was one of the people who worked on the language in question in the C
standard, but you never know, sometimes we miss stuff.


I am quoting the standard. The standard calls typedef statements
"Type Definitions". That's the heading. That's what they're listed
as
in the table of contents.

typedef declarations introduce typedef-names, not "type definitions". A
constituent in a declaration containing 'typedef' is a type. A "type" is
a description. A typedef declaration introduces a synonymous (with the
type) name for a type. Ident 'foo' in the example below is a
typedef-name, not a type-name, but the declaration does name a type.

typedef int foo; // ident 'foo' has type 'int'
 
J

Jon

Jon said:
typedef declarations introduce typedef-names,

Hyphen is incorrect.
not "type definitions".
A constituent in a declaration containing 'typedef' is a type. A
"type" is a description. A typedef declaration introduces a
synonymous (with the type) name for a type. Ident 'foo' in the
example below is a typedef-name,

Hyphen is incorrect.
not a type-name, but the declaration
does name a type.

does give a (synonymous) name for a type.
typedef int foo; // ident 'foo' has type 'int'

Ok, that's enough... bedtime for bonzo now.
 
B

BartC

Given the debate, I suggest the term be deprecated and replaced in the
standard. :) Someone, I don't remember who, suggested 'typealias'. Sounds
like a big win all around for so small a change. Existing code will not be
affected. Of course that would mean the committee and language designers
would have to "suck it up" a bit.

So if you were granted one wish to change something in the C language, you
would replace 'typedef' with something else?
 
B

BartC

<Snip my explanation to Keith Thompson of what exactly typedef does. Just in
case he didn't know...>

(What a crap newsreader Windows Live Mail is, when it fails to send, you
delete the post, but will secretly try again with a hidden copy -- twice.
Which meant I couldn't edit down a re-post later on..)
 
K

Keith Thompson

Tim Rentsch said:
It turns out the standard does define the word "definition" (I should
have checked that earlier). C99 6.7p5:

A declaration specifies the interpretation and attributes of a set
of identifiers. A definition of an identifier is a declaration for
that identifier that:
-- for an object, causes storage to be reserved for that object;
-- for a function, includes the function body;
-- for an enumeration constant or typedef name, is the (only)
declaration of the identifier.

Which isn't quite as coherent as I'd like. I'd rather be able
to say that a definition is a declaration that creates the entity
it declares. [snip elaboration]

The word 'definition', as the Standard uses the term, assigns a
meaning to an identifier. It is the identifier that is being
defined; the "content" or "value" (ie, the /definiens/) given to
the identifier provides the "meaning" of the identifier. Notice
the phrasing in the passage quoted above: "A definition /of an
identifier/ ..." (emphasis added).

This usage is perfectly consistent with how 'definition' is used
in ordinary English. See for example any of the regular online
dictionaries, or this entry

http://en.wikipedia.org/wiki/Definition

on wikipedia.

Hmm, interesting point, I'l have to give that further thought.

But then why is
typedef int word;
considered to be a definition, but
struct foo { int x; };
isn't? Surely they "define" (in the sense you state) the identifiers
"word" and "foo", respectively.

I'm also not entirely sure about applying the English meaning of
"definition" (presenting the meaning of a word that already exists,
as in a dictionary) to C. An English dictionary documents existing
words and their meanings; dictionaries rarely introduce new words
that did not previously exist. Even neologisms that did not appear
in previous editions are not generally invented by the editors of
the dictionary itself.

In C, on the other hand, we write "declarations" and "definitions"
that create new things that never existed before all the time.

Stepping back for a moment, it seems to me that in C we have two
different ways of introducing a new identifier. One is to apply a
new identifier to an entity that already existed. Another is to
introduce a new identifier as the name of an entity that we are
creating, i.e., that didn't previously exist. [*]

I'd like to have clear terms that distinguish between these two things.
My thought (though obviously the standard doesn't agree with me, and I
am therefore definitively wrong) is that "definition" would be a good
term for something that creates an entity.

Is there a simple term for a declaration that causes the declared
entity (not just an identifier that refers to it) to be created?
Do we even need such a term?

[*] For that matter, we can apply an existing name to a newly created
entity, such as a function definition that follows a declaration
of the same function.
 
K

Keith Thompson

Seebs said:
I am not convinced. See next example.


What existing type?

typedef struct { int x; } foo;

What is the "existing type"? What do you mean by "existing", anyway? In
what way did that type have existence prior to this line of code executing?

The existing type is struct { int x; }. No, it didn't exist prior
to this line of code. But the way I think of it is that the type
struct { int x; } comes into existence when we reach the closing
brace, and the name "foo" for that type doesn't come into existence
until the end of the declaration.

Each of the following lines of code declares two distinct things:

int x = 0, y = x + 1;
/* declares x and y (note that x exists and can be accessed
prior to the end of the declaration) */

struct { int x; } obj;
/* declares a struct type and an object of that type */

typedef struct { int x; } foo;
/* declares a struct type and a typedef for that struct type */
Can you suggest a way in which I could then declare something else, without
using "typedef foo bar" or the equivalent, such that (somethingelse *) and
(foo *) are considered compatible types?

Not off the top of my head. Can you clarify the relevance of the
question?
No, but when we're talking about language, the name is more interesting.

It's not a matter of which is more interesting, it's about the fact that
they're distinct.
The thing you put in front of x to declare x is the name of a type, but it's
just as useful to say that it's the type of x.

I consider that misleading, unless it's in an informal context where
it's clear that we're taking verbal shortcuts.
I think the existence of pairs of types with identical storage and
representation, but which are considered distinct, suggests that "typeness"
has to do with something other than representation or storage.

For historical reasons, C in some cases allows you to have two types
which happen to be considered interchangeable, but I think they're still
two types.

Certainly.

char and signed char are two types. int and long are two types.
struct { int x; } and foo are one type.
Well, then. Tell me which one of them is the file, and which one isn't a
file.

Or are they both not-a-file? If neither of them is a file, then we're going
to confuse a lot of people when we say that the thing where "touch a" creates
a directory entry isn't creating a file. Now, we could argue that really,
"touch a" is doing two things; first, it creates a file, and second, it
creates a link to that file. But that's not what anyone actually says when
discussing it; they say it creates a file.

Certainly they say that, because describing what *really* happens would
take too long. And in most cases, saying that "touch a" creates a file
is sufficiently clear.

But it's precisely when we're discussing the distinctions among files,
directory entries, inodes, sets of disk blocks, and so forth that it's
*not* enough to say that "touch a" creates a file.
So we do something, which creates a thing which is definitely a file. Then
we do another thing, which creates a thing which is precisely equivalent
to the first thing. There is no way in which the first thing is different
from the second; you can delete either one and the other continues to be
exactly the same as it was before. So if one of them is a file, so's the
other one...

If one of them were a file, the other one would be. And if we don't
happen to have multiple hard links to the same file, it's perfectly
reasonable to talk about touch and rm acting on files. But why
would you gloss over the distinction between files and links *while
discussing the distinction between files and links*?
Anyway, I think the whole thing is killed by the heading for 6.7. If
typedef is introduced under the heading "Type Definitions", then the standard
has clearly committed to the notion that typedef is defining types.

Agreed. And it's not the wording I would have chosen, but I accept it.

But that means that "defining" a type doesn't necessarily *create* a
type. "typedef struct { int x; } foo;", whatever it may or may not
define, *creates* one type, not two.
 
K

Keith Thompson

Keith Thompson said:
Right.

On the other hand, I can easily imagine the above code working as
expected with one K&R or pre-K&R compiler, and failing in some
arbitrarily bad way on another. For example, there certainly could have
been compilers on which long is not exactly twice as wide as short,
and/or where the order in which the parameters are stored does not match
the high-order vs. low-order halves of a long object.

In other words, K&R and pre-K&R C didn't explicitly have the concept of
"undefined behavior", but it had plenty of cases where the behavior was
not defined.
 
S

Seebs

But then why is
typedef int word;
considered to be a definition, but
struct foo { int x; };
isn't? Surely they "define" (in the sense you state) the identifiers
"word" and "foo", respectively.

Probably historical quirk.
I'd like to have clear terms that distinguish between these two things.
My thought (though obviously the standard doesn't agree with me, and I
am therefore definitively wrong) is that "definition" would be a good
term for something that creates an entity.
Hmm.

Is there a simple term for a declaration that causes the declared
entity (not just an identifier that refers to it) to be created?
Do we even need such a term?

I think we do for things that will take up space in the binary, such as
functions and objects. For types, not so much, because types don't ever
really exist. There is no point at which a type is really created; it's
just a definition in the dictionary sense. Here is the name by which
we will refer to things with the following qualities...

It's just nomenclature, there's never an actual thing created. There is
no "struct foo" in the executable code, there's just code generated according
to a naming convention that was adopted during compilation. (Ignoring,
for now, debugging symbols.)

-s
 
S

Seebs

The existing type is struct { int x; }. No, it didn't exist prior
to this line of code. But the way I think of it is that the type
struct { int x; } comes into existence when we reach the closing
brace, and the name "foo" for that type doesn't come into existence
until the end of the declaration.

Okay, I think I buy this interpretation.
Not off the top of my head. Can you clarify the relevance of the
question?

Not anymore, it seemed relevant when I asked it, but I forgot why.
char and signed char are two types. int and long are two types.
struct { int x; } and foo are one type.

I think size_t is the interesting case. I would argue that it's always
the case that size_t and unsigned int are two types, and that size_t and
unsigned long are two types. It's just that, on most systems, the compiler
won't be able to tell one of the pairs apart.
Certainly they say that, because describing what *really* happens would
take too long. And in most cases, saying that "touch a" creates a file
is sufficiently clear.
But it's precisely when we're discussing the distinctions among files,
directory entries, inodes, sets of disk blocks, and so forth that it's
*not* enough to say that "touch a" creates a file.

Ah-hah!

Okay, here's the thing.

Outside of, say, working on a compiler's implementation of typedef, I would
consider any discussion of "what typedef does" to be happening at the ordinary
level where it's sufficient to say "touch creates a file", or "typedef defines
a type".

So far as I can tell, I have never, ever, outside of this debate or chatter
while working on the standard, been in a situation where I cared about that
distinction -- just as, outside of actual work on filesystem tools, I pretty
much never have reason to distinguish between creating links and creating
files.
If one of them were a file, the other one would be. And if we don't
happen to have multiple hard links to the same file, it's perfectly
reasonable to talk about touch and rm acting on files. But why
would you gloss over the distinction between files and links *while
discussing the distinction between files and links*?

I wouldn't.

But when I'm talking about typedef, normally I'm working at the level of
"how do you use this", and just as it's usually fine to say "rm removes
files", it's usually fine to say "typedef defines types". I've never had
reason to care about the distinction before. I presumably knew it, and
I certainly knew that typedefs are interchangeable with the things
they're defined in terms of, but... I never cared.

It's always been perfectly fine, from my point of view, to say that "typedef
defines a type". It's easy to remember, it's consistent with the choice of
keyword, and it adequately explains what's happening.
But that means that "defining" a type doesn't necessarily *create* a
type. "typedef struct { int x; } foo;", whatever it may or may not
define, *creates* one type, not two.

I think I agree with this.

I would view defining a type as in the same category as defining a word.

I do think it's sort of odd that the language spec overloads "definition"
then when talking about functions and objects, but...

Huh. I will say this much for this topic: This has revealed a large and
complicated territory where it seems pretty clear that experienced programmers
don't all share the same cognitive map.

-s
 
R

Rui Maciel

Kenneth said:
<nit>
One could say the same thing about "int", "float", and "char".
</nit>

Not quite. The key aspect here is that the typedef feature lets the programmers attribute any type
definition they see fit to a set of customized identifiers. You don't get any of that by using any
of C's type specifiers.


Rui maciel
 
I

ImpalerCore

[snip]
Anyway, I think the whole thing is killed by the heading for 6.7.  If
typedef is introduced under the heading "Type Definitions", then the standard
has clearly committed to the notion that typedef is defining types.

Agreed.  And it's not the wording I would have chosen, but I accept it.

But that means that "defining" a type doesn't necessarily *create* a
type.  "typedef struct { int x; } foo;", whatever it may or may not
define, *creates* one type, not two.

For myself, I tend to explain it as distinguishing types as either
'physical' and 'logical'. The basic types like char, int, float plus
the types created from the struct keyword define what I would call
'physical' types (they define how much space in memory to allocate and
have sharply defined semantics on what is represented in them). I
view types created from typedef as 'logical' types; they have an
underlying 'physical' type, but with added cognitive meaning that
allows the programmer to mentally distinguish important
characteristics that are not readily apparent from its underlying
'physical' type. I contrast defining a 'logical' type with creating
an 'alias', even though they both use typedef to create them, an
'alias' doesn't add any cognitive meaning.

/* A couple aliases */
typedef int Integer;
typedef struct document document;

/*
* A logical type, using an integer to store the representation
* of a year in the gregorian calendar starting from 1 AD and up.
*/
typedef int greg_year;

/* A logical type, describing a boolean (physical type 'int') */
typedef enum
{
false = 0,
true = 1
} bool;

Whether you see typedef defining a type or an alias is in the eye of
the beholder. I personally don't have an issue with the 'typedef'
name as it's defined, but I admit that I have to add 'physical' and
'logical' decorations to explain what it creates to another person,
similar to the link and file analogy being tossed around. And there
may be subtleties that I miss in this kind of explanation, but its the
best I got.

Best regards,
John D.
 
K

Keith Thompson

Seebs said:
Ah-hah!

Okay, here's the thing.

Outside of, say, working on a compiler's implementation of typedef, I would
consider any discussion of "what typedef does" to be happening at the ordinary
level where it's sufficient to say "touch creates a file", or "typedef defines
a type".

So far as I can tell, I have never, ever, outside of this debate or chatter
while working on the standard, been in a situation where I cared about that
distinction -- just as, outside of actual work on filesystem tools, I pretty
much never have reason to distinguish between creating links and creating
files.

But the assumption that "rm removes a file" is false in a way that's
visible to an ordinary user.

If I have a 1-gigabyte file named "foo.dat", and I type "rm foo.dat"
(and it succeeds), I may or may not have increased my free disk space by
1 gigabyte. If there's another link to the same file, or if some
process has the file open, the name "foo.dat" stops being a valid name
for that file, but the file itself still exists, perhaps under another
name, perhaps with no name at all.

Or if, instead of removing it, I append data to it, that data might be
appended to a file with some other name.

This is why "ln" and "cp" are different commands.

Similarly, the fact that typedef doesn't create a distinct type is
visible to C programmers, for example in the failure of the compiler to
warn you if you assign a size_t* value to an unsigned * object.

Given:
struct foo { int x; };
these two declarations:
typedef struct foo t1;
typedef struct { int x; } t2;
do *visibly* different things. The first creates a name, t1, for a type
that is in some sense the same as struct foo. The second creates a
name, t2, for something that looks very much like a struct foo, but is a
distinct type.

If you insist that struct foo and t1 are two types, it's difficult
to describe the difference between the t1 and t2 declarations.
struct foo and t1 share something that struct foo and t2 do not.
What they share, I'd say, is *identity*, the fact that they're the
same type.

[...]
Huh. I will say this much for this topic: This has revealed a large and
complicated territory where it seems pretty clear that experienced programmers
don't all share the same cognitive map.

Indeed.
 
S

Seebs

But the assumption that "rm removes a file" is false in a way that's
visible to an ordinary user.

Oh, certainly. But it turns out that ordinary users can go ten or twenty
years of daily use of a Unix machine without ever encountering that.
Similarly, the fact that typedef doesn't create a distinct type is
visible to C programmers, for example in the failure of the compiler to
warn you if you assign a size_t* value to an unsigned * object.

This raises a fascinating question: Does any compiler warn for such
things? It seems like it'd be useful.

But... I guess that's the thing. I don't expect the types defined by typedef
to be distinct. I just expect them to have a new name. My underlying model
of C does not lead me to expect the compiler to notice every possible case
where I use something of the wrong type, only the subset where the wrongness
is of a sort that compilers are obliged to diagnose.

I tend to view types in C as notation, rather than reality. The actual
space allocated by malloc is just unformed space. I can use it to hold
objects of a particular type, but really all I'm doing is viewing it through
type-filtered lenses. The space is still just space.

-s
 
K

Keith Thompson

Seebs said:
Oh, certainly. But it turns out that ordinary users can go ten or twenty
years of daily use of a Unix machine without ever encountering that.

Sure, but it's still part of an understanding of the file system
semantics, even without reference to the underlying implementation.
For example, it remains relevent even if you don't know or care
whether you're using ext3, tmpfs, or claytabletfs.
This raises a fascinating question: Does any compiler warn for such
things? It seems like it'd be useful.

I've wondered about that myself. gcc doesn't seem to. Which also
means that some things that are constraint violations on some
platforms don't even trigger warnings on others.
But... I guess that's the thing. I don't expect the types defined by typedef
to be distinct. I just expect them to have a new name. My underlying model
of C does not lead me to expect the compiler to notice every possible case
where I use something of the wrong type, only the subset where the wrongness
is of a sort that compilers are obliged to diagnose.

I tend to view types in C as notation, rather than reality. The actual
space allocated by malloc is just unformed space. I can use it to hold
objects of a particular type, but really all I'm doing is viewing it through
type-filtered lenses. The space is still just space.

Types obviously (?) have no physical reality, and are very likely
not reflected by anything you could see in memory as the program
is executed. But I don't see types as notation; I see them as
an abstraction. And a name like "size_t" can *refer* to such
an abstraction, but "size_t" is a name, not the thing the name
refers to. (And I'm probably beating a dead horse.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,091
Messages
2,570,605
Members
47,224
Latest member
Gwen068088

Latest Threads

Top