If you could change the C or C++ or Java syntax, what would you like different?

R

Rui Maciel

BartC said:
This would be opening a can of worms (with inheritance and all that crap).

Consider:

typedef int newint;

Can you pass a newint to a function that takes an int? (Apparently not,
according to your other post).

My only gripe is about accepting implicit conversion between custom types. It would be possible
through explicit conversion. That would be a good thing. Once the person writing the code is
forced to explicitly convert to/from their custom objects, that person will be forced to stop for
a moment and think if that is really what he wants to do.

Can you do newint+newint?

Yes, provided that the person writing the code explicitly casts the objects to the intended final
type.

If not, then this will make life difficult.

Well, that's the gist of it. If the C programming language supported this "new type" feature then
programmers would be able to easily declare these custom types. That would then be used as a way
to force themselves and others to better compartmentalize their code at a conceptual level and
therefore be forced to follow better coding practices (and to explicitly state their intention of
not following it).

If
yes, then it means sometimes newint is treated as int, and sometimes it
isn't.

What about newint+int?

It would depend on the resulting type the programmer intended to get. So, if the programmer
intended that expression to return a newint then he should cast the int to newint. And vice-
versa.

What type is the literal 1234?

The current rules regarding standard integer types would apply. Regarding types derived from
standard types such as those derived from standard integer types (i.e., the newint example), it
would be possible to initialize a derived type from a constant without an explicit cast, provided
it made sense (i.e., don't initialize a newint with a float constant). However, explicit casts
would be needed if a derived type was initialized from an object of a standard type.

Keep in mind that the purpose of this feature would be to take advantage of C's type rules in
order to better compartmentalize the code by taking advantage of the compiler's ability to enforce
the relations between types. So, if a programmer declares a new type such as newint, he is
explicitly stating his intention to handle an object of that type as a different type.

Can one do newint=int without a cast?

I believe that a cast should be needed for that, provided that int would be an object of type int.
If int refers to a constant then no cast would be needed.


Rui Maciel
 
J

Joshua Maurice

I think that's a little strong.

If you include the library as part of the language, the notion of
typedef as "defining types" clearly makes sense at that level.

How to put this nicely. I am flabbergasted that we are still
continuing on this train of thought.

To emphasize this point: types are distinct only so far as a type
system says they are. A type and its type system are inseparable
concepts.

Let me put it like this. There is the C type system and type checker
which exist in reality. It is well defined and well specified in the C
standard and in practice at large.

Then there is the type system in your world, such as "at the
programmer's level of abstraction" or "at the library's level of
abstraction". Such a thing is nebulous at best. It is not well defined
by the C standard, no any standard. In fact, it can change depending
on what's convenient to the programmer.

You are saying that, as a programmer, you have your own internal type
system and type checker which says that "size_t" is a different type
than all other types. As such, the following is a type error in your
"programmer's abstraction level" type system.
unsigned int * x;
size_t * y = x;
However, according to the C type system, that little code fragment on
a particular implementation may not be a type error. (If it makes you
feel any better, I could replace size_t with one of the stdint.h
"fast" integer types.)

I personally believe that it is highly inappropriate to have a
discussion or do reasoning with two different inconsistent
abstractions and terminologies at once. Your "programmer's
abstraction level" type system is inconsistent with the C type system.
Moreover, as your programmer's type system is whatever the programmer
feels like at the time, it's completely open to abuse as it's defined
by individual fiat. No. That is not an appropriate way to pick names
of things in a programming language standard. That is not an
appropriate way to decide that "typedef" is a reasonable English
name.

The programmer may want a way to define new types, but typedef is not
that, and lying to him does him no service, nor does letting him
entertain the delusion that it does.

Your idea of "the C standard library's abstraction level" is just as
bogus. It is not documented anywhere, so we're again left to
individual fiat of just what is the type system, and it's also
inconsistent with the actual C type system and C terminology.
 
J

Joshua Maurice

Give me a moment to ponder the rest of it, but could you answer this
one thing to guide my future discussion points?
Initially, you were of the opinion that MILES, METERS, and int were
all distinct types, (for whatever appropriate definition of
"distinct"). I again hope we're talking at a strictly formal language
level. In that context, MILES, METERS, and int are not distinct types.
They are the same type. The size of set of types in this discussion is
1.
I again bring up 6.7.7 Type definitions / 4
  typedef int MILES, KLICKSP();
  [...]
  The type of distance is int, [...]
So, do you agree? Or do you want to use language which says that there
are 3 distinct types, MILES, METERS, and int?

Err, that snipped quote should read:

I again bring up 6.7.7 Type definitions / 4
  typedef int MILES, KLICKSP();
  [...]
  MILES distance;
  [...]
  The type of distance is int, [...]

So, I went ahead and did the rest of my reasoning. Here's what I
have:

Point 1

Let me show more conclusively that a name which refers to a type is a
different thing than the type to which it refers.

Let's look at the C standard. Again, I am referring to this
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf
publicly available draft for want of the real thing, and so that
others can follow along.

"A type name is different from its type" is the simple meaning of
"6.7.7 Type definitions / 3". This is the meaning that any programmer
would take away from the English words of this section.

This is also explicitly supported by the example "6.7.7 Type
definitions / 4".

Finally, I appeal to common sense in the following way:
struct foo { int x; };
int main()
{
typedef struct foo bar;
struct foo x;
bar y;
x = y;
}
I think that all C implementations will compile this without required
diagnostic, and I hope that we can all agree that this is a well
formed, no undefined behavior program. Thus we can look at the simple
assignment rules in "6.5.16.1 Simple assignment / 1". From this, we
can conclude that the only applicable constraint is "the left operand
has a qualified or unqualified version of a structure or union type
compatible with the type of the right;". From this, we can conclude
that the "types" (quote unquote) 'struct foo' and 'bar' are
compatible. Try as I might, I can find no rules in the C standard
which says that they are compatible besides "6.2.7 Compatible type and
composite type / 1" - "Two types have compatible type if their types
are the same.". Thus we are left to conclude that 'struct foo' and
'bar' "are" the same type.

Secondly, let's take a moment and ask ourselves if this is reasonable
according to the usage of the terms in the world at large. The answer
is yes, it is. The world at large, specifically technical type theory,
generally treats two types as compatible iff they are the same type.
Obviously this is not the case for C because of separate translation
units. It's also not the case because of type qualifiers (ex: const).
It's also not the case for C because type declarations can be
"incomplete", as referenced in "6.2.7 Compatible type and composite
type / 3 and 5". However, by and large, for a single translation unit
where definitions and scope and type (for C) "make sense", it is
largely consistent.

So, a type is not the same thing as its type name, and a typedef name
does not specify a new type nor create a new type, to use the common
meaning - which is consistent with the meaning of the C standard.
Specifically, for the example
typedef MILES int;
MILES and int are both names which name the same thing. They are both
type names. They both name a type which we commonly call "int" using a
form of shorthand. In fact, this very looseness of terms is
acknowledged in "6.2.1 Scopes of identifiers / 4".

--

Point 2

Hopefully we can agree that a "type definition" is a category of
thing, where that thing "defines a type". I merely mean that the two
English formations are equivalent, and I don't mean to imply any
particular meaning to "define", "definition", "type", and so on.

--

Point 3

There is no specific passage or passages in the C standard which give
sufficient meaning to the phrase "define a type" or "type definition".
By sufficient, I mean it should be possible to look at a program and
decide if some aspect of that program does define a type, or does not
define a type, and furthermore we should be able to ascertain which
type is being defined. I find no such clear language in the C
standard. I've looked twice now.

It's hard to prove a negative -- short of proof by exhaustion. As
such, all I practically do is make this claim (again), and wait for
someone to supply chapter and verse in the standard where it does
define "type definition" or "define a type" clearly.

Yes, I do recognize that many places actively refer to such a thing.
In fact, there is even a whole section of the standard with that name,
"6.7.7 Type definitions". However, in the section "6.7.7 Type
definitions", nor anywhere else, does it give clear meaning to the
phrase "type definition" or "define a type".

Again, I have not found that clear definition of "to define a type",
but I have found clear definitions for:
* "object definitions", "function definitions", "enumeration constant
definition", "typedef name definition", from "6.7 Declarations / 5"
* "contents of a type definition", from "6.7.2.3 Tags / 1"
* and "macro definition".

Moreover, of the text which refers to type definitions (but does not
give meaning to the term "type definition"), there is great
inconsistency and there appears to be great confusion. Some places
clearly refer to a type's definitions, but then other places will
refer to that exact thing as that type's declaration. There is no
rhyme or reason which I can uncover in this. (In fact, more often than
not, such things are referred to as type declarations and not
definitions.)

Here's a somewhat comprehensive, but not complete, listing:

"5.1.2.2.2 Program execution / 1". The library clause (clause 7)
defines types.

"6.2.5 Types / 14 and 15". The implementation defines the types char,
unsigned char, and signed char.

"6.4.4.4 Character constants / 11". The header <stddef.h> defines the
integer type wchar_t.

"6.5.3.4 The sizeof operator / 4". The header <stddef.h> defines the
unsigned integer type size_t.

"6.5.6 Additive operators / 9". The header <stddef.h> defines the
signed integer type ptrdiff_t.

"6.2.5 Types / 1". Types are clearly defined in terms of values and
the types of expressions which access those values.

"6.7 Declarations / 5" clearly talks about definitions of type names,
but not types. In fact, according to this, a definition is always a
declaration, which means that a definition is something which can be
applied to an identifier, "aka a name", thus never a type. Thus, under
this definition of definition, there cannot be such a thing as a type
definition.

"6.7.2.1 Structure and union specifiers / 7" discusses when a new type
is declared, which seems to coincide with when its content is defined
(to use the term from "6.7.2.3 Tags / 1").

6.7.2.3 Tags
Constraints
1 A specific type shall have its content defined at most once.
[...]
Semantics
4 All declarations of structure, union, or enumerated types that have
the same scope and
use the same tag declare the same type. The type is incomplete111)
until the closing brace
of the list defining the content, and complete thereafter.
5 Two declarations of structure, union, or enumerated types which are
in different scopes or
use different tags declare distinct types. Each declaration of a
structure, union, or
enumerated type which does not include a tag declares a distinct type.

"6.7.7 Type definitions". Again, contrary to title, this does not
define "type definition". It does define "type name definition". It
seems non-obvious to me that a type definition is necessarily a type
name definition from the text here. In fact, this would contradict any
earlier sense of definition from "6.7 Declarations / 5" which says
that definitions only apply to names. Also, this would contradict the
meaning of "type definition" in the programming culture at large.

"6.7.7 Type definitions / 3 and 4" also clearly show that typedef does
not specify new types. First, I'd like to note that this is wrong on
some level.
typedef struct { int x; } foo;
This declaration definitely specifies a new type. However, presumably
the text of "6.7.7 Type definitions / 3" was thinking only about the
situation where there was no type content specification (to borrow the
term from "6.7.2.3 Tags / 1").

"7.1.2 Standard headers / 4". Some headers define types.

"7.6 Floating-point environment <fenv.h> / 1". This header declares
two types.

"7.8 Format conversion of integer types <inttypes.h> / 2". This header
declares some fuctions, and it declares the type imaxdiv_t, and that
type is a structure.

"7.11 Localization <locale.h> / 1 and 2". This header declares two
functions, a type, and several macros. That type is
struct lconv

"7.12 Mathematics <math.h> / 1 and 2". This header declares two
floating point types and many functions.

"7.13 Nonlocal jumps <setjmp.h> / 1". This header defines one macro,
declares one function, and declares one array type.

"7.14 Signal handling <signal.h> / 1 and 2". This header declares an
integer type, declares two functions, and defines several macros.

"7.15 Variable arguments <stdarg.h> / 1". This header declares a type
and defines four macros.

"7.17 Common definitions <stddef.h> / 1". This header defines some
integer types and defines some macros.

"7.17 Common definitions <stddef.h> / 4". The recommended practice is
that the defined types are "made" using typedef.

"7.18 Integer types <stdint.h> / 1". This header declares some integer
types.

"7.18 Integer types <stdint.h> / 4". The header shall declare the
types with typedef.

"7.19.1 Introduction / 1". The header <stdio.h> shall declare three
types, several macros, and many functions. One of the declared types
is size_t. (This is defined in the header <stddef.h> according to
other sections. Is this declaration different than those other
definitions?) One of the declared types (FILE) is an object type which
is capable of recording all the necessary state. One of the declared
types (fpos_t) is an object type and not an array type.

"7.20 General utilities <stdlib.h> / 1 and 2". This header declares 5
types, declares several functions, and defines several macros. Some of
the declared types are size_t and wchar_t (both of which are defined
in <stddef.h> according to other sections. Is this declaration
different than those other definitions?). The other three types are
structure types.

--

And hopefully you can see the pattern for the library clauses.
Specifically, for the same type, sometimes it's defined, sometimes
it's declared. Sometimes it specifically mentions that the integer
type must be "made" with typedef. Sometimes it says that the
recommended practice is that the integer type is "made" with typedef.
Sometimes it doesn't say.

Moreover, sometimes it declares macros. Sometimes it defines macros.
Unlike type definitions, there is a clear definition for "macro
definition" and there is no definition of "macro declaration", yet it
occasionally says that macros are declared, and sometimes they're
defined.

I say that you should not read into any of this at all. I claim that
when the standard talks about defining a type and declaring a type,
they're using a much more colloquial meaning of those terms. They're
generally using "definition" in the English dictionary sense to define
words and phrases (which is the exact purpose of a programming
language standard btw), though sometimes it's clear that they really
mean to say define or declare the type /name/.

Finally, I can find /clear/ language in the standard which defines
"type declarations" as opposed to type "name definitions": "6.2.7
Compatible type and composite type / 1" and "6.7.2.1 Structure and
union specifiers / 7". "6.7.2.1 Structure and union specifiers / 7" is
quite clear that certain syntax creates a new type. However, I can
still find no definition for "type definition".

What we see - arbitrary usage of declare a type and define a type - is
exactly what we would expect if the standard implicitly conflated
"names" for "the things which they name" and gave no clear meaning to
"type definition" but did give clear meaning to "type
declaration" (which is exactly what the standard does).

--

Point 4.

Having (again) concluded that the standard makes no formal definition
for "type definition", the only reasonable definition can come from
one of the standard's normative sources, or from another "reliable"
source, such as the well accepted usage in the programming community.

I do plan to get those normative sources eventually and see if they
give any better sense on type theory and type definitions. I half
expect that they would. Is there anyone offhand who has access to
those ISO docs who can review them for me?
 
B

BartC

Inheritance? We're talking about C, remember.

If I were proposing this, I wouldn't change the general behavior of
numeric types (or at least that would be a separate proposal).

Somebody suggested a new keyword "newtype" that acts like typedef except
that the new type is actually distinct. Consider:

newtype int newint; /* changing the existing semantics of typedef is not
a reasonable option */

I won't argue with that.
Suppose int and long happen to be the same size. Then int, long,
and newint are all distinct signed integer types with the same
representation.


Yes, just as you can pass a long to such a function.

I think this was whole point of defining a new type, to pick up type
mismatches:

newtype float Angle;
newtype float Length;

Length d;
Angle a;

rotate(d); /* type error */

rotate((Angle)d);

This now passes, but how is the conversion done? Ie. is it passive or is
there some actual conversion, or should Length->Angle be disallowed unless
you do Length->float->Angle?
Yes (otherwise there's not much point).

Well, distinct types are useful where there are also facilities for
overloading built-in operators, and perhaps also user-functions.

If you forget about these, allow derived types to inherit the built-ops
available to their 'base' types, but not to do the same with user-functions,
and use casts everywhere so that type-matching is a case of 100% or nothing,
then it might be workable, although it might just look like a poor imitation
of C++.
<OT>
Ada has predefined integer types similar to C's, but they're
distinct types, and there are no implicit conversions between
....

Ada is a much more complex language; I read somewhere that an Ada compiler
takes 50 man-years of effort (compared with 10 years for C++ which is itself
more difficult to write than for C). And some of this complexity I think
also makes it harder to program.
But I wouldn't suggest changing C in this way; it would break too
much existing code (and too many C programmers' minds).

At the moment there is some disagreement too about exactly what it means and
how it might work.
 
R

Robert A Duff

BartC said:
Ada is a much more complex language; I read somewhere that an Ada
compiler takes 50 man-years of effort (compared with 10 years for C++
which is itself more difficult to write than for C). And some of this
complexity I think also makes it harder to program.

I'd say C++ is somewhat more complicated than Ada,
and C is rather less complicated than either C++ or Ada.
But the C standard is pretty big these days.
I wouldn't call any of these three "simple".
Look at Smalltalk, for example, or Scheme.

Anyway, I don't know where the 50 and 10 numbers come from;
they sound sort of "made up", but anyway, how much effort
has gone into the gcc C compiler so far? It's got to be far,
far more than 50 man-years. And it's still not done. ;-)

- Bob
 
R

Robert A Duff

Keith Thompson said:
Hello Bob, welcome to the *other* language whose name is a hexadecimal
palindrome!

;-)

Hello, Keith.
C allows any numeric type to be converted implicitly to any other
numeric types. For example, int and long are distinct types (which
may or may not have the same representation) but given:

Right, there wouldn't be much point in a new typedef-like feature
if you can implicitly convert willy-nilly.

Also, the way operators work in C doesn't quite fit
with such a new feature.

I'm not interested in spending a lot of time designing
such a C feature. I'm just saying that such a feature
can make sense, and if somebody is interested, it would
be wise to look at other languages. Especially languages
that aren't directly descended from C.

- Bob
 
B

BartC

Robert A Duff said:
....
Anyway, I don't know where the 50 and 10 numbers come from;
they sound sort of "made up", but anyway, how much effort
has gone into the gcc C compiler so far? It's got to be far,
far more than 50 man-years. And it's still not done. ;-)

The Ada figure probably came from here:

http://www.infres.enst.fr/~pautet/Ada95/intro.htm (3rd paragraph).

The 10 man-years for C++ was actually for OCaml, and got mixed up with this
comment from Walter Bright:

"For someone to start a C++ compiler today, I'd give it 10 man years
minimum just to do the front end (not optimizer, code generator, linker,
or library), and that's if you can find a good compiler writer."

So perhaps Ada and C++ aren't that far apart in implementation effort.
 
K

Keith Thompson

Rui Maciel said:
My only gripe is about accepting implicit conversion between custom
types. It would be possible through explicit conversion. That would
be a good thing. Once the person writing the code is forced to
explicitly convert to/from their custom objects, that person will be
forced to stop for a moment and think if that is really what he wants
to do.



Yes, provided that the person writing the code explicitly casts the
objects to the intended final type.



Well, that's the gist of it. If the C programming language supported
this "new type" feature then programmers would be able to easily
declare these custom types. That would then be used as a way to force
themselves and others to better compartmentalize their code at a
conceptual level and therefore be forced to follow better coding
practices (and to explicitly state their intention of not following
it).

Ok, so a type created by "newtype" would be distinct in a way that
existing integer types such as int and long are not.

Hmm. I'm sure this could be done consistently, but I'm not sure I'm
comfortable with this added level of complexity in the C type system.

Still, the ability to create truly distinct types that are "clones" of
existing types is extremely useful in languages that do support it.

[...]
The current rules regarding standard integer types would apply.

So the type of 1234 is int.
Regarding types derived from standard types such as those derived from
standard integer types (i.e., the newint example), it would be
possible to initialize a derived type from a constant without an
explicit cast, provided it made sense (i.e., don't initialize a newint
with a float constant). However, explicit casts would be needed if a
derived type was initialized from an object of a standard type.

Interesting idea, but it could lead to some counterintuitive results.

For example:

newtype int newint;
newint a = 42; /* 42 is of type int, implicit conversion */
const int answer = 42;
newint b = answer; /* no implicit conversion, constraint violation */

But maybe that's ok. I explicitly declared "answer" as an int,
so I shouldn't be mixing it with newints.

We'd have to have a more detailed description of which expressions can
be converted implicitly. Almost certainly if
newint a = 42;
is legal, then
newint c = 6 * 9;
should be as well. The simplest reasonable solution is to say that
an integer constant expression can be implicitly converted to newint.
One odd corner case is that this:
newint d = (int)42;
is legal, because integer constant expressions can include cast
operators.

And the "usual arithmetic conversions" would have to be modified
to permit ``n + 1'' while still forbidding ``n + i'' (where n is
a newint and i is an int).

[...]

It occurs to me that cast operators would become necessary in more
contexts. Currently, a cast is a warning sign, either that it's
used unnecessarily (perhaps due to the author's inexperience),
or that something strange is going on (such as deliberately
non-portable code doing pointer conversions). But again, maybe
that's ok. Well-written code using "newtype" types should *still*
have a minimum of casts; if you need to cast an operand, you probably
should have declared it with the right type in the first place.
Keep in mind that the purpose of this feature would be to take
advantage of C's type rules in order to better compartmentalize the
code by taking advantage of the compiler's ability to enforce the
relations between types. So, if a programmer declares a new type such
as newint, he is explicitly stating his intention to handle an object
of that type as a different type.

I think I like it.

But I'm not at all optimistic that something like this could actually
be added to a new C standard. Even if it were, it would be difficult
or impossible to change existing predefined types such as size_t from
typedefs to newtypes; any code that mixes size_t and (non-constant)
int in an expression would break. So newint would be the recommended
way to create new types, but the existing standard library would
not use the new mechanism (except maybe for new headers).

Incidentally, could you keep your lines below 80 columns, preferably
down to about 72? A lot of people like to read Usenet in 80-column
windows.
 
J

Joshua Maurice

Yes it is. It's why its called typedef. For anyone with half a
brain. Create variable my type MyType where MyType if a typedef.

You really don't have to be too clever to see this how real people work
and communicate.

The entire point of this discussion is that I believe that tyepdef
does not specify new types, which then implies that it does not define
new types, which then implies that it does not define types. Could you
respond to any of my particular arguments on the subject to show how I
am wrong besides simply stating that I am wrong, or relying on the
simple meaning of the word "typedef" which I claimed is a misnomer?

To repeat my arguments here, they are:

1- typedef does not specify new types because that's the very clear
and explicit reading of "6.7.7 Type definitions / 3".

2- typedef does not specify new types because that's the very clear
and explicit reading of the example "6.7.7 Type definitions / 4".
Specifically: "The type of distance is int."

3- typedef does not specify new types because its well understood
semantics and usage create compatible types, and the only possible
rule in the C standard which makes them compatible types is "6.2.7
Compatible type and composite type / 1", which says that, quote "Two
types have compatible type if their types are the same." Example:
struct foo { int x; };
typedef struct foo bar;
int main()
{
struct foo x;
bar y = x;
}

4- typedef does not specify new types because its well understood
semantics and usage, when viewed under general type theory, clearly
shows that it does not specify new types. In a type system, generally
types are compatible for things like simple assignment iff the types
are the same type. (Obviously, this is not strictly true in C
[depending on your interpretation of the terms]. This is because of
the rules governing separate translation units, implicit conversions,
and incomplete type declarations.)

The interesting part of this discussion is whether we can ferret out
an agreed upon meaning of a type definition, or what it means to
define a type, not whether typedef specifies new types.
 
F

Felix Palmen

* Joshua Maurice said:
The interesting part of this discussion is whether we can ferret out
an agreed upon meaning of a type definition, or what it means to
define a type, not whether typedef specifies new types.

No, we can't, because we apply different levels of abstraction. Some
people in this discussion (including you) are convinced that the C
language standard is the only level of abstraction that should be used
for naming language keywords, some others (including me) object to that
-- in my case because I think it's much better to give the programmer
tools that likely match his conceptual model of what he's doing.

But, there's probably one thing we could/should agree: As Seebs and some
others worked out of the standard texts, typedef indeed defines
something on that abstraction layer, too: the identifiers (aka type
names).

Regards,
Felix
 
S

Seebs

But, there's probably one thing we could/should agree: As Seebs and some
others worked out of the standard texts, typedef indeed defines
something on that abstraction layer, too: the identifiers (aka type
names).

The more I look at this, the more I think the problem is that the word
"type" is being used in C much the way the word "definition" is used when
talking about a glossary. It's used both to refer to the thing you
use to explain the meaning of a given word, and to the entry that links
a word with a thing that explains the meaning of a word.

Poking at this more... Consider:
typedef int foo;

After this has been done, "foo" is a type-name that refers to a particular
mapping from storage space to interpretation. It's not "really" a type,
because it's really just a name that points to a type.

But... How is that any different from "int" or "signed int"? Those are,
also, both just names that refer to that interpretation-of-data. They are
not more privileged in any particularly significant way. A variable
declared as "int" is not somehow better tied to that interpretation-of-data
than a variable declared as "foo" or a variable declared as "signed int".

We don't have any problem calling "int" a type, but really, "int" is the
same kind of thing that "foo" is; it's a word which can be used to invoke
a particular mapping of storage. Sure, it's got additional language magic,
but in terms of declaring objects, it's not any better, it's not any more
real.

So in the standard, there is a certain tendency to conflate the name of
the set of rules and the set of rules, because really, they are effectively
interchangeable in most contexts. Perhaps more significantly, in any given
context, only one is usually relevant, so if you just use the relevant one
you always know what people meant.

-s
 
J

Joshua Maurice

No, we can't, because we apply different levels of abstraction. Some
people in this discussion (including you) are convinced that the C
language standard is the only level of abstraction that should be used
for naming language keywords, some others (including me) object to that
-- in my case because I think it's much better to give the programmer
tools that likely match his conceptual model of what he's doing.

But, there's probably one thing we could/should agree: As Seebs and some
others worked out of the standard texts, typedef indeed defines
something on that abstraction layer, too: the identifiers (aka type
names).

Yes. typedef does define type names. I very clearly agree with this.

I guess that they're not much more I can do on this argument though. I
think it's an affront to good practice in any discipline to start
mixing inconsistent ontologies in a single document. The C standard
should be defined in terms of a single ontology, preferably one which
makes sense to the programmer. The programmer should not need to adopt
a different inconsistent ontology to be a good programmer. In fact, I
don't think you do. It works perfectly well just to think that "The
definition of size_t is unspecified and dependent on implementation.
It's likely a typedef for an unsigned integer type. However, as it's
unspecified, don;'t rely on any particular implementation." There's no
need to think that "size_t is a distinct type", and I think that's a
very bad way to think about it because it implies certain untrue
things, such as:
unsigned int * x;
size_t * y = x;
is guaranteed to produce a diagnostic (specifically a type error) on a
conforming implementation.
 
K

Keith Thompson

No, we can't, because we apply different levels of abstraction. Some
people in this discussion (including you) are convinced that the C
language standard is the only level of abstraction that should be used
for naming language keywords, some others (including me) object to that
-- in my case because I think it's much better to give the programmer
tools that likely match his conceptual model of what he's doing.

You're making an unwarranted assumption about the programmer's
conceptual model.

What if I *want* to define a name that's nothing more than an alias for
an existing type?

Consider, for example, the rather confusing declaration of the signal()
function:

void (*signal(int sig, void (*func)(int)))(int);

I might reasonably declare a typedef for the second argument:

typedef void (*signal_handler)(int);

It's not intended *on any level of abstraction* to be a distinct type,
just an alias for the already existing type of the second parameter of
signal().

Is it the language's business to tell me that I should be using typedef
(which does not create a new language-level type) to create a new and
distinct logical type?

[...]
 
S

Seebs

Is it the language's business to tell me that I should be using typedef
(which does not create a new language-level type) to create a new and
distinct logical type?

I don't think so.

I still want to know whence comes the inference that "define" implies
"create a new and distinct". That interpretation is clearly not
unique, as I've seen more than one person arguing for it, but it would
simply never have occurred to me to expect it from "define".

-s
 
K

Keith Thompson

Seebs said:
The more I look at this, the more I think the problem is that the word
"type" is being used in C much the way the word "definition" is used when
talking about a glossary. It's used both to refer to the thing you
use to explain the meaning of a given word, and to the entry that links
a word with a thing that explains the meaning of a word.

I disagree. A type name is not a type.
Poking at this more... Consider:
typedef int foo;

After this has been done, "foo" is a type-name that refers to a particular
mapping from storage space to interpretation. It's not "really" a type,
because it's really just a name that points to a type.
Right.

But... How is that any different from "int" or "signed int"? Those are,
also, both just names that refer to that interpretation-of-data. They are
not more privileged in any particularly significant way. A variable
declared as "int" is not somehow better tied to that interpretation-of-data
than a variable declared as "foo" or a variable declared as "signed int".

Agreed; "int", "signed int", and "foo" are three different names for the
same type. The names have different syntactic forms (a keyword, a pair
of keywords, and an identifier, respectively), but they all satisfy the
grammar production "type-name".

(An aside: This caused me substantial headaches when I was working
on a tool that had to parse C source code. In effect, a typedef
declaration causes the identifier to be treated as a keyword; in
the absence of the typedef, trying to use "foo" as a type name is
actually a syntax error. This is not particularly relevant to the
point we're discussing.)
We don't have any problem calling "int" a type, but really, "int" is the
same kind of thing that "foo" is; it's a word which can be used to invoke
a particular mapping of storage. Sure, it's got additional language magic,
but in terms of declaring objects, it's not any better, it's not any more
real.

Oh, but I do have a problem calling "int" a type. "int" is a keyword.
It has the syntactic form of an identifier consisting of three letters.
If preceded (or followed) by "short", "long", or "unsigned", it doesn't
even refer to the type whose name is "int".

The *name* int is a thing that appears in C source code. The *type*
int (i.e., the type whose name is "int") is an abstraction that
exists in a running program.

Much of the above applies equally to "foo".

Similarly, the token ``42'' appearing in a C source file only *refers*
to the numeric value 42 that might exist in a running program. For
example, the token is decimal; the numeric value is not.
So in the standard, there is a certain tendency to conflate the name of
the set of rules and the set of rules, because really, they are effectively
interchangeable in most contexts. Perhaps more significantly, in any given
context, only one is usually relevant, so if you just use the relevant one
you always know what people meant.

I think you're misreading it. The statement
int is a type.
is verbal shorthand for
The entity whose name is "int" is a type.
Such shorthand is why we have name for things in the first place.

This doesn't make it any less important to distinguish clearly between
names and the things they refer to.

When I read the Standard, I don't see anything that's clearly
inconsistent with my "a type-name is not a type" mental model
(though, again, the stuff about "defining" types is still unclear).
 
I

Ian Collins

I don't think so.

I still want to know whence comes the inference that "define" implies
"create a new and distinct". That interpretation is clearly not
unique, as I've seen more than one person arguing for it, but it would
simply never have occurred to me to expect it from "define".

Especially in the context of C where a definition describes something
that has already been declared. Yes a definition can be a declaration,
but it can't exist in isolation.
 
S

Seebs

I disagree. A type name is not a type.

But it can be used as one, since the jump from type-name to named type
is unambiguous.
Agreed; "int", "signed int", and "foo" are three different names for the
same type. The names have different syntactic forms (a keyword, a pair
of keywords, and an identifier, respectively), but they all satisfy the
grammar production "type-name".

Right. And the thing is... If we can say that "int is a type", then we
can just as accurately say that "foo is a type". They're both actually
names that refer to a type, but it turns out that you can simply use the
type name to refer to the type when talking about code, just as you
can in the code.

It's like identifiers.

int x;

This declares a variable, x, of type int... but wait! Actually, x isn't a
variable; it's the name that identifies the variable. And it's not of
type int. It's of the type denoted by the name int.

But it's completely unconfusing to just say "x is a variable of type int"
instead of "the name x denotes a variable of the type denoted by int".
The *name* int is a thing that appears in C source code. The *type*
int (i.e., the type whose name is "int") is an abstraction that
exists in a running program.

I would argue that the abstraction exists only during compilation, and
types don't exist at runtime.
I think you're misreading it. The statement
int is a type.
is verbal shorthand for
The entity whose name is "int" is a type.
Such shorthand is why we have name for things in the first place.

Ahh, but here's where we get into fuzziness. That entity's name isn't "int"
any more than it's "signed int". Or, after a typedef, "foo".
This doesn't make it any less important to distinguish clearly between
names and the things they refer to.

I think that's a thing which is only situationally important. I've
never before this conversation had any reason to say "the type which
is denoted by the name int" rather than "the type int". And, similarly,
I've never once needed the phrase "the type which is denoted by the
name size_t".
When I read the Standard, I don't see anything that's clearly
inconsistent with my "a type-name is not a type" mental model
(though, again, the stuff about "defining" types is still unclear).

I think it's not so much that there's not a distinction there, as that
you can pretty much ignore that distinction while programming, and you
only really need it when trying to explain the "under the hood" parts
of typedef. The rest of the time, "x is a variable of type int" is good
enough.

-s
 
J

Joshua Maurice

The more I look at this, the more I think the problem is that the word
"type" is being used in C much the way the word "definition" is used when
talking about a glossary.  It's used both to refer to the thing you
use to explain the meaning of a given word, and to the entry that links
a word with a thing that explains the meaning of a word.

Poking at this more... Consider:
        typedef int foo;

After this has been done, "foo" is a type-name that refers to a particular
mapping from storage space to interpretation.  It's not "really" a type,
because it's really just a name that points to a type.

But... How is that any different from "int" or "signed int"?  Those are,
also, both just names that refer to that interpretation-of-data.  They are
not more privileged in any particularly significant way.  A variable
declared as "int" is not somehow better tied to that interpretation-of-data
than a variable declared as "foo" or a variable declared as "signed int".

We don't have any problem calling "int" a type, but really, "int" is the
same kind of thing that "foo" is; it's a word which can be used to invoke
a particular mapping of storage.  Sure, it's got additional language magic,
but in terms of declaring objects, it's not any better, it's not any more
real.

So in the standard, there is a certain tendency to conflate the name of
the set of rules and the set of rules, because really, they are effectively
interchangeable in most contexts.  Perhaps more significantly, in any given
context, only one is usually relevant, so if you just use the relevant one
you always know what people meant.

I agree in large part. Generally it's clear whether the word "int"
carries meaning as a type or as a type name from context. There is
little fear of ambiguity. Of course, I myself use the English word int
to refer to both a type name and a type.

However, if you accept my position that "to define a type" is
equivalent in most programming contexts to "to fully specify a new
type to the compiler and type checker so that it is distinct from all
other currently defined types (for that translation unit)", then we
have a problem. With that, there is a crucial distinction between "to
define the type foo" and "to define the type name foo".
 
K

Keith Thompson

Seebs said:
I don't think so.

I still want to know whence comes the inference that "define" implies
"create a new and distinct". That interpretation is clearly not
unique, as I've seen more than one person arguing for it, but it would
simply never have occurred to me to expect it from "define".

For both functions and objects, a "definition" creates a new and
distinct function or object, respectively, whereas a "declaration"
may or may not do so. It seemed (and still seems) perfectly natural
to apply the same distinction to type declarations.

It doesn't match the way a dictionary "defines" a word, but then
dictionaries don't create anything; they just document the existing
language. C programs create new things all the time.

The authors of the standard obviously had a different idea of what
is or is not a "definition" (C99 6.7p5).
 
S

Seebs

For both functions and objects, a "definition" creates a new and
distinct function or object, respectively, whereas a "declaration"
may or may not do so. It seemed (and still seems) perfectly natural
to apply the same distinction to type declarations.

Ah-hah!

I see. See, I was coming at it the other way; I assumed that "definition"
had its normal meaning except when explicitly overloaded for functions
and objects (because they have the declare/define problem between modules).

Since types don't have that problem -- you don't refer to a type that's
defined by another translation unit, you just declare compatible types in
all units -- it falls back on the general English meaning.
The authors of the standard obviously had a different idea of what
is or is not a "definition" (C99 6.7p5).

Note that there's two kinds; there's the kind for functions and objects,
where a definition is the thing that actually makes a thing which takes
storage space or generates code, and there's the kind for names, which
work a lot more like dictionary definitions or the general-English usage.

A declaration specifies the interpretation and attributes of a set
of identifiers. A definition of an identifier is a declaration for
that identifier that:
- for an object, causes storage to be reserved for that object;
- for a function, includes the function body;
- for an enumeration constant or typedef name, is the (only)
declaration of the identifier.

If you compare these, the division can be summarized as: For things which
actually result in some kind of generated code, the "definition" is the
one which generates the code or reserved storage or whatever, and is
distinct from a declaration because you *can't* have two definitions of
the same thing in two different modules. You have to have one definition,
and the rest have to be declarations.

But for things like enum constants and typedef names, they're unique for
each translation unit, and don't show up in any way in generated code
(again, ignoring debug symbols). And then, the "definition" is just the
thing that tells you what they mean.

Another way of looking at it: The definition is always in the dictionary
sense, the thing that really tells you the specifics of what something is,
not just roughly what kind of thing it is. For typedef names and
enumeration constants, you can have exactly one per name per translation
unit. For objects and functions, you can have exactly one per name per
final program.

Basically, the weirdness is that definitions of objects and functions
*are* shared, and as such, you have to know which one "creates" the
thing and that other references to it are just letting you know that
such a thing will have been created. For types and enum constants,
you don't care -- the mere fact of defining them effectively "creates"
them.

.... I am clearly rambling here. I perceive a distinction between these
cases, as a result of which it seems unsurprising to me that enumeration
constants and typedef names have different "definition" rules than
objects and functions. And as a result of that, it seems unexceptional
to me to refer to typedef as "defining" types in the way that types are
defined, which is not at all like the way that objects and functions
are defined, but is a great deal like the way that enumeration constants
are defined.

Consider:
typedef int foo;
typedef int bar;
enum foo_enum { FOO };
enum bar_enum { BAR };

I'm no more surprised by "foo" and "bar" denoting the same underlying
type than I am by FOO and BAR denoting the same value. I still feel
that foo and bar are both defined, and FOO and BAR are both defined,
and that it's quite reasonable to refer to foo and bar as types, and
FOO and BAR as constants. Even though, really, FOO and BAR aren't
constants, they're names that denote the same constant...

-s
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,083
Messages
2,570,589
Members
47,211
Latest member
JaydenBail

Latest Threads

Top