C as a Subset of C++ (or C++ as a superset of C)

BartC · Sep 12, 2012

Malcolm McLean said:
×‘×ª××¨×™×š ×™×•× ×¨×‘×™×¢×™, 12 ×‘×¡×¤×˜×ž×‘×¨ 2012 16:22:06 UTC+1, ×ž××ª Bart:
Consider this

int16 *readaudiosamples(void *context);
void playaudiosamples(short *wave, int N);

What about it?

Audio probably *is* best handled as blocks of 16-bit data. But the example
uses 3 different int denotations, with probably two different widths; is
that the trouble?

I've already made the point that C could be improved, with regard to
managing these various different types. But I doubt whether you're seriously
proposing using actual C syntax, rather than C-style, as the basis for your
universal language.

James Kuyper · Sep 12, 2012

Proliferation? Typically there are 8 combinations of signed/unsigned, and
width (8,16,32,64 bits).

I count a lot more than 8. There's a minimum of 12 integer types that
must be distinct on all implementations (even though one pair of them
must have the same size, alignment requirement and representation, and
other pairs might also match):
_Bool
[unsigned | (plain) | signed] char: 3
[unsigned | signed] [short | int | long | long long]: 8

Each of the following types could be the same as one of the above, or a
distinct extended integer type:
intN_t: none mandatory
int_leastN_t: 8 mandatory
int_fastN_t: 8 mandatory
intmax_t: 2
intprt_t: 2
size_t, ptrdiff_t, wchar_t, char16_t, char32_t
atomic_ versions of all of the above: total of 37 mandatory.
sig_atomic_t, wint_t

That's a total of 74 mandatory types that are required to be integers,
and in principle they could all be distinct types.

In addition, all of the following types could be integer types (though
they don't have to be), and don't have to be the same as any of the above:
fpos_t, cnd_t, thrd_t, tss_t, mtx_t, clock_t, time_t, max_align_t,
mbstate_t, wctrans_t, wctype_t

The important thing is not that they could all be distinct - that's
highly unlikely - in all likelihood they span not much more than a dozen
distinct types. The important point is that it's difficult, in general,
to determine which ones are the same. It used to be impossible, but the
new _Generic feature of C2011 makes it possible, at least at run time,
if not at compile time. However, it's still not easy)

Keith Thompson · Sep 12, 2012

Malcolm McLean said:
×‘×ª××¨×™×š ×™×•× ×¨×‘×™×¢×™, 12 ×‘×¡×¤×˜×ž×‘×¨ 2012 09:01:53 UTC+1, ×ž××ª Nick Keighley:
I'd say it needs essentially C syntax - loops, curly brackets,
non-flow control non-arithemtical functionality implemented as
functions rather than the core language. That's generally been
accepted.

No, it hasn't. There are a number of different syntaxes for
delimiting compound statements in modern languages: {/} (C, C++,
Perl et al), begin/end (Ada et al), and indentation (Python et al)
spring to mind, and there are variations on each. I don't find
any of these clearly superior to the others.

[...]

Also the proliferation of integer types is unacceptable.

It's nearly unavoidable if you're going to (a) support types
matching the various widths of integers supported by hardware, and
(b) permit the use of types whose width can vary from platform to
another (size_t, for example).

BartC · Sep 12, 2012

James Kuyper said:
Proliferation? Typically there are 8 combinations of signed/unsigned, and
width (8,16,32,64 bits).

Click to expand...

I count a lot more than 8. There's a minimum of 12 integer types that
must be distinct on all implementations (even though one pair of them
must have the same size, alignment requirement and representation, and
other pairs might also match):
_Bool
[unsigned | (plain) | signed] char: 3
[unsigned | signed] [short | int | long | long long]: 8

Each of the following types could be the same as one of the above, or a
distinct extended integer type:
intN_t: none mandatory
int_leastN_t: 8 mandatory
int_fastN_t: 8 mandatory
intmax_t: 2
intprt_t: 2
size_t, ptrdiff_t, wchar_t, char16_t, char32_t
atomic_ versions of all of the above: total of 37 mandatory.
sig_atomic_t, wint_t

That's a total of 74 mandatory types that are required to be integers,
and in principle they could all be distinct types.

This is just C trying to make things difficult.

I thought Malcolm was suggesting the use of a single int type instead of the
four common widths supported by many CPUs (plus signed/unsigned variations).
Perhaps dealing with all my eight types isn't a big deal after all..

In your list, I wouldn't include _Bool as an 'int' type; it can be
considered a type in it's own right. While the char types are just synonyms
for ints, really (unless they behave in any way differently to an int of the
same width and sign?).

And on typical machines I use, char is 8 bits, short is 16 bits, int is 32
bits, and long long is 64 bits.

'long' seems to be either 32 or 64 bits. (Maybe there are machines that use
odd sizes such as 24/48 bits, or that support 96/128-bit ints, but not
'typical'.)

Everything else is imposed by the language (including all those UINT_MINs
and INT_MAXs). I doubt a 'universal' language would use the same approach.

(I use 'int' and 'word' for signed/unsigned integers of natural width; int:N
and word:N for specific bitwidths, and int*N and byte*N for specific
bytewidths; while 'byte' is a synonym for word:8. For value ranges, I use
int.min, word.max and so on.)

James Kuyper · Sep 12, 2012

I count a lot more than 8. There's a minimum of 12 integer types that
must be distinct on all implementations (even though one pair of them
must have the same size, alignment requirement and representation, and
other pairs might also match):
_Bool
[unsigned | (plain) | signed] char: 3
[unsigned | signed] [short | int | long | long long]: 8

Each of the following types could be the same as one of the above, or a
distinct extended integer type:
intN_t: none mandatory
int_leastN_t: 8 mandatory
int_fastN_t: 8 mandatory
intmax_t: 2
intprt_t: 2
size_t, ptrdiff_t, wchar_t, char16_t, char32_t
atomic_ versions of all of the above: total of 37 mandatory.
sig_atomic_t, wint_t

That's a total of 74 mandatory types that are required to be integers,
and in principle they could all be distinct types.

Click to expand...

This is just C trying to make things difficult.

I thought Malcolm was suggesting the use of a single int type instead of the

In a C context, 'int' is the name of a specific type, and a component
(usually optional) of the names of several other types, but it is not an
adjective. The adjective you're looking for is "integer", not "int".

four common widths supported by many CPUs (plus signed/unsigned variations).
Perhaps dealing with all my eight types isn't a big deal after all..

Malcolm does indeed believe in having a single integer type. I don't. I
agree with him that there are too many integer types, but he would trim
the type system far more than I would, if either of us had to the power
to do so without worrying about backwards compatibility.

In your list, I wouldn't include _Bool as an 'int' type; it can be
considered a type in it's own right.

It is considered a type in it's own right, but like most other types,
that type is also defined by the standard as being a member of a
specific type category, the standard unsigned integer types (6.2.5p6).
As a result, it is also a member of several other standard-defined type
categories: unsigned integer types, integer types, basic types, real
types, arithmetic types, and scalar types. (6.2.5)

... While the char types are just synonyms
for ints, really (unless they behave in any way differently to an int of the
same width and sign?).

The char types are integer types (6.2.5p17). The int8_t and
int_*8_t types are likely to be typedefs for char types on any
machine with CHAR_BIT==8, but calling 'unsigned char" a synonym for
uint_least8_t reverses the roles of the two types.

And on typical machines I use, char is 8 bits, short is 16 bits, int is 32
bits, and long long is 64 bits.

I've already acknowledged that the 12 integer types that a conforming
implementation of C is required to support as distinct types need not
have distinct characteristics. In fact, plain char is required to have
exactly the same characteristics as either "signed char" or "unsigned
char". The important point is that, while some of them might be
implemented identically, the standard still requires that they all be
treated as distinct types, and there's no guarantees as to which pairs
of types are implemented identically. For the full set of types I listed
above, many of them are likely to be aliases for one of the 12 mandatory
types. On implementations with extended integer types, many pairs of the
other types I listed are likely to be aliases for the same extended
integer type. However, the key point is that there's no pairs in that
list which are guaranteed to be aliases for the same type; portable code
must assume that any two of them might be distinct types.

....

(I use 'int' and 'word' for signed/unsigned integers of natural width; int:N
and word:N for specific bitwidths, and int*N and byte*N for specific
bytewidths; while 'byte' is a synonym for word:8. For value ranges, I use

I strongly recommend against using non-standard definitions for terms
defined by the C standard such as 'int' and 'byte' in this newsgroup -
it can only lead to confusion. 'int' is a C standard type, which need
not have what you consider the "natural width". The C standard defines
'byte' as an "addressable unit of data storage large enough to hold any
member of the basic character set of the execution environment"; if
CHAR_BIT != 8, that doesn't match your definition.

The standard doesn't define a meaning for 'word', so that's one you can
freely muck around with.

Malcolm McLean · Sep 12, 2012

×‘×ª××¨×™×š ×™×•× ×¨×‘×™×¢×™, 12 ×‘×¡×¤×˜×ž×‘×¨ 2012 20:57:10 UTC+1, ×ž××ª Bart:

I thought Malcolm was suggesting the use of a single int type instead of the
four common widths supported by many CPUs (plus signed/unsigned variations).

Perhaps dealing with all my eight types isn't a big deal after all..

Most data is integers, strings, reals or enums. Beyond that level of granularity
it becomes very domain-specific. There's a tension between the needs of the
machine and the needs of the data. In a universal computer languge, the line
would have to be drawn to empahsise the needs of the data.

mike3 · Sep 12, 2012

Kaz Kylheku wrote:

Most are actually pretty darned smart. But when they start on their "spiel",
I give it right back to them and then some, but they do not seem to be
learning from it. I think they see everything as a pissing contest. Have you
ever seen a C or C++ programmer admit that they were wrong or that someone
else actually has a good point? They see everything as a threat. I guess
that is what a focus on sports in schools and fraterneties result in. Sigh.
I think this is why much of the world sees "Americans" as stupid. Which, of
course, begs the question, "Is USA's largest export, stupidity?".

1. So does this mean everyone in the entire "rest of the world" is
"not stupid"
in this way, and _ONLY_ Americans are?

2. Are _all_ the C/C++ programmers who refuse to admit they're wrong
or that
others' points are good, American?

BartC · Sep 13, 2012

James Kuyper said:
On 09/12/2012 03:36 PM, BartC wrote:

I strongly recommend against using non-standard definitions for terms
defined by the C standard such as 'int' and 'byte' in this newsgroup -
it can only lead to confusion.

That was an example of a different approach (in my own language) to dealing
with families of integer types. But this morning I had better luck in
finding the datatypes used by Go (a modern language with C-style syntax) and
that uses:

uint8, uint16, uint32, uint64
int8, int16, int32, int64

While C# uses:

byte, ushort, uint, ulong of sizes 8, 16, 32, 64
sbyte, short, int, long of sizes 8, 16, 32, 64

All pretty much corresponding to the 8 integer types that I said were
typical.

So where is size_t and ptr_diff_t amongst that lot? See, it's possible to
manage without it! So C's approach *does* seem untidy.

However, the context in this subthread was whether C was a suitable starting
point for a hypothetical universal language. Since such a language would
need to incorporate, amongst many other extremes, the type handling of Ada,
with that of Python, plus all the manipulations allowed by C, then that's
obviously a non-starter.

Ben Bacarisse · Sep 13, 2012

BartC said:
That was an example of a different approach (in my own language) to dealing
with families of integer types. But this morning I had better luck in
finding the datatypes used by Go (a modern language with C-style syntax) and
that uses:

uint8, uint16, uint32, uint64
int8, int16, int32, int64

Go also has int, uint and uintptr with implementation-defined widths.

While C# uses:

byte, ushort, uint, ulong of sizes 8, 16, 32, 64
sbyte, short, int, long of sizes 8, 16, 32, 64

(C# has a type char which is also considered to be an integral type.)

All pretty much corresponding to the 8 integer types that I said were
typical.

So where is size_t and ptr_diff_t amongst that lot? See, it's possible to
manage without it! So C's approach *does* seem untidy.

Go has some of these.

<snip>

Nick Keighley · Sep 13, 2012

×‘×ª××¨×™×š ×™×•× ×¨×‘×™×¢×™, 12 ×‘×¡×¤×˜×ž×‘×¨ 2012 14:31:18 UTC+1, ×ž××ª Nick Keighley:> On Sep 12, 12:40Â pm, Malcolm McLean <[email protected]>

Most modern new languages go for curly braces and a superfically at least
C-like syntax.

but considerable differences in semantics

But there are exceptions, of course. If there were no exceptions
at all then saying "we need essential C-like syntax" would be as fatuous as
saying "we should standardise on Arabic numerals".

it wasn't the syntax I had a problem with. It was the semantic wish-
list. I suspect C-like syntax isn't the "best possible" syntax but its
well known and widely used. I'm not sure what SPL is for.

James Kuyper · Sep 13, 2012

That was an example of a different approach (in my own language) to dealing
with families of integer types.

I strongly recommend conspicuously labeling all uses of such a different
approach in a C-oriented forum to prevent any possibility of confusion
with C's standard-defined terms that have the same spelling. For
instance, instead of 'int', write something like mylanguage::int.

....

So where is size_t and ptr_diff_t amongst that lot? See, it's possible to
manage without it! So C's approach *does* seem untidy.

C90's integer types have sizes that vary from one implementation to
another, which is why a typedef is needed for a type like C99's int32_t,
that has the same size on all platforms (or at least, on all platforms
where there is any supported integer type of that size). The example you
gave appeared to be of languages where the basic types have a fixed
size, which would remove the need for such typedefs. However, by the
same token, that creates the need for other typedefs: one for the
natural int type for a given platform, regardless of what size that type
is - call it natural_int or nint, corresponding to C's built-in 'int'
type (in C99, int_fast16_t is a clumsier representation of roughly the
same idea). Either way, if you have multiple similar types, you
sometimes need to have the type chosen vary with context, and you then
need something like C's typedef to record which of those types was
chosen. And that's where things like size_t and ptrdiff_t come in.

However, the context in this subthread was whether C was a suitable starting
point for a hypothetical universal language. Since such a language would
need to incorporate, amongst many other extremes, the type handling of Ada,
with that of Python, plus all the manipulations allowed by C, then that's
obviously a non-starter.

If you're giving those languages as examples because you consider them
to be the best in each of those areas, you're probably aiming too high.
An "everything" language will necessarily involve a lot of compromises;
you'll be lucky if it handles types as well as the average language;
it's not likely to handle them as well as the language (whichever one
that is) that handles types best. The same is likely to be true of every
other desirable feature of the language.

Nick Keighley · Sep 13, 2012

Most data is integers, strings, reals or enums.

chars, bools

Nick Keighley · Sep 13, 2012

1. So does this mean everyone in the entire "rest of the world" is
"not stupid"
in this way, and _ONLY_ Americans are?

2. Are _all_ the C/C++ programmers who refuse to admit they're wrong
or that
others' points are good, American?

not all C or C++ programmers are American

BartC · Sep 13, 2012

Ben Bacarisse said:
Go also has int, uint and uintptr with implementation-defined widths.

OK, int and uint are described a few lines further down. They are just
integers with default width for that machine (which I suspect will be either
32 or 64 bits).

That's even closer to my approach (no width specified it uses a default,
currently 32 bits).

uintptr will follow from that.

(That could be said to be similar to what C does with it's int and intxx_t,
except that I believe that intxx_t types are defined in terms of ints
(short, long int etc) instead of the other way around.)

John Bode · Sep 13, 2012

×‘×ª××¨×™×š ×™×•× ×¨×‘×™×¢×™, 12 ×‘×¡×¤×˜×ž×‘×¨ 2012 09:01:53 UTC+1, ×ž××ª Nick Keighley:

I'd say it needs essentially C syntax - loops, curly brackets, non-flow
control non-arithemtical functionality implemented as functions rather
than the core language. That's generally been accepted.

Haskell seems to do pretty well without loops or curly brackets.

Quicksort in Haskell:

quicksort :: Ord a => [a] -> [a]
quicksort [] = []
quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
where
lesser = filter (< p) xs
greater = filter (>= p) xs

Of course, there *is* no universal hammer, because the universe is too
goddamned big. What works really well for OS kernels and device drivers
would be painful to use for Web apps, which would in turn be painful to use
for serious number crunching.

Ben Bacarisse · Sep 13, 2012

BartC said:
OK, int and uint are described a few lines further down. They are just
integers with default width for that machine (which I suspect will be either
32 or 64 bits).

That's even closer to my approach (no width specified it uses a default,
currently 32 bits).

uintptr will follow from that.

I wasn't commenting on your approach (I don't know what that is). I
thought you were citing Go as an example of a language that did without
implementation-defined integer sizes.

<snip>

Keith Thompson · Sep 13, 2012

BartC said:
That was an example of a different approach (in my own language) to dealing
with families of integer types. But this morning I had better luck in
finding the datatypes used by Go (a modern language with C-style syntax) and
that uses:

uint8, uint16, uint32, uint64
int8, int16, int32, int64

While C# uses:

byte, ushort, uint, ulong of sizes 8, 16, 32, 64
sbyte, short, int, long of sizes 8, 16, 32, 64

All pretty much corresponding to the 8 integer types that I said were
typical.

So where is size_t and ptr_diff_t amongst that lot? See, it's possible to
manage without it! So C's approach *does* seem untidy.

[...]

Sure, it's untidy, but IMHO necessary, at least for C.

Having a range of integer types with specified sizes can be
convenient, but they don't eliminate the need for types defined
in terms of how they're used rather than how they're represented.
C's size_t, in particular, has a size defined by the compiler,
not by its size.

There are several possible approaches that can be used in defining
built-in integer types for a language.

You can start with a set of types with exactly specified sizes,
as Go and C# do; you might then define other types in terms of those.

Or you can define a set of types with implementation-defined sizes,
as C has always done; C99 then added the intN_t types that are
defined in an implementation-defined manner in terms of those.

The former approach probably would not have been practical, say,
40 or 50 years ago, because there were existing systems whose
hardware-supported types were multiples of 6, 8, or 9 bits.
That's the environment in which C was developed.

The industry seems to have settled on 8, 16, 32, and 64 bits as the
usual sizes for hardware-supported integers, and 2's-complement
as the representation for signed integers. The designers of Go
and C# have chosen to build that assumption into their languages.
That probably made good sense, but I wonder how that decision will
look 40 or 50 years from now.

Keith Thompson · Sep 13, 2012

BartC said:
OK, int and uint are described a few lines further down. They are just
integers with default width for that machine (which I suspect will be either
32 or 64 bits).

That's even closer to my approach (no width specified it uses a default,
currently 32 bits).

uintptr will follow from that.

(That could be said to be similar to what C does with it's int and intxx_t,
except that I believe that intxx_t types are defined in terms of ints
(short, long int etc) instead of the other way around.)

The word "int" is not a collective term for the types short, int, and
long. Yes the word does appear in some forms of the names of the types
(short int, long int), but IMHO it's best *not* to think of "short int"
as a phrase in which "short" modifies "int". short int, int, and long
int are all *integer* types.

"int" is a single type (also known as "signed int"). "short" and "long"
are distinct types, whichever of their several names you use.

To address your point, the intN_t and uintN_t types are defined
as typedefs, i.e., as aliases for existing types. In most
implementations, the chosen existing types are predefined types
such as char, short, int, long, or long long, or their signed
or unsigned variants. They can also be defined as typedefs for
implementation-defined "extended" types, but I don't know of any
compiler that does so. Types such as short, int, and long are built
into the language, and syntactically, their names are composed of
keywords. See section 6.2.5 of the C standard for more information
about predefined types.

Nick Keighley · Sep 14, 2012

On 09/13/2012 06:36 AM, BartC wrote:

If you're giving those languages as examples because you consider them
to be the best in each of those areas, you're probably aiming too high.
An "everything" language will necessarily involve a lot of compromises;
you'll be lucky if it handles types as well as the average language;
it's not likely to handle them as well as the language (whichever one
that is) that handles types best. The same is likely to be true of every
other desirable feature of the language.

which leaves me wondering what purpose this "everything language"
serves

Nick Keighley · Sep 14, 2012

chars, bools

functions, monads, continuations..

C as a scripting language	88	Mar 26, 2009
On the development of C	211	Mar 9, 2009
In the Matter of Herb Schildt: a Detailed Analysis of "C: TheComplete Nonsense"	109	Apr 3, 2010
Are c++ features a subset of java features?	148	Jan 19, 2007
binary encode 7 ([7].pack("C")) as "\007" instead of "\a"	3	Jul 30, 2010
As a programmer of both languages...	39	Dec 11, 2007
ANN: C Compiler Update Available	7	Jun 2, 2009
C++ Now 2013 Call for Submissions	0	Oct 31, 2012

C as a Subset of C++ (or C++ as a superset of C)

BartC

James Kuyper

Keith Thompson

BartC

James Kuyper

Malcolm McLean

mike3

BartC

Ben Bacarisse

Nick Keighley

James Kuyper

Nick Keighley

Nick Keighley

BartC

John Bode

Ben Bacarisse

Keith Thompson

Keith Thompson

Nick Keighley

Nick Keighley

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads