Does your C compiler support "//"? (was Re: using structures)

D

Douglas A. Gwyn

Keith said:
The fundamental problem, I think, is that C's integer type system
isn't uniform enough. The type names are all keywords or sequences of
keywords with a very limited grammar, and it's far too easy for unwary
programmers to create dependencies on certain types being certain
sizes.

All width dependencies should be expressd in terms of the
<stdint.h> typedefs.

It is certainly true that as machines have grown, C's type
system has become strained. It already caused trouble when
C89 legislated sizeof(char)==1, making it necessary to add
a whole duplicate set of character/string facilities.

Another problem is that as a historical artifact of the way
the PDP-11 implementation worked, the smallest types get
promoted to intermediate sizes way too readily. Personally
I'm not a fan of "mixed mode arithmetic", preferring that
all conversions be explicit, but that's not going to happen
for C. The integer promotion rules are a problem when
introducing new integer types, and there was quite an
argument about what they should be for long long and the
(optional) extended integer types.

We discussed for a while the possibility of adding much
better support for "requirements-based integer type
specification", including endianness, representation, and
other attributes as well as the usual signedness and width,
but it was deemed too radical a change to be making until
we have seen results from a similar experimental
implementation (which we encouraged, but haven't yet seen).
We were able to agree upon <inttypes.h> and <stdint.h> due
to there being sufficient existing practice to judge its
merits. Now all we have to do is to get some recalcitrant
folks to use them when appropriate, i.e. work with us
instead of fighting us.
One possible solution that's consistent with what's been done so far
might be to allow a "short short" type. ...
(and maybe "long char" rather than "wchar_t").

In 1986 I proposed "short char" as the unit for sizeof.
At the time some new computer architectures were being
designed, and many of us wanted to see direct bit
addressability, but lacking any way to get at it from
portable high-level languages, management wouldn't allow
it.

While we could still add more general integer type
specification to C, it is really too late to fix all its
problems. I sincerely hope that designers of future
procedural programming languages will learn from the
mistakes of the past and do it right, but experience
suggests otherwise.
 
D

Douglas A. Gwyn

Keith said:
The point is that choosing to make "int" 16 bits on the 68000 is a
perfectly reasonable choice, though not the only one.

Indeed, the MC68000 compiler that I still use and maintain
has 16-bit ints.
 
A

Alexander Terekhov

Douglas A. Gwyn said:
That is even less reasonable! What POSIX *should* have done,
if there was a requirement for an octet type, was to specify
support for int8_t.

Funny. "The restriction that a byte is now exactly eight bits
was a conscious decision by the standard developers. It came
about due to a combination of factors, primarily the use of the
type int8_t within the networking functions and the alignment
with the ISO/IEC 9899:1999 standard, where the intN_t types are
now defined.

According to the ISO/IEC 9899:1999 standard:

The intN_t types must be two's complement with no padding
bits and no illegal values.

All types (apart from bit fields, which are not relevant here)
must occupy an integral number of bytes.

If a type with width W occupies B bytes with C bits per byte
(C is the value of {CHAR_BIT}), then it has P padding bits
where P + W = B * C.

Therefore, for int8_t P=0, W=8. Since B>=1, C>=8, the only
solution is B=1, C=8.

The standard developers also felt that this was not an undue
restriction for the current state-of-the-art for this version
of IEEE Std 1003.1, but recognize that if industry trends
continue, a wider character type may be required in the
future."

regards,
alexander.
 
D

Douglas A. Gwyn

Alexander said:
Funny. "The restriction that a byte is now exactly eight bits
was a conscious decision by the standard developers. It came
about due to a combination of factors, primarily the use of the
type int8_t within the networking functions and the alignment
with the ISO/IEC 9899:1999 standard, where the intN_t types are
now defined.

According to the ISO/IEC 9899:1999 standard:

The intN_t types must be two's complement with no padding
bits and no illegal values.

All types (apart from bit fields, which are not relevant here)
must occupy an integral number of bytes.

If a type with width W occupies B bytes with C bits per byte
(C is the value of {CHAR_BIT}), then it has P padding bits
where P + W = B * C.

Therefore, for int8_t P=0, W=8. Since B>=1, C>=8, the only
solution is B=1, C=8.

The standard developers also felt that this was not an undue
restriction for the current state-of-the-art for this version
of IEEE Std 1003.1, but recognize that if industry trends
continue, a wider character type may be required in the
future."


Yet another example (starting with the "feature test" macros
in 1988) where the POSIX people have not understood what was
suggested by the Standard C people. The networking functions
certainly should be using uint8_t since they explicitly deal
with octet streams, and POSIX can reasonably require support
for that typedef in the C implementation. The *logical*
consequence is that, as the C standard currently stands,
uint8_t is most likely implemented as a synonym for unsigned
char on POSIX-supporting systems. (int8_t might not be a
synonym for signed char, if the underlying hardware encourages
ones-complement or sign-magnitude representation.) The talk
about a "wider character type" seems oblivious to the
difference between the byte type, which is called "char" in C,
and the type used to encode the full character set, which is
called "wchar_t" in C. So long as "char" is the basic unit
of object addressability and POSIX insists on uint8_t, they
are going to be the same size on a POSIX-supporting platform.
We might give implementations more freedom in some future
revision of the C standard; at present, if they want a wider
byte type, as is reasonable when the memory exists in 16-bit
wide units, as it does on some systems I work with, or when
it is desired to use 16-bit Unicode with the older style of
character constants, string literals, and the "char" type,
then {u}int8_t cannot be supported. Frankly, this is an error
in the original C standard (byte=character) that we ought to
have fixed in 1989, and could conceivably fix some day
(although a lot of code now relies on that property and would
need to be cleaned up as implementations start to take
advantage of the postulated new flexibility.)

The problem with specifying that type "char" has a width of
8 bits is that then the programmers following those
guidelines will use "char" when "uint8_t" is what was meant.
That causes loss of portability to a potential future where
the two aren't necessarily coupled, and it hides the code's
requirements for an exactly-8 type by using a name that does
not universally have that meaning. (Thus, the code would
compile quietly on a different platform, and the inherent
bugs could be mysterious and hard to track down.) It is only
good programming practice to "say what you mean", thus uint8_t.
 
D

Dan Pop

In said:
If you assume that a "16-bit host" can only have 16-bit addresses (and
therefore no more than 64 kbytes of memory -- perhaps 64k each I and D
spaces), that's probably true. But it can make sense for a system to
have 16-bit ints and larger addresses. The 68000 is one example; I
think some of the earlier members of the x86 family also qualify.

As far as hosted implementations are concerned, both the 68000 and the
earlier members of the x86 family have been dead and buried for the last
15 years.

Dan
 
D

Dan Pop

In said:
Note that I was referring to the 68000, not necessarily to the later
versions of the 68k family.

Motorola's documentation and instruction format names refer to 16-bit
quantities as "words", and to 32-bit quantities as "long-words". That
suggests it's a 16-bit processor.

On the other hand, the data and address registers are 32 bits each,
which argues for it being a 32-bit processor.

Things are quite clear: the 68000 is a 32-bit processor at the
architecture level and a 16-bit processor at the microarchitecture
level. It has 32-bit general purpose registers and instructions operating
on them, but the ALU and internal data paths are 16-bit, therefore 32-bit
instructions are slower than their 16-bit counterparts.

If I were to implement C on the 68000, I'd support both 16-bit int's and
32-bit int's and let the user choose what int size suits his needs best.

Dan
 
D

Dan Pop

In said:
No, you mean only to conforming freestanding implementations.

Obviously: who cares about non-conforming implementations in comp.std.c?
There is a trade-off such that requiring too much doesn't
improve portaibility but rather discourages standard
conformance, and for freestanding implementations conformance
to the language portion is quite important while conformance
to the library is secondary.

Apparently, requiring double precision floating point support for
conforming freestanding implementations was already too much, as this
seems to be the first standard C feature that many freestanding
implementations don't support. So, if the committee was trying for
a trade-off, it failed!
The committee was not "stupid" just because they decided on
the basis of factors you personally don't care about. Since
the computing environment changes over time, trade-offs also
changes, and after 20 years perhaps a different decision can
be made.

Some of the processors used in embedded control applications haven't
changed since the release of C89.
I suppose you must mean type-generic functions.

I'm not stupid enough to write <math.h> when I mean <tgmath.h>.
I was talking about the single precision versions of the usual math
functions, that were missing from the C89 <math.h> but which are present
in the C99 said:
That raises
the bar for compilers, and imposing it on small systems that
might not even have substantial use of floating point is
something that must be weighed carefully.

Nope, it doesn't raise any bar: many embedded control applications that
need floating point also need such functions and the implementor is
often in a better position for providing highly optimised versions
than the programmer. And the vast majority of embedded control
applications that need floating point arithmetic are perfectly happy
with single precision.

So, a clever trade-off would have been to drop double precision support
and include the single precision contents of <math.h> in the requirements
for conforming freestanding implementations. And the clever way of doing
that without creating other problems (see the default argument promotion
rules) is to allow DBL_DIG and LDBL_DIG to be as low as 6 on freestanding
implementations (so that supporting double and long double comes at no
real cost to the runtime support). This way, if it makes sense on the
underlying hardware, the freestanding implementation can still provide
genuine double and long double support, while if it doesn't, double
and long double merely become aliases for float.

Dan
 
K

Kevin Easton

In comp.lang.c Keith Thompson said:
The fundamental problem, I think, is that C's integer type system
isn't uniform enough. The type names are all keywords or sequences of
keywords with a very limited grammar, and it's far too easy for unwary
programmers to create dependencies on certain types being certain
sizes.

One possible solution that's consistent with what's been done so far
might be to allow a "short short" type. Then you could have:

Who likes short shorts?

- Kevin.
 
K

Keith Thompson

Ben Pfaff said:
How about "long char"? "short long"?

You snipped my suggestion of "long char" as a replacement for "wchar_t".

"short long" is just too ugly and ambiguous.
 
D

Douglas A. Gwyn

Dan said:
So, a clever trade-off would have been to drop double precision support
and include the single precision contents of <math.h> in the requirements
for conforming freestanding implementations. And the clever way of doing
that without creating other problems (see the default argument promotion
rules) is to allow DBL_DIG and LDBL_DIG to be as low as 6 on freestanding
implementations (so that supporting double and long double comes at no
real cost to the runtime support). This way, if it makes sense on the
underlying hardware, the freestanding implementation can still provide
genuine double and long double support, while if it doesn't, double
and long double merely become aliases for float.

That proposal, if it were made at an appropriate time
(which it was not), would certainly merit serious consideration.
One lkely counterargument is that programmers use type double
when they need more precision than C guarantees for type float,
and if they don't need the extra precision then they ned to use
the appropriate type, rather than changing the rules for what
the types mean.
 
P

Paul Eggert

you don't have to waste time worrying about
them. Just use C's data types portably instead of making
unnecessary assumptions.

Making the assumptions saves us time. So they're useful.

The assumptions may not be strictly "necessary" by your definition of
"necessary", since we obviously could spend more of our time and
remove them. But we've got many more important things to do than
worrying about porting to 16-bit int hosts.

There is a similar situation with K&R C. Some people still insist on
writing code that is portable to K&R C compilers. They are few, but
valiant. They have to do extra work to maintain portability to those
ancient, nonstandard hosts. But I gave up porting to K&R C some time
ago, just as I gave up (long ago) on porting to 16-bit hosts. It's
not worth my time any more, in either case.

Nobody would even think that int always has 32 bits if you hadn't
been misinforming them.

The GNU project hasn't misinformed anybody; it's always been quite
clear in the coding standards that 16-bit int machines exist, but
they're not supported.

Perhaps you think POSIX 1003.1-2001 is misinforming people? I
seriously doubt it. I'd guess that fewer than 1 in 1000 C programmers
even know that POSIX 1003.1-2001 requires 32-bit int. And of the few
that do, I'd guess all of them know that C allows 16-bit int.
If they need 32 bits they should use another type.

In the POSIX world, 'int' is fine for that. 'long' has more problems
than 'int', as it might be too wide and might take up too much space.
The C99 int zoo isn't universal yet, so portable programs can't assume
those types yet. (And besides, most C programmers probably have
trouble keeping track of them all; they're too confusing.)
there are better ways to obtain that property than by
limiting one's programs to only POSIX platforms.

Porting to non-POSIX platforms is not a goal for POSIX programmers,
so we by and large don't care about what you are calling "better ways".
They cost us extra work, so they're not better for us.
 
P

Paul Eggert

It hardly seems like a problem if some 16-bit-int POSIX
platforms happened to exist, and there was no logical
reason to preclude them.

This is not a simply a matter of logic; it is a matter of pragmatics.
Any logical argument against POSIX 1003.1-2001 requiring at-least-32-bit int
applies equally well against C99 requiring at-least-16-bit int.
 
K

Keith Thompson

Paul Eggert said:
At Wed, 24 Sep 2003 04:35:14 -0400, "Douglas A. Gwyn"


This is not a simply a matter of logic; it is a matter of pragmatics.
Any logical argument against POSIX 1003.1-2001 requiring at-least-32-bit int
applies equally well against C99 requiring at-least-16-bit int.

C99 requires at-least-16-bit int because the standard on which it's
based, C90, requires at-least-16-bit int. POSIX may have good reasons
for requiring at-least-32-bit int, but it doesn't have that one.
 
D

Dan Pop

In said:
That proposal, if it were made at an appropriate time
(which it was not), would certainly merit serious consideration.
One lkely counterargument is that programmers use type double
when they need more precision than C guarantees for type float,
and if they don't need the extra precision then they ned to use
the appropriate type, rather than changing the rules for what
the types mean.

Agreed. But the purpose of the proposal is to accomodate implementations
where double precision support doesn't make sense (and is not provided,
in practice) with a minimum of changes to the standard. Code that
requires genuine double precision support is not ported to such
implementations and it is trivial to check whether the double precision
support of a given implementation is enough for the program's needs or
not:

#if DBL_DIG < 14 || DBL_MAX_10_EXP < 100
#error This program requires better double precision support.
#endif

which is a good idea, anyway, given that the standard only guarantees
10 for DBL_DIG and 37 for DBL_MAX_10_EXP.

Dan
 
N

Nick Keighley

Douglas A. Gwyn said:
Keith Thompson wrote:

All width dependencies should be expressd in terms of the
<stdint.h> typedefs.

It is certainly true that as machines have grown, C's type
system has become strained. It already caused trouble when
C89 legislated sizeof(char)==1, making it necessary to add
a whole duplicate set of character/string facilities.

Another problem is that as a historical artifact of the way
the PDP-11 implementation worked, the smallest types get
promoted to intermediate sizes way too readily. Personally
I'm not a fan of "mixed mode arithmetic", preferring that
all conversions be explicit, but that's not going to happen
for C. The integer promotion rules are a problem when
introducing new integer types, and there was quite an
argument about what they should be for long long and the
(optional) extended integer types.

We discussed for a while the possibility of adding much
better support for "requirements-based integer type
specification", including endianness, representation, and
other attributes as well as the usual signedness and width,
but it was deemed too radical a change to be making until
we have seen results from a similar experimental
implementation (which we encouraged, but haven't yet seen).
We were able to agree upon <inttypes.h> and <stdint.h> due
to there being sufficient existing practice to judge its
merits. Now all we have to do is to get some recalcitrant
folks to use them when appropriate, i.e. work with us
instead of fighting us.


In 1986 I proposed "short char" as the unit for sizeof.
byte?


At the time some new computer architectures were being
designed, and many of us wanted to see direct bit
addressability, but lacking any way to get at it from
portable high-level languages, management wouldn't allow
it.

While we could still add more general integer type
specification to C, it is really too late to fix all its
problems. I sincerely hope that designers of future
procedural programming languages will learn from the
mistakes of the past and do it right, but experience
suggests otherwise.

what would be the right way to do it? Have any other languages
managed better? Ada?.

Perhaps someone should write a book on "Arithmetic Types For
Procedural Languages". I'd read it...


--
Nick Keighley

Most Ada programmers would consider going out of your way to construct
an Ada program that had a potential buffer overflow not as a
challenge,
but as a kind of pornography.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,146
Messages
2,570,832
Members
47,374
Latest member
anuragag27

Latest Threads

Top