C language portability support

M

Mike Wahler

asm said:
Hi All,

Like typdef, does C have further support for portability?


Yes, C has 'support' for portablility, in that it
is by definition a portable, platform-independent language.
However, one must follow the language rules in order
to achieve said portability.

-Mike
 
M

Malcolm

asm said:
Like typdef, does C have further support for portability?
"typedef" is sort of a support for portability, in that it is often possible
to integrate with non-C libraries on different platforms by typedefing
parameters. For instance, Windows 3.1 used a 16-bit "WPARAM" and a 32-bit
"LPARAM". When Microsoft upgraded to Windows 95, processors were 32 bit, so
it made sense to make the "WPARAM" 32 bits. Because of typedeffing,
programmers didn't need to rewrite code.

However this is an exception. Generally you should use int for numbers which
are unlikely to go above 30,000 (such as the number of pupils in a school),
and long for numbers unlikely to go above 2 billion (such as the number of
students in a university). Then the code will run acceptably on any
conforming system.

You will quite commonly see conditional defines in code. #ifdef
SUN_MICROSYSTEMS #define endiannness 1 etc.

Again this is largely a mistake, since it makes programs very difficult to
test, to read, and to maintain. However sometimes it is necessary,
particularly because the C rubtime libraies are somewhat deficient, and
provide no support for display management.
 
P

Paul Hsieh

"typedef" is just a way of naming an arbitrary abstract data type. It
has nothing to do with portability per se (except that all C compilers
support it). You may be thinking of "#ifdef" which you misheard from
some conversation.
Yes, C has 'support' for portablility, in that it
is by definition a portable, platform-independent language.

Excuse me, but C is *NOT* a platform-independent language. It
specifically states in the ANSI standard that certain operations such
as the right shift of signed integers, or the signedness of char are
platform specific. By the same token C is not a portable language.

Virtually every C compiler exposes important platform specific
functionality *by neccessity* and the vast majority of C programs
specifically leverage these extensions which makes them non-portable.

Compare this with Java, where the only way to make a non-portable Java
program is to chew through insane amounts of memory (different
platforms will fault at different times), or have a race condition
(different timings will cause the race condition to manifest in
different ways).
However, one must follow the language rules in order to achieve said
portability.

No real world programmers deliberately follow any "rules" that would
make a C program completely portable. People may do so inadvertently
and usually if their program is completely trivial. Remember that the
real purpose of the C language is for making UNIX-style command line
utilities. Anything else you try to do with the language will, by
itself, usually lead you away from portability (just using floating
point on x86, where "double"s can have intermediate calculations of
either 64 or 80 bits depending on the compiler *version*), and just
ordinary "expected usage" (<isspace(0x80)> is UB, <int digit =
(char)cdig - '0';> is not portable) will drive you toward "platform
depdendence" with respect to the C language.
 
C

Chris Torek

No real world programmers deliberately follow any "rules" that would
make a C program completely portable.

But good real-world C programmers will often follow many "rules" that
will make their C programs far more portable than if they fail to
follow them.
People may do so inadvertently

Sensible people do so deliberately.
and usually if their program is completely trivial.

"Complete" portability will indeed only result for a quite limited
subset of "all possible C programs", but not all such programs
should be labeled "trivial", in my opinion. (For instance, yacc/bison
and lex/flex are not "trivial", yet can be written entirely in
portable C.)
... Anything else you try to do with the language will, by
itself, usually lead you away from portability (just using floating
point on x86, where "double"s can have intermediate calculations of
either 64 or 80 bits depending on the compiler *version*),

True enough -- but the same problem occurs in other languages as
well. The x86 in particular is especially troublesome, because
"fast" code invariably leaves values inside the FPU, where they
have "too much" precision, even if you use precision-control code
to fiddle with the FPU control word. (In particular, the exponent
is always held in 80-bit format, even when the mantissa is rounded,
so that single and double precision values do not overflow to
infinity correctly.)
and just ordinary "expected usage" (<isspace(0x80)> is UB, <int digit =
(char)cdig - '0';> is not portable)

The isspace() function is defined over the domain union{EOF,
all_possible_unsigned_chars}, and UCHAR_MAX must be at least 0xff.
Since 0x80 (128) is between 0 and 255 inclusive, isspace(128) is
well-defined.

Likewise, if "cdig" is a value in '0' through '9', the line:

int digit = (char)cdig - '0';

is guaranteed to set digit to 0 through 9 correspondingly. Moreover,
if you write instead:

unsigned int digit = (unsigned)cdig - '0';

you can then test whether cdig was a valid digit character:

if (digit > 9)
printf("%c was not a valid digit\n", cdig);

also with complete portability (given UCHAR_MAX <= UINT_MAX and
the rules for unsigned arithmetic).

There is nothing inherently wrong with using platform- or
machine-dependent code in C programs, of course. But comp.lang.c
is not the best place to talk about such code; instead, here we
can talk about how to *avoid* such code if and when it is reasonable
to do so (e.g., when dealing with text files, rather than graphical
interfaces).
 
M

Malcolm

Paul Hsieh said:
No real world programmers deliberately follow any "rules" that would
make a C program completely portable. People may do so inadvertently
and usually if their program is completely trivial.
You are confusing "trivial" with "doesn't manipulate hardware". A large
number of programs use a GUI, or network, or similar, however not all,
particularly if you are writing scientific or similar programs. Often the
only "non-portable" feature needed is directory support.
Also, a lot of programs have inherently portable components. For instance a
spell-checker is most useful when integrated with a wordprocessor, but the
routine itself neden't have any platform dependencies.
 
P

pete

Chris Torek wrote:
Likewise, if "cdig" is a value in '0' through '9', the line:

int digit = (char)cdig - '0';

is guaranteed to set digit to 0 through 9 correspondingly.

I think (int) would be a better cast, if a cast is needed.
 
M

Mark F. Haigh

"typedef" is just a way of naming an arbitrary abstract data type. It
has nothing to do with portability per se (except that all C compilers
support it). You may be thinking of "#ifdef" which you misheard from
some conversation.

Are you being deliberately obtuse? Whether it has anything to do with
portability "per se", it's widely used to aid portability. The
current C standard even explicitly uses it for that purpose:

7.18.1.1 Exact-width integer types

1 The typedef name intN_t designates a signed integer type with
width N. Thus, int8_t denotes a signed integer type with a
width of exactly 8 bits.

2 The typedef name uintN_t designates an unsigned integer type
with width N. Thus, uint24_t denotes an unsigned integer type
with a width of exactly 24 bits.
Excuse me, but C is *NOT* a platform-independent language. It
specifically states in the ANSI standard that certain operations such
as the right shift of signed integers, or the signedness of char are
platform specific. By the same token C is not a portable language.

Virtually every C compiler exposes important platform specific
functionality *by neccessity* and the vast majority of C programs
specifically leverage these extensions which makes them non-portable.

What's your point? C can be used to write portable programs that can
run on anything from the largest supercomputers to tiny
microcontrollers. It can also be used to write hideously unportable
programs that break every time the compiler vendor issues a point
release.
Compare this with Java, where the only way to make a non-portable Java
program is to chew through insane amounts of memory (different
platforms will fault at different times), or have a race condition
(different timings will cause the race condition to manifest in
different ways).

To me, C is *much* more portable than Java. There are many, many
platforms out there that do not have anything even remotely
approximating a Java runtime, but have a competent C compiler.
No real world programmers deliberately follow any "rules" that would
make a C program completely portable. People may do so inadvertently
and usually if their program is completely trivial. Remember that the
real purpose of the C language is for making UNIX-style command line
utilities.
<snip>

Now you're just being silly. You follow the portability rules as much
as possible, and isolate platform specific code as much as possible.
Avoid undefined behavior, and isolate implementation defined behavior.

I suppose you'll find some reason that does not qualify as a "rule".


Mark F. Haigh
(e-mail address removed)
 
C

Chris Torek

I think (int) would be a better cast, if a cast is needed.

The cast is indeed not needed at all; I used "char" simply because that
was the original example I was quoting.

Note that while it *is* OK to assume that the digit-characters '0'
through '9' are contiguous and sequential -- i.e., '0' + 1 gives '1',
'1' + 1 gives '2', and so on through '8' + 1 == '9' -- it is *not*
OK, in terms of C portability, to assume that the same holds for
alphabetic characters. In particular, EBCDIC (used on some IBM
mainframe machines for instance) has gaps between 'I' and J', and
again beween 'R' and 'S': while 'H' + 1 gives 'I', you then have
to add 7 or 8 (I forget which) to 'I' to get 'J'.

Again, it is not *wrong* to simply "assume ASCII", but it does
limit portability. If the tradeoff you gain (whoever "you" may be
at the time, and whatever it is you gain) is worth the loss in
portability (however large or small you may think it is), go ahead
and give up the portability. Just be aware that you are making
the assumption "this system uses ASCII", and if possible, document
it or test for it.

(These days one might even want to "assume Windows character set Q"
or "assume UTF-8" or some such, in various applications. Just be
aware that not every system has these things. If you *need* them,
you need them, and you have to discard portability in that part of
the program. If you can isolate the non-portable parts of the code
from the portable parts, that will help when it comes time to port
the program.)
 
P

Paul Hsieh

Are you being deliberately obtuse? Whether it has anything to do with
portability "per se", it's widely used to aid portability.

You mean the typedefs used header files for some C compilers? This is
a very marginal claim about creating portability. And more to the
point you typically can't use the include files from one compiler in
another. When you have platform specific functionality which you want
to expose in a portable way, you end up having to leverage #ifdef's in
combination with already existing portable interfaces. There's no
sense in which you rely in particular on typedef, other than using it
to potentially expose a common naming.
[...] The
current C standard even explicitly uses it for that purpose:

7.18.1.1 Exact-width integer types

1 The typedef name intN_t designates a signed integer type with
width N. Thus, int8_t denotes a signed integer type with a
width of exactly 8 bits.

2 The typedef name uintN_t designates an unsigned integer type
with width N. Thus, uint24_t denotes an unsigned integer type
with a width of exactly 24 bits.

What the hell has this got to do with portability? First of all its
in reference to a C standard that is not widely adopted at all. So in
fact use of intN_t will specifically destroy any chance of
portability. Secondly, integers in C don't have a predescribed
implementation (like 2s complement, for example), so the int16_t on
one platform doesn't have to behave the same way as it might on any
other platform (except maybe having the same "sizeof").
What's your point? C can be used to write portable programs that can
run on anything from the largest supercomputers to tiny
microcontrollers. It can also be used to write hideously unportable
programs that break every time the compiler vendor issues a point
release.

The point is that there is no clear dividing line between these two
classes of programs in C. And the vast majority of the functionality
of your platform will mostly be in these non-portable extensions.
To me, C is *much* more portable than Java. There are many, many
platforms out there that do not have anything even remotely
approximating a Java runtime, but have a competent C compiler.

You are confusing availability with portability. By this reasoning
assembly/machine language is even more portable since you have to
start there to bootstrap any C compiler.

To the best of my knowledge, in fact, there are no two C compilers in
existence which are portable with each other (Intel C++ and MSVC are
close since they actually share libraries and include files, but each
contains different semantics and extensions). By contrast there are
no two Java compilers/implementation which are *not* portable (modulo
bugs).
<snip>

Now you're just being silly. You follow the portability rules as much
as possible, and isolate platform specific code as much as possible.
Avoid undefined behavior, and isolate implementation defined behavior.

I suppose you'll find some reason that does not qualify as a "rule".

No, my claim is that such rules are just not followed. Who the hell
rigorously test for signedness independence of "char"? C specifically
says that char can be either signed or unsigned. And who keeps track
of the difference between POSIX and ANSI? And if you are doing heavy
floating point, you can't do anything to isolate such platform
specific behaviors.
 
P

Paul Hsieh

Malcolm said:
You are confusing "trivial" with "doesn't manipulate hardware". A large
number of programs use a GUI, or network, or similar, however not all,
particularly if you are writing scientific or similar programs.

But I am also including the signedness of char, indeterminancy of
sizeof(int) and sizeof(double) and sizeof(float), what happens when
you right shift a signed integer, how do integers overflow, POSIX, and
so on.
[...] Often the
only "non-portable" feature needed is directory support.
Also, a lot of programs have inherently portable components. For instance a
spell-checker is most useful when integrated with a wordprocessor, but the
routine itself neden't have any platform dependencies.

But a spell checker is not a complete application -- and like I said,
is really just comparable to a command line UNIX utility (in fact, it
*IS* a UNIX utility, as I recall.)
 
P

Paul Hsieh

Chris Torek said:
But good real-world C programmers will often follow many "rules" that
will make their C programs far more portable than if they fail to
follow them.


Sensible people do so deliberately.

But my point is that they *fail*. The reason is simple -- the code
continues to work and is correct for their platform even as they drift
to non-portability. So there is no feedback mechanism by which a
programmer can realize that what they are doing is non-portabile.
Compare this to Java, where there is no choice about portability --
its just is.
"Complete" portability will indeed only result for a quite limited
subset of "all possible C programs", but not all such programs
should be labeled "trivial", in my opinion. (For instance, yacc/bison
and lex/flex are not "trivial", yet can be written entirely in
portable C.)

I said *usually*. Later on in my post I also made mention of UNIX
command line utilities (which include yacc/bison.)
True enough -- but the same problem occurs in other languages as
well. The x86 in particular is especially troublesome, because
"fast" code invariably leaves values inside the FPU, where they
have "too much" precision, even if you use precision-control code
to fiddle with the FPU control word. (In particular, the exponent
is always held in 80-bit format, even when the mantissa is rounded,
so that single and double precision values do not overflow to
infinity correctly.)

This not exactly a correct characterization of the x86. The x86 has a
64bit "mode flag" which, in fact, has been turned on in the modern x86
compilers. The problem is with the C *STANDARD* that says x86
compilers could do whatever they want (and did, since in the early 90s
the extra precision was seen as an advantage).
 
E

E. Robert Tisdale

Paul said:
"typedef" is just a way of naming an arbitrary abstract data type.

If you are *really* concerned about portability,
none of the built-in types should appear in the body of your code.
Instead, you should substitute your own type definitions
for all of the built-in types -- in a header file:

typedef int myInteger;
typedef float mySingle;
typedef double myDouble;
typedef size_t myExtent;
// etc.
#define myIntegerMax INT_MAX
// etc.

Remember that
portability is *not* about writing code that compiles everywhere.
It's about writing code that is easy to modify
for each target platform.
The best strategy is to *sequester* platform dependent code
into header files and subprograms
that are easy to locate and modify or replace.
 
K

Keith Thompson

E. Robert Tisdale said:
If you are *really* concerned about portability,
none of the built-in types should appear in the body of your code.
Instead, you should substitute your own type definitions
for all of the built-in types -- in a header file:

typedef int myInteger;
typedef float mySingle;
typedef double myDouble;
typedef size_t myExtent;
// etc.
#define myIntegerMax INT_MAX
// etc.

The "typedef double myDouble;", for example, is not useful unless
there's a possibility that myDouble could be defined as something
other than double (and if there is, myDouble is a poor choice of
name).

Similarly, if you need size_t, use size_t; that's what it's for.
Hiding behind your own typedef just obfuscates the code.
 
M

Malcolm

Paul Hsieh said:
What the hell has this got to do with portability? First of all its
in reference to a C standard that is not widely adopted at all. So in
fact use of intN_t will specifically destroy any chance of
portability. Secondly, integers in C don't have a predescribed
implementation (like 2s complement, for example), so the int16_t on
one platform doesn't have to behave the same way as it might on any
other platform (except maybe having the same "sizeof").
That is sadly the case. If C99 was widely implemented we could use int32_t
in the rare cases we actually require 32 bits, safe in the knowledge that
any platform that didn't support it was so weird that it wouldn't be worth
trying to run on it anyway.
The point is that there is no clear dividing line between these two
classes of programs in C. And the vast majority of the functionality
of your platform will mostly be in these non-portable extensions.
Of course. The stdin / stdout model isn't acceptable to many users, who will
demand a graphical interface. For a lot of applications, like games, this is
required.
ANSI could have done the development community a big favour by declaring a
struct POINT, so that 3d libraries all use the same structure for
co-ordinates, and a struct COLOUR (I suppose that would have to be COLOR) so
that images all use the same system of colour naming.

It then could have declared a few very simple functions, like OpenWindow()
to allow for management of graphics. Third parties would soon build
comprehensive libraries on the top of drawpixel(), querymouse(), and so on.
It still wouldn't be easy to implement a fully-functional GUI app, but
something like a Mandelbrot could be very simply written.
No, my claim is that such rules are just not followed. Who the hell
rigorously test for signedness independence of "char"? C specifically
says that char can be either signed or unsigned. And who keeps track
of the difference between POSIX and ANSI? And if you are doing heavy
floating point, you can't do anything to isolate such platform
specific behaviors.
The question is, by "portability" do we mean bit-for-bit correspondence
between input and output, or do we mean equivalent functionality? For
instance I might write a calulator-type program that calculates PI and
stores it in a double. Only occasionally would you demand the exact same
output, normally as long as the last digit is within one of the answer, and
there is a reasonable degree of precision, the program would be accepted as
"correct".

On the other hand, if you are writing a video game, a common problem is
"stitching", or one-pixel wide holes in the image caused by rounding errors.
This does mean that it is impossible to test routines that use floating
point on ne platform, and expect them to work correctly on another.
 
C

CBFalconer

Paul said:
.... snip ...

But my point is that they *fail*. The reason is simple -- the
code continues to work and is correct for their platform even as
they drift to non-portability. So there is no feedback mechanism
by which a programmer can realize that what they are doing is
non-portabile. Compare this to Java, where there is no choice
about portability -- its just is.

Nonsense. If they simply use a C compiler with the appropriate
warning level most of those non-portabilities will insist on being
noticed. The simple expedient of aliasing the compiler to a run
with the appropriate options set tends to enforce this. Now the
user has to make an effort to defeat the checks and warnings.
 
F

Flash Gordon

On 25 Sep 2004 21:24:50 -0700
(e-mail address removed) (Mark F. Haigh) wrote:


You are confusing availability with portability. By this reasoning
assembly/machine language is even more portable since you have to
start there to bootstrap any C compiler.

To the best of my knowledge, in fact, there are no two C compilers in
existence which are portable with each other (Intel C++ and MSVC are
close since they actually share libraries and include files, but each
contains different semantics and extensions). By contrast there are
no two Java compilers/implementation which are *not* portable (modulo
bugs).

That would be why one Java application I have won't run on the Java
implementations available for SCO 5.0.5 and earlier (it's not due to
bugs, it's due to changes in the JVM between version), but only on the
Java implementations available on SCO 5.0.6& 5.0.7. This is in stark
contrast to many C programs, some of which I regularly compiler on SCO
5.0.7 and customers then run on SCO 5.0.5 without problem.
No, my claim is that such rules are just not followed. Who the hell
rigorously test for signedness independence of "char"? C specifically
says that char can be either signed or unsigned.

I use unsigned char when I need it, when I don't care I use plain char.
And who keeps track
of the difference between POSIX and ANSI?

The sub-set of C programmers here who program for POSIX systems, for a
start. I always know when a function I am using is POSIX rather than
ANSI.
And if you are doing heavy
floating point, you can't do anything to isolate such platform
specific behaviors.

If you need more than the minimum provided by the standard you can test
(at compile time) for it.

Portability is not a simple issue, I know because I and others in the
company I work for do cross-platform development in a variety of
languages including Java.
 
K

Keith Thompson

Malcolm said:
That is sadly the case. If C99 was widely implemented we could use int32_t
in the rare cases we actually require 32 bits, safe in the knowledge that
any platform that didn't support it was so weird that it wouldn't be worth
trying to run on it anyway.

You can use int32_t anyway; <inttypes.h>, or an equivalent, is easy to
implement in C90 (with the possible exception of the 64-bit types).
Of course. The stdin / stdout model isn't acceptable to many users, who will
demand a graphical interface. For a lot of applications, like games, this is
required.
ANSI could have done the development community a big favour by declaring a
struct POINT, so that 3d libraries all use the same structure for
co-ordinates, and a struct COLOUR (I suppose that would have to be COLOR) so
that images all use the same system of colour naming.

It then could have declared a few very simple functions, like OpenWindow()
to allow for management of graphics. Third parties would soon build
comprehensive libraries on the top of drawpixel(), querymouse(), and so on.
It still wouldn't be easy to implement a fully-functional GUI app, but
something like a Mandelbrot could be very simply written.

I'm not convinced that functions like OpenWindow() and drawpixel() can
be specified portably enough to justify inclusion in the language
standard. Interfaces have been specified that are supposed to enable
code that's portable between MS Windows and X11, but as far as I know
they haven't really caught on. If there's some existing practice,
perhaps it could be added to the standard -- or, perhaps more
sensibly, made into a separate standard. (Perhaps there is such a
widespread de facto standard that I don't know about; does the Java
stuff qualify?)
 
D

Dan Pop

In said:
The "typedef double myDouble;", for example, is not useful unless
there's a possibility that myDouble could be defined as something
other than double (and if there is, myDouble is a poor choice of
name).

Not necessarily. Imagine an implementation where float and double share
the same representation and you need long double to get the additional
precision implied by the (English) word "double".

E.g. on a system with 48-bit words, one machine word would match the
C requirements for both float and double, so the implementor chooses
to provide more than that (a two word floating point type) only for
long double, because this type is a lot slower (no hardware support).

Dan
 
K

Keith Thompson

Not necessarily. Imagine an implementation where float and double share
the same representation and you need long double to get the additional
precision implied by the (English) word "double".

E.g. on a system with 48-bit words, one machine word would match the
C requirements for both float and double, so the implementor chooses
to provide more than that (a two word floating point type) only for
long double, because this type is a lot slower (no hardware support).

Perhaps, but I think a lot of programmers would automatically
associate the name "myDouble" with C's type "double". If I wanted to
create typedefs for small and large floating-point types, regardless
of what the C compiler chooses to call them, I might use terms like
"small_float" and "large_float" to avoid confusion.

A typedef whose name is too close to a predefined type name is
probably either redundant (if it's for the same type) or confusing (if
it isn't).

To take another example, "typedef unsigned char byte;" is more
reasonable than "typedef unsigned char uchar;", IMHO. (One might
argue that "typedef unsigned char byte;" is redundant, but I might use
such a typedef to distinguish between an unsigned char holding a
character value and an unsigned char holding a fundamental unit of
storage, a distinction that C's type system, for convoluted historical
reasons, doesn't make very well.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,147
Messages
2,570,833
Members
47,377
Latest member
MableYocum

Latest Threads

Top