Typecasting in C

K

Keith Thompson

Harti Brandt said:
On Tue, 29 Jun 2004, Arthur J. O'Dwyer wrote: [...]
AJO> Who's been using "%d" or "%x" to print *pointer* values?

%p is a quite new feature for printf(). Neither V7 nor BSD had this, so
the natural way of printing pointers was %x. Don't assume that everybody
out there does a daily update of it's compilers and libraries to the
current gcc.

It's not *that* new; it was introduced in the 1989 ANSI C standard, at
the same time prototypes were introduced. It did take several years
for ANSI C to catch on widely enough for "%p" to be considered
portable, but certainly any code written in the last decade or so
should use it.
 
J

Jens.Toerring

jacob navia said:
I think lcc-win32 will eventually migrate to
sizeof(void *) == 32, sizeof(int) 32
sizeof( __long void *) == 64, sizeof (long long) == 64.
For the few data items where you may need more than
4GB of addressing space a special pointer type would be more efficient

Do we have to expect the return of the dreaded far and near pointer
issues in a new disguise? Please tell me you don't even consider
something like that....
Regards, Jens
 
J

jacob navia

Do we have to expect the return of the dreaded far and near pointer
issues in a new disguise? Please tell me you don't even consider
something like that....

Well, if you do not want that, you should use all 64 bit
pointers. I will allow a switch for that since anyway I have
implemented a 64 bit only mode first.

A 32 bit mode with all pointers 32 bit by default is
more efficient in space and also in time for many applications.

This will not work very well if the added complexity doesn't justify
the performance gains. But for *many* existing and many
complex programs, a 64 bit pointer is an overkill and 32 bits
will suffice wonderfully. Not all software is handling big
database applications. Why carry all those zeroes around?
 
A

Andrey Tarasevich

jacob said:
The expression

printf("the address is: 0x%x\n",p);

where p is some pointer appears in several million lines in
existing code.
...

This simply means that authors of such code are in desperate need of
additional education in C programming (even though they might not
realize that). In this particular case the problem with the code is not
hypothetical, it is very real. Just consider what will happen on 64-bit
platform with 32-bit ints.
 
K

Keith Thompson

jacob navia said:
"Keith Thompson" <[email protected]> a écrit dans le message de


If they set the debug level to high then yes. If not, the compiler
accepts it. I think this discussion has shown me that
maybe a warning is needed, if the programmer wishes.

In my opinion, the warning is needed whether the programmer asks for
it or not.
lcc-win32 (as its name implies) is a 32 bit system. I am working
in the 64 bit version already, but that is another topic.

The name also implies that it's a C compiler. In C, calling printf()
with a "%d" or "%x" format and a pointer argument invokes undefined
behavior.

[...]
Yes. I started porting the code and it is hard. Warnings like this
could improve the situation. In any case in the 64 bit system a warning
will be issued since sizeof(int) != sizeof(void *).

A call like
printf("p = 0x%x\n", some_pointer_value);
isn't correct on a 32-bit system and incorrect on a 64-bit system.
It's incorrect in C, regardless of the particular implementation; it
just happens to "work" on some 32-bit systems (or more generally, on
some, but not all, systems where ints and pointers happen to be the
same size).

[...]
Yes, I think I will be forced to write that in the 64 bit system.

Why wait?
I am wary of forcing casts to please the compiler... In a 32
bit system where sizeof void * is the same as sizeof int, this
warning has no sense really, and should be optional.

Your wariness about casts is appropriate. Arguments to printf() are
among the rare cases where they're necessary (the prototype doesn't
force an implicit conversion, so you need an explicit one). Even
passing a pointer argument for a "%p" format calls for a cast to void*
(though it's probably not strictly necessary for char*).
Do not clutter output. Filter it. Make more verbose
options available but choose a sensible default:

do not clutter...

A warning about incorrect code is not clutter.
 
C

Christian Bau

"jacob navia said:
The rationale is that most programs do have a certain use of the
extra registers provided by the new architecture, but will never
need more than 4GB address range for most pointers.

So why would you need sizeof (void *) == 32? Don't you think that 256
bit pointers are a bit excessive and 256 bit int is a bit much as well?
 
A

Arthur J. O'Dwyer

Well, if you do not want that, you should use all 64 bit
pointers. I will allow a switch for that since anyway I have
implemented a 64 bit only mode first.

A 32 bit mode with all pointers 32 bit by default is
more efficient in space and also in time for many applications.

This will not work very well if the added complexity doesn't justify
the performance gains. But for *many* existing and many
complex programs, a 64 bit pointer is an overkill and 32 bits
will suffice wonderfully. Not all software is handling big
database applications. Why carry all those zeroes around?

I second Jens' comment. Have you had any experience with the
'far'/'near' morass? It's exactly isomorphic to your '__long'
proposal, except that 'far' and 'near' were dinosaurs of the
16-to-32-bit transition, and your proposal is a dinosaur of the
32-to-64-bit transition. It ought never to see the light of day.

(On the other hand, I don't think there's much real danger that
anyone will use lcc-win32 as their primary source-developing
platform, in the same way Borland[1] dominated many areas during the
last ice age. And I do have a perverse fascination with archaic
hacks like 'far'/'near'. So maybe I would prefer to see Jacob
struggle this one out... ;)

The "correct" solution, as far as I'm concerned, is to go ahead
and use 32-bit pointers whenever possible --- but *hide this fact
from the user*! That is, have only a single type 'void*', but let
it be 32 bits most of the time and 64 bits when necessary. If
it's too hard to figure out when 32 bits are truly sufficient, then
go the "memory model" route (which you already suggested as an
alternative). But steer clear of 'far' and 'near'!

-Arthur

[1] - It *was* "Turbo <PLOC>" that introduced 'far' and 'near',
wasn't it, and then Microsoft picked it up? Or was it invented
several times independently?
 
A

Arthur J. O'Dwyer

"jacob navia" <[email protected]> writes:
[re: warning about 'printf("%d", cptr)']
In my opinion, the warning is needed whether the programmer asks for
it or not.

Especially for a reason Dan Pop pointed out twice, which seems to
have been ignored in all the 64-bit "clutter";) ...

char *p = "foo";
if (should_we_print_foo)
printf("%d\n", p);

Whoops! The programmer meant to type "%s", but his finger slipped
from the 's' to the adjacent 'd' key, and his program will have a
subtle bug. Not so subtle if "foo" is an important prompt, but
suppose "foo" is an error message that only appears during weekly
progress meetings? ;)

"Oh, that's no problem... I'll fix the typo and recompile," says
the programmer.

char *p = "foo";
if (should_we_print_foo)
printf("%x\n", p);

Whoops! Dang that 's' key --- this time I hit 'x' by mistake instead!
And again lcc-win32 conspires to hide my mistake.

This is a system of compiler diagnostics truly worthy of the DS9000,
if it will hide *only* those 'printf' errors which could conceivably
be typing mistakes! (I bet it warns about printf("%d") on a float,
or vice versa, though, so it's not yet perfect. ;)

-Arthur
 
K

Keith Thompson

jacob navia said:
I think lcc-win32 will eventually migrate to

sizeof(void *) == 32, sizeof(int) 32
sizeof( __long void *) == 64, sizeof (long long) == 64.

Assuming that by sizeof(foo) you mean sizeof(foo)*CHAR_BIT ...

So you're not planning to allow code compiled by lcc-win32
(lcc-win64?) to interface easily to code compiled by other compilers?
Like, say, the operating system?
 
R

Richard Bos

jacob navia said:
In any case this discussion was positive for me (and lcc-win32).
I have been able to improve lcc-win32 a bit.

I hope you have the decency to credit comp.lang.c in your documentation
for any correct parts of that compiler.

Richard
 
H

Harti Brandt

On Tue, 29 Jun 2004, Default User wrote:

DU>Harti Brandt wrote:
DU>
DU>> %p is a quite new feature for printf().
DU>
DU>You must have an interesting definition of "quite new" in that it
DU>appeared in the 1989 standard and K&R 2.

In the standards world 15 years is quite new :). From time to time I even
compile v7 code on modern platforms. Don't forget that there is an aweful
lot of old code still running. There are still pdp11 systems running
production code. In the bank I was working a couple of years ago they
discovered (while preparing for Y2K) that they have code from the 60s
running since then every day with most of the programmers already dead or
almost so.

So definitely there is a lot of code written when %p wasn't there yet in
many compilers.

DU>> Neither V7 nor BSD had this, so
DU>> the natural way of printing pointers was %x. Don't assume that everybody
DU>> out there does a daily update of it's compilers and libraries to the
DU>> current gcc.
DU>
DU>One needn't have a daily update, just one from, oh, the last 10 years or
DU>so.

I just tried to explain that if you have a program that works and compiles
with an old compiler you just don't want to port your program to
adhere to the standards.

Sure, when writing new software or re-writing you better do what the
standards say - there is already too many crappy (from the porting
standpoint of view) software out there. Many programmers unfortunately
didn't learn the 16 -> 32 bit lesson and happily stuff pointers into ints.

harti
 
D

Dan Pop

In said:
%p is a quite new feature for printf().

Yeah, it's "only" 15 years old.
Neither V7 nor BSD had this,

How much code developed back then is still used *as such*?
so the natural way of printing pointers was %x.

Nope, this was a mistake even back then. Read K&R1. The *correct* way
of printing pointers was converting them to unsigned integers and using
whatever conversion was appropriate for unsigned integers.
Don't assume that everybody
out there does a daily update of it's compilers and libraries to the
current gcc.

OTOH, it is a fair assumption that all code originally developed before
1989 that is still being used *and* maintained, has been converted to
ANSI C long ago. Especially considering the existence of tools that
automate most of the process and the advantages of maintaining standard
C code.

Dan
 
D

Dan Pop

As I mentioned in another thread, I started learning C using those printf
statements for debugging since there wasn't any debugger in those times.

Without casting the pointers, those printf calls have *always* been
incorrect and worked by pure accident.
And the usage became an habit.

There are good habits and bad habits. It is sheer stupidity to defend
the bad ones.
It has been working since more or less 20 years.

And it stopped working (portably) 12 years ago, when DEC released an
implementation with 32-bit integers and 64-bit pointers.

Not to mention the AS/400 oddity...
Is that a terrible sin?
YES!

Those printf statements never survived a lot anyway. Why
should I bother?

But you *do* bother, as proved in this very thread.
Is this extremely important?

The quality of your diagnostics is entirely up to you. But if you make
the wrong choices, expect your users to discover the benefits of gcc.
Maybe. The fact that I check printf (as few compilers do
actually) is ignored, and an oversight is amplified.

Your main competition, gcc, does a better job than you do and this is the
only thing that matters. Noone is going to leave your implementation in
favour of a worse one ;-)

Dan
 
D

Dan Pop

[1] - It *was* "Turbo <PLOC>" that introduced 'far' and 'near',
wasn't it, and then Microsoft picked it up?

Nope, they were a Microsoft invention that became standard in the MSDOS
world. Believe it or not, they served an excellent purpose at the time.
People who didn't care about the performance of their codes on those
slow 8088 chips, could always use the huge memory model and pretend that
they're programming on an architecture with a linear address space.

Strictly speaking, the 8086 has a 20-bit linear address space.
It's just that it cannot access it as such, in software, because its
address registers have only 16 bits, hence the need for the segment-offset
pairs.

Dan
 
K

Keith Thompson

Harti Brandt said:
In the standards world 15 years is quite new :). From time to time I even
compile v7 code on modern platforms. Don't forget that there is an aweful
lot of old code still running. There are still pdp11 systems running
production code. In the bank I was working a couple of years ago they
discovered (while preparing for Y2K) that they have code from the 60s
running since then every day with most of the programmers already dead or
almost so.

So definitely there is a lot of code written when %p wasn't there yet in
many compilers.

If I compile such code with a modern compiler, I still want to see
warnings about printf format mismatches; some of them are going to
indicate real problems. If I don't want the warnings for some reason,
I can always set an option to disable them (or run "grep -v" on the
log file).

The default behavior shouldn't be optimized for compiling 20-year-old
code.
 
H

Harti Brandt

On Wed, 30 Jun 2004, Dan Pop wrote:

DP>
DP>>%p is a quite new feature for printf().
DP>
DP>Yeah, it's "only" 15 years old.
DP>
DP>>Neither V7 nor BSD had this,
DP>
DP>How much code developed back then is still used *as such*?

Well, although this now starts to be off-topic: there is a lot of old code
out there. There is still a company marketing pdp11 and the associated OSs
(RSX, RT) because there is still a lot of software that does it's duty.
While working for a bank some 10 years ago they discovered (while
preparing for year 2k) that they had still at lot of assembler code from
the 60s running that does some of the nightly jobs (with programmers not
available anymore of course). Yes, that's not C-code, but there is also
old C-code. Only two or three years ago FreeBSD removed the _P() macro
from the kernel after several years of yearly discussion. The port to
sparc64 discovered a lot of problems of the sizeof(int) == sizeof(void *)
problem and I assume there are still such problems lingering around. Some
of the FreeBSD programs still have no prototypes...

DP> DP>>so the natural way of printing pointers was %x.
DP>
DP>Nope, this was a mistake even back then. Read K&R1. The *correct* way
DP>of printing pointers was converting them to unsigned integers and using
DP>whatever conversion was appropriate for unsigned integers.

So that would be %x, wouldn't it? I assumed, that you would write

printf("%x", (unsigned)ptr);

But, OTOH, if you look at the v7 code (utilities I mean, the kernel was a
bit cleaner in this regard), you'll notice that they didn't much
care about casts. It was natural to stuff integers into pointers and
pointers into integers, applying -> to pointers that point to a different
struct (the compiler had a single name space for field names), even to ->
integers (as far as I remember).

I once needed the dmr compiler to be compiled by gcc. That was a hard job
:)

DP>
DP>>Don't assume that everybody
DP>>out there does a daily update of it's compilers and libraries to the
DP>>current gcc.
DP>
DP>OTOH, it is a fair assumption that all code originally developed before
DP>1989 that is still being used *and* maintained, has been converted to
DP>ANSI C long ago. Especially considering the existence of tools that
DP>automate most of the process and the advantages of maintaining standard
DP>C code.

I don't think so. My experience (at least in the industry, not the
academic world) is that one won't change a program just to conform to a
standard, but rather use the old compiler, if there really needs to be a
change in the program. Nobody (at least in a german company) would give
you a EU-cent for this :-(

But sure, there is no reason to do these mistakes nowadays.

harti
 
H

Harti Brandt

On Wed, 30 Jun 2004, Dan Pop wrote:

DP>
DP>
DP>>It has been working since more or less 20 years.
DP>
DP>And it stopped working (portably) 12 years ago, when DEC released an
DP>implementation with 32-bit integers and 64-bit pointers.
DP>
DP>Not to mention the AS/400 oddity...

Or the segmented compiler (based on pcc) running on a Z8000 under
SystemIII. That one had 16bit ints and 32bit (really 24 bit with an unused
byte in the middle :) pointers. I still have one of those.

harti
 
H

Harti Brandt

On Wed, 30 Jun 2004, Keith Thompson wrote:

KT>[...]
KT>> In the standards world 15 years is quite new :). From time to time I even
KT>> compile v7 code on modern platforms. Don't forget that there is an aweful
KT>> lot of old code still running. There are still pdp11 systems running
KT>> production code. In the bank I was working a couple of years ago they
KT>> discovered (while preparing for Y2K) that they have code from the 60s
KT>> running since then every day with most of the programmers already dead or
KT>> almost so.
KT>>
KT>> So definitely there is a lot of code written when %p wasn't there yet in
KT>> many compilers.
KT>
KT>If I compile such code with a modern compiler, I still want to see
KT>warnings about printf format mismatches; some of them are going to
KT>indicate real problems. If I don't want the warnings for some reason,
KT>I can always set an option to disable them (or run "grep -v" on the
KT>log file).
KT>
KT>The default behavior shouldn't be optimized for compiling 20-year-old
KT>code.

Sure.

harti
 
D

Dan Pop

In said:
On Wed, 30 Jun 2004, Dan Pop wrote:

DP>
DP>>%p is a quite new feature for printf().
DP>
DP>Yeah, it's "only" 15 years old.
DP>
DP>>Neither V7 nor BSD had this,
DP>
DP>How much code developed back then is still used *as such*?

Well, although this now starts to be off-topic: there is a lot of old code
out there. There is still a company marketing pdp11 and the associated OSs
(RSX, RT) because there is still a lot of software that does it's duty.

You're talking about legacy applications, most of them existing only in
binary form, not about applications that are under active maintenance.
Therefore, in most cases, you don't even need a C compiler for the
machines you're talking about. Not to mention that C has never been the
programming language of choice under either RSX or RT (yes, I used
DECUS C for RSX-11 and I know what I'm talking about).
While working for a bank some 10 years ago they discovered (while
preparing for year 2k) that they had still at lot of assembler code from
the 60s running that does some of the nightly jobs (with programmers not
available anymore of course). Yes, that's not C-code, but there is also
old C-code.

Is this old C code used on new platforms and being maintained? Its
mere existence has no bearing on what compilers developed *today* should
do.
Only two or three years ago FreeBSD removed the _P() macro
from the kernel after several years of yearly discussion. The port to
sparc64 discovered a lot of problems of the sizeof(int) == sizeof(void *)
problem and I assume there are still such problems lingering around. Some
of the FreeBSD programs still have no prototypes...

They will get them, as soon as they need to be maintained (if ever).
Unless they are legacy applications, superseded by other programs (which
could explain why no one bothered to touch them).
DP> DP>>so the natural way of printing pointers was %x.
DP>
DP>Nope, this was a mistake even back then. Read K&R1. The *correct* way
DP>of printing pointers was converting them to unsigned integers and using
DP>whatever conversion was appropriate for unsigned integers.

So that would be %x, wouldn't it?

Or %o (the usual choice on the PDP-11) or even %u, depending on the
programmer's taste.
I assumed, that you would write

printf("%x", (unsigned)ptr);

ONLY if the implementation didn't support the unsigned long type. The
most sensible choice would be

printf("%lx", (unsigned long)ptr);

Think about "weird" platforms like the 8086, with its 16-bit int's and
32-bit pointers... It predated standard C by more than one decade.
But, OTOH, if you look at the v7 code (utilities I mean, the kernel was a

Why should I bother? Is such code in any way relevant to the behaviour
of compilers developed today? Are you sure you remember the topic
of this subthread?
I once needed the dmr compiler to be compiled by gcc. That was a hard job
:)

Why would you expect otherwise? The dmr compiler was written in whatever
happened to be the definition of the C language at the time. gcc
implements a different language specification, even in -traditional mode.
DP>>Don't assume that everybody
DP>>out there does a daily update of it's compilers and libraries to the
DP>>current gcc.
DP>
DP>OTOH, it is a fair assumption that all code originally developed before
DP>1989 that is still being used *and* maintained, has been converted to
^^^^^^^^^^^^^^^^
DP>ANSI C long ago. Especially considering the existence of tools that
DP>automate most of the process and the advantages of maintaining standard
DP>C code.

I don't think so. My experience (at least in the industry, not the
academic world) is that one won't change a program just to conform to a
standard, but rather use the old compiler, if there really needs to be a
change in the program. Nobody (at least in a german company) would give
you a EU-cent for this :-(

If you're talking about *legacy* applications, we're in perfect agreement.
No one wants to touch them more than *strictly* necessary. And certainly
not with a modern compiler.

But I was talking about code under active maintenance, i.e. code that has
been ported to modern platforms and whose specifications are still being
changed. Not to be confused with V7 code that is still used on a PDP-11
running V7.

Dan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,145
Messages
2,570,824
Members
47,369
Latest member
FTMZ

Latest Threads

Top