Preprocessor and sizeof()

A

Amigo

Hello all!

I started working on an embedded project a few ago on Freescale 16-bit
micro with an IAR toolset. Running PolySpace for the project code
highlighted amongst other things a peculiar construct in one of the
compiler's header files. Here's the code snippet:

#ifndef __SIZE_T_TYPE__
#if sizeof((char*)0 - (char*)0) <= sizeof(int)
#define __SIZE_T_TYPE__ unsigned int
#else
#define __SIZE_T_TYPE__ unsigned long
#endif
#endif

I was very suspicious about this construct in the first place, but
coming from a compiler vendor baffled me even more. Although I kind of
understand what the author wanted, the combination of sizeof and
pre-processing is bound to not comply with the ANSI C specification.

My question is: have you ever encountered such a construct in a file
delivered by a tool vendor? It definitely "works", in that the compiler
does not throw errors or warnings. But because this code goes into a
safety-related project I need to understand exactly what's happening in
the code and why. Any thoughts appreciated.

Cheers,
Romeo

PS: I was dumped in the project team just last week, so I did not have
time to look through the whole of the project (documentation is
"standard", i.e. poor)
 
C

Chris Torek

I started working on an embedded project a few ago on Freescale 16-bit
micro with an IAR toolset. Running PolySpace for the project code
highlighted amongst other things a peculiar construct in one of the
compiler's header files. Here's the code snippet:
#ifndef __SIZE_T_TYPE__
#if sizeof((char*)0 - (char*)0) <= sizeof(int)
[snippage; this is a somewhat bizarre way of "re-guessing" the
compiler's internal type for size_t]
I was very suspicious about this construct in the first place ...

It is good to be suspicious of this.

The C standard tells us that, during the preprocessing phases,
keywords do not exist. (This is because "pp-tokens" have not yet
been converted to "tokens".) If "sizeof" has not been "#define"d,
it is just an ordinary identifier, just like every other ordinary
identifier. Likewise for the type-names "char" and "int" -- so
the above is equivalent to:

#if burble((gank*)0 - (gank*)0) <= burble(mud)

Each identifier is then replaced with the integer value 0,
giving:

#if 0((0*)0 - (0*)0) <= 0(0)

This line requires a diagnostic, if you write it in your own
code (and do not do anything that turns on compiler-specific
extensions and so on, i.e., the diagnostic is only required from
a Standard C compiler, not from a Langauge-Almost-But-Not-Entirely-
Unlike-C compiler).
... but coming from a compiler vendor baffled me even more.

Well, compiler-vendors can get away with all kinds of sneakiness.
Although I kind of
understand what the author wanted, the combination of sizeof and
pre-processing is bound to not comply with the ANSI C specification.

It does not; and as I noted, a diagnostic is required, at least if
a *user* writes the above. But compiler-vendors can depend on
compiler-specific hacks. For instance, suppose the above is
preceded by:

_Pragma("impl++")

(with a corresponding _Pragma("impl--") afterward, one hopes).
The effect of this _Pragma is undefined, according to the C
standard; but presumably this implementation would define it.
It would disable the diagnostic for the "#if" line -- and after
emitting the diagnostic (or not, in this case), any C compiler
can go on to "magically re-interpret" the #if test.

This is, of course, a pretty silly way to achieve the desired
result. Instead of using "#if" to test, in the preprocessing
phase, something that is already set in later stages of the
compiler, why not use a vendor-specific extension that simply
accesses those already-set things in a more direct manner? In
this case, since the goal is to use an internal, compiler-specific
name for "size_t", the compiler can simply pre-define the
(internal-only) type-name __size_t, and the compiler's headers
can then include lines like:

struct __some_internal_struct {
unsigned char *__p;
__size_t __len;
};

as needed.
My question is: have you ever encountered such a construct in a file
delivered by a tool vendor? It definitely "works", in that the compiler
does not throw errors or warnings. But because this code goes into a
safety-related project I need to understand exactly what's happening in
the code and why.

Because __-prefixed names (like __SIZE_T_TYPE__) are already
restricted to the implementor, you have to refer to the implementation
to find out what they mean. In order to be *sure* about this, you
must consult vendor documentation (which is unlikely to say anything
anyway, thus leaving you stranded, but "them's the breaks").

(While the *intent* of the vendor's header is clear enough, it is
not at all obvious that whoever wrote that header for that vendor
knew whether the "#if" line actually works in that compiler.
Perhaps after skipping the required diagnostic, the #if test is
replaced with "#if 0", regardless of whether the later pieces of
the compiler use "unsigned long" internally for its size_t. In
this case, the parts of the snippet I deleted would define
__SIZE_T_TYPE__ as unsigned long, even if the compiler actually
uses unsigned int. If __SIZE_T_TYPE__ is not actually used in
those places where size_t is required, this is even OK!)
 
A

Ancient_Hacker

If you google for "preprocessor sizeof" you'll get a bunch of hits, all
of them agreeing the preprocessor has no clue what sizeof() is. And
once you think about it, it can't know about types, pointers, casts, or
what the sizeof the difference between two pointers is either.

So the line is not only bogus, your C preprocessor isnt too good at
detecting this.

Also note what they're supposedly checking for is not being done too
exactly. I think they're checking to see if a pointer will fit into an
integer or a long, but the test isnt very precise. it might not even
handle the case of soem weird processor where pointers might be 48 to
64 bits, but "long" might be 32 bits. Also note it won't work on
platforms where there are different size poitners, like NEAR, FAR,
SHORT, or HUGE.
 
S

Stephen Sprunk

Ancient_Hacker said:
If you google for "preprocessor sizeof" you'll get a bunch of hits,
all
of them agreeing the preprocessor has no clue what sizeof() is. And
once you think about it, it can't know about types, pointers, casts,
or
what the sizeof the difference between two pointers is either.

So the line is not only bogus, your C preprocessor isnt too good at
detecting this.

Unless, of course, the C preprocessor that's part of this implementation
does somehow know what sizeof() means.
Also note what they're supposedly checking for is not being done too
exactly. I think they're checking to see if a pointer will fit into
an
integer or a long, but the test isnt very precise. it might not even
handle the case of soem weird processor where pointers might be 48 to
64 bits, but "long" might be 32 bits.

Note that the test is actually attempting to determine
sizeof(ptrdiff_t), not the size of pointers. It's entirely possible
that ptrdiff_t (and presumably size_t) could fit in an int whereas a
full pointer might require a long or long long (or more).
Also note it won't work on platforms where there are different size
poitners, like NEAR, FAR, SHORT, or HUGE.

Since this snippet was found in the compiler's headers, the folks who
wrote it were free to assume such things since it's part of the
implementation. If their implementation doesn't work on platforms with
variable-sized pointers (or even pointers longer than long), accounting
for that is unnecessary.

Obviously, user code isn't permitted to make the same assumptions, at
least if the writer wishes the code to be portable.

Still, any compiler that doesn't know the size of a size_t is rather
broken; one shouldn't have to rely on the preprocessor for that.

S
 
J

Jack Klein

I started working on an embedded project a few ago on Freescale 16-bit
micro with an IAR toolset. Running PolySpace for the project code
highlighted amongst other things a peculiar construct in one of the
compiler's header files. Here's the code snippet:
#ifndef __SIZE_T_TYPE__
#if sizeof((char*)0 - (char*)0) <= sizeof(int)
[snippage; this is a somewhat bizarre way of "re-guessing" the
compiler's internal type for size_t]

No, I think not. Actually it is attempting to test the size of a
ptrdiff_t, although it is quite common for that to be the signed type
that corresponds to the unsigned type used for size_t.
 
J

Jack Klein

Hello all!

I started working on an embedded project a few ago on Freescale 16-bit
micro with an IAR toolset. Running PolySpace for the project code
highlighted amongst other things a peculiar construct in one of the
compiler's header files. Here's the code snippet:

#ifndef __SIZE_T_TYPE__
#if sizeof((char*)0 - (char*)0) <= sizeof(int)
#define __SIZE_T_TYPE__ unsigned int
#else
#define __SIZE_T_TYPE__ unsigned long
#endif
#endif

I was very suspicious about this construct in the first place, but
coming from a compiler vendor baffled me even more. Although I kind of
understand what the author wanted, the combination of sizeof and
pre-processing is bound to not comply with the ANSI C specification.

My question is: have you ever encountered such a construct in a file
delivered by a tool vendor? It definitely "works", in that the compiler
does not throw errors or warnings. But because this code goes into a
safety-related project I need to understand exactly what's happening in
the code and why. Any thoughts appreciated.

Cheers,
Romeo

PS: I was dumped in the project team just last week, so I did not have
time to look through the whole of the project (documentation is
"standard", i.e. poor)

The compiler is allowed to perform all sorts of non-standard tricks in
its headers and libraries, and in some cases is required to, as some
things an implementation is required to do are not possible in
standard C.

I tested PolySpace on one of our code bases some years back and ran
into an issue with one such case, namely the offsetof macro. The
compiler's <stddef.h> provides a typical implementation:

#define offsetof(s, m) (int)(&((s*)0)->m)

Polyspace complained about the dereference of a null pointer, and also
about a signed/unsigned mismatch when the result of the macro was
assigned to a size_t, because the macro contains a specific cast to
int.

That prompted a discussion where I maintained they needed to special
case either that macro, or macros from standard headers in general, or
at least standard macros from standard headers.

The code is legal because the standard requires the implementation to
provide a macro that works with their compiler, and it cannot be
written legally. And the result of the macro is a size_t, because the
standard says it is a size_t.
 
A

Ark

Jack said:
The compiler is allowed to perform all sorts of non-standard tricks in
its headers and libraries, and in some cases is required to, as some
things an implementation is required to do are not possible in
standard C.

I tested PolySpace on one of our code bases some years back and ran
into an issue with one such case, namely the offsetof macro. The
compiler's <stddef.h> provides a typical implementation:

#define offsetof(s, m) (int)(&((s*)0)->m)

Polyspace complained about the dereference of a null pointer, and also
about a signed/unsigned mismatch when the result of the macro was
assigned to a size_t, because the macro contains a specific cast to
int.

That prompted a discussion where I maintained they needed to special
case either that macro, or macros from standard headers in general, or
at least standard macros from standard headers.

The code is legal because the standard requires the implementation to
provide a macro that works with their compiler, and it cannot be
written legally. And the result of the macro is a size_t, because the
standard says it is a size_t.

.... and recently I publicly made a fool of myself in this NG because
offsetof was redefined in my environment with something like above, and
therefore ceased being "a constant integral expression". What's allowed
in the vendor's headers is not necessarily good for the user code, as
has been mentioned in this thread.

But let's feel for an (embedded) compiler vendor. They need to support a
variety of chips, and the only cost-effective way of doing this is to
have a compiler engine as common as possible. So I would not be
surprised if the the compiler didn't know size_t, ptrdiff_t etc other
than from some compiler configuration files. There is nothing wrong if
the compiler configuration is actually contained in the vendor's system
headers.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,821
Latest member
AleidaSchi

Latest Threads

Top