unsigned char *

J

j0mbolar

#include <stdio.h>

int main(void)
{
long dword = 0xff;

printf("%u\n", ((unsigned char *)&dword)[0]);
return 0;
}

does the standard say this is legal?
if so or if not, what section?
 
E

Emmanuel Delahaye

In said:
#include <stdio.h>

int main(void)
{
long dword = 0xff;

printf("%u\n", ((unsigned char *)&dword)[0]);
return 0;
}

does the standard say this is legal?
if so or if not, what section?

No. An unsigned char will be converted to an int, but "%u" wants an unsigned
int. The behaviour is undefined.

#include <stdio.h>

int main(void)
{
long dword = 0xff;

printf("%u\n", (unsigned) ((unsigned char *)&dword)[0]);
return 0;
}
 
J

Jack Klein

In said:
#include <stdio.h>

int main(void)
{
long dword = 0xff;

printf("%u\n", ((unsigned char *)&dword)[0]);
return 0;
}

does the standard say this is legal?
if so or if not, what section?

No. An unsigned char will be converted to an int, but "%u" wants an unsigned
int. The behaviour is undefined.

An unsigned char will be converted to signed int on many platforms,
but to unsigned int on some where sizeof(int) == 1. There are such
platforms, mostly Digital Signal Processors and some exotic RISC
architectures. You may not have encountered them, but I work on them
regularly.

So if the code is compiled on such a platform (such as the TI 2812 DSP
I am working on currently), the result is perfectly defined.
#include <stdio.h>

int main(void)
{
long dword = 0xff;

printf("%u\n", (unsigned) ((unsigned char *)&dword)[0]);
return 0;
}

Of course the above is much more portable.
 
J

Jack Klein

#include <stdio.h>

int main(void)
{
long dword = 0xff;

printf("%u\n", ((unsigned char *)&dword)[0]);
return 0;
}

does the standard say this is legal?
if so or if not, what section?

It is not legal except on platforms where sizeof(int) == 1, and there
are some of those but not in the desk top world.

In most common implementations, signed int can hold all possible
values of unsigned char, so the value is converted to signed int, not
unsigned int which the "%u" conversion specifier requires.

The C standard suggests that signed and unsigned integer types should
be acceptable substitutions for each other in function calls, but does
guarantee that this is so.
 
P

Peter Nilsson

Jack Klein said:
In said:
#include <stdio.h>

int main(void)
{
long dword = 0xff;

printf("%u\n", ((unsigned char *)&dword)[0]);
return 0;
}

does the standard say this is legal?
if so or if not, what section?

No. An unsigned char will be converted to an int, but "%u" wants an unsigned
int. The behaviour is undefined.

An unsigned char will be converted to signed int on many platforms,
Indeed.

but to unsigned int on some where sizeof(int) == 1.

ITYM: _all_ where sizeof(int) == 1.

The precise criteria for promotion to unsigned int is, of course...

UCHAR_MAX > INT_MAX
There are such
platforms, mostly Digital Signal Processors and some exotic RISC
architectures. You may not have encountered them, but I work on them
regularly.

So if the code is compiled on such a platform (such as the TI 2812 DSP
I am working on currently), the result is perfectly defined.
#include <stdio.h>

int main(void)
{
long dword = 0xff;

printf("%u\n", (unsigned) ((unsigned char *)&dword)[0]);
return 0;
}

Of course the above is much more portable.

It's portable, but not very useful as is. The printed value is unspecified.
 
R

Richard Delorme

Jack Klein a écrit :
An unsigned char will be converted to signed int on many platforms,
but to unsigned int on some where sizeof(int) == 1. There are such
platforms, mostly Digital Signal Processors and some exotic RISC
architectures. You may not have encountered them, but I work on them
regularly.

So if the code is compiled on such a platform (such as the TI 2812 DSP
I am working on currently), the result is perfectly defined.

The lack of portability makes such a construct an undefined behaviour:

3.4.3

1 undefined behavior

behavior, upon use of a *nonportable* or erroneous program construct or
of erroneous data, for which this International Standard imposes no
requirements.
 
J

j0mbolar

Jack Klein said:
#include <stdio.h>

int main(void)
{
long dword = 0xff;

printf("%u\n", ((unsigned char *)&dword)[0]);
return 0;
}

does the standard say this is legal?
if so or if not, what section?

It is not legal except on platforms where sizeof(int) == 1, and there
are some of those but not in the desk top world.

In most common implementations, signed int can hold all possible
values of unsigned char, so the value is converted to signed int, not
unsigned int which the "%u" conversion specifier requires.

The C standard suggests that signed and unsigned integer types should
be acceptable substitutions for each other in function calls, but does
guarantee that this is so.

well, I was more concerned with the
casting to unsigned char *
 
D

Dan Pop

In said:
In said:
#include <stdio.h>

int main(void)
{
long dword = 0xff;

printf("%u\n", ((unsigned char *)&dword)[0]);
return 0;
}

does the standard say this is legal?
if so or if not, what section?

No. An unsigned char will be converted to an int, but "%u" wants an unsigned
int. The behaviour is undefined.

An unsigned char will be converted to signed int on many platforms,
but to unsigned int on some where sizeof(int) == 1. There are such
platforms, mostly Digital Signal Processors and some exotic RISC
architectures. You may not have encountered them, but I work on them
regularly.

Do they have *conforming* *hosted* implementations, to be relevant in the
context of Emmanuel's comments: the %u conversion specifier of printf?
So if the code is compiled on such a platform (such as the TI 2812 DSP
I am working on currently), the result is perfectly defined.

If the implementation is not hosted, calling printf without defining it
invokes undefined behaviour, anyway. Anything the C standard says about
printf is irrelevant on such an implementation.

Dan
 
D

Dan Pop

In said:
#include <stdio.h>

int main(void)
{
long dword = 0xff;

printf("%u\n", ((unsigned char *)&dword)[0]);
return 0;
}

does the standard say this is legal?
if so or if not, what section?

It is not legal except on platforms where sizeof(int) == 1, and there
are some of those but not in the desk top world.

Care to mention ONE conforming hosted implementation with sizeof(int) == 1
???

It would break tons of existing code that assumes that EOF is different
from any byte value:

while ((c = getc(fp)) != EOF) /* do whatever */ ;

Dan
 
D

Dan Pop

In said:
Jack Klein a écrit :


The lack of portability makes such a construct an undefined behaviour:

3.4.3

1 undefined behavior

behavior, upon use of a *nonportable* or erroneous program construct or
of erroneous data, for which this International Standard imposes no
requirements.

Nope, it's undefined behaviour *only* on implementations where the type
resulting from the integral promotions does not match the type expected
by the conversion specifier.

The correct statement is that the lack of portability prevents this
construct from being used in strictly conforming programs.

If you still don't get it, consider a simple example: 12345 * 3.
Undefined behaviour if INT_MAX == 32767, well defined behaviour otherwise.

And if you still don't get it: you cannot use the definition of undefined
behaviour to decide when a program invokes undefined behaviour. It is a
separate paragraph of the standard dealing with this issue:

2 If a ``shall'' or ``shall not'' requirement that appears outside
of a constraint is violated, the behavior is undefined. Undefined
behavior is otherwise indicated in this International Standard
by the words ``undefined behavior'' or by the omission of any
explicit definition of behavior. There is no difference in
emphasis among these three; they all describe ``behavior that
is undefined''.

Dan
 
O

Old Wolf

Emmanuel Delahaye said:
In said:
#include <stdio.h>

int main(void)
{
long dword = 0xff;

printf("%u\n", ((unsigned char *)&dword)[0]);
return 0;
}

does the standard say this is legal?
if so or if not, what section?

No. An unsigned char will be converted to an int, but "%u" wants an unsigned
int. The behaviour is undefined.

If an unsigned char is converted to a signed int, it must be a
non-negative signed int. (otherwise it would have been converted
to unsigned int).

The standard says that non-negative signed ints must have the same
size, alignment and representation as unsigned ints. It can't be
a trap representation, or represent a different value. Therefore, as
far as a variadic function is concerned, it was passed a signed int.
Even the DS9000 could not go wrong here AFAICS, so shouldn't that
mean the behaviour is well-defined?
 
R

Richard Delorme

Dan Pop a écrit :
Nope, it's undefined behaviour *only* on implementations where the type
resulting from the integral promotions does not match the type expected
by the conversion specifier.

I don't think so. Undefined behaviour stands from the Standard's point
of view, not from the implementation's one. An implementation may define
a behaviour for what the standard leaves undefined.
The correct statement is that the lack of portability prevents this
construct from being used in strictly conforming programs.

Do you mean the word "nonportable" in the definition of "undefined
behaviour" is not correct?

If you still don't get it, consider a simple example: 12345 * 3.
Undefined behaviour if INT_MAX == 32767, well defined behaviour otherwise.

I know implementations where INT_MAX == 32767 and 12345 * 3 has a well
defined behaviour (the result is -28501); however, this behaviour has to
be found in the implementation documentation, not in the standard, that
do not specify the value of INT_MAX, neither define the behaviour of
12345 * 3 in case of overflow.
And if you still don't get it: you cannot use the definition of undefined
behaviour to decide when a program invokes undefined behaviour.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I do not agree with this sentence. Undefined behaviour is a lack of
definition within the standard, not a property of a program. A program
may be erroneous in a way that seems too complicated for the Standard to
impose a particular behaviour for an implementation.
It is a
separate paragraph of the standard dealing with this issue:

2 If a ``shall'' or ``shall not'' requirement that appears outside
of a constraint is violated, the behavior is undefined. Undefined
behavior is otherwise indicated in this International Standard
by the words ``undefined behavior'' or by the omission of any
explicit definition of behavior. There is no difference in
emphasis among these three; they all describe ``behavior that
is undefined''.


If you still don't think that undefined behaviour is a portability
issue, guess what is the title of the the annex J, where the undefined
behaviours are recapitulated in J.2.
 
D

Dan Pop

In said:
Dan Pop a écrit :

I don't think so. Undefined behaviour stands from the Standard's point
of view, not from the implementation's one. An implementation may define
a behaviour for what the standard leaves undefined.


Do you mean the word "nonportable" in the definition of "undefined
behaviour" is not correct?

It is perfectly correct, it is your interpretation of it that is
incorrect. It is NOT enough for a bit of code to be non-portable to
invoke undefined behaviour, it must satisfy other conditions, clearly
documented in the standard, in the text I have quoted. Why is that so
difficult to understand for you?

Example of non-portable code that does not invoke undefined behaviour:

#include <stdio.h>

int foo(void) { puts("I am foo"); return 1; }
int bar(void) { puts("I am bar"); return 1; }
int main() { return foo() - bar(); }

This is clearly non-portable code, with unpredictable output, as it relies
on unspecified behaviour. Do you claim that it invokes undefined
behaviour? Or that the definition of undefined behaviour is broken?
I know implementations where INT_MAX == 32767 and 12345 * 3 has a well
defined behaviour (the result is -28501);

From the standard's point of view, this is undefined behaviour and, no
matter what the implementation defines/documents, nasal demons are still
allowed by the C standard.
however, this behaviour has to
be found in the implementation documentation, not in the standard, that
do not specify the value of INT_MAX, neither define the behaviour of
12345 * 3 in case of overflow.

And, because it doesn't define it, from its point of view *anything* can
happen after 12345 * 3 is evaluated, the definition provided by the
implementation doesn't change the program status, as far as the C standard
is concerned.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I do not agree with this sentence.

Then, you're a patent idiot. I have quoted you the exact chapter and
verse which specifies WHEN undefined behaviour is invoked. It is sheer
stupidity to invent other situations, based on the definition of undefined
behaviour.
Undefined behaviour is a lack of
definition within the standard, not a property of a program.

Undefined behaviour cannot exist in the absence of a program. Furthermore
in some cases, it is the implementation that decides whether the program
invokes undefined behaviour or not, as in the example with 12345 * 3
(if the implementation defines INT_MAX as 32767, the attempt to evaluate
this expression *unconditionally* invokes undefined behaviour, regardless
of what the documentation says).

Another example:

6 Any pointer type may be converted to an integer type. Except as
previously specified, the result is implementation-defined. If the
result cannot be represented in the integer type, the behavior
is undefined. The result need not be in the range of values of
any integer type.

According to your interpretation, this is undefined behaviour. Then, why
didn't the standard simply say:

6 Converting any pointer type to an integer type invokes undefined
behaviour.

??? What are all the other extra words for?
A program
may be erroneous in a way that seems too complicated for the Standard to
impose a particular behaviour for an implementation.

So what? If it matches any of the conditions listed in the text I have
quoted, it invokes undefined behavior. Otherwise, its behaviour is
unspecified, implementation-defined or well defined.
If you still don't think that undefined behaviour is a portability
issue, guess what is the title of the the annex J, where the undefined
behaviours are recapitulated in J.2.

Your reasoning capabilities are severely damaged: undefined behaviour *is*
a portability issue, but this doesn't mean that *any* portability issue
is undefined behaviour. What about the *other* sections of appendix J?
Aren't they portability issues?

Dan
 
D

Dave Thompson

Emmanuel Delahaye said:
In 'comp.lang.c', (e-mail address removed) (j0mbolar) wrote: [ (punned) uchar passed to printf %u ]
No. An unsigned char will be converted to an int, but "%u" wants an unsigned
int. The behaviour is undefined.

If an unsigned char is converted to a signed int, it must be a
non-negative signed int. (otherwise it would have been converted
to unsigned int).

The standard says that non-negative signed ints must have the same
size, alignment and representation as unsigned ints. It can't be
a trap representation, or represent a different value. Therefore, as
far as a variadic function is concerned, it was passed a signed int.
Even the DS9000 could not go wrong here AFAICS, so shouldn't that
mean the behaviour is well-defined?

(Same representation only for values in the intersection of ranges,
but such a promoted value must be in the intersection. Same
representation is *nonnormatively*, in two footnotes, "meant to imply
interchangeability as arguments ....".)

For a user-written variadic function, (necessarily) using va_arg, this
is guaranteed explicitly in C99, as are xchar*/void* and the same
cases for unprototyped functions. There have been at least two rounds
of discussion on c.s.c whether the va_arg guarantees apply to standard
library functions, which aren't required to be implemented in C; there
appears to be a majority, but not necessarily a consensus, that this
was expected, but it's not clear it was actually specified.

And for *printf in particular there is the more local and thus
presumably overriding requirement in 7.19.6.1p9:
If any argument is not the correct type for the corresponding
conversion specification, the behavior is undefined.

In practice it would be exceedingly unlikely, and unhelpful, for any
implementor to break this. But that's not a formal requirement.

- David.Thompson1 at worldnet.att.net
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,145
Messages
2,570,824
Members
47,370
Latest member
desertedtyro29

Latest Threads

Top