Very difficult floating point question

A

amit

This is a homework question so please don't give full answer, but I
really need a hint, I have no idea where to start...

Recall that in C, int and float are both 4 byte types. Consider the
following code.

main()
{
float f, g;
union { float f; int i; } u;
srand48(time(0));
f = drand48();
u.f = f;
u.i++;
g = u.f;
// ===== POINT A ===== //
}

At point A, will g be greater than f? Will it always be the next
representable floating point value after f? Can you explain your answers?

I tried printf'ing f and g at point A, but they both show up as equal...
I'm really confused. I don't really understand what the union does.

Thanks for any help!
 
B

bartc

amit said:
This is a homework question so please don't give full answer, but I
really need a hint, I have no idea where to start...

Recall that in C, int and float are both 4 byte types. Consider the
following code.

main()
{
float f, g;
union { float f; int i; } u;
srand48(time(0));
f = drand48();
u.f = f;
u.i++;
g = u.f;
// ===== POINT A ===== //
}

At point A, will g be greater than f? Will it always be the next
representable floating point value after f? Can you explain your answers?

I tried printf'ing f and g at point A, but they both show up as equal...

To how many decimals? Try as many as possible.

Or try printing the difference.
 
S

Stefan Ram

amit said:
Recall that in C, int and float are both 4 byte types.

I do not think so.
u.i++;
g = u.f;

»When a value is stored in a member of an object of
union type, the bytes of the object representation that
do not correspond to that member but do correspond to
other members take unspecified values«

ISO/IEC 9899:1999 (E), 6.2.6.1#7

Unless one can show that a float value does not have more
bytes than an int value in C, this means that some bytes of
the float value might have unspecified values after the
assignment. Therefore, the value of f might be unspecified
now. Possibly, it might even be an illegal representation.
 
E

Eric Sosman

amit said:
This is a homework question so please don't give full answer, but I
really need a hint, I have no idea where to start...

Recall that in C, int and float are both 4 byte types.

Maybe. Every `float' I happen to have seen had four bytes,
but I've encountered both two- and four-byte `int'. Other sizes
are possible, and not even the four-byte `float' is guaranteed.
Consider the
following code.

Missing some #include directives here, I think.
main()
{
float f, g;
union { float f; int i; } u;
srand48(time(0));
f = drand48();

No declarations for time(), srand48(), or drand48(). The
first is a Standard library function (for which you should have
#include'd <time.h>). The other two are not.

You get undefined behavior here for calling the time()
function via an expression of the wrong type, and with an
argument of the wrong type. You may also be in trouble with
srand48() and drand48(), depending on what they are and how
they expect to be called.
u.f = f;
u.i++;

Undefined behavior. In a union, only the element most
recently stored has a predictable value. You've stored the
`f' element, so the `i' is indeterminate. There's no telling
what you may get when you try to fetch, increment, and re-store
that indeterminate value.
g = u.f;
// ===== POINT A ===== //
}

At point A, will g be greater than f? Will it always be the next
representable floating point value after f? Can you explain your answers?

The program is not even guaranteed to *get* to point A.
That said, it probably will get there, despite having invoked
undefined behavior three times. But there's no telling what
kind of curdled value you'll find in `g'.
I tried printf'ing f and g at point A, but they both show up as equal...
I'm really confused. I don't really understand what the union does.

It does what you told it, which is "Do something undefined."
Or, in other words, "Anything at all that you do is fine with me."
 
M

mohangupta13

This is a homework question so please don't give full answer, but I
really need a hint, I have no idea where to start...

Recall that in C, int and float are both 4 byte types. Consider the
following code.

main()
{
  float f, g;
  union { float f; int i; } u;
  srand48(time(0));
  f = drand48();
  u.f = f;
  u.i++;
  g = u.f;
  // ===== POINT A ===== //

}

At point A, will g be greater than f? Will it always be the next
representable floating point value after f? Can you explain your answers?

I tried printf'ing f and g at point A, but they both show up as equal...
I'm really confused. I don't really understand what the union does.

Thanks for any help!

as i saw your question i did some googling and fell upon this page .

http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm

this must answer your question . I just read this it looks great but
reading the other comments in this thread makes me skeptical about its
validity in C .

With reference to the same link can anyone please tell me what
actually aliasing optimization means .
thanks Mohan
 
N

Nick Keighley

It sounds like the class is investigating how floating point works on their
machines, which presumably they already know have 32-bit floats and ints.

And they're using C as one way of doing that.

They're probably not (at the minute anyway), looking at writing C code that
will work portably on every conceivable class of hardware in the world. And
this task is necessarily specific to their machine anyway.

but does the instructor know that's what he's doing. I don't think so.

(I've tried a similar program on my machine, which also has 32-bit floats
and ints, and I've found a couple of interesting things. I wouldn't have
found out those things by religiously following everything in C standard.
Sometimes you either have to learn to read between the lines, or throw the
thing out the window.)

but why not do it in a portable fashion? Why not use an array of
unsigned char? You can then do the union trick (or use an unsigned
char* or memcpy it into an array of unsigned char).
 
D

D Yuniskis

amit said:
Recall that in C, int and float are both 4 byte types. Consider the
following code.

[note that which version of The Standard you read will refine
some of these issues, to some extent]

This isn't necessarily true. In *some* implementations, int's
(unqualified -- i.e., not shorts or longs) are 4 bytes. In
*some* implementations, floats are 32 bits (IEEE 754-ish).

But, none of these are guaranteed. For example, a float
can be the same size as a double. And a double can be
the same size as a long double. I.e., a float can be
implemented AS IF it was a long double (8 bytes?).

Likewise, an int needs only be at least as large as a short int.
So, an int can be 2 bytes!

Having said this, just keep it in the back of your mind
as to how it further muddies the situation explained below...
(i.e., lets pretend your statement *is* true)
main()
{
float f, g;
union { float f; int i; } u;
srand48(time(0));
f = drand48();
u.f = f;
u.i++;
g = u.f;
// ===== POINT A ===== //
}

How about we write this simpler?

function()
{
float x, y;
union {
float f;
int i;
} u;

x = 3.1415926; /* some random floating point value */

u.f = x;
u.i = u.i + 1;

y = u.f;

// ===== POINT A ===== //
}

I.e., you are storing some "bit pattern" (depending on how
your compiler represents the floating point number 3.1415926)
in the "float" called x.

You are then *copying* that bit pattern into a float called
f that is located within a union called u. How this looks
in memory now not only depends on how the compiler chose to
represent that value, but, also, on how it arranges the storage
requirements for f *within* u!

You are then modifying some *portion* of the union using
access through some *other* object in that union (i.e., the i).

Then, you are reexamining the bit pattern from the original
means by which you accessed it (f).

Now, let's come up with a *similar* example:

function()
{
int x, y;
union {
float f;
int i;
} u;

x = 512; /* some random integer value */

u.i = x;
u.f = u.f + 1.0;

y = u.i;

// ===== POINT A ===== //
}

Note that this is essentially the same problem: storing
a bit pattern in a member of the union, modifying some
*other* member of that same union, then reexamining the
original member's "value". Right?

Now, a third example:

function()
{
int x, y;
union {
unsigned char a[4];
int i;
} u;

x = 27; /* some random integer */

u.i = x;
u.a[0] = u.a[0] + 1;

y = u.i;

// ===== POINT A ===== //
}

This is the same problem as the first two.

Now (guessing as to what you know of C), what do
you expect the results in the third example to be?
In what circumstances do your assumptions make sense?

[apologies if I've let some typos slip through]
 
C

Chris M. Thomasson

Eric Sosman said:
amit wrote: [...]
u.f = f;
u.i++;

Undefined behavior. In a union, only the element most
recently stored has a predictable value.

I am wondering if you could tell me about a platform in which the following
would compile, but fail at the assertion:
__________________________________________________________
#include <assert.h>


typedef char static_assert
[
sizeof(unsigned int) == sizeof(unsigned long int) ? 1 : -1
];


union foo
{
unsigned int a;
unsigned long int b;
};


int main(void)
{
union foo f = { 0 };

++f.a;

assert(f.b == 1);

return 0;
}
__________________________________________________________




Or perhaps even this example:
__________________________________________________________
#include <assert.h>
#include <limits.h>


#define UCHAR_PER_UINT \
(sizeof(unsigned int) / sizeof(unsigned char))


#if (UCHAR_MAX != 0xFFU)
# error could not compile
#endif


typedef char static_assert
[
sizeof(unsigned char) *
UCHAR_PER_UINT == sizeof(unsigned int) ? 1 : -1
];


union foo
{
unsigned int value;
unsigned char parts[UCHAR_PER_UINT];
};


int main(void)
{
unsigned int i;
union foo f = { 0U };
unsigned char parts[4] = { 0U };

for (i = 0; i < UCHAR_PER_UINT; ++i)
{
++f.parts;
}

for (i = 0; i < UCHAR_PER_UINT; ++i)
{
unsigned int offset = i * CHAR_BIT;
unsigned int mask = 0xFFU << offset;
parts = (f.value & mask) >> offset;
}

for (i = 0; i < UCHAR_PER_UINT; ++i)
{
assert(parts == 1);
}

return 0;
}
__________________________________________________________


[...]
 
C

Chris M. Thomasson

Gordon Burditt said:
I am wondering if you could tell me about a platform in which the
following
would compile, but fail at the assertion:
__________________________________________________________ [...]
assert(f.b == 1);

If unsigned int is big-endian, and unsigned long is something else,
this assertion might fail.

Is there anything prohibiting a bit layout such that f.a and f.b
(assuming that both are at least 32 bits, which unsigned long would
have to be and the static_assert would fail if unsigned int isn't
the same size) have the bits with values 2**3, 2**7, 2**13, 2**29,
and 2**30 (those expressions are in math, not C, and use the FORTRAN
exponentiation ** operator), and *no others*, line up in corresponding
positions?

There are 32! different ways to map 32 bits in a register to a
32-bit area in storage. You have to be going out of your way to
map these differently for unsigned int and unsigned long, but it's
possible, and probably works that way on the DS9K.

I could see it now... I write code like that, and the FIRST time it runs
happens to be on a hardcore weirdo platform and the damn thing launches
missiles or something.

;^o
 
J

James

[...]
If unsigned int is big-endian, and unsigned long is something else,
this assertion might fail.

Is there anything prohibiting a bit layout such that f.a and f.b
(assuming that both are at least 32 bits, which unsigned long would
have to be and the static_assert would fail if unsigned int isn't
the same size) have the bits with values 2**3, 2**7, 2**13, 2**29,
and 2**30 (those expressions are in math, not C, and use the FORTRAN
exponentiation ** operator), and *no others*, line up in corresponding
positions?

There are 32! different ways to map 32 bits in a register to a
32-bit area in storage. You have to be going out of your way to
map these differently for unsigned int and unsigned long, but it's
possible, and probably works that way on the DS9K.

Holy Shi%


Humm... Is there a real platform in use today that would cause the assertion
to fail in the first and/or second one of Chris' examples?
 
J

James Dow Allen

.
I could see it now... I write code like that, and the FIRST time it runs
happens to be on a hardcore weirdo platform and the damn thing launches
missiles or something.

I don't think you need be worried about this.
The highly secure U.S. missile launching systems
run Microsoft Vista on Intel boxes.

James
 
C

Chris M. Thomasson

I don't think you need be worried about this.
The highly secure U.S. missile launching systems
run Microsoft Vista on Intel boxes.

One would think that the U.S. military could have developed a highly
classified operating system by now!

:^|
 
B

bartc

James said:
I don't think you need be worried about this.
The highly secure U.S. missile launching systems
run Microsoft Vista on Intel boxes.

Of course just for extra safety the programmers have absolutely no idea of
the word sizes, byte-orientation, alignment rules or typical memory sizes of
the processors inside the missile.

(Programming 'blind' like is considered good practice in this group.)
 
E

Eric Sosman

Chris said:
Eric Sosman said:
amit wrote: [...]
u.f = f;
u.i++;

Undefined behavior. In a union, only the element most
recently stored has a predictable value.

I am wondering if you could tell me about a platform in which the
following would compile, but fail at the assertion:
[... type-punning int and same-size long via union ...]

It fails, of course, on the DeathStation 9000. ;-)

The gotcha for unionized type-punning isn't so much the
remote possibility that types with identical sizes and similar
semantics might have different representations, but that an
aggressive optimizer might cache one member's value in a register
while the program stores something to the other member, making
the cached value stale. The optimizer might reason that storing
an int value somewhere "couldn't possibly" affect the already-
cached value of a long or a double or whatever.

But there are at least two possible objections to my gloomy
assessment: First, unions are so commonly used for type-punning
that an implementor might well "make it work" even if the code
is (technically) wrong. Second, I may have misunderstood the
matter; sometimes the language of the Standard requires study
of an intensity exceeding my ability.
 
R

Rich Webb

Chris said:
Eric Sosman said:
amit wrote: [...]
u.f = f;
u.i++;

Undefined behavior. In a union, only the element most
recently stored has a predictable value.

I am wondering if you could tell me about a platform in which the
following would compile, but fail at the assertion:
[... type-punning int and same-size long via union ...]

It fails, of course, on the DeathStation 9000. ;-)

The gotcha for unionized type-punning isn't so much the
remote possibility that types with identical sizes and similar
semantics might have different representations, but that an
aggressive optimizer might cache one member's value in a register
while the program stores something to the other member, making
the cached value stale. The optimizer might reason that storing
an int value somewhere "couldn't possibly" affect the already-
cached value of a long or a double or whatever.

But there are at least two possible objections to my gloomy
assessment: First, unions are so commonly used for type-punning
that an implementor might well "make it work" even if the code
is (technically) wrong. Second, I may have misunderstood the
matter; sometimes the language of the Standard requires study
of an intensity exceeding my ability.

To carry this forward a bit, the specific Thou Shalt Not sentence is in
informative appx J.1, Unspecified Behavior: "The value of a union member
other than the last one stored into (6.2.6.1)."

But if we go back to 6.2.6.1, the discussion on unions states: "When a
value is stored in an object of structure or union type, including in a
member object, the bytes of the object representation that correspond to
any padding bytes take unspecified values."

Given this statement, if the objects used in the type punning are the
same size and thus no padding bytes are involved, does that imply that
the unspecified behavior is not invoked?
 
P

Phil Carmody

Chris M. Thomasson said:
Eric Sosman said:
amit wrote: [...]
u.f = f;
u.i++;

Undefined behavior. In a union, only the element most
recently stored has a predictable value.

Point of order - not UB, just unspecified.
I am wondering if you could tell me about a platform in which the
following would compile, but fail at the assertion:
__________________________________________________________
#include <assert.h>


typedef char static_assert
[
sizeof(unsigned int) == sizeof(unsigned long int) ? 1 : -1
];

Your indenting is bizarre to say the least.
Why didn't you write

typedef char static_assert
[
sizeof
(
unsigned int
)
== sizeof
(
unsigned long int
)
? 1 : -1
];

if you have a strange obsession to split and indent on brackets not
of the curly variety.
union foo
{
unsigned int a;
unsigned long int b;
};

int main(void)
{
union foo f = { 0 };
++f.a;
assert(f.b == 1);
return 0;
}

It's quite possibly an implementation where the least significant
byte of the uint is at address 3, and the of the ulong is at 7.
I've used at least three different platforms like that in the past,
quite possibly more, but as I do my best to avoid non-portable
contstructs, such issues have been abstracted away into irrelevance.

Phil
 
E

Eric Sosman

Rich said:
[... "type-punning with unions needn't work" ...]
But there are at least two possible objections to my gloomy
assessment: First, unions are so commonly used for type-punning
that an implementor might well "make it work" even if the code
is (technically) wrong. Second, I may have misunderstood the
matter; sometimes the language of the Standard requires study
of an intensity exceeding my ability.

To carry this forward a bit, the specific Thou Shalt Not sentence is in
informative appx J.1, Unspecified Behavior: "The value of a union member
other than the last one stored into (6.2.6.1)."

But if we go back to 6.2.6.1, the discussion on unions states: "When a
value is stored in an object of structure or union type, including in a
member object, the bytes of the object representation that correspond to
any padding bytes take unspecified values."

Given this statement, if the objects used in the type punning are the
same size and thus no padding bytes are involved, does that imply that
the unspecified behavior is not invoked?

That's the source of my doubt: I'm not 100% sure the
normative text makes the punning undefined in the "perfect
overlap" case. The appendix says so, but it's non-normative.
And, as you point out, it seems to overstate the normative
text's case a little bit.

Even with perfect overlap, of course, there's still the
possibility that storing one member could produce something
that would be a trap representation when viewed via a different
member. The O.P.'s code stored a float and than manipulated
its representation as an int; integers usually don't have trap
representations but the result when viewed as a float might be
a signalling NaN for all anyone knows to the contrary.
 
J

James Dow Allen

Of course just for extra safety the programmers have absolutely no idea of
the word sizes, byte-orientation, alignment rules or typical memory sizes of
the processors inside the missile.

Lest any lurker think we're joking, strategic nuclear-weapons
submarines *do* run Windows:
http://m.linuxjournal.com/content/blue-screen-megadeath
http://www.tomshardware.com/news/Submarines-Windows-Royal-Navy,6718.html

Engineering and maintenance run on documentation.
Every bolt on a nuclear submarine has, I'm sure,
drawings and specifications. Yet the source code
to their computers' OS isn't bundled with the other
documentation -- that's Microsoft trade secret!

I dimly recall that Microsoft eventually agreed
to provide source to U.S. government. I don't know
about U.K., but for them to use closed-source in
critical weapons system would not be without precedent.
Their Chinook helicopters were grounded for several
years:
http://news.bbc.co.uk/2/hi/uk_news/7923341.stm :
"But the aircraft have never been able to fly
because the MoD failed to secure access to
key software source code."

It seems odd that Trident missiles, awesome weapons
indeed, would be controlled by source code offlimits
to the Pentagon (or Britain's MoD) but it seems fitting
in an intellectual environment where governments
cannot test voting machines because that might violate
the machine manufacturer's intellectual property rights.

James Dow Allen
 
F

Flash Gordon

bartc said:
Of course just for extra safety the programmers have absolutely no idea
of the word sizes, byte-orientation, alignment rules or typical memory
sizes of the processors inside the missile.

Most of the time you don't need to know that stuff, and I *have* done C
development in the defense industry. Not on missile launching systems,
but...
(Programming 'blind' like is considered good practice in this group.)

No, what is considered good practice is programming so that those things
don't matter *except* when you have specific *need* to know them. Or
more generally, so that you are not dependent on the specifics of your
implementation except where you need to be.

Oh, and I have debugged software on hardware with drastically different
characteristics to the target hardware. Specifically 8 bit bytes where
the target had 16 bit bytes and I don't (and didn't even then) know what
other differences. All I replaced was the actual routines which
interfaced to the HW, which was a minuscule amount of the code (two
small functions, one of which needed to be written in assembler for
speed on the target HW).
 
F

Flash Gordon

bartc said:
Of course just for extra safety the programmers have absolutely no idea
of the word sizes, byte-orientation, alignment rules or typical memory
sizes of the processors inside the missile.

Most of the time you don't need to know that stuff, and I *have* done C
development in the defense industry. Not on missile launching systems,
but...
(Programming 'blind' like is considered good practice in this group.)

No, what is considered good practice is programming so that those things
don't matter *except* when you have specific *need* to know them. Or
more generally, so that you are not dependent on the specifics of your
implementation except where you need to be.

Oh, and I have debugged software on hardware with drastically different
characteristics to the target hardware. Specifically 8 bit bytes where
the target had 16 bit bytes and I don't (and didn't even then) know what
other differences. All I replaced was the actual routines which
interfaced to the HW, which was a minuscule amount of the code (two
small functions, one of which needed to be written in assembler for
speed on the target HW).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,813
Latest member
lawrwtwinkle111

Latest Threads

Top