float to string to float, with first float == second float

J

James Kanze

[...]
The equality operator applied to floating point types simply
isn't affordable. It may work in such a particular case, but
you're not guaranteed it will.

Bullshit. (OK, I suppose that there may be problems if one or
both of the floating point values is a NaN---if one is a
trapping NaN, there are guaranteed to be problems. But this has
nothing to do with the equality operator---anything you do with
a trapping NaN is going to cause problems.)
Built-in floating point types simply don't give such kind of
certainty,

I'd suggest that you learn how floating point types work before
stating such inanities.
 
J

James Kanze

You should choose between *either* human readability *or* data
integrity. You can't have both.

Sure you can. You just have to define "human readability" and
"data integrity" appropriately, then write the correct code. In
practice, just outputting with seven digits precision will
fulfill both definitions most of the time.

[...]
Some arguments why *not* to do what you want:

1) If data integrity is a priority, store the data on a
binary file format: While imperfect, the binary format
is consistent. Of course, if the usual number formats
are unacceptable to you, you could use some binary encoded
decimal format.

Before starting to state what should be done, define "data
integrity". The original poster did, and for his definition, it
can be proven that seven decimal digits are sufficient for an
IEEE float.
 
J

James Kanze

On 6 Okt, 15:04, Victor Bazarov <[email protected]> wrote:

[...]
No. There are infinitely many real numbers between any two
consecutive FP numbers.

In usual English usage, some three billion exactly representable
numbers isn't a "few". Even if it is an infinitely small
percent of the possible numbers.

[...]
As I understand the question, the OP wants to break out
of those limitations: Exact conversions between base-10
and base-2 numbers, eliminating approximation errors etc.

And where did you get that? (It's possible, of course, but not
generally useful.) He most specifically defined what he wanted:
a round trip conversion. All he's asking for is:

1. the decimal value appear close to the floating point value
to human readers, without taking too many "unnecessary"
digits (his "readability" criterion), and
2. when converted back to float, the decimal value results in
the exact same value that was used to convert it---for IEEE
float, seven decimal digits should suffice (or maybe
eight---I'd have to check to be sure).

Those are reasonable constraints, and I've written code for
double in the past which met them, more or less. (Arguable,
outputting something like 1.0000000000000000, when 1 or 1.0
would be sufficient violates his first criterion. We considered
it acceptable, however.)
 
F

Francesco S. Carta

James Kanze said:
[snip]
There are a couple of problems with your code.
First, you should never compare floats or doubles for equality
that way.

Why not? And what should he do instead? He wants to know if
the two values are exactly equal.
Secondarily, extracting (>>) non-character data (integers,
doubles and so on) from streams can hang.

Since when? A correct implementation of stringstream will never
hang. (A correct implementation of an fstream can hang if the
"file" is actually a device which can hang, but that's beyond
the power of the library to control.)
[snip]
The equality operator applied to floating point types simply
isn't affordable.  It may work in such a particular case, but
you're not guaranteed it will.

Bullshit.  (OK, I suppose that there may be problems if one or
both of the floating point values is a NaN---if one is a
trapping NaN, there are guaranteed to be problems.  But this has
nothing to do with the equality operator---anything you do with
a trapping NaN is going to cause problems.)
Built-in floating point types simply don't give such kind of
certainty,

I'd suggest that you learn how floating point types work before
stating such inanities.

Well, all of this will serve as a lesson for the times to come.

I normally try to expose my thoughts thoroughly and accurately, but
here in this thread I've done more harm than good, clear as the purest
water.

Although I understand these subjects, I exposed my points in a very
weird manner, both for floating points and for non-character data
extraction from streams.

But all of this has been already and promptly sorted out, you're still
seeing these threads perfectly updated and correctly sorted via
GoogleGroups, yes? No weird NTTP delay/ordering issues here, no?

Rhetorical questions? Yes.
 
J

James Kanze

The problem is to come up with a pattern that is guaranteed
to reproduce the original binary pattern:
1) Approximations might occur when the value is first loaded
2) Approximations might occur when the value is serialized
3) Approximations might occur when the value is de-serialized
The problem is steps 2) and 3): Unless you can guarantee
that either
a) No approximations occur in steps 2) and 3)
b) The approximation in 3) exactly cancels the
approximation introduced in 2)
you can not guarantee that you end up with the same bit
pattern as you started out with.

With IEEE floating point, b is guaranteed for "normal" numbers
if there are at least 9 decimal digits. (I looked it up this
time---my original statement that 7 suffices was wrong.) There
may be problems with +/-0.0, if e.g. the implementation always
outputs 0.0 (even for a negative 0), or reads -0.0 as a positive
zero, and of course, nothing is said about infinity and NaNs.
Understanding the term 'serialize' as 'store binary data on
text-based format', I would have dropped the requirement for
human readability and stored the HEX pattern of the float.

Why? The whole point of using text instead of binary is human
readability. Of course, human readability isn't a binary
condition; there's a range for something like:
1.9
1.899999976
15938355*2^-23
3FF33333
All of the above are output from the same float, initialized
with the floating point literal 1.9. The first is the default
format, the second using fixed and a precision of 9, the third
using:

std::string
asInts( float const& f )
{
unsigned const& p = reinterpret_cast<unsigned const&>(f);
int sign = (p & 0x80000000) == 0 ? 1 : -1;
int exponent = ((p & 0x7F800000) >> 23) - 127 - 23;
int mantissa = (p & 0x007FFFFF) | 0x00800000;
while ( (mantissa & 1) == 0 ) {
mantissa >>= 1;
++ exponent;
}
std::eek:stringstream result ;
result << mantissa*sign << "*2^" << exponent ;
return result.str() ;
}

and the last a simple hex dump of the bytes. The last two are
an exact representation, and the last three guarantee round trip
conversion.
Something like

float a = 0.3;
ss << std::hex() << a << std::endl;
ss >> std::hex() >> a;
should go a long way to meet your requirements - EXCEPT
for the 'human readability' issue.

And the fact that std::hex doesn't affect floating point
output:). To get the output I have above, I used:
ss.setf( std::ios::hex, std::ios::basefield );
ss.setf( std::ios::uppercase );
ss << std::setw(8) << reinterpret_cast<unsigned const&>(a);
 
J

James Kanze

OK, so let's play with numbers:
The number D of digits needed to represent a number N in
base 10 notation is
D = ceil(log10(N)).
For a single-precision floating-point number, there are some
20 bits in the mantissa, so the number of digits D to represent
the mantissa becomes
D = ceil(log10(2^20))
= ceil(20*log10(2))
So one needs *at*least* 60 digits to represent the number.

Exactly. There's no requirement for an exact representation,
however; only one that guarantees a round trip conversion. For
that, 9 digits suffice, see the section "Binary to Decimal
Conversion" in http://www.validlab.com/goldberg/paper.pdf.
(Until you've read and understood this paper, you have no
business using floating point in a program.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,158
Messages
2,570,881
Members
47,414
Latest member
djangoframe

Latest Threads

Top