A question: Convert double to string

A

Aman JIANG

hi

I need to do this (convert double to string) fast, safe and
portable. Is there any way to do this ?

Except the ways following:
1. C++ I/O stream, stringstream (and boost::lexical_cast)
2. sprintf (and its brothers)
3. any nonstandard C/C++ functions

PS: if any implementation of i/o stream was faster than
sprintf, please tell me, thank you.
 
R

Rolf Magnus

Aman said:
hi

I need to do this (convert double to string) fast, safe and
portable. Is there any way to do this ?

Except the ways following:
1. C++ I/O stream, stringstream (and boost::lexical_cast)
2. sprintf (and its brothers)
3. any nonstandard C/C++ functions

How is that supposed to be possible? You don't want any standard way of
doing it and you don't want any non-standard way of doing it. I guess that
limits the choices a bit.
 
A

Aman JIANG

How is that supposed to be possible? You don't want any standard way of
doing it and you don't want any non-standard way of doing it. I guess that
limits the choices a bit.

Thanks.

I suppose it is possible because i am asking a question. And I have
this
question because:
A. sprintf was hard to be safe
B. iostream was pretty slow, it's hard to believe it can be used on a
program what have lots and lots of operation, It has many unwanted
actions
just for a numerical value conversion.

So, i want to find a better way.
(sorry for my poor english)
 
I

Ivan Vecerina

: > Aman JIANG wrote:
: > > hi
: >
: > > I need to do this (convert double to string) fast, safe and
: > > portable. Is there any way to do this ?
: >
: > > Except the ways following:
: > > 1. C++ I/O stream, stringstream (and boost::lexical_cast)
: > > 2. sprintf (and its brothers)
: > > 3. any nonstandard C/C++ functions
: >
: > How is that supposed to be possible? You don't want any standard
: > way of doing it and you don't want any non-standard way of doing
: > it. I guess that limits the choices a bit.
:
: Thanks.
:
: I suppose it is possible because i am asking a question.
: And I have this question because:
: A. sprintf was hard to be safe
It is. Safer variants of sprintf can help, as can writing
directly to a file. But ultimately, it might make sense
to check the range and validity of the converted number.

: B. iostream was pretty slow, it's hard to believe it can be used
: on a program what have lots and lots of operation, It has many
: unwanted actions just for a numerical value conversion.
The stringstream approach has its overheads, agreed, but is safe.

: So, i want to find a better way.
: (sorry for my poor english)
If iostream is too slow on your platform for your application, the
performance-sensitive way to go is the printf family of functions.
Because you might want to use some platform-specific extension
to make it safe, it could be a good idea to wrap it behind
your own function - using the buffer-allocation and format-
specification policy you want.
As to rewriting or adapt a dtoa function into your application,
I doubt it would be a sensible priority in terms of performance
optimization (there should be more important things to tune...)

hth -Ivan
 
A

Aman JIANG

If iostream is too slow on your platform for your application, the
performance-sensitive way to go is the printf family of functions.
Because you might want to use some platform-specific extension

I don't want to use any 'platform-specific extension', it isn't cross-
platform code :( Otherwise maybe I can research the floating point
format on all platforms...
to make it safe, it could be a good idea to wrap it behind
your own function - using the buffer-allocation and format-
specification policy you want.
As to rewriting or adapt a dtoa function into your application,
I doubt it would be a sensible priority in terms of performance
optimization (there should be more important things to tune...)

It's hard. I have no idea to confirm the size of the buffer. for
double, it can be 316 bytes, on my platforms, and i don't know the
sizes on other platforms...
 
B

Ben Rudiak-Gould

Aman said:
I need to do this (convert double to string) fast, safe and
portable. Is there any way to do this ?

You could roll your own printing code, something like this:

1. Test for NaN with x != x
2. Test for infinity by comparing with FLT_MAX (in <cfloat>)
3. Split into integer and fractional parts with modf (<cmath>)
4. Repeatedly div-mod the integer part to extract the digits
left of the decimal point, and repeatedly multiply and modf
the fractional part to get the digits to the right. It will
probably be faster to work in base 10^9 rather than 10.

-- Ben
 
I

Ivan Vecerina

: On Sep 22, 4:59 pm, "Ivan Vecerina"
: > If iostream is too slow on your platform for your application, the
: > performance-sensitive way to go is the printf family of functions.
: > Because you might want to use some platform-specific extension
:
: I don't want to use any 'platform-specific extension', it isn't cross-
: platform code :( Otherwise maybe I can research the floating point
: format on all platforms...
:
: > to make it safe, it could be a good idea to wrap it behind
: > your own function - using the buffer-allocation and format-
: > specification policy you want.
: > As to rewriting or adapt a dtoa function into your application,
: > I doubt it would be a sensible priority in terms of performance
: > optimization (there should be more important things to tune...)
:
: It's hard. I have no idea to confirm the size of the buffer. for
: double, it can be 316 bytes, on my platforms, and i don't know the
: sizes on other platforms...

If you want a more precise answer than what you've obtained so far,
you need to also say more about what you want:
- what allocation strategy would you like to rely on for the
returned character buffer?
- what output formats do you want to support?
I personally use a wrapper over platform-specific variants
of sprintf, which returns the result as an std::string
(using a stack buffer for the initial output if possible, and
a dynamically allocated buffer for larger outputs - relying
e.g. on _vscprintf do determine buffer size).
But if I am writing to a file, fprintf does the job well.

If you really want a custom solution, you should be able to find
some open source implementation to start from... look
for dtoa() or fcvt() as common low-level function names.
 
K

Kai-Uwe Bux

Aman said:
I don't want to use any 'platform-specific extension', it isn't cross-
platform code :( Otherwise maybe I can research the floating point
format on all platforms...


It's hard. I have no idea to confirm the size of the buffer. for
double, it can be 316 bytes, on my platforms, and i don't know the
sizes on other platforms...

Do you have any platform independent method (however involved) to compute an
upper bound for the length of the buffer at runtime? If so, you could use a
trick like the following:

#include <string>
#include <cassert>

template < unsigned N >
struct buffer_length {

static unsigned const value = 1 + buffer_length<N-1>::value * 2;

};

template <>
struct buffer_length<1> {

static unsigned const value = 1;

};

template <>
struct buffer_length<0> {

static unsigned const value = 1;

};

template < unsigned N >
std::string function_needing_a_buffer ( double d ) {
char buffer [ buffer_length<N>::value ];
// do something
buffer[0] = 'c';
buffer[1] = 0;
return ( std::string( buffer ) );
}


typedef std::string(* function_ptr )( double );

function_ptr get_function ( unsigned needed_size ) {
unsigned N = 0;
while ( needed_size != 0 ) {
++N;
needed_size /= 2;
}
switch ( N ) {
case 0 : { return &function_needing_a_buffer<0>; }
case 1 : { return &function_needing_a_buffer<1>; }
case 2 : { return &function_needing_a_buffer<2>; }
case 3 : { return &function_needing_a_buffer<3>; }
case 4 : { return &function_needing_a_buffer<4>; }
case 5 : { return &function_needing_a_buffer<5>; }
case 6 : { return &function_needing_a_buffer<6>; }
case 7 : { return &function_needing_a_buffer<7>; }
case 8 : { return &function_needing_a_buffer<8>; }
case 9 : { return &function_needing_a_buffer<9>; }
case 10 : { return &function_needing_a_buffer<10>; }
// ...
}
assert( false );
}

std::string true_function ( double d ) {
static function_ptr the_ptr = get_function( 325 );
// in real life, use some magic to determine the needed
// buffer size
return ( the_ptr(d) );
}

#include <iostream>

int main ( void ) {
std::cout << true_function( 1 ) << '\n';
}


Note that this executes the code to select the right buffer size only once
and not every time a double is converted.


Since speed seems to be of the essence in your application, using the
sprintf family is probably the way to go. It is usually heavily optimized
and should be almost impossible to beat. I would concentrate on writing a
wrapper around it that makes it safe and convenient to use without
compromising efficiency.


Best

Kai-Uwe Bux
 
A

Aman JIANG

You could roll your own printing code, something like this:

1. Test for NaN with x != x
2. Test for infinity by comparing with FLT_MAX (in <cfloat>)
3. Split into integer and fractional parts with modf (<cmath>)
4. Repeatedly div-mod the integer part to extract the digits
left of the decimal point, and repeatedly multiply and modf
the fractional part to get the digits to the right. It will
probably be faster to work in base 10^9 rather than 10.

-- Ben

Thank you.
It's valuable and I'll write some codes and do some tests.
By the way, what 'NaN' means, please ?
 
A

Aman JIANG

If you want a more precise answer than what you've obtained so far,
you need to also say more about what you want:
- what allocation strategy would you like to rely on for the
returned character buffer?

Maybe stack, maybe heap, I have no idea now. I think this isn't the
point.
Anyway, It must be fast and safe...
- what output formats do you want to support?

Maybe chars. It's easy to convert to other formats and other character
sets.
I personally use a wrapper over platform-specific variants
of sprintf, which returns the result as an std::string
(using a stack buffer for the initial output if possible, and
a dynamically allocated buffer for larger outputs - relying
e.g. on _vscprintf do determine buffer size).
But if I am writing to a file, fprintf does the job well.

That's good. Sometimes stack is enough.
But I donnot know what is '_vscprintf' ? I never saw ...
If you really want a custom solution, you should be able to find
some open source implementation to start from... look
for dtoa() or fcvt() as common low-level function names.

I'll try.

Thank you :)
 
K

Kai-Uwe Bux

Ben said:
You could roll your own printing code, something like this:

1. Test for NaN with x != x
2. Test for infinity by comparing with FLT_MAX (in <cfloat>)
3. Split into integer and fractional parts with modf (<cmath>)
4. Repeatedly div-mod the integer part to extract the digits
left of the decimal point, and repeatedly multiply and modf
the fractional part to get the digits to the right. It will
probably be faster to work in base 10^9 rather than 10.

a) The numerical analysis involved to make sure that all digits are correct
is far from trivial.

b) Performancewise, it will be close to impossible to beat the sprintf
family. Those library functions are heavily optimized and crucial parts are
probably coded in assembler.


I would suggest to wrap snprintf within a nice safe layer:

template < typename A >
bool strprintf ( std::string & buffer, char const * formant, A a );

template < typename A, typename B >
bool strprintf ( std::string & buffer, char const * formant, A a, B b );

...

When passing a buffer whose capacity is high enough, the function should not
need dynamic allocation. Moreover, using the contiguity requirement for
std::string from the next revision of the standard, already implemented by
most (all?) STL-implementations, one does not even need to use an internal
buffer. Something like:

template < typename A >
bool strprintf ( std::string & buffer, char const * format, A a ) {
while ( true ) {
buffer.append( 1, '\0' );
int length_needed =
snprintf( &buffer[0], buffer.size(), format, a );
if ( length_needed < 0 ) {
return ( false );
}
if ( length_needed > buffer.size() ) {
buffer.resize( length_needed );
} else {
buffer.resize( length_needed );
return ( true );
}
}
}

Assuming that std::string actually holds a buffer that always keeps a 0-char
at the end, one could even do:

template < typename A >
bool strprintf ( std::string & buffer, char const * format ) {
while ( true ) {
int old_length = buffer.size();
int length_needed =
snprintf( &buffer[0], old_length + 1, format, a );
if ( length_needed < 0 ) {
return ( false );
}
buffer.resize( length_needed );
if ( length_needed <= old_length ) {
return ( true );
}
}
}

However, I think that will have undefined implementation even according to
the current working draft.

In order to make double calls to snprintf as rare as possible, one could
also resize the buffer to its current capacity (beware, this also has
undefined behavior but I expect it to be portable).

template < typename A >
bool strprintf ( std::string & buffer, char const * format ) {
while ( true ) {
buffer.resize( buffer.capacity() );
int old_length = buffer.size();
int length_needed =
snprintf( &buffer[0], old_length + 1, format, a );
if ( length_needed < 0 ) {
return ( false );
}
buffer.resize( length_needed );
if ( length_needed <= old_length ) {
return ( true );
}
}
}

Anyway, to print millions of doubles, I would try a loop like this:

std::string buffer;
while ( whatever ) {
strprintf( buffer, "%f", my_function() );
std::cout << buffer << '\n';
}

and measure which one works best (according to my measurements, the last
version is slightly faster than the other two).


Best

Kai-Uwe Bux
 
J

James Kanze

a) The numerical analysis involved to make sure that all digits are correct
is far from trivial.
b) Performancewise, it will be close to impossible to beat the sprintf
family. Those library functions are heavily optimized and crucial parts are
probably coded in assembler.

That's very implementation dependent. It's been a long time
since I worked on the C library, but at least then, the sources
I saw were designed to be portable. It theory, code designed to
handle just one specific floating point representation (that on
your machine), and one particular text representation (maybe the
equivalent of %.17E) could be faster.

In practice, of course, you'd have to be pretty experienced in
the domain of numerics just to handle your point 1, and only a
real expert, and a lot of work, to improve on either sprintf or
std::eek:stringstream, assuming a reasonably good implementation in
both cases.
 
A

Aman JIANG

You could roll your own printing code, something like this:
1. Test for NaN with x != x
2. Test for infinity by comparing with FLT_MAX (in <cfloat>)
3. Split into integer and fractional parts with modf (<cmath>)
4. Repeatedly div-mod the integer part to extract the digits
left of the decimal point, and repeatedly multiply and modf
the fractional part to get the digits to the right. It will
probably be faster to work in base 10^9 rather than 10.

a) The numerical analysis involved to make sure that all digits are correct
is far from trivial.

b) Performancewise, it will be close to impossible to beat the sprintf
family. Those library functions are heavily optimized and crucial parts are
probably coded in assembler.

I would suggest to wrap snprintf within a nice safe layer:

template < typename A >
bool strprintf ( std::string & buffer, char const * formant, A a );

template < typename A, typename B >
bool strprintf ( std::string & buffer, char const * formant, A a, B b );

...

When passing a buffer whose capacity is high enough, the function should not
need dynamic allocation. Moreover, using the contiguity requirement for
std::string from the next revision of the standard, already implemented by
most (all?) STL-implementations, one does not even need to use an internal
buffer. Something like:

template < typename A >
bool strprintf ( std::string & buffer, char const * format, A a ) {
while ( true ) {
buffer.append( 1, '\0' );
int length_needed =
snprintf( &buffer[0], buffer.size(), format, a );
if ( length_needed < 0 ) {
return ( false );
}
if ( length_needed > buffer.size() ) {
buffer.resize( length_needed );
} else {
buffer.resize( length_needed );
return ( true );
}
}
}

Assuming that std::string actually holds a buffer that always keeps a 0-char
at the end, one could even do:

template < typename A >
bool strprintf ( std::string & buffer, char const * format ) {
while ( true ) {
int old_length = buffer.size();
int length_needed =
snprintf( &buffer[0], old_length + 1, format, a );
if ( length_needed < 0 ) {
return ( false );
}
buffer.resize( length_needed );
if ( length_needed <= old_length ) {
return ( true );
}
}
}

However, I think that will have undefined implementation even according to
the current working draft.

In order to make double calls to snprintf as rare as possible, one could
also resize the buffer to its current capacity (beware, this also has
undefined behavior but I expect it to be portable).

template < typename A >
bool strprintf ( std::string & buffer, char const * format ) {
while ( true ) {
buffer.resize( buffer.capacity() );
int old_length = buffer.size();
int length_needed =
snprintf( &buffer[0], old_length + 1, format, a );
if ( length_needed < 0 ) {
return ( false );
}
buffer.resize( length_needed );
if ( length_needed <= old_length ) {
return ( true );
}
}
}

Anyway, to print millions of doubles, I would try a loop like this:

std::string buffer;
while ( whatever ) {
strprintf( buffer, "%f", my_function() );
std::cout << buffer << '\n';
}

and measure which one works best (according to my measurements, the last
version is slightly faster than the other two).

Best

Kai-Uwe Bux

Thank you :)
This is a smart new way, I think it's a good idea,
I will spend one or two hours to analyze and test it.
 
I

Ivan Vecerina

: On Sep 23, 2:37 am, "Ivan Vecerina"
: > If you want a more precise answer than what you've obtained so far,
: > you need to also say more about what you want:
: > - what allocation strategy would you like to rely on for the
: > returned character buffer?
:
: Maybe stack, maybe heap, I have no idea now.
: I think this isn't the point.

It can be. If you are ready to sacrifice generality
for speed, you could declare a function fast_dtoa as:
struct FloatStr { char buf[16] };
FloatString fast_dtoa(double d);
It is likely to beat most string-returning stragegies,
but has its limitations.

Returning an std::string is more flexible, but has a
performance cost. Passing a buffer as an std::string&
can be relatively good performance-wise, but is
less easy to use.
Directly using fprintf on a FILE* is something
you might want to consider...


: Anyway, It must be fast and safe...
:
: > - what output formats do you want to support?
:
: Maybe chars. It's easy to convert to other formats
: and other character sets.

I was talking about floating-point formats.
If you only need to support fixed-point formats,
a specialized implementation might outrun sprintf -
especially if only a limited range of values needs
to be exported.


To paraphrase a popular project management saying:
"fast, safe, general - pick two"

Standard library functions ought to be general:
ostream is general and safe, but slow.
sprintf is general and pretty fast, but unsafe.

If you want a general solution, it will be tough to
beat at its game the implementation of sprintf available
on your platform: they are usually pretty mature.

Beating the ostream implementation you use is likely
to be easier. But the right optimization will depend
on context, and on the trade-offs that one is willing
to make.


Cheers -Ivan
 
A

Aman JIANG

: > If you want a more precise answer than what you've obtained so far,
: > you need to also say more about what you want:
: > - what allocation strategy would you like to rely on for the
: > returned character buffer?
:
: Maybe stack, maybe heap, I have no idea now.
: I think this isn't the point.

It can be. If you are ready to sacrifice generality
for speed, you could declare a function fast_dtoa as:
struct FloatStr { char buf[16] };
FloatString fast_dtoa(double d);
It is likely to beat most string-returning stragegies,
but has its limitations.

Returning an std::string is more flexible, but has a
performance cost. Passing a buffer as an std::string&
can be relatively good performance-wise, but is
less easy to use.
Directly using fprintf on a FILE* is something
you might want to consider...

At least I think it isn't the hard point...
: Anyway, It must be fast and safe...
:
: > - what output formats do you want to support?
:
: Maybe chars. It's easy to convert to other formats
: and other character sets.

I was talking about floating-point formats.
If you only need to support fixed-point formats,
a specialized implementation might outrun sprintf -
especially if only a limited range of values needs
to be exported.

Maybe this is a hot potato. It will depend on the way,
I guess.
To paraphrase a popular project management saying:
"fast, safe, general - pick two"

Standard library functions ought to be general:
ostream is general and safe, but slow.
sprintf is general and pretty fast, but unsafe.

If you want a general solution, it will be tough to
beat at its game the implementation of sprintf available
on your platform: they are usually pretty mature.

This is a classic problem :)
Beating the ostream implementation you use is likely
to be easier. But the right optimization will depend
on context, and on the trade-offs that one is willing
to make.

I don't wanna go to a extreme, I wish to find a balance
way...

Thank you.
 
B

Ben Rudiak-Gould

Kai-Uwe Bux said:
a) The numerical analysis involved to make sure that all digits are correct
is far from trivial.

Yes, you're right. I don't know what I was thinking when I wrote that.
I would suggest to wrap snprintf within a nice safe layer:

snprintf would have been my first suggestion except that it isn't standard C++.

-- Ben
 
A

Aman JIANG

Yes, you're right. I don't know what I was thinking when I wrote that.


snprintf would have been my first suggestion except that it isn't standard C++.

Yes, I have this problem. But I think that 'vsnprintf' of standard C++
with a
simple wrap can provides the same function, right ?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,818
Latest member
Brigette36

Latest Threads

Top