A better way of formatting strings?

R

Ricky65

I was thinking about string formatting after reading this article by
Herb Sutter here:
http://www.gotw.ca/publications/mill19.htm

I thought to myself "Why haven't they overloaded the << operator for
the basic_string class?" like with the iostream. I don't think I'm the
first to have thought of this. I assume performance problems are the
reason this is not done.

This would eliminate the need to make a temporary variable to hold the
formatted data like a stringstream does and the problems with the
sprintf family. I know a lot of people shun stringstreams because they
are unacceptably slow in some cases.

For example, we could do something like this:
int cat_len = 120;
int mouse_len = 29;

std::string myformattedstring << "The cat is " << a << "cm tall and
the mouse is " << b << "cm tall.";

As you can see, this would format directly to the string. This would
be both type safe and length safe and the programmer wouldn't have to
allocate a temporary variable for a stream. However, I'm not sure how
efficient it would be. I'm intrigued to know if this can be done and
and if not, the reasons why.

Thanks

Ricky Marcangelo
 
V

Victor Bazarov

I was thinking about string formatting after reading this article by
Herb Sutter here:
http://www.gotw.ca/publications/mill19.htm

I thought to myself "Why haven't they overloaded the<< operator for
the basic_string class?" like with the iostream. I don't think I'm the
first to have thought of this. I assume performance problems are the
reason this is not done.

This would eliminate the need to make a temporary variable to hold the
formatted data like a stringstream does and the problems with the
sprintf family. I know a lot of people shun stringstreams because they
are unacceptably slow in some cases.

For example, we could do something like this:
int cat_len = 120;
int mouse_len = 29;

std::string myformattedstring<< "The cat is "<< a<< "cm tall and
the mouse is "<< b<< "cm tall.";

This syntax is invalid. You either declare an object (and initialize
it) or manipulate it using operator<<. You can't do both in the same
statement.
As you can see, this would format directly to the string. This would
be both type safe and length safe and the programmer wouldn't have to
allocate a temporary variable for a stream. However, I'm not sure how
efficient it would be. I'm intrigued to know if this can be done and
and if not, the reasons why.

It cannot, see above.

V
 
R

Ricky65

This syntax is invalid.  You either declare an object (and initialize
it) or manipulate it using operator<<.  You can't do both in the same
statement.


It cannot, see above.

V

Sorry, I meant

std::string myformattedstring;

myformattedstring << "The cat is " << a << "cm tall and
the mouse is " << b << "cm tall.";
 
M

Michael Doubez

I was thinking about string formatting after reading this article by
Herb Sutter here:http://www.gotw.ca/publications/mill19.htm

I thought to myself "Why haven't they overloaded the << operator for
the basic_string class?" like with the iostream.

Who is "they" ? The standard committee ?
I don't think I'm the
first to have thought of this. I assume performance problems are the
reason this is not done.

Doing so would shun formating facilities.

The language definition is geared toward general case. If you have
specific needs, you are expected to roll your own. This avoid
cluttering the standard with use cases that you can recode or that
libraries can provide.
This would eliminate the need to make a temporary variable to hold the
formatted data like a stringstream does and the problems with the
sprintf family. I know a lot of people shun stringstreams because they
are unacceptably slow in some cases.

I expect that using a temporary stringstream is more efficient than
appending into a string.
For example, we could do something like this:
int cat_len = 120;
int mouse_len = 29;

std::string myformattedstring << "The cat is " << a << "cm tall and
the mouse is " << b << "cm tall.";

As you can see, this would format directly to the string. This would
be both type safe and length safe and the programmer wouldn't have to
allocate a temporary variable for a stream. However, I'm not sure how
efficient it would be. I'm intrigued to know if this can be done and
and if not, the reasons why.

As you wrote it, it is impossible: you cannot mix variable definition
and initialization this way.

Something possible is:
std::string myformattedstring = string_formater()<<"The cat is "
<<...;

Some very fast formating constructs based on template can be designed
this way ( especially if the maximum size of each element can be
computed beforehand). But they have some limitations.

If you want to see an example, you can lookup the FastFormat library
from Matthew Wilson and its shims technique.


Concerning the form, I'd prefer:
std::string myformattedstring = ("The cat is ",fmt::_1,"cm tall and
the mouse is ",fmt::_2,"cm tall.")
, cat_len, mouse_len;
 
V

Victor Bazarov

Sorry, I meant

std::string myformattedstring;

myformattedstring<< "The cat is "<< a<< "cm tall and
the mouse is "<< b<< "cm tall.";

That is *usually* done with stringstreams:

std::eek:stringstream myformattedstream;
myformattedstream << "The cat it in the hat";

std::string myformattedstring = myformattedstream.str();

There is no need to pollute std::string with the functionality that
already exists in another class.

V
 
M

Michael Doubez

That does not lend itself to well to localization; consider boost.format
instead.

It depends on what you want to achieve. Boost.Format is great for
having complex format (event tabulation IIRC) while others won't need
the internationalisation/localisation but prefer high speed (like in
logging or feeding another program).

See:
http://accu.org/index.php/journals/1539
 
J

James Kanze

On 24/03/2011 15:34, Michael Doubez wrote:
[...]
Concerning the form, I'd prefer:
std::string myformattedstring = ("The cat is ",fmt::_1,"cm tall and
the mouse is ",fmt::_2,"cm tall.")
, cat_len, mouse_len;

I'd be interested in seeing how that could be implemented
without varargs, and the resulting loss of type safety.
That does not lend itself to well to localization; consider boost.format
instead.

boost::format solves one small aspect of localization, but in
practice, it's rarely enough. I had a class which did the same
thing, long before there was boost, and I gradually stopped
maintaining it, because it didn't solve any real problems
satisfactorily. (And unlike boost::format, it didn't really
work with manipulators. And you really want user defined
manipulators to define semantic markup.) As soon as more than
one language is involved, you almost always have to have a
separate dll, with specially written code, for each language.
 
R

Ricky65

No problem if you really want this, just add a little helper function.
Note that this essentially duplicates stringstream functionality for
another class (std::string) actually meant for other purposes.

#include <iostream>
#include <string>
#include <sstream>

template<typename T>
std::string& operator<<(std::string& s, const T& x) {
   std::eek:stringstream os;
   os << x;
   s += os.str();
   return s;

}

int main() {
   std::string myformattedstring;
   int a = 10;
   double b = 3.5;
   myformattedstring << "The cat is " << a <<
       "cm tall and the mouse is " << b << "cm tall.";
   std::cout << myformattedstring << "\n";







}

Thanks for this example. This was what I had in mind in my original
post. However, am I right in saying that this wouldn't be any faster
than using normal stringstream? If so, I may as well stick to good old
stringstreams.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,235
Members
46,821
Latest member
AleidaSchi

Latest Threads

Top