Help add commas to int on console output

Y

yogi_bear_79

I am passing a whole number integer to this function, it converts the
integer to a string, then I want to add commas so a number like
1234567 shows as 1,234,567. I am a little stuck, I belive the loop is
reading the string from right to left which led me to try decrementing
the loop, but a number like 12345 shows up like 123,45. Or is there a
better way all together to achive ths.

string toStr(int &i)
{
std::string s;
std::stringstream out;
out << i;
s = out.str();
for(size_t x = s.size(); x > 0; x--){
if(x != s.size() && x%3 == 0)
s.insert(x,",");
}
return s;
}
 
K

Kai-Uwe Bux

yogi_bear_79 said:
I am passing a whole number integer to this function, it converts the
integer to a string, then I want to add commas so a number like
1234567 shows as 1,234,567. I am a little stuck, I belive the loop is
reading the string from right to left which led me to try decrementing
the loop, but a number like 12345 shows up like 123,45. Or is there a
better way all together to achive ths.

string toStr(int &i)
{
std::string s;
std::stringstream out;
out << i;
s = out.str();
for(size_t x = s.size(); x > 0; x--){
if(x != s.size() && x%3 == 0)
s.insert(x,",");
}
return s;
}

For unsigned ints, you could do:

#include <string>
#include <sstream>
#include <iostream>

std::string to_string ( unsigned int i ) {
std::eek:stringstream out;
out << i;
std::string result = out.str();
for ( unsigned int i = 1 + ( result.size() + 2 ) % 3;
i < result.size();
i += 4 ) {
result.insert( i, "," );
}
return ( result );
}

int main ( void ) {
std::cout << to_string( 1 ) << '\n'
<< to_string( 12 ) << '\n'
<< to_string( 123 ) << '\n'
<< to_string( 1234 ) << '\n'
<< to_string( 12345 ) << '\n'
<< to_string( 123456 ) << '\n'
<< to_string( 1234567 ) << '\n';
}

Handling negative numbers will require some extra care.


Best

Kai-Uwe Bux
 
M

Martin York

I am passing a whole number integer to this function, it converts the
integer to a string, then I want to add commas so a number like
1234567 shows as 1,234,567. I am a little stuck, I belive the loop is
reading the string from right to left which led me to try decrementing
the loop, but a number like 12345 shows up like 123,45. Or is there a
better way all together to achive ths.

string toStr(int &i)
{
std::string s;
std::stringstream out;
out << i;
s = out.str();
for(size_t x = s.size(); x > 0; x--){
if(x != s.size() && x%3 == 0)
s.insert(x,",");
}
return s;

}

You can use the locale objects to do this for you.

#include <sstream>
#include <iostream>
#include <locale>
#include <string>

// custom numeric punctuation facet
struct Punct: std::numpunct<char>
{
char do_thousands_sep () const
{
return ',';
}
std::string do_grouping () const
{
return "\3";
}
};

std::string toStr(int x,std::locale const& l)
{
std::stringstream stream;
stream.imbue(l);

stream << x;
return stream.str();
}


int main()
{
// construct a custom punctuation facet
std::numpunct<char>* punct = new Punct;

// construct a locale containing the custom facet
const std::locale locale(std::cout.getloc(),punct);

std::cout.imbue(locale);
std::cout << "Val: " << 1234556 << std::endl;
std::cout << "From String: " << toStr(123456789,locale) <<
std::endl;
}
 
J

Jim Langston

yogi_bear_79 said:
I am passing a whole number integer to this function, it converts the
integer to a string, then I want to add commas so a number like
1234567 shows as 1,234,567. I am a little stuck, I belive the loop is
reading the string from right to left which led me to try decrementing
the loop, but a number like 12345 shows up like 123,45. Or is there a
better way all together to achive ths.

string toStr(int &i)
{
std::string s;
std::stringstream out;
out << i;
s = out.str();
for(size_t x = s.size(); x > 0; x--){
if(x != s.size() && x%3 == 0)
s.insert(x,",");
}
return s;
}

There's a few things with the code. For one the size of your string is
going to be changing inside of the loop. Also, you are basing where to put
the commas based on the length of the string (x) not the chracter you're on.

Consider. "12345" contains 7 characters. on '5' x will be 7. On '4' x
will be 6. 6 is evenly dividable by 3, so you get a comma there producing
your 123,45.

Now, the size of your string has also changed. It's no longer 7, it's now
8. This shouldn't be too much of a problem since you're going right to
left, but consider that the next character won't be a '3' but a ',' now.

I would add another counter inside the loop keeping track of digits.

Here's something I threw together. It's not elegant but it works. It can
probalby be cleaned up quite a bit. My intention was just to show the
algorithm, but I had to get it to work to do it and it's a bit of a mess.

std::string toStr( const int& i )
{
std::string s;
std::stringstream out;
out << i;
s = out.str();

size_t Count = ( s.size() - 1 ) / 3;

size_t Offset;
if ( s.size() % 3 == 0 )
Offset = 0;
else
Offset = 3 - s.size() % 3;
for ( size_t x = Count * 3; x != 0; x -= 3 )
{
s.insert(x - Offset,",");
}
return s;
}
 
J

James Kanze

You can use the locale objects to do this for you.

Of course, there's a good chance that there is already a locale
present which does it. Or not---it depends somewhat on the
compiler and the system. On my Sparc, about the only locale
installed, other than C, is en_US.UTF-8, which doesn't do the
job. And g++ won't find it anyway, although Sun CC does. On my
Linux machine, en_US and de_DE both insert commas, but fr_FR
doesn't. (Logically, it should insert spaces.) And I have no
idea how locales are named under Windows, in order to test it
there.

For the rest, excellent code, but I'd add a few comments,
because std::locale does have some strange and unexpected ways
of doing things.
#include <sstream>
#include <iostream>
#include <locale>
#include <string>
// custom numeric punctuation facet
struct Punct: std::numpunct<char>
{
char do_thousands_sep () const
{
return ',';
}
std::string do_grouping () const
{
return "\3";
}
};

The two functions above are virtual in the base class. I'd have
repeated the virtual here (and declared them protected, since
that's what they are in the base class).

I'd also explain why do_grouping returns a string with binary
values, except that I can't figure that one out myself; it just
does. (The equivalent function in C returns a char const*, but
in this case, "char" is not a character, but a small integer,
and the C++ equivalent would be std::vector<char>. Given the
actual use, of course, having the C++ function return a char
const* would make perfect sense as well.)
std::string toStr(int x,std::locale const& l)
{
std::stringstream stream;
stream.imbue(l);
stream << x;
return stream.str();
}
int main()
{
// construct a custom punctuation facet
std::numpunct<char>* punct = new Punct;

I'd also have added a comment here, to the effect that the
constructed locale will delete the object. Otherwise, it should
make anyone not aware of this fact wonder. (As it is, it looks
to someone unfamiliar with locales that you've been too
influenced by Java.)
 
J

James Kanze

There's a few things with the code. For one the size of your
string is going to be changing inside of the loop. Also, you
are basing where to put the commas based on the length of the
string (x) not the chracter you're on.
Consider. "12345" contains 7 characters. on '5' x will be 7.
On '4' x will be 6. 6 is evenly dividable by 3, so you get a
comma there producing your 123,45.
Now, the size of your string has also changed. It's no longer
7, it's now 8. This shouldn't be too much of a problem since
you're going right to left, but consider that the next
character won't be a '3' but a ',' now.
I would add another counter inside the loop keeping track of
digits.
Here's something I threw together. It's not elegant but it
works.

Martin York posted the correct solution, but if you wanted to do
it by hand, wouldn't the simplest solution be to reverse the
string before and after:

std::string
insertThousandsSep(
std::string const& source )
{
std::string result ;
int count = 0 ;
for ( std::string::const_reverse_iterator i =
source.rbegin() ;
i != source.rend() ;
++ i ) {
if ( count % 3 == 0 && count != 0 ) {
result += ',' ;
}
result += *i ;
++ count ;
}
std::reverse( result.begin(), result.end() ) ;
return result ;
}


For output, you can easily create a decorator:

class IntWithThousandsSep
{
public:
explicit IntWithThousandsSep( int value )
: myValue( value )
{
}
friend std::eek:stream&operator<<(
std::eek:stream& dest,
IntWithThousandsSep const&
obj )
{
std::eek:stringstream s ;
s << obj.myValue ;
dest << insertThousandsSep( s.str() ) ;
return dest ;
}
private:
int myValue ;
} ;

so you can write things like:

std::cout << IntWithThousandsSep( i ) ;

(And I know, calling this a decorator is straining the term a
bit, compared to the way the pattern is usually implemented.)
 
J

Jerry Coffin

[ ... ]
I'd also explain why do_grouping returns a string with binary
values, except that I can't figure that one out myself; it just
does.

I'm not sure which you're referring to: that it returns a string, or
that it uses binary values, or both.

Using binary values makes them independent of the character encoding
used. Since the locale is supposed to encapsulate such things as the
character encoding, having it depend on the character encoding would
sort of defeat the purpose.

Using a string instead of a pointer to const char is a bit harder to
be certain about. I suspect it was just somebody who thought "we're
designing this cool string class, why not use it?"

Using a container (string or pointer to char) that allows multiple
values, as opposed to the single char returned by other members like
do_thousands_sep and do_decimal_point is because it supports different
grouping widths. For example, you could have the first group contain
two digits, and the remainder contain three digits. I'm not sure who
uses this, but given its difference from the other members, I'm pretty
sure somebody must have thought it was really needed.
 
J

James Kanze

On Apr 1, 2:42 am, James Kanze <[email protected]> wrote:
[ ... ]
I'd also explain why do_grouping returns a string with binary
values, except that I can't figure that one out myself; it just
does.
I'm not sure which you're referring to: that it returns a
string, or that it uses binary values, or both.

The combination of the two.
Using binary values makes them independent of the character
encoding used. Since the locale is supposed to encapsulate
such things as the character encoding, having it depend on the
character encoding would sort of defeat the purpose.
Using a string instead of a pointer to const char is a bit
harder to be certain about. I suspect it was just somebody who
thought "we're designing this cool string class, why not use
it?"

*IF* it is appropriate to use a container here, that container
would be std::vector<>, not std::string. What is being returned
is not a string, it is an array. Calling it a string is just
obfuscation.

Arguably, given the use, it should have been an int const* in C.
Obviously, make it char const* saves some space (since no one
will ever have a thousands separation of more than 127), but
we're talking here of only a couple of bytes. But it is clear
that in C, what is being returned is an "array", not a string.

In all cases, I can't really imagine a case where it wouldn't be
a constant. Something like:

char const _thousands_sep[] = { 3, 0 } ;

So even in C++, I'd have gone with either char const* or int
const*. (Probably char const*, with the idea that this might
allow reusing some of the C implementation.)
Using a container (string or pointer to char) that allows
multiple values, as opposed to the single char returned by
other members like do_thousands_sep and do_decimal_point is
because it supports different grouping widths.
For example, you could have the first group contain two
digits, and the remainder contain three digits. I'm not sure
who uses this, but given its difference from the other
members, I'm pretty sure somebody must have thought it was
really needed.

I'm aware of this. I'm not personally aware of any locale with
such a grouping, but I seem to remember someone vaguely saying
that one existed using 4, 2, 0; or something like that. (Of
course, it may be a case of premature genericity. But in a
standard, you can't go back and make it more generic if the need
later arises.)
 
P

peter koch

I'm not sure which you're referring to: that it returns a
string, or that it uses binary values, or both.

The combination of the two.
Using binary values makes them independent of the character
encoding used. Since the locale is supposed to encapsulate
such things as the character encoding, having it depend on the
character encoding would sort of defeat the purpose.
Using a string instead of a pointer to const char is a bit
harder to be certain about. I suspect it was just somebody who
thought "we're designing this cool string class, why not use
it?"

*IF* it is appropriate to use a container here, that container
would be std::vector<>, not std::string.  What is being returned
is not a string, it is an array.  Calling it a string is just
obfuscation.

Arguably, given the use, it should have been an int const* in C.
Obviously, make it char const* saves some space (since no one
will ever have a thousands separation of more than 127), but
we're talking here of only a couple of bytes.  But it is clear
that in C, what is being returned is an "array", not a string.

In all cases, I can't really imagine a case where it wouldn't be
a constant.  Something like:

    char const _thousands_sep[] = { 3, 0 } ;

So even in C++, I'd have gone with either char const* or int
const*.  (Probably char const*, with the idea that this might
allow reusing some of the C implementation.)
Using a container (string or pointer to char) that allows
multiple values, as opposed to the single char returned by
other members like do_thousands_sep and do_decimal_point is
because it supports different grouping widths.
For example, you could have the first group contain two
digits, and the remainder contain three digits. I'm not sure
who uses this, but given its difference from the other
members, I'm pretty sure somebody must have thought it was
really needed.

I'm aware of this.  I'm not personally aware of any locale with
such a grouping, but I seem to remember someone vaguely saying
that one existed using 4, 2, 0; or something like that.  (Of
course, it may be a case of premature genericity.  But in a
standard, you can't go back and make it more generic if the need
later arises.)

http://en.wikipedia.org/wiki/Thousands_separator#Thousands_separator
gives the answer.
/Peter
 
J

Jerry Coffin

[ ... ]
*IF* it is appropriate to use a container here, that container
would be std::vector<>, not std::string. What is being returned
is not a string, it is an array. Calling it a string is just
obfuscation.

True -- I'm pretty sure that's a historical matter though. The string
class was added relatively early in the standardization process. The
vector template wasn't added until a _lot_ later. I suspect by the time
vector was added, nobody had the inclination to redesign locales to use
them -- especially since doing so probably would have delayed the
standard by quite a while (a year wouldn't surprise me at all...)
 
J

James Kanze

On 11 Apr., 11:32, James Kanze <[email protected]> wrote:

[...]

Not to why the value is a string. It does raise some
interesting issues, however: depending on the context, you may
or may not want thousand separators after the decimal point.

Also, the usual thousands separators in France are spaces, which
according to the above, is what ISO recommends. This means,
however, that you can't reread what you've written. (Maybe
non-breaking spaces? 0xA0 in Unicode?)
 
J

James Kanze

[ ... ]
*IF* it is appropriate to use a container here, that container
would be std::vector<>, not std::string. What is being returned
is not a string, it is an array. Calling it a string is just
obfuscation.
True -- I'm pretty sure that's a historical matter though.

Maybe. But the locale stuff is a pure invention of the
committee; it didn't exist before, so backward compatibility was
no issue. And the locales don't mind using char* elsewhere
where string would really be more appropriate, e.g.
ctype said:
The string class was added relatively early in the
standardization process. The vector template wasn't added
until a _lot_ later. I suspect by the time vector was added,
nobody had the inclination to redesign locales to use them --
especially since doing so probably would have delayed the
standard by quite a while (a year wouldn't surprise me at
all...)

Changing std::string to std::vector<char> certainly wouldn't
have delayed the standard, but as you say, probably no one
wanted to ever see <locale> again. The real question is why
std::string to begin with, rather than char const*.
 
P

peter koch

On 11 Apr., 11:32, James Kanze <[email protected]> wrote:

    [...]

Not to why the value is a string.  It does raise some
interesting issues, however: depending on the context, you may
or may not want thousand separators after the decimal point.

Also, the usual thousands separators in France are spaces, which
according to the above, is what ISO recommends.  This means,
however, that you can't reread what you've written.  (Maybe
non-breaking spaces? 0xA0 in Unicode?)

Oh - I only intended to provide a list of countries, where there was
not always three characters in each group. I remembered Tibet or
Nepal, but the link above indicates that this is far more widespread.
It sounds like a good idea with the non-breaking space, but the
problem is that you will be unable to write formatted numbers in
ASCII, which does not contain such a character. On the other hand,
numbers formatted this way are intended for human reading, and I would
not mind so much if reading them by a program was not directly
supported.

/Peter
 
J

Jerry Coffin

On 11 avr, 19:25, Jerry Coffin <[email protected]> wrote:

[ ... ]
Maybe. But the locale stuff is a pure invention of the
committee; it didn't exist before, so backward compatibility was
no issue. And the locales don't mind using char* elsewhere
where string would really be more appropriate, e.g.
ctype<>::toupper.

Oh, I don't mean history before the standardization effort -- only
during it.

[ ... ]
Changing std::string to std::vector<char> certainly wouldn't
have delayed the standard,

By itself, no -- except that doing so would have reopened the whole
subject of locales, and I can hardly imagine a way to even discuss them
without leaving somebody (usually quite a few somebodys) quite rightly
feeling that their needs are being slighted or ignored completely.

Don't get me wrong: when C was new, even mandating that a character set
have both lower- and uppercase English characters was asking for a lot
(and, in fact, C89 didn't mandate it). Given those meager beginnings, I
think they've done an almost amazing job of grafting some degree of
support of I18n on long after the fact. Nonetheless, it is grafted on
and (as I'm sure you know better than I) for almost anybody outside the
US, things can get clumsy in a hurry. Heck, even for us inside the US,
things get clumsy in a hurry -- in fact, the original subject of this
very thread applies equally in the US as elsewhere.

The bottom line is that I'm reasonably certain that if the subject of
locales had been reopened at all, it would have been almost impossible
to just agree to use std::vector where appropriate, and leave it at
that. As to why they didn't use std::string elsewhere, I can't say for
sure, though as I recall Andrew Koenig once explained that passing an
std::string as the name of a file to open wasn't added because it led to
discussions of I18N of file names that nobody felt could be resolved at
that point, so it was dropped entirely. I suspect (even if it was never
made official) that much the same thinking went into deciding to just
leave locales as they were, and be done with it.
 
J

James Kanze

On 12 Apr., 09:58, James Kanze <[email protected]> wrote:

[...]
It sounds like a good idea with the non-breaking space, but
the problem is that you will be unable to write formatted
numbers in ASCII, which does not contain such a character.

So when was the last time you saw anyone using ASCII?
ISO 8859-1 has been pretty much standard everywhere I've worked,
for the last 15 or so years, although UTF-8 seems to be
replacing it---very slowly. Windows has been Unicode for a long
time as well.
On the other hand, numbers formatted this way are intended for
human reading, and I would not mind so much if reading them by
a program was not directly supported.

I more or less agree. Anytime you are writing for the machine,
you should use the "C" locale (and limit yourself to characters
in the basic execution character set). But it does happen that
output is designed for both---our log files are definitely read
by humans (and contain large enough numbers that thousands
separators would be nice) and are parsed by various programs as
well.
 
J

James Kanze

[ ... ]
Changing std::string to std::vector<char> certainly wouldn't
have delayed the standard,
By itself, no -- except that doing so would have reopened the
whole subject of locales, and I can hardly imagine a way to
even discuss them without leaving somebody (usually quite a
few somebodys) quite rightly feeling that their needs are
being slighted or ignored completely.

Yes. You're certainly right about that. I guess my real
question is more along the lines of why they even used string to
begin with, given that the abstraction behind the char* in C
wasn't a string.
Don't get me wrong: when C was new, even mandating that a
character set have both lower- and uppercase English
characters was asking for a lot (and, in fact, C89 didn't
mandate it). Given those meager beginnings, I think they've
done an almost amazing job of grafting some degree of support
of I18n on long after the fact. Nonetheless, it is grafted on
and (as I'm sure you know better than I) for almost anybody
outside the US, things can get clumsy in a hurry. Heck, even
for us inside the US, things get clumsy in a hurry -- in fact,
the original subject of this very thread applies equally in
the US as elsewhere.

Quite. Starting with the fact that plain char can be signed
(resulting in characters having negative values).

Globally, given the context, I think that the C committee did a
pretty good job. (The context was that the ANSI C committee was
ready to adopt the standard without any i18n support, and
someone from ISO mentionned that ISO would have to add it
anyway, thus making ISO C different from ANSI C. So the
committee went back and added all of the original i18n support
in a year, so that ISO C could be ANSI C.) I'm a lot less
enthousiastic about the C++ locales, although it does recognize
one thing missing in C, that you need a stream specific locale.
But the entire <locale> header seems to be designed to be
difficult to understand and to use.
 
B

Ben Bacarisse

Jerry Coffin said:
Don't get me wrong: when C was new, even mandating that a character set
have both lower- and uppercase English characters was asking for a lot
(and, in fact, C89 didn't mandate it).

Small point: C89 did mandate it. Both the execution and source "basic
characters sets" must include both upper and lower case English
letters. In fact, it went further and allowed (different) multibyte
encodings in both the source and execution character sets.
 
J

Jerry Coffin

Small point: C89 did mandate it. Both the execution and source "basic
characters sets" must include both upper and lower case English
letters. In fact, it went further and allowed (different) multibyte
encodings in both the source and execution character sets.

That much is true. What it didn't mandate is that you be able to
communicate those characters to the outside world -- for example, it
specifically allows input on the command line to always appear in one
case -- in which case (no pun intended) it's always supposed to look
like lowercase regardless of what the user actually entered.
 
J

Jerry Coffin

(e-mail address removed)>, (e-mail address removed)
says...
On 13 avr, 03:52, Jerry Coffin <[email protected]> wrote:

[ ... ]
Yes. You're certainly right about that. I guess my real
question is more along the lines of why they even used string to
begin with, given that the abstraction behind the char* in C
wasn't a string.

Right -- and I honestly doubt anybody really knows the answer to that
anymore. Even the people who designed have probably lost conscious
memory of it (though their spouses might be able to tell us about times
they wake up in a cold sweat without an explanation...)

[ ... ]
Globally, given the context, I think that the C committee did a
pretty good job. (The context was that the ANSI C committee was
ready to adopt the standard without any i18n support, and
someone from ISO mentionned that ISO would have to add it
anyway, thus making ISO C different from ANSI C. So the
committee went back and added all of the original i18n support
in a year, so that ISO C could be ANSI C.) I'm a lot less
enthousiastic about the C++ locales, although it does recognize
one thing missing in C, that you need a stream specific locale.
But the entire <locale> header seems to be designed to be
difficult to understand and to use.

I think it tends to show some of the shortcomings of OOP in general --
it went along with the idea that you'd just derive a new class, override
a function or two, and be on your way.

In a way, they even did reasonably well: ignoring the overhead of a
class header and such, you can often create and use a facet (for
example) in half dozen lines of code or so. The problem, of course, is
that you need to read and understand hundreds of pages of dense
documentation before you figure out how to write those half dozen lines
of code -- and even then, I think most who do it have just worked out a
template that works, and modify a few specific parts to do what we want;
real understanding of the whole mechanism is relatively rare.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,175
Messages
2,570,946
Members
47,498
Latest member
yelene6679

Latest Threads

Top