Defeated by basic_istringstream

K

Keith MacDonald

I've been trying to write generic procedures for persistence of the contents
of a standard collection. The output side works perfectly, but the input
side fails when the collection is of strings and a string contains
whitespace. The problem is that I don't know the type that's stored in the
collection, so can't treat strings as a special case. This is how I've
tried to solve the problem:

template <class COLL>
void readCollection(COLL& coll)
{
COLL::value_type elem;

while (MoreData()) {
std::basic_istringstream<char>(GetData()) >> elem;
coll.insert(coll.end(), elem);
}
}

Asuming GetData() returns a string from somewhere, this can be successfully
invoked as:

std::vector<int> v1;
readCollection< std::vector<int> >(v1);

However, the following only works if GetData() returns a string without
embedded whitespace:

std::vector<std::string> v2;
readCollection< std::vector<std::string> >(v2);

Otherwise, each element only gets the first word of each string. Clearly,
this is a case for unformatted input, but that will only work for strings.
How can I implement a generic solution?

Thanks,
Keith MacDonald
 
J

John Harrison

Keith MacDonald said:
I've been trying to write generic procedures for persistence of the contents
of a standard collection. The output side works perfectly, but the input
side fails when the collection is of strings and a string contains
whitespace. The problem is that I don't know the type that's stored in the
collection, so can't treat strings as a special case. This is how I've
tried to solve the problem:

template <class COLL>
void readCollection(COLL& coll)
{
COLL::value_type elem;

while (MoreData()) {
std::basic_istringstream<char>(GetData()) >> elem;
coll.insert(coll.end(), elem);
}
}

Use template specialisation

template <class COLL>
void readCollection(COLL& coll)
{
COLL::value_type elem;

while (MoreData()) {
std::basic_istringstream<char>(GetData()) buffer;
read_element(buffer, elem);
coll.insert(coll.end(), elem);
}
}

// generic version
template <class ELEM>
void read_element(std::basic_istringstream<char>& buffer, ELEM& elem)
{
buffer >> elem;
}

// specialized version for strings
template <>
void read_element<std::string>(std::basic_istringstream<char>& buffer,
std::string& elem)
{
// whatever
}

john
 
E

Evan Carew

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Keith,

Right, that's because the input streams use white space as the default
delimiter on input. There are a number of solutions to this, you could
change the default delimiter, you could change the white space on the
way out, or, if your strings don't have "/n"'s at the end, you could add
one and then use getline on the way in.

Evan


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFAKT0Loo/Prlj9GScRAhcrAJ0XiCItK6qtd9DXUFsepB9uTMfSQgCcCRR5
2oJOcbGK3Pbi5Org0wPe/qw=
=kqVv
-----END PGP SIGNATURE-----
 
J

John Harrison

while (MoreData()) {
std::basic_istringstream<char>(GetData()) buffer;

Sorry should be

std::basic_istringstream<char> buffer(GetData());

of course.

john
 
K

Keith MacDonald

Evan,

I like the idea of changing the default delimiter, but can't find out how to
do it. Please will you point me at the relevant documentation.

Thanks, Keith
 
B

Branimir Maksimovic

Keith MacDonald said:
Evan,

I like the idea of changing the default delimiter, but can't find out how to
do it. Please will you point me at the relevant documentation.

Thanks, Keith

Sorry for jumping in the middle of discussion, but I had
similar problem years ago and find solution on this group
too :)

One of the tricks is to customize ctype facet.
eg:
#include <cstring>
#include <string>
#include <locale>
#include <sstream>
#include <iostream>

using namespace std;

class my_ctype: public ctype<char>
{
public:
my_ctype()
: ctype<char>(my_table, false) // false means don't delete table
{
memcpy(my_table, classic_table(), table_size);

my_table[';'] = space; // eg if you wan't other delimiter
my_table[' '] = alpha; // space is now counted as alpha
}
private:
mask my_table[table_size];
};

int main()
{
const char* buf="ffff;1.14;word1 word2";
istringstream is(buf);
is.imbue(locale(locale::classic(), new my_ctype));
int i;double d; string s;
is>>hex>>i>>dec>>d>>s;
cout<<"i="<<i<<"\nd="<<d<<"\ns="<<s<<'\n';
return 0;
}

Greetings, Bane.
 
E

Evan Carew

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Keith,

Let me have a look, I saw this technique mentioned in this news group
about a week ago. Since this issue comes up regularly for one reason or
another.

On a side note, this issue has come across my desk several times too,
and I usually end up saving some metadata along with the string to
determine how long it is. That being said, the following code should get
around the default delimiter... be forewarned tho, any imbedded 0s in
your serialized string will fool this method:

const int BUFLEN( ... );
char buffer[ BUFLEN ];
cin.get( buffer, BUFLEN, ',' );

Evan Carew

Keith said:
Evan,

I like the idea of changing the default delimiter, but can't find out how to
do it. Please will you point me at the relevant documentation.

Thanks, Keith


Keith,

Right, that's because the input streams use white space as the default
delimiter on input. There are a number of solutions to this, you could
change the default delimiter, you could change the white space on the
way out, or, if your strings don't have "/n"'s at the end, you could add
one and then use getline on the way in.

Evan
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFAKp97oo/Prlj9GScRAvJhAJ4+9pO/4wFI9CSQ9+ttYefF2IF5AACeK9nI
qu75bBl9rfzhcJvCeZL/hl0=
=18fy
-----END PGP SIGNATURE-----
 
E

Evan Carew

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Evan said:
Keith,

Let me have a look, I saw this technique mentioned in this news group
about a week ago. Since this issue comes up regularly for one reason or
another.

On a side note, this issue has come across my desk several times too,
and I usually end up saving some metadata along with the string to
determine how long it is. That being said, the following code should get
around the default delimiter... be forewarned tho, any imbedded 0s in
your serialized string will fool this method:
[snip]

Oops, I meant any embedded character which matches your terminator. I
suspect this correction was obvious, but just in case...

Evan
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFAKsA4oo/Prlj9GScRAo6HAJ9getUoVg7Kvac4JxewBn9XptRhjwCfdOG+
fRqAtxhaLExngMm2D8DpBs8=
=GQXu
-----END PGP SIGNATURE-----
 
K

Keith MacDonald

Bane,

You're always welcome to jump in, with a solution ;-)

I suspected that it would involve locales, but couldn't figure out how.
After comparing this solution with John Harrison's template specialisation
suggestion, I prefer the latter for solving the problem in hand, but now
that you've explained how to manipulate locales, I can use that technique to
tidy up some other code that's a bit of a hack.

Thanks,
Keith MacDonald

Branimir Maksimovic said:
"Keith MacDonald" <[email protected]> wrote in message
Sorry for jumping in the middle of discussion, but I had
similar problem years ago and find solution on this group
too :)

One of the tricks is to customize ctype facet.
eg:
#include <cstring>
#include <string>
#include <locale>
#include <sstream>
#include <iostream>

using namespace std;

class my_ctype: public ctype<char>
{
public:
my_ctype()
: ctype<char>(my_table, false) // false means don't delete table
{
memcpy(my_table, classic_table(), table_size);

my_table[';'] = space; // eg if you wan't other delimiter
my_table[' '] = alpha; // space is now counted as alpha
}
private:
mask my_table[table_size];
};

int main()
{
const char* buf="ffff;1.14;word1 word2";
istringstream is(buf);
is.imbue(locale(locale::classic(), new my_ctype));
int i;double d; string s;
is>>hex>>i>>dec>>d>>s;
cout<<"i="<<i<<"\nd="<<d<<"\ns="<<s<<'\n';
return 0;
}

Greetings, Bane.
 
B

Branimir Maksimovic

Keith MacDonald said:
Bane,

You're always welcome to jump in, with a solution ;-)

Thanks.
I forgot to correct a bug(just in case):

that should be
memcpy(my_table, classic_table(), table_size * sizeof(mask));
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,161
Messages
2,570,892
Members
47,427
Latest member
HildredDic

Latest Threads

Top