Parsing a string using istringstream

A

Adam Parkin

Hello all, I'm trying to write a function which given a std::string
parses the string by breaking the sentance up by whitespace (\t, ' ',
\n) and returns the result as a vector of strings. Here's what I have
so far:

std::vector<std::string> tokenize (std::string foo)
{
std::istringstream s (foo);
std::vector <std::string> v;
std::string tok;

for (;;) // infinite loop
{
// try to extract token
s >> tok;

// if string was read, push onto vector else break out
of loop
if (s.good())
v.push_back(tok);
else
break;
}

return v;
}

The problem is that now if given a string that doesn't have whitespace
at the end (ex - "this is a string"), the last token will be
passed up because s.good() will return false not when the last
extraction failed, but when there is no further input in the
istringstream. If I restructure the loop as:

while (s.good())
{
s >> tok;
v.push_back(tok);
}

then if there is whitespace after the last token (ex - "this is a
string "), then v will end up with the last token repeated, as
an extraction when there is only whitespace in the istringstream will
not modify the tok variable (so it keeps it's old value which was the
last token successfully read, and then v pushes this onto the end of the
vector).

Any suggestions?

Thanks,

Adam Parkin
 
J

John Harrison

Adam said:
Hello all, I'm trying to write a function which given a std::string
parses the string by breaking the sentance up by whitespace (\t, ' ',
\n) and returns the result as a vector of strings. Here's what I have
so far:

std::vector<std::string> tokenize (std::string foo)
{
std::istringstream s (foo);
std::vector <std::string> v;
std::string tok;

for (;;) // infinite loop
{
// try to extract token
s >> tok;

// if string was read, push onto vector else break out of
loop
if (s.good())
v.push_back(tok);
else
break;
}

return v;
}

The problem is that now if given a string that doesn't have whitespace
at the end (ex - "this is a string"), the last token will be
passed up because s.good() will return false not when the last
extraction failed, but when there is no further input in the
istringstream. If I restructure the loop as:

while (s.good())
{
s >> tok;
v.push_back(tok);
}

then if there is whitespace after the last token (ex - "this is a
string "), then v will end up with the last token repeated, as
an extraction when there is only whitespace in the istringstream will
not modify the tok variable (so it keeps it's old value which was the
last token successfully read, and then v pushes this onto the end of the
vector).

Any suggestions?

Thanks,

Adam Parkin

while (s >> tok)
v.push_back(tok);

john
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,999
Messages
2,570,243
Members
46,836
Latest member
login dogas

Latest Threads

Top