A
Adam Parkin
Hello all, I'm trying to write a function which given a std::string
parses the string by breaking the sentance up by whitespace (\t, ' ',
\n) and returns the result as a vector of strings. Here's what I have
so far:
std::vector<std::string> tokenize (std::string foo)
{
std::istringstream s (foo);
std::vector <std::string> v;
std::string tok;
for (; // infinite loop
{
// try to extract token
s >> tok;
// if string was read, push onto vector else break out
of loop
if (s.good())
v.push_back(tok);
else
break;
}
return v;
}
The problem is that now if given a string that doesn't have whitespace
at the end (ex - "this is a string"), the last token will be
passed up because s.good() will return false not when the last
extraction failed, but when there is no further input in the
istringstream. If I restructure the loop as:
while (s.good())
{
s >> tok;
v.push_back(tok);
}
then if there is whitespace after the last token (ex - "this is a
string "), then v will end up with the last token repeated, as
an extraction when there is only whitespace in the istringstream will
not modify the tok variable (so it keeps it's old value which was the
last token successfully read, and then v pushes this onto the end of the
vector).
Any suggestions?
Thanks,
Adam Parkin
parses the string by breaking the sentance up by whitespace (\t, ' ',
\n) and returns the result as a vector of strings. Here's what I have
so far:
std::vector<std::string> tokenize (std::string foo)
{
std::istringstream s (foo);
std::vector <std::string> v;
std::string tok;
for (; // infinite loop
{
// try to extract token
s >> tok;
// if string was read, push onto vector else break out
of loop
if (s.good())
v.push_back(tok);
else
break;
}
return v;
}
The problem is that now if given a string that doesn't have whitespace
at the end (ex - "this is a string"), the last token will be
passed up because s.good() will return false not when the last
extraction failed, but when there is no further input in the
istringstream. If I restructure the loop as:
while (s.good())
{
s >> tok;
v.push_back(tok);
}
then if there is whitespace after the last token (ex - "this is a
string "), then v will end up with the last token repeated, as
an extraction when there is only whitespace in the istringstream will
not modify the tok variable (so it keeps it's old value which was the
last token successfully read, and then v pushes this onto the end of the
vector).
Any suggestions?
Thanks,
Adam Parkin