R
Robbie Hatley
A couple of days ago I dedecided to force myself to really learn
exactly what "strtok" does, and how to use it. I figured I'd
just look it up in some book and that would be that.
I figured wrong!
Firstly, Bjarne Stroustrup's "The C++ Programming Language" said:
(nothing)
Ok, how about a C book? Steven Prata's "C Primer Plus" said:
(nothing)
Aaarrrggg. Ok, how about good old Randy Schildt and his book
"C++: Complete Reference"? It said:
#include <cstring>
char *strtok(char *str1, const char *str2);
The strtok() function returns a pointer to the next token in
the string pointed to by str1. The characters making up the
string pointed to by str2 are the delimiters that determine
the token. A null pointer is returned when there is no token
to return.To tokenize a string, the first call to strtok()
must have str1 point to the string being tokenized. Subsequent
calls must use a null pointer for str1. In this way, the entire
string can be reduced to its tokens. It is possible to use a
different set of delimiters for each call to strtok() .
Ok. But when I tried using the function, it didn't do what I
expected at all. For one thing, it severely alters the contents
of its first argument. Randy Schildt's book doesn't mention that
little factoid at all. :-( Bad Randy!
I had to google this function and find info on it on the web in
order to find out how it really works. Turns out, there's lots
of things missing is Schildt's description. (But hey, at least
he tried. Most other C/C++ authors chicken out and won't even
touch strtok in their books.) This is how this function REALLY
works:
http://www.opengroup.org/onlinepubs/007908799/xsh/strtok_r.html
I wish more authors would cover this useful function in their
books. After all, it IS a part of both the C and C++ standard
libraries. Ok, I'm done ranting now.
For your amusement, here is a function I wrote to break a string
into tokens, given a string of "separator" characters, and put
the tokens in a std::vector<std::string> . I'm sure there's
various ways this could be improved. Comments? Slings? Arrows?
void
Tokenize
(
std::string const & RawText,
std::string const & Delimiters,
std::vector<std::string> & Tokens
)
{
// Load raw text into an appropriately-sized dynamic char array:
size_t StrSize = RawText.size();
size_t ArraySize = StrSize + 5;
char* Ptr = new char[ArraySize];
memset(Ptr, 0, ArraySize);
strncpy(Ptr, RawText.c_str(), StrSize);
// Clear the Tokens vector:
Tokens.clear();
// Get the tokens from the array and put them in the vector:
char* TokenPtr = NULL;
char* TempPtr = Ptr;
while (NULL != (TokenPtr = strtok(TempPtr, Delimiters.c_str())))
{
Tokens.push_back(std::string(TokenPtr));
TempPtr = NULL;
}
// Free memory and scram:
delete[] Ptr;
return;
}
--
Cheers,
Robbie Hatley
East Tustin, CA, USA
lone wolf intj at pac bell dot net
(put "[usenet]" in subject to bypass spam filter)
http://home.pacbell.net/earnur/
exactly what "strtok" does, and how to use it. I figured I'd
just look it up in some book and that would be that.
I figured wrong!
Firstly, Bjarne Stroustrup's "The C++ Programming Language" said:
(nothing)
Ok, how about a C book? Steven Prata's "C Primer Plus" said:
(nothing)
Aaarrrggg. Ok, how about good old Randy Schildt and his book
"C++: Complete Reference"? It said:
#include <cstring>
char *strtok(char *str1, const char *str2);
The strtok() function returns a pointer to the next token in
the string pointed to by str1. The characters making up the
string pointed to by str2 are the delimiters that determine
the token. A null pointer is returned when there is no token
to return.To tokenize a string, the first call to strtok()
must have str1 point to the string being tokenized. Subsequent
calls must use a null pointer for str1. In this way, the entire
string can be reduced to its tokens. It is possible to use a
different set of delimiters for each call to strtok() .
Ok. But when I tried using the function, it didn't do what I
expected at all. For one thing, it severely alters the contents
of its first argument. Randy Schildt's book doesn't mention that
little factoid at all. :-( Bad Randy!
I had to google this function and find info on it on the web in
order to find out how it really works. Turns out, there's lots
of things missing is Schildt's description. (But hey, at least
he tried. Most other C/C++ authors chicken out and won't even
touch strtok in their books.) This is how this function REALLY
works:
http://www.opengroup.org/onlinepubs/007908799/xsh/strtok_r.html
I wish more authors would cover this useful function in their
books. After all, it IS a part of both the C and C++ standard
libraries. Ok, I'm done ranting now.
For your amusement, here is a function I wrote to break a string
into tokens, given a string of "separator" characters, and put
the tokens in a std::vector<std::string> . I'm sure there's
various ways this could be improved. Comments? Slings? Arrows?
void
Tokenize
(
std::string const & RawText,
std::string const & Delimiters,
std::vector<std::string> & Tokens
)
{
// Load raw text into an appropriately-sized dynamic char array:
size_t StrSize = RawText.size();
size_t ArraySize = StrSize + 5;
char* Ptr = new char[ArraySize];
memset(Ptr, 0, ArraySize);
strncpy(Ptr, RawText.c_str(), StrSize);
// Clear the Tokens vector:
Tokens.clear();
// Get the tokens from the array and put them in the vector:
char* TokenPtr = NULL;
char* TempPtr = Ptr;
while (NULL != (TokenPtr = strtok(TempPtr, Delimiters.c_str())))
{
Tokens.push_back(std::string(TokenPtr));
TempPtr = NULL;
}
// Free memory and scram:
delete[] Ptr;
return;
}
--
Cheers,
Robbie Hatley
East Tustin, CA, USA
lone wolf intj at pac bell dot net
(put "[usenet]" in subject to bypass spam filter)
http://home.pacbell.net/earnur/