strtok() and std::string

A

Alex Vinokur

Here is some program with using strtok() and std::string.

Why does strtok() affect str2 in function func2()?

====== File foo.cpp : BEGIN ======
#include <cstring>
#include <string>
#include <iostream>
using namespace std;

void func1()
{
string str1 ("abcd\nxyz");
string str2 (str1);
cout << "\tBEFORE" << endl;
cout << "str1 = <" << str1 << ">" << endl;
cout << "str2 = <" << str2 << ">" << endl;

str1[4] = 0;

cout << endl;
cout << "\tAFTER assignment" << endl;
cout << "str1 = <" << str1 << ">" << endl;
cout << "str2 = <" << str2 << ">" << endl;

}

void func2()
{
string str1 ("abcd\nxyz");
string str2 (str1);
cout << "\tBEFORE" << endl;
cout << "str1 = <" << str1 << ">" << endl;
cout << "str2 = <" << str2 << ">" << endl;

strtok ((char*)str1.c_str(), "\n");

cout << endl;
cout << "\tAFTER strtok" << endl;
cout << "str1 = <" << str1 << ">" << endl;
cout << "str2 = <" << str2 << ">" << endl;

}


int main()
{
func1();
cout << "-------------" << endl;
func2();
return 0;
}
====== File foo.cpp : END ========



====== Run : BEGIN ======

BEFORE
str1 = <abcd
xyz>
str2 = <abcd
xyz>

AFTER assignment
str1 = <abcd xyz>
str2 = <abcd
xyz>
-------------
BEFORE
str1 = <abcd
xyz>
str2 = <abcd
xyz>

AFTER strtok
str1 = <abcd xyz>
str2 = <abcd xyz>

====== Run : END ========
 
R

Rolf Magnus

Alex said:
Here is some program with using strtok() and std::string.

Why does strtok() affect str2 in function func2()?

====== File foo.cpp : BEGIN ======
#include <cstring>
#include <string>
#include <iostream>
using namespace std;

void func1()
{
string str1 ("abcd\nxyz");
string str2 (str1);
cout << "\tBEFORE" << endl;
cout << "str1 = <" << str1 << ">" << endl;
cout << "str2 = <" << str2 << ">" << endl;

str1[4] = 0;

cout << endl;
cout << "\tAFTER assignment" << endl;
cout << "str1 = <" << str1 << ">" << endl;
cout << "str2 = <" << str2 << ">" << endl;

}

void func2()
{
string str1 ("abcd\nxyz");
string str2 (str1);
cout << "\tBEFORE" << endl;
cout << "str1 = <" << str1 << ">" << endl;
cout << "str2 = <" << str2 << ">" << endl;

strtok ((char*)str1.c_str(), "\n");

This is a bad thing. The pointer returned by c_str() points to const char
for a reason. What you are saying with your cast is: "I know this is
constant. For some reason, I don't want it to be const, but I promise, I
won't modify it". Then, you use strtok to modify it. So you lie to your
compiler.
The bottom line is: Don't EVER modify what the pointer returned by c_str()
points to. That's why it is const in the first place.
 
N

Niels Dybdahl

Why does strtok() affect str2 in function func2()?

str1 and str2 probably share the same storage for their contents. That is
probably why c_str() is const, so that it is not likely to be modified.
By typecasting and modifying it anyway, you modify the contents of both str1
and str2

Niels Dybdahl
 
A

Alex Vinokur

Alex Vinokur said:
Here is some program with using strtok() and std::string.

Why does strtok() affect str2 in function func2()?
[snip]

Compiler-related notes.

1) GNU g++ 3.3.3 (Cygwin), GNU g++ 3.3.3 (Cygwin, Mingw32 interface), GNU gpp 3.4.1 (Djgpp), Borland C++ 5.5.1:
strtok() affect str2 in function func2()

2) Microsoft C++ 13.00.9466, Digital Mars 8.40n:
strtok() doesn't affect str2 in function func2()
 
A

Alex Vinokur

Niels Dybdahl said:
str1 and str2 probably share the same storage for their contents.
[snip]

But str1 and str2 are different instances.
Why do they share the same storage?
 
?

=?ISO-8859-15?Q?Stefan_N=E4we?=

Alex said:
Here is some program with using strtok() and std::string.

Why does strtok() affect str2 in function func2()?

...
void func2()
{
string str1 ("abcd\nxyz");
string str2 (str1);
cout << "\tBEFORE" << endl;
cout << "str1 = <" << str1 << ">" << endl;
cout << "str2 = <" << str2 << ">" << endl;

strtok ((char*)str1.c_str(), "\n");

This is definitely not allowed!
std::string::c_str() returns a const char* which can't simply be casted
to a char *.

You're begging for trouble...

Stefan
 
P

Pete Becker

Alex said:
But str1 and str2 are different instances.
Why do they share the same storage?

Because they hold the same text. Some implementations use copy-on-write:
when you copy a string, the new one gets a pointer to the same internal
data structure as the old one; when you change either one through the
normal interface it first makes a copy, so that only the string that's
being changed gets changed. If an application spends a lot of time
copying strings around but not modifying them, this can make it faster,
because it doesn't have to reallocate internal storage for every copy.
Your code bypasses the internal checking, which lets this implementation
detail show through.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,822
Latest member
israfaceZa

Latest Threads

Top