I find a BIG bug of VS 2005 about string class!

L

Lighter

I find a BIG bug of VS 2005 about string class!

#include <iostream>
#include <string>

using namespace std;

string GetStr()
{
return string("Hello");
}

int main(int argc, char *argv[])
{
const char* p = GetStr().c_str();

cout << p << endl;

cin.get();
}

============================

Run the code above, Dev C++ will normally output the string "Hello",
however, VS 2005 should output an empty string!

In my opinion, string class is vital to almost every C++ program, and
VS 2005 as a C++ compiler is a widely used by programmers, if string
class is wrongly implemented by Microsoft, that will be a disaster!
 
L

Lighter

I'm the OP. VS 2005 did right. I am wrong.

Because GetStr() returns a temporary object, so the object will be
deleted after the function returns.

I'm very sorry for my misunderstanding.
 
B

benben

Lighter said:
I find a BIG bug of VS 2005 about string class!

#include <iostream>
#include <string>

using namespace std;

string GetStr()
{
return string("Hello");
}

int main(int argc, char *argv[])
{
const char* p = GetStr().c_str();

cout << p << endl;

cin.get();
}

I find a big bug in your code!

the GetStr() call creates a temporary string object from which you asked
for a c string. The c string is managed by the temporary string object
so when the temporary gets destroyed at the end of the first statement
in main(), the c string gets invalidated.

And the second statement in main() tries to access through an invalid
pointer, and the result is UB (Undefined Bahavior.)

Regards,
Ben
 
C

Colander

benben said:
Lighter said:
I find a BIG bug of VS 2005 about string class!

#include <iostream>
#include <string>

using namespace std;

string GetStr()
{
return string("Hello");
}

int main(int argc, char *argv[])
{
const char* p = GetStr().c_str();

cout << p << endl;

cin.get();
}

I find a big bug in your code!

the GetStr() call creates a temporary string object from which you asked
for a c string. The c string is managed by the temporary string object
so when the temporary gets destroyed at the end of the first statement
in main(), the c string gets invalidated.

I wonder;

Somewhere (don't recall) I read that modern compilers rewrite the thing
like the following:

void GetStr(string &result)
{
result = string("hello");
}

and nicely re-write the function call aswell.

When tested this works fine, but can I rely on it?

Thanks,
colander
 
P

peter koch

Colander wrote:

[snip]
I wonder;

Somewhere (don't recall) I read that modern compilers rewrite the thing
like the following:

void GetStr(string &result)
{
result = string("hello");
}

and nicely re-write the function call aswell.

This is not quite correct. What often happens is that it is written as

void GetStr(void* preallocated result)
{
new(result) string("hello");
}

When tested this works fine, but can I rely on it?

Also the reason that the tested code works fine is that it differs from
the original code. Thus you can not rely on it.

/Peter
 
S

Siam

benben said:
Lighter said:
I find a BIG bug of VS 2005 about string class!

#include <iostream>
#include <string>

using namespace std;

string GetStr()
{
return string("Hello");
}

int main(int argc, char *argv[])
{
const char* p = GetStr().c_str();

cout << p << endl;

cin.get();
}

I find a big bug in your code!

the GetStr() call creates a temporary string object from which you asked
for a c string. The c string is managed by the temporary string object
so when the temporary gets destroyed at the end of the first statement
in main(), the c string gets invalidated.

And the second statement in main() tries to access through an invalid
pointer, and the result is UB (Undefined Bahavior.)

Regards,
Ben

I don't understand this... I thought string literals are stored in
memory for the duration of the program, and don't obey the normal
scoping/deallocation rules? For example, I've been told the following
code is perfectly valid:


const char * GetStr()
{
const char* t = "Hello";
return t;
}

int main(int argc, char *argv[])
{
const char* p = GetStr();

cout << p << endl;

cin.get();
}

In fact, the above code works in VC as well, while the previous code
doesnt? It seems that when creating a string object from a string
literal, the string object takes control of it and destroys it when its
destructor is called... Is this correct, and if so why?

Furthermore, I would have thought that string objects wouldnt even be
dependent on the c-style string literal used to construct it? When the
string is returned from GetStr(), I would have thought a (deep) copy
would have been made of the object, which contains all the information
it needs about the string. Calling c_str on it should then just return
a new string literal based on the string data, neh?

I've always hated this string stuff!

Siam

Cheers
 
P

Philip Potter

Siam said:
I don't understand this... I thought string literals are stored in
memory for the duration of the program, and don't obey the normal
scoping/deallocation rules? For example, I've been told the following
code is perfectly valid:


const char * GetStr()
{
const char* t = "Hello";
return t;
}

int main(int argc, char *argv[])
{
const char* p = GetStr();

cout << p << endl;

cin.get();
}

The above code *is* perfectly valid. Your confusion arises from the term
"string literal". A string literal is not of type std::string - it is a char
array of static duration (that is, program duration), and when it appears in
an expression its value is a pointer (char *) to its first element.
In fact, the above code works in VC as well, while the previous code
doesnt? It seems that when creating a string object from a string
literal, the string object takes control of it and destroys it when its
destructor is called... Is this correct, and if so why?

Furthermore, I would have thought that string objects wouldnt even be
dependent on the c-style string literal used to construct it?

What do you mean? It *has* to be dependent on the c-style string used to
construct it in order to get the characters out of it. Once it has done
that, though, the c-style string doesn't matter anymore - a copy is stored
in the std::string object.
When the
string is returned from GetStr(), I would have thought a (deep) copy
would have been made of the object, which contains all the information
it needs about the string. Calling c_str on it should then just return
a new string literal based on the string data, neh?

Nearly right, but not quite. You are right that a deep copy is made (though
it may be neglected by an optimizer), and that this copy contains all the
information the std::string object needs. However, calling c_str() does not
return a string literal - it returns a char *, pointing to an area of memory
managed by the std::string object. When that (temporary) object is destroyed
at the end of the expression, that memory is no longer valid. And so when
this saved pointer is used later, you invoke UB.
I've always hated this string stuff!

Learn pointers, and you learn C strings. A char * is not a char[]. (The
comp.lang.c FAQ is quite useful reading on this point.)

Philip
 
S

Siam

Philip said:
What do you mean? It *has* to be dependent on the c-style string used to
construct it in order to get the characters out of it. Once it has done
that, though, the c-style string doesn't matter anymore - a copy is stored
in the std::string object.

Yeh, that's what I meant... Surely after you've constructed the string
object, the object isnt dependent on the value held in the location of
the c-style string... Wouldn't all the characters have been copied
across into the string object? Would a string object ever hold a handle
to the original memory location of the c-style string used to construct
it?
You are right that a deep copy is made (though
it may be neglected by an optimizer), and that this copy contains all the
information the std::string object needs.

....including a handle to a region of memory containing the characters
of the string?
However, calling c_str() does not
return a string literal - it returns a char *, pointing to an area of memory
managed by the std::string object.

Is this area of memory the same area of memory held by the c-style
string used to construct it? Or does the string have its own area of
memory holding the characters?
When that (temporary) object is destroyed
at the end of the expression, that memory is no longer valid. And so when
this saved pointer is used later, you invoke UB.

When the temporary object is returned, isn't it copy-constructed to the
local scope where the function had been called - thus also allocating
its own region of memory to store the characters? Why should it then
matter if the original temporary object is destroyed? Surely that
wouldnt affect a copy-constructed string with its own (copied) data?

Siam
 
P

Philip Potter

Siam said:
Yeh, that's what I meant... Surely after you've constructed the string
object, the object isnt dependent on the value held in the location of
the c-style string... Wouldn't all the characters have been copied
across into the string object?

Yes, you are correct.
Would a string object ever hold a handle
to the original memory location of the c-style string used to construct
it?

No. Not all std::strings are constructed with c-style strings anyway.
...including a handle to a region of memory containing the characters
of the string?

Yes. The std::string needs to know its contents somehow, doesn't it?
Is this area of memory the same area of memory held by the c-style
string used to construct it?

No, because the std::string is no longer dependent on the c-style string it
was constructed from. The std::string uses its own piece of memory to hold
the contents, and its c_str() method returns a handle to this memory. This
handle becomes invalid when the std::string which supplied it is destroyed.
When the temporary object is returned, isn't it copy-constructed to the
local scope where the function had been called - thus also allocating
its own region of memory to store the characters? Why should it then
matter if the original temporary object is destroyed?

You're right, that doesn't matter. But the newly copy-constructed temporary
is *also* destroyed, because it is a temporary and therefore not guaranteed
to last longer than the current expression.

Philip
 
S

Siam

Philip said:
You're right, that doesn't matter. But the newly copy-constructed temporary
is *also* destroyed, because it is a temporary and therefore not guaranteed
to last longer than the current expression.

Aah, damit, now I got it! You weren't talking about the destruction of
the object in the function's scope, but the destruction of the
temporary created when it's called!

So is it always incorrect to do something like the following:

MyType t = SomeReturnByValueFunction(...).SomeMethod(...);

Does the object returned by SomeReturnByValueFunction get destroyed
once we reached the period?

Cheers!

Siam
 
S

Siam

Let me rephrase that!

Is it always incorrect to do something like the following:

SomeHandle t =
SomeReturnByValueFunction(...).SomeMethodReturningInternalHandle(...);

t->Something( ); //or any expression using t

Does the object returned by SomeReturnByValueFunction get destroyed
once we reached the semicolon?


Cheers,

Siam
 
O

Old Wolf

Siam said:
Aah, damit, now I got it! You weren't talking about the destruction of
the object in the function's scope, but the destruction of the
temporary created when it's called!
Yes

So is it always incorrect to do something like the following:

MyType t = SomeReturnByValueFunction(...).SomeMethod(...);

Does the object returned by SomeReturnByValueFunction get destroyed
once we reached the period?

The temporary object gets destroyed at the end of the
full-expression -- ie. after the semicolon. Your code is correct,
as long as 't' only includes a copy of data from the temporary
object (eg. if MyType is int). The original code was incorrect
because it points inside the temporary object, and then uses
that pointer after the temporary object is gone.
 
P

Philip Potter

Siam said:
Let me rephrase that!

Is it always incorrect to do something like the following:

SomeHandle t =
SomeReturnByValueFunction(...).SomeMethodReturningInternalHandle(...);

t->Something( ); //or any expression using t

That depends precisely on the semantics of your code. However, in general,
this isn't safe. I think you've got the right idea.

To be precise: if SomeMethodReturningInternalHandle() makes no guarantee
that the handle will be valid after the object is destroyed, then this isn't
safe.

It's good of you to think in terms of handles rather than pointers here;
this applies also to references, streams which may be closed by the object's
destructor, and any other resource which the object shares. See the class's
documentation for details of what is safe and what isn't.
Does the object returned by SomeReturnByValueFunction get destroyed
once we reached the semicolon?

Yes. (I don't know if this is guaranteed; but its certainly not safe to
assume that the object still exists at the end of the expression.)

Philip
 
B

benben

It's good of you to think in terms of handles rather than pointers here;
this applies also to references, streams which may be closed by the object's
destructor, and any other resource which the object shares. See the class's
documentation for details of what is safe and what isn't.

Just to add another "handle-like" category here that people too often
overlooked--STL iterators.

Be very careful though, these "handles" don't have to be valid until the
owner gets destroyed. They can be invalidated whenever the owner object
feels like so, most probably when the state of the object has changed.
std::string::c_str() and vector iterators are examples.

Enjoy,
Ben
 
D

dasjotre

Lighter said:
I find a BIG bug of VS 2005 about string class!

I find a bug in you'r code :)

const char* p = GetStr().c_str();

temporary is destroyed, memory deallocated but it
is unlikely that it is returned to the OS; that is, you
can still use it without causing segmentation fault
cout << p << endl;

if you'r lucky you'll get away with it
anyway you got undefined behaviour

it is the same as saying there is a bug in auto_ptr
because you can do

auto_ptr<T> t(new T);
delete t.get();
use t

just don't do it !
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,998
Messages
2,570,242
Members
46,834
Latest member
vina0631

Latest Threads

Top