std::string class instance are immutable or not??

P

praveenraj1987

i have a doubt regarding std::string class..
that whether their instance are immutable or not.. like as we have in
other language

..net framework..
system.string class instance are immutable.

java
java.lang.string class instance are immutable..

same as in C++..
std::string class are immutable or not???

i hope it is not..
as it is typedef of template<class charT, class traits =
char_traits<charT>, class Allocator = allocator<charT> > class
basic_string.
 
P

praveenraj1987

Thanx Alot for your response.. :)
actually there was a guy who was not not agreeing to this.
 
S

SG

i have a doubt regarding std::string class..
that whether their instance are immutable or not.. like as we
have in other language

.net framework..
system.string class instance are immutable.

The string class in C# is funny. It is a class (reference type) but
string is immutable and has many overloaded operators so you can use
it like a value type:

string a = foo();
a += bar(); // a points to another immutable string now (C#)

Assembling a string that way is very costly which is why in C# and in
Java you typically have a StringBuilder class which is like a string
but mutable.

In C++ a non-const string is mutable and has value semantics:

string a = foo();
a += bar(); // a will be the same string object (C++)

There's no need for a "StringBuilder" class.

HTH,
SG
 
I

Ian Collins

Jeff said:
Ian said:
A simple bit of test code would have put him straight!

True in this case, but misleading in general. Suppose he had been
wondering about C-style string literals, rather than std::string:

int main() {
char* p = "hello";
p[1] = 'a';
}

"/tmp/z.cc", line 2: Warning: String literal converted to char* in
initialization.
 
D

Default User

Ian said:
Jeff said:
Ian said:
(e-mail address removed) wrote:
Thanx Alot for your response.. :)
actually there was a guy who was not not agreeing to this.

A simple bit of test code would have put him straight!

True in this case, but misleading in general. Suppose he had been
wondering about C-style string literals, rather than std::string:

int main() {
char* p = "hello";
p[1] = 'a';
}

"/tmp/z.cc", line 2: Warning: String literal converted to char* in
initialization.

While a very useful diagnostic, it is not a required one.



Brian
 
I

Ian Collins

Default said:
Ian said:
Jeff said:
Ian Collins wrote:
(e-mail address removed) wrote:
Thanx Alot for your response.. :)
actually there was a guy who was not not agreeing to this.
A simple bit of test code would have put him straight!
True in this case, but misleading in general. Suppose he had been
wondering about C-style string literals, rather than std::string:

int main() {
char* p = "hello";
p[1] = 'a';
}
"/tmp/z.cc", line 2: Warning: String literal converted to char* in
initialization.

While a very useful diagnostic, it is not a required one.
Alas, true. Which is a shame.
 
D

Daniel Pitts

i have a doubt regarding std::string class..
that whether their instance are immutable or not.. like as we have in
other language

..net framework..
system.string class instance are immutable.

java
java.lang.string class instance are immutable..
No such class exists. but the java.lang.String class is indeed
"immutable" (with the exception of hashCode(), but externally it is
immutable).
I believe .NET String is also immutable, but I don't know.
same as in C++..
std::string class are immutable or not???
std::string is definitely mutable, "const std::string" on the other hand
is not.
i hope it is not.. You're hope is answered.
as it is typedef of template<class charT, class traits =
char_traits<charT>, class Allocator = allocator<charT> > class
basic_string.
What does this have to do with your hope?
 
P

praveenraj1987

You're hope is answered.> as it is typedef of template<class charT, class traits =

What does this have to do with your hope?

as basic_string are mutable. :D
 
S

SG

This information is misleading, I believe.  How is the fact that in C#

Misleading was certnainly not my intention.
'a' suddenly "points to another" string different than the fact that in
C++ 'a' *can* reallocate it buffer?

Yes, there's little difference. But in C# the string OBJECT is a new
one (as in identity) and the object _reference_ has been mutated
whereas in C++ the string _object_ is mutated.
To the programmer, 'a' is the same in both cases.
Yes.

So, semantically, C++ string and C# strings are the same, mutable.

If by "C# string" you mean a C# string reference then yes. The C#
string _object_ isn't. But we can say that a C# string object
_reference_ behaves like a "value type" (in .NET terminology). That's
why I said "C# string is funny". I didn't point out explicitly that to
a programmer there's no difference but it was part of what I tried to
communicate.

Cheers!
SG
 
A

Alf P. Steinbach

* Victor Bazarov:
std::string (that is, std::basic_string) is unfortunately not immutable.

This information is misleading, I believe. How is the fact that in C#
'a' suddenly "points to another" string different than the fact that in
C++ 'a' *can* reallocate it buffer? To the programmer, 'a' is the same
in both cases. So, semantically, C++ string and C# strings are the
same, mutable. It may be expensive to mutate them, true. It even can
be more expensive in C# than in C++, but *semantically* they are pretty
much the same.

Sorry, that's a misunderstanding.

Immutable strings can easily support assignment. Immutability for a string type
does not refer to the conceptual complete string value but to the individual
characters in a value. The reason this constraint known to the implementation is
important (otherwise we could just typedef a const version of std::string) is
that it allows extreme optimization, and avoids a number of serious problems.

With immutable strings you have guaranteed efficiency, constant time for most
operations, including constructing a string from a literal, returning a string,
etc., depending on the implementation even for substring extraction, whereas
with std::string you have to just hope for the best, depending on the
implementation.


Cheers, & hth.,

- Alf
 
A

Alf P. Steinbach

* SG:
The string class in C# is funny. It is a class (reference type) but
string is immutable and has many overloaded operators so you can use
it like a value type:

string a = foo();
a += bar(); // a points to another immutable string now (C#)

Assembling a string that way is very costly which is why in C# and in
Java you typically have a StringBuilder class which is like a string
but mutable.

I'm sorry but that is incorrect.

If '+=' is "very costly" then you have a compiler or VM with an extremely lousy
implementation of the operation, something cooked up by an utter novice.

With any reasonable string implementation '+=' is the most efficient possible
way to do concatenation, and since it avoids at least one conversion call it's
then more efficient than using a string builder (buffer) object.



Cheers & hth.,

- Alf
 
S

SG

* SG:

I'm sorry but that is incorrect.

In Java it is the case (just tested on Sun's JVM/Compiler 1.6.0_10) by
wich I mean

a = a + ".";

in a loop is horribly slow. You are supposed to use a
java.lang.StringBuilder for this.

I'm not familiar with C#/.NET and it looks like there might be
compiler/VM magic involved w.r.t. the string class. So, yes, I can
imagine that in the .NET world string's "+=" isn't as bad as Java's
version. Still, what's the purpose of StringBuilder in C# if it wasn't
for speeding up string assembly.
With any reasonable string implementation '+=' is the most efficient possible
way to do concatenation, and since it avoids at least one conversion call it's
then more efficient than using a string builder (buffer) object.

I don't know what you mean by "conversion call" in this concext
but ... Yes, I can imagine an implementation where string objects
share character buffers and only manage their own start/end pointers.
So, if there's some yet unused and big enough room left in that buffer
there's no need to allocate a new buffer for concatenation. But you
might need to do some locking/synchronization.

In Java there is also a String member function "concat" which could do
what I described above. Just for kicks and giggles I wrote a simple
test in Java:

1: String a = "";
for (int k=0; k<0x10000; ++k) {
a = a + ".";
}

2: String a = "";
for (int k=0; k<0x10000; ++k) {
a = a.concat(".");
}

3: StringBuilder sb = new StringBuilder();
for (int k=0; k<0x10000; ++k) {
sb.append(".");
}
String a = sb.toString();

Test | Runtime
-----+---------------
1 | 10.369 seconds
2 | 2.624 seconds
3 | 0.076 seconds

As far as I know java.lang.StringBuilder doesn't do any kind of
locking/synchronization which is probably one reason it is so fast.

Alf, care to provide some C# test results just for the heck of it?

Cheers!
SG
 
S

SG

I'm sorry but that is incorrect.

In Java it is the case.
[...]

1: String a = "";
   for (int k=0; k<0x10000; ++k) {
     a = a + ".";
   }

2: String a = "";
   for (int k=0; k<0x10000; ++k) {
     a = a.concat(".");
   }

3: StringBuilder sb = new StringBuilder();
   for (int k=0; k<0x10000; ++k) {
     sb.append(".");
   }
   String a = sb.toString();

Some more accurate measurements on a Intel dual core2 E6600, dual
channel memory access:

1: 10365.00 ms
2: 2568.00 ms
3: 1.46 ms (average on 100 passes)

It's odd that the compiler -- since it's already using its magic for
strings -- doesn't map operator+ to String.concat which would be four
times faster.
Alf, care to provide some C# test results just for the heck of it?

No need. The following quote from Microsofts C# documentation seems to
back me up here. ;)

"The String object is immutable. Every time you use one of
the methods in the System.String class, you create a new
string object in memory, which requires a new allocation
of space for that new object. In situations where you need
to perform repeated modifications to a string, the overhead
associated with creating a new String object can be costly.
The System.Text.StringBuilder class can be used when you
want to modify a string without creating a new object. For
example, using the StringBuilder class can boost performance
when concatenating many strings together in a loop."

Cheers!
SG
 
A

Alf P. Steinbach

* SG:
In Java it is the case (just tested on Sun's JVM/Compiler 1.6.0_10) by
wich I mean

a = a + ".";

in a loop is horribly slow. You are supposed to use a
java.lang.StringBuilder for this.

I'm sorry but that is a misunderstanding and /very/ different from the example I
commented on.

It is however somewhat related, and using infix '+' is a very common newbie Bad
Coding Style (also reportedly employed by MS! :) ), so I'll comment on it.

a = a + ".";

is of necessity slow unless you have a supersmart optimizer, because it
constructs a separate new string instance. It therefore generally leads to
O(n^2) time when it's repeated in a loop. That is, the total time is
proportional to the /square/ of the final number of characters.

a += ".";

is on the other hand the potentially fastest way to do concatenation, and with
any reasonable implementation of '+=' yields O(n) time.

*However*, with Java and C# it seems (as you found, and as also I found now when
repeating your testing) that '+=' is really ineffecient.

I.e., at least with the used compilers it seems that those languages have really
lousy '+=' operator implementations.

The difference between infix '+' and the '+=' update operator is what the 'a'
object can do, what knowledge it has of what's going on. With '+=' it only has
to make itself single reference (it it isn't already) and then append in its own
buffer. Which buffer, for a reasonable implementation, it then doubles in size
as necessary to achieve amortized O(n) behavior.

I'm not familiar with C#/.NET and it looks like there might be
compiler/VM magic involved w.r.t. the string class. So, yes, I can
imagine that in the .NET world string's "+=" isn't as bad as Java's
version.

Have you really tested Java's version of '+=' or have you tested infix '+'?

Anyways, my own testing, shown below, yields that unexpectedly bad result.

But the compiler and VM I used are old.

Still, what's the purpose of StringBuilder in C# if it wasn't
for speeding up string assembly.


I don't know what you mean by "conversion call" in this concext

Preparatory conversion from original string to string buffer, final conversion
from string buffer to string.

but ... Yes, I can imagine an implementation where string objects
share character buffers and only manage their own start/end pointers.

Yes, just about.

So, if there's some yet unused and big enough room left in that buffer
there's no need to allocate a new buffer for concatenation. But you
might need to do some locking/synchronization.

Only for the reference counting.

In Java there is also a String member function "concat" which could do
what I described above. Just for kicks and giggles I wrote a simple
test in Java:

1: String a = "";
for (int k=0; k<0x10000; ++k) {
a = a + ".";
}

2: String a = "";
for (int k=0; k<0x10000; ++k) {
a = a.concat(".");
}

3: StringBuilder sb = new StringBuilder();
for (int k=0; k<0x10000; ++k) {
sb.append(".");
}
String a = sb.toString();

Test | Runtime
-----+---------------
1 | 10.369 seconds
2 | 2.624 seconds
3 | 0.076 seconds

#3 looks like C# not Java, and you're not testing '+='.

But I'm surprised at the results, which I've duplicated for both Java and C#,
and in particular for '+='.

Even though my compilers are old, it seems that both Java and C# have really
inefficient '+=' implementations.

As far as I know java.lang.StringBuilder doesn't do any kind of
locking/synchronization which is probably one reason it is so fast.

Alf, care to provide some C# test results just for the heck of it?

OK, but note that these are OLD compilers and VMs.

<versions>
C:\temp> java -version
java version "1.6.0_05"
Java(TM) SE Runtime Environment (build 1.6.0_05-b13)
Java HotSpot(TM) Client VM (build 10.0-b19, mixed mode, sharing)

C:\temp> csc
Microsoft (R) Visual C# 2005 Compiler version 8.00.50727.1433
for Microsoft (R) Windows (R) 2005 Framework version 2.0.50727
Copyright (C) Microsoft Corporation 2001-2005. All rights reserved.

fatal error CS2008: No inputs specified

C:\temp> g++ --version
g++ (GCC) 3.4.5 (mingw-vista special r3)
Copyright (C) 2004 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


C:\temp>
</version>


<code language="Java">
import java.util.Date;
import java.text.DecimalFormat;

class Time
{
private long myMilliSecs;

public Time()
{
myMilliSecs = new Date().getTime();
}

public long msecs() { return myMilliSecs; }
public double secs() { return myMilliSecs/1000.0; }
}

class ConcatTest
{
static final long n = 0x8000;

static void doInfixAdd()
{
String a = "";
for( int i = 1; i <= n; ++i )
{
a = a + ".";
}
}

static void doConcatAdd()
{
String a = "";
for( int i = 1; i <= n; ++i )
{
a = a.concat( "." );
}
}

static void doBuilder()
{
StringBuffer sb = new StringBuffer();
for( int i = 1; i <= n; ++i )
{
sb.append(".");
}
String a = sb.toString();
}

static void doUpdateAdd()
{
String a = "";
for( int i = 1; i <= n; ++i )
{
a += ".";
}
}


static void printResult( String s, Time start, Time end )
{
String num = new DecimalFormat( "00.00" ).format( end.secs() -
start.secs() );
System.out.println( s + ": " + num );
}


public static void main( String[] args )
{
Time startTime;

startTime = new Time();
doInfixAdd();
printResult( "Infix '+' ", startTime, new Time() );

startTime = new Time();
doConcatAdd();
printResult( "concat ", startTime, new Time() );

startTime = new Time();
doBuilder();
printResult( "builder ", startTime, new Time() );

startTime = new Time();
doUpdateAdd();
printResult( "'+=' operator", startTime, new Time() );
}
}
</code>

<results language="Java">
C:\temp> java ConcatTest
Infix '+' : 12,33
concat : 02,84
builder : 00,02
'+=' operator: 11,83

C:\temp> java ConcatTest
Infix '+' : 11,45
concat : 02,88
builder : 00,00
'+=' operator: 11,73

C:\temp> java ConcatTest
Infix '+' : 11,40
concat : 02,85
builder : 00,00
'+=' operator: 11,42

C:\temp>
</results>


<code language="C#">
using DateTime = System.DateTime;
using TimeSpan = System.TimeSpan;
using StringBuilder = System.Text.StringBuilder;

class ConcatTest
{
private const long n = 0x8000;

static void doInfixAdd()
{
string a = "";
for( int i = 1; i <= n; ++i )
{
a = a + ".";
}
}

static void doConcatAdd()
{
string a = "";
for( int i = 1; i <= n; ++i )
{
a = string.Concat( a, "." );
}
}

static void doBuilder()
{
StringBuilder sb = new StringBuilder();
for( int i = 1; i <= n; ++i )
{
sb.Append( "." );
}
string a = sb.ToString();
}

static void doUpdateAdd()
{
string a = "";
for( int i = 1; i <= n; ++i )
{
a += ".";
}
}

static void printResult( string s, DateTime start, DateTime end )
{
string num = string.Format( "{0:00.00}", (end -
start).Milliseconds/1000.0 );
System.Console.WriteLine( s + ": " + num );
}

static void Main()
{
DateTime startTime;

startTime = DateTime.Now;
doInfixAdd();
printResult( "Infix '+' ", startTime, DateTime.Now );

startTime = DateTime.Now;
doConcatAdd();
printResult( "concat ", startTime, DateTime.Now );

startTime = DateTime.Now;
doBuilder();
printResult( "builder ", startTime, DateTime.Now );

startTime = DateTime.Now;
doUpdateAdd();
printResult( "'+=' operator", startTime, DateTime.Now );
}
}
</code>

<results language="C#">
C:\temp> concat
Infix '+' : 00,46
concat : 00,44
builder : 00,09
'+=' operator: 00,38

C:\temp> concat
Infix '+' : 00,43
concat : 00,40
builder : 00,00
'+=' operator: 00,74

C:\temp> concat
Infix '+' : 00,40
concat : 00,38
builder : 00,00
'+=' operator: 00,49

C:\temp>
</concat>


<code language="C++">
#include <iostream>
#include <iomanip> // std::fixed
#include <string>
#include <time.h> // clock

using namespace std;

long const n = 0x4000;

class TimeSpan
{
private:
clock_t myTicks;
public:
TimeSpan( clock_t ticks ): myTicks( ticks ) {}
double secs() const { return double(myTicks)/CLOCKS_PER_SEC; }
};

class Time
{
private:
clock_t myTicks;
public:
Time(): myTicks( clock() ) {}
TimeSpan operator-( Time const& rhs ) const { return myTicks - rhs.myTicks; }
};

void doInfixAdd()
{
string a = "";
for( int i = 1; i <= n; ++i )
{
a = a + ".";
}
}

void doConcatAdd()
{
string a = "";
for( int i = 1; i <= n; ++i )
{
a = a.append( "." );
}
}

// Not applicable.
//void doBuilder() {}

void doUpdateAdd()
{
string a = "";
for( int i = 1; i <= n; ++i )
{
a += ".";
}
}

void printResult( string const& s, Time const& start, Time const& end )
{
cout << fixed << setprecision( 4 );
cout << s << ": " << (end - start).secs() << endl;
}

int main()
{
Time startTime;

startTime = Time();
doInfixAdd();
printResult( "Infix '+' ", startTime, Time() );

startTime = Time();
doConcatAdd();
printResult( "concat ", startTime, Time() );

startTime = Time();
doUpdateAdd();
printResult( "'+=' operator", startTime, Time() );
}
</code>

<results language="C++">
C:\temp> a
Infix '+' : 0.0940
concat : 0.0000
'+=' operator: 0.0000

C:\temp> a
Infix '+' : 0.1560
concat : 0.0000
'+=' operator: 0.0000

C:\temp> a
Infix '+' : 0.0930
concat : 0.0000
'+=' operator: 0.0000

C:\temp>
</results>


So, in summary, at least for C++ programming it is, as I noted earlier, a really
good idea to use '+=' (or equivalently 'append') instead of infix '+', and a
really bad idea to do the opposite. And it is surprising that the Java and C#
compilers don't implement '+=' efficiently; it's easy to do. I thought they did.


Cheers & hth.,

- Alf
 
A

Alf P. Steinbach

* Alf P. Steinbach:

Oops, I forgot to comment on that.

Yes, it seems '+=' has inefficient implementations in Java and C#, but the
inefficient implementations are not inefficient because '+=' is costly for
immutable strings, as you wrote; on the contrary, '+=', as opposed to infix '+',
allows for maximum efficiency.

The Java and C# '+=' ops are just inefficient silly implementations.

For a faster implementation of '+=' for immutable strings you might look at (old
code and never quite finished) <url: http://alfsstringvalue.sourceforge.net/>;
of course, when I wrote that page & code, pointing to Java and C# for
efficiency, I didn't know that Java and C# are so inefficient! :)

And by the way, in the test programs I now see that the C++ version had just
half the number of operations, which made C++ look very very efficient in
comparision; doubling the number of operations to the same as for Java and C#
makes it just somewhat more efficient than C#:

<results language="C++">
C:\temp> a
Infix '+' : 0.3910
concat : 0.0000
'+=' operator: 0.0000

C:\temp> a
Infix '+' : 0.3290
concat : 0.0160
'+=' operator: 0.0000

C:\temp> a
Infix '+' : 0.3280
concat : 0.0000
'+=' operator: 0.0000

C:\temp>
</results>


Cheers & hth.,

- Alf
 
S

SG

Hi Alf,

thanks for taking the time to do this test! I really was curious about
how the different variants in C# and in Java compare but I don't have
a C# compiler handy.

I'm sorry but that is a misunderstanding and /very/ different from the example I
commented on.

I'm aware of that. I seem to have forgotten about the existence of +=
on strings in Java. The method I believed to come closest is the
member function String.concat which I included in the test.
Have you really tested Java's version of '+=' or have you tested infix '+'?

I tested the three cases: 1. Infix +, 2. member function concat and 3.
StringBuilder.
#3 looks like C# not Java, and you're not testing '+='.

It is indeed fraction of valid Java (except for the "3:"). You were
able to compile it as far as I can tell. :)
C:\temp> java ConcatTest
Infix '+'    : 12,33
concat       : 02,84
builder      : 00,02
'+=' operator: 11,83

This matches my measurements. I guess the JVM/compiler builders don't
see the need to optimize infix+ and += with having a StringBuilder
available.
<results language="C#">
C:\temp> concat
Infix '+'    : 00,46
concat       : 00,44
builder      : 00,09
'+=' operator: 00,38
Interesting.

<results language="C++">
C:\temp> a
Infix '+'    : 0.0940
concat       : 0.0000
'+=' operator: 0.0000

a = a.concat("."); // is probably equivalent to
a = a += "."; // where the self-assignment check just returns
So, in summary, at least for C++ programming it is, as I noted earlier, a really
good idea to use '+=' (or equivalently 'append') instead of infix '+', and a
really bad idea to do the opposite. And it is surprising that the Java and C#
compilers don't implement '+=' efficiently; it's easy to do. I thought they did.

Thanks again for the thorough testing.

Cheers!
SG
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,228
Members
46,818
Latest member
SapanaCarpetStudio

Latest Threads

Top