simple code performance question

galiorenye · Oct 30, 2007

Hi,

Given this code:

A** ppA = new A*[10];
A *pA = NULL;
for(int i = 0; i < 10; ++i)
{
pA = ppA;
//do something with pA
}

- is there some performance penalty if pA declaration and assignment
will be inside the for-block as:

A *pA = ppA;

Regards,
ren

Victor Bazarov · Oct 30, 2007

Given this code:

A** ppA = new A*[10];
A *pA = NULL;
for(int i = 0; i < 10; ++i)
{
pA = ppA;
//do something with pA
}

- is there some performance penalty if pA declaration and assignment
will be inside the for-block as:

A *pA = ppA;

Of cource not. However, the declaration/definition/initialisation
inside "the for-block" is much better from the code maintenance POV.
Presence of 'pA' variable outside has *no merit*. It only pollutes
the scope. Unless there is a need to use 'pA' in the same scope as
'ppA', 'pA' should be defined inside.

V

glory2 · Oct 31, 2007

Thanks for the answer, but can you supply some more details?
Why there is no performance penalty here, is it because it is a
pointer type?
TIA
Oren

Victor Bazarov · Oct 31, 2007

Thanks for the answer, but can you supply some more details?
Why there is no performance penalty here, is it because it is a
pointer type?

Yes. What details are you looking for?

Let's say this: with a very bad compiler (which doesn't emit the
same code in both cases) there might be some difference, but so
miniscule that it would not matter when the big[ger] picture is
concerned.

V

glory2 · Oct 31, 2007

What is the difference between these cases in a bad compiler?
Is it that the second case allocates the pointer's memory on each loop
and this causes the (negligible) performance degradation?

Victor Bazarov · Oct 31, 2007

What is the difference between these cases in a bad compiler?
Is it that the second case allocates the pointer's memory on each loop
and this causes the (negligible) performance degradation?

Something like that. You should not expect such allocation to take too
much time, of course, even with a bad compiler. Automatic storage is
usually not overly slow. That's why the degradation (if any) is truly
negligible.

V

James Kanze · Nov 1, 2007

Yes. What details are you looking for?

Let's say this: with a very bad compiler (which doesn't emit the
same code in both cases) there might be some difference, but so
miniscule that it would not matter when the big[ger] picture is
concerned.

More importantly, the difference can go both ways.

A while back, someone suggested declaring an std::string outside
the loop, on the grounds that that would improve performance,
since the constructor and the destructor would only have to be
invoked once, and not each time through the loop. I wrote up a
small benchmark to see just how much difference it would make,
and with g++ (2.95.2 at the time, I think), it turned out that
declaring the variable in the loop was actually significantly
faster. For my benchmark---it all depended on what you were
doing.

The correct answer is (and I know you know it): don't worry
about it until the profiler says you have to, and then try both,
measuring, to see what actually happens on your implementation.
Anything else is just pure stupidity.

=?iso-8859-1?q?Elias_Salom=E3o_Helou_Neto?= · Nov 2, 2007

[email protected] said:
[email protected] said:

Thanks for the answer, but can you supply some more details?
Why there is no performance penalty here, is it because it is a
pointer type?

Click to expand...

Yes. What details are you looking for?
Let's say this: with a very bad compiler (which doesn't emit the
same code in both cases) there might be some difference, but so
miniscule that it would not matter when the big[ger] picture is
concerned.

Click to expand...

More importantly, the difference can go both ways.

A while back, someone suggested declaring an std::string outside
the loop, on the grounds that that would improve performance,
since the constructor and the destructor would only have to be
invoked once, and not each time through the loop. I wrote up a
small benchmark to see just how much difference it would make,
and with g++ (2.95.2 at the time, I think), it turned out that
declaring the variable in the loop was actually significantly
faster. For my benchmark---it all depended on what you were
doing.

How could that be possible? With a minimum of smartness in memory
management, both ways should be equally efficient in this regard. The
execution of any kind of code in constructors/destructors implicit
calls could also be easily avoided for std::strings when gcc detects
such constructs (though such optimization may not always be possible
for user-defined classes). But this only means that gcc should have
the same performance in both cases, otherwise it must not be doing a
good job on optimizing the constructor-outside-the-loop case.

In my humble opinion, it is safier to believe that code like this will
run faster:

#include <string>

void get_string_somehow( std::string& str ); //Elsewhere implemented.
void do_something_with_string( std::string& str ); //Elsewhere
implemented.

int main();
{
{
std::string str;
for ( unsigned i( 0 ); i < 100; ++i )
{
get_string_somehow( str );
do_something_with_string( str );
}
}

return( 0 );
}

Notice that concerns about name leaking are adressed by simply placing
braces {} around the relevant code. When it comes to pointers and
other fundamental types, however, I pretty much agree that declaring
inside loops is the way to go.

The correct answer is (and I know you know it): don't worry
about it until the profiler says you have to, and then try both,
measuring, to see what actually happens on your implementation.
Anything else is just pure stupidity.

I would say that some precaution when coding, i.e. prior to proper
profiling be possible, cannot be considered stupidity, but I do agree
that no a-priori assertion can be done in this repect.

Elias Salomão Helou Neto

James Kanze · Nov 2, 2007

(e-mail address removed) wrote:
Thanks for the answer, but can you supply some more details?
Why there is no performance penalty here, is it because it is a
pointer type?
Yes. What details are you looking for?
Let's say this: with a very bad compiler (which doesn't emit the
same code in both cases) there might be some difference, but so
miniscule that it would not matter when the big[ger] picture is
concerned.

Click to expand...

More importantly, the difference can go both ways.
A while back, someone suggested declaring an std::string outside
the loop, on the grounds that that would improve performance,
since the constructor and the destructor would only have to be
invoked once, and not each time through the loop. I wrote up a
small benchmark to see just how much difference it would make,
and with g++ (2.95.2 at the time, I think), it turned out that
declaring the variable in the loop was actually significantly
faster. For my benchmark---it all depended on what you were
doing.

Click to expand...

How could that be possible?

Who knows? Who cares? I didn't do an extensive analysis of the
implementation of g++. The point remains that you cannot say
which is faster (if either) until you've measured the case which
interests you.

With a minimum of smartness in memory management, both ways
should be equally efficient in this regard. The execution of
any kind of code in constructors/destructors implicit calls
could also be easily avoided for std::strings when gcc detects
such constructs (though such optimization may not always be
possible for user-defined classes). But this only means that
gcc should have the same performance in both cases, otherwise
it must not be doing a good job on optimizing the
constructor-outside-the-loop case.

Or the implementation of std::string does something funny. Or
whatever. All it means is that assignment is more expensive
than construction/destruction, which shouldn't really supprise
anyone.

In my humble opinion, it is safier to believe that code like
this will run faster:

#include <string>

void get_string_somehow( std::string& str ); //Elsewhere implemented.
void do_something_with_string( std::string& str ); //Elsewhere
implemented.

int main();
{
{
std::string str;
for ( unsigned i( 0 ); i < 100; ++i )
{
get_string_somehow( str );
do_something_with_string( str );
}
}
return( 0 );
}

I don't know what you mean by "safer", but there's absolutely no
reason to believe anything of the kind. Depending on the
implementation of std::string, what get_string_somehow does, and
possibly any number of other factors, it's impossible to say
without measuring whether the above will be faster or slower
than any other particular solution.

glory2 · Nov 2, 2007

Thanks for all posters.
Oren

=?iso-8859-1?q?Elias_Salom=E3o_Helou_Neto?= · Nov 2, 2007

More importantly, the difference can go both ways.

Who knows? Who cares? I didn't do an extensive analysis of the
implementation of g++. The point remains that you cannot say
which is faster (if either) until you've measured the case which
interests you.

I understand this.

Or the implementation of std::string does something funny. Or
whatever. All it means is that assignment is more expensive
than construction/destruction, which shouldn't really supprise
anyone.

I think it actually should surprise. Since memory is already allocated
(things are not that simple, I know, but it does not invalidate the
conclusion) assignment should be faster, or, at least as fast as
construction. Not being is, in my point of view, a flaw in the
compiler. I would like to see the code you benchmarked.

I don't know what you mean by "safer", but there's absolutely no
reason to believe anything of the kind. Depending on the
implementation of std::string, what get_string_somehow does, and
possibly any number of other factors, it's impossible to say
without measuring whether the above will be faster or slower
than any other particular solution.

Sorry, I was not completely clear, but I meant that it is more likely
that the mentioned code would run faster than one with the constructor
inside the loop, i.e., just above the get_string_somehow().

What I do not understand are the reasons which could make assignment
slower than construction. Again, I think it cannot be justified unless
as a compiler issue.

James Kanze · Nov 3, 2007

[...]

I think it actually should surprise. Since memory is already
allocated (things are not that simple, I know, but it does not
invalidate the conclusion) assignment should be faster, or, at
least as fast as construction.

That's really a very na ve point of view.

As I said, I measured; the actual results don't support your
conclusion, at least in the specific case I measured. (I might
also add that I sort of suspected this, since I am familiar with
the g++ implementation of std::string.)

Not being is, in my point of view, a flaw in the compiler. I
would like to see the code you benchmarked.

It was quite some time ago, but if I recall correctly, it was
something like:

std::string data ;
for ( ... ) {
data = someFunction() ;
}

vs.

for ( ... ) {
std::string data( someFunction() ) ;
}

[...]

What I do not understand are the reasons which could make
assignment slower than construction.

std::string is a complicated class, with not a few constraints.
Implementations try to optimize frequent operations, like the
above (with the definition in the loop). In the case of g++,
for example, construction of a copy of a string does NOT
generally allocate memory, nor copy any text---assignment to the
string will almost always copy text. Other implementations
don't allocate memory for smaller strings, but just copy the
data. And so on.

Again, I think it cannot be justified unless as a compiler
issue.

What can I say? You're wrong. (And there's nothing to
"justify". I would consider it a sign of a good implementation
that the cleaner, more frequent use runs faster.)

=?iso-8859-1?q?Elias_Salom=E3o_Helou_Neto?= · Nov 4, 2007

[...]

It was quite some time ago, but if I recall correctly, it was
something like:

std::string data ;
for ( ... ) {
data = someFunction() ;
}

vs.

for ( ... ) {
std::string data( someFunction() ) ;
}

[...]

What I do not understand are the reasons which could make
assignment slower than construction.

Click to expand...

std::string is a complicated class, with not a few constraints.
Implementations try to optimize frequent operations, like the
above (with the definition in the loop). In the case of g++,
for example, construction of a copy of a string does NOT
generally allocate memory, nor copy any text---assignment to the
string will almost always copy text. Other implementations
don't allocate memory for smaller strings, but just copy the
data. And so on.

Well, IMHO whenever copy construction doesn't need copying, neither
should assignement.

But the real point here is that you were using something like str =
someFunction() instead of someFunction( str ). Do you see? In such a
case, it is much easier to optimize away the creation of the temporary
in std::string str( someFunction ) than in str = someFunction(). This
is not the compiler fault, nor it falls under my example. In the
second case, a temporary needed to be created for the assignement to
be possible, while in the former no.

The bottleneck here most certainly was the creation of the temporary,
not assignement operation versus construction/destruction cycle. It
was not the compiler, after all!

As I see, even gurus like you (and I always appreciate our
enlightening discussions) eventually get lost with c++ subtleties. I
am perhaps naive, but not as much as you may be thinking.

Also, in normal cases, optimizations may not be as simple as they are
with std::string, which is under the compiler control (in fact, the
implementation of std::string should be near optimal without any
compiler optimization anyway). Within user-defined classes, copy
construction may execute startup code that will be cleaned up upon
destruction, which would not happen with assignement operations, so
the construction/destruction cycle within a loop should best be
avoided in most cases when it is not needed, unless assignement is
specially poorly implemented. In such cases, however, it would be
better to reimplement the assignement properly. That is the reason my
advice is to place constructors outside the loop, though exceptions to
the rule may exist (yours doesn't seem to be one) AND to refrain from
returning large objects by value, wich would avoid the creation of
temporaries.

Elias Salomão Helou neto.

Bo Persson · Nov 4, 2007

Elias Salomão Helou Neto wrote:
::
:: [...]
::
:: It was quite some time ago, but if I recall correctly, it was
:: something like:
::
:: std::string data ;
:: for ( ... ) {
:: data = someFunction() ;
:: }
::
:: vs.
::
:: for ( ... ) {
:: std::string data( someFunction() ) ;
:: }
::
:: [...]
:
::: What I do not understand are the reasons which could make
::: assignment slower than construction.
::
:: std::string is a complicated class, with not a few constraints.
:: Implementations try to optimize frequent operations, like the
:: above (with the definition in the loop). In the case of g++,
:: for example, construction of a copy of a string does NOT
:: generally allocate memory, nor copy any text---assignment to the
:: string will almost always copy text. Other implementations
:: don't allocate memory for smaller strings, but just copy the
:: data. And so on.
:
: Well, IMHO whenever copy construction doesn't need copying, neither
: should assignement.

But assignment has the additional problem of dealing with the old
value.

:
: But the real point here is that you were using something like str =
: someFunction() instead of someFunction( str ). Do you see?

No, I don't!

We have a perfectly good and idiomatic piece of code in section 2. It
is simple, easy to read, and actualy runs faster. What more could we
ask??

In section 1, the programmer tries some kind of premature optimzation
which just complicates the code. That he also gets worse run-time
performance, is well deserved!

: In such a
: case, it is much easier to optimize away the creation of the
: temporary in std::string str( someFunction ) than in str =
: someFunction().

Now you are just making the code even more complicated, attempting to
match the performance of the smaller and the simpler code. In
addition, you also force the user of the function to create the target
value before calling the function. This forces me to write

std::string data;
someFunction(data);

whether I have a loop to "optimize" or not!

: Also, in normal cases, optimizations may not be as simple as they
: are with std::string, which is under the compiler control (in fact,
: the implementation of std::string should be near optimal without any
: compiler optimization anyway). Within user-defined classes, copy
: construction may execute startup code that will be cleaned up upon
: destruction, which would not happen with assignement operations, so
: the construction/destruction cycle within a loop should best be
: avoided in most cases when it is not needed, unless assignement is
: specially poorly implemented. In such cases, however, it would be
: better to reimplement the assignement properly. That is the reason
: my advice is to place constructors outside the loop, though
: exceptions to the rule may exist (yours doesn't seem to be one) AND
: to refrain from returning large objects by value, wich would avoid
: the creation of temporaries.

The fact is that std::string has no overhead in its copy constructor,
all it does is store a copy of the other string. On many compilers, it
also has a definite advantage in combination with RVO/NRVO
optimization for value returning functions.

The assignment operator is much more complicated, as it also has to
decide what to do with the existing value. It doesn't help if you move
the assignment to inside the function, further complicating it by
passing a parameter.

Bo Persson

peter koch · Nov 4, 2007

[...]

Click to expand...

It was quite some time ago, but if I recall correctly, it was
something like:

Click to expand...

std::string data ;
for ( ... ) {
data = someFunction() ;
}

vs.

Click to expand...

for ( ... ) {
std::string data( someFunction() ) ;
}

[...]
What I do not understand are the reasons which could make
assignment slower than construction.

Click to expand...

Click to expand...

std::string is a complicated class, with not a few constraints.
Implementations try to optimize frequent operations, like the
above (with the definition in the loop). In the case of g++,
for example, construction of a copy of a string does NOT
generally allocate memory, nor copy any text---assignment to the
string will almost always copy text. Other implementations
don't allocate memory for smaller strings, but just copy the
data. And so on.

Click to expand...

Well, IMHO whenever copy construction doesn't need copying, neither
should assignement.

But the real point here is that you were using something like str =
someFunction() instead of someFunction( str ). Do you see? In such a
case, it is much easier to optimize away the creation of the temporary
in std::string str( someFunction ) than in str = someFunction(). This
is not the compiler fault, nor it falls under my example. In the
second case, a temporary needed to be created for the assignement to
be possible, while in the former no.

I believe James gave a realistic example, where the object gets
initialised once for every entrance in the loop. After all, since the
object logically belongs inside the loop (or we would not have this
discussion), it should be initialised for every entry.

The bottleneck here most certainly was the creation of the temporary,
not assignement operation versus construction/destruction cycle. It
was not the compiler, after all!

Surely, RVO helps James here - no doubt about that. But I know of no
modern compiler that does not implement RVO and you would be a fool
not to exploit it.

As I see, even gurus like you (and I always appreciate our
enlightening discussions) eventually get lost with c++ subtleties. I
am perhaps naive, but not as much as you may be thinking.

Also, in normal cases, optimizations may not be as simple as they are
with std::string, which is under the compiler control (in fact, the
implementation of std::string should be near optimal without any
compiler optimization anyway). Within user-defined classes, copy
construction may execute startup code that will be cleaned up upon
destruction, which would not happen with assignement operations, so
the construction/destruction cycle within a loop should best be
avoided in most cases when it is not needed, unless assignement is
specially poorly implemented.

There is no reason copy construction should add any overhead compared
to assignment. Actually it is the other way around.

If you have the choice between
Class c; somefunction(s) and
Class c(somefunction()),

The second choice will be faster than the first one. More important,
the second choice is shorter and clearer in intent (esp. when the
definition and the initialisation is separated in space). The second
choice is also much easier to write in case you consider any exception
guarantees. As a matter of fact, you probably end up writing
somefunction as

somefunction(Class &c)
{
Class tmp;
// initialise tmp
std::swap(c,tmp);
}

In such cases, however, it would be
better to reimplement the assignement properly. That is the reason my
advice is to place constructors outside the loop, though exceptions to
the rule may exist (yours doesn't seem to be one) AND to refrain from
returning large objects by value, wich would avoid the creation of
temporaries.

The best practice is the opposite of yours: Prefer to return classes
by value, and declare your classes locally. This gives clearer code,
more stable code and - as a side effect - optimises the more common
case, where you actually want to construct and initialise the object
in one go.
There might be exceptions to this rule, of course, but do not exchange
clarity for perceived efficiency unless your profiler shows so.

Elias Salomão Helou neto.

/Peter

=?iso-8859-1?q?Elias_Salom=E3o_Helou_Neto?= · Nov 4, 2007

But assignment has the additional problem of dealing with the old
value.

As I see, copy on write with reference counting for the data is the
only reason for "the additional problem of dealing with the old value"
you mention (even though your wording is awful).

If so, why would not it hold as well for copying? Think about it!
Copying may be done on write, allright, but why could not assignement
do exactly the same, i.e. copy on write with reference counting for
the data? If it is done so, when a temporary argument is passed for
either the copy constructor or the assignement operator, actual
copying of the data would never take place.

: But the real point here is that you were using something like str =
: someFunction() instead of someFunction( str ). Do you see?

No, I don't!

We have a perfectly good and idiomatic piece of code in section 2. It
is simple, easy to read, and actualy runs faster. What more could we
ask??

I ask to compare it to what it should be compared to, not to an
oversimplified version. I mean that str( someFunction() ) is not to be
benchmarked against str = someFunction() when optimization is turned
on because, in the former, temporaries are rather easy to be
eliminated, while in the second no!

In section 1, the programmer tries some kind of premature optimzation
which just complicates the code. That he also gets worse run-time
performance, is well deserved!

Is it well deserved to receive a performance penalty solely for
deviating from an idiom? Now, an affirmative answer to this would be
dumb!

You should realize, and nobody will be able to seriously disagree with
me on this, that there is no reason for (now, this is the correct
comparison) my RVO version to run slower than construction.

Remember that if we want to return an object, it must be created
within the function (and copied to be returned), but both extra
constructions can be optimized away. Actually, your glorified idiom is
nothing but a way to support such optimization to be done by the
compiler!

: In such a
: case, it is much easier to optimize away the creation of the
: temporary in std::string str( someFunction ) than in str =
: someFunction().

Now you are just making the code even more complicated, attempting to
match the performance of the smaller and the simpler code. In
addition, you also force the user of the function to create the target
value before calling the function. This forces me to write

std::string data;
someFunction(data);

whether I have a loop to "optimize" or not!

Overloading the someFunction would allow both options to be used.

: Also, in normal cases, optimizations may not be as simple as they
: are with std::string, which is under the compiler control (in fact,
: the implementation of std::string should be near optimal without any
: compiler optimization anyway). Within user-defined classes, copy
: construction may execute startup code that will be cleaned up upon
: destruction, which would not happen with assignement operations, so
: the construction/destruction cycle within a loop should best be
: avoided in most cases when it is not needed, unless assignement is
: specially poorly implemented. In such cases, however, it would be
: better to reimplement the assignement properly. That is the reason
: my advice is to place constructors outside the loop, though
: exceptions to the rule may exist (yours doesn't seem to be one) AND
: to refrain from returning large objects by value, wich would avoid
: the creation of temporaries.

The fact is that std::string has no overhead in its copy constructor,
all it does is store a copy of the other string.

I cannot figure out what you are meaning. Should not every copy-
constructed object store a copy of the copied object? I guess you
wanted to mean that they share data (copy on write).

On many compilers, it
also has a definite advantage in combination with RVO/NRVO
optimization for value returning functions.

Really? Why?

The assignment operator is much more complicated, as it also has to
decide what to do with the existing value.

Again! How is that possible that operator= (or any other function)
would need to know what to do with its argument after it has
completed execution (if this is what you mean)? The reason I see for
something vaguely like this is, again, copy on write, but I will not
discuss this again.

It doesn't help if you move
the assignment to inside the function, further complicating it by
passing a parameter.

That would be really smart, huh? Did you came up with that by
yourself? Why would anybody do it?

Elias Salomão Helou Neto

James Kanze · Nov 4, 2007

[...]
It was quite some time ago, but if I recall correctly, it
was something like:
std::string data ;
for ( ... ) {
data = someFunction() ;
}
vs.
for ( ... ) {
std::string data( someFunction() ) ;
}
[...]

What I do not understand are the reasons which could make
assignment slower than construction.

Click to expand...

std::string is a complicated class, with not a few constraints.
Implementations try to optimize frequent operations, like the
above (with the definition in the loop). In the case of g++,
for example, construction of a copy of a string does NOT
generally allocate memory, nor copy any text---assignment to the
string will almost always copy text. Other implementations
don't allocate memory for smaller strings, but just copy the
data. And so on.

Click to expand...

Well, IMHO whenever copy construction doesn't need copying,
neither should assignement.

Well, as you say, that's your opinion (humble or not). You're
free to believe that the earth is flat as well. An objective
analysis of the facts doesn't give you any reason to believe it,
but opinions are opinions. The issues are simply complex enough
that you can't make any assumptions.

But the real point here is that you were using something like
str = someFunction() instead of someFunction( str ). Do you
see?

I don't see where it makes any real difference in my argument.
My argument is simple: for any given case, you can't know until
you've measured it. Guessing is totally unreliable.

As it happens, I know exactly how g++ implements basic_string,
and I know that it wouldn't be too difficult to create a
benchmark in which using someFunction( data ) also runs slower.
Similarly, I could easily create cases where the reverse was
true. But that's neither here nor there. My point remains:
what you (or someone else) naïvely expects to be faster may not
be. Until you've measured the specific case which interests
you, you don't know which solution will be faster.

In such a case, it is much easier to optimize away the
creation of the temporary in std::string str( someFunction )
than in str = someFunction(). This is not the compiler fault,
nor it falls under my example. In the second case, a temporary
needed to be created for the assignement to be possible, while
in the former no.

The bottleneck here most certainly was the creation of the
temporary, not assignement operation versus
construction/destruction cycle. It was not the compiler, after
all!

That's not what the profiler said. The compiler implements
NRVO, and the code in the function was designed to take
advantage of it.

As I see, even gurus like you (and I always appreciate our
enlightening discussions) eventually get lost with c++
subtleties. I am perhaps naive, but not as much as you may be
thinking.

I'm afraid in this case you're completely wrong.

Also, in normal cases, optimizations may not be as simple as they are
with std::string, which is under the compiler control

It's true in theory. In practice, all of the compilers I know
treat std::basic_string exactly like they do a user defined
class.

(in fact, the implementation of std::string should be near
optimal without any compiler optimization anyway). Within
user-defined classes, copy construction may execute startup
code that will be cleaned up upon destruction, which would not
happen with assignement operations,

In the most frequent idiom for complex classes, the user defined
assignment operator starts by constructing a temporary copy
using the copy constructor. So assignment is very, very likely
to be slower than copy construction. In general, in fact,
assignment is likely to be slower than copy construction.

so the construction/destruction cycle within a loop should
best be avoided in most cases when it is not needed, unless
assignement is specially poorly implemented.

And that is simply wrong. It's a classical example of naïve
premature optimization: replacing clean code with something less
clean on the grounds that it is faster, when you've not
measured, and when in fact it isn't necessarily faster.

In such cases, however, it would be better to reimplement the
assignement properly. That is the reason my advice is to place
constructors outside the loop, though exceptions to the rule
may exist (yours doesn't seem to be one) AND to refrain from
returning large objects by value, wich would avoid the
creation of temporaries.

In sum, make your code as unreadable as possible, so that it
can't be optimized later, if the need realy does exist, for what
are probably non-issues, and for what (in this case at least)
may actually be a pessimization.

James Kanze · Nov 4, 2007

As I see, copy on write with reference counting for the data
is the only reason for "the additional problem of dealing with
the old value" you mention (even though your wording is
awful).

(Strange, I didn't have any problem with his wording. But I'm
not sure any of us are native speakers here, although I did grow
up in America.)

By definition, the difference between a constructor and an
assignment operator is that the assignment operator has an old
value, which must be taken into consideration. Sometimes,
taking it into consideration can actually improve performance;
e.g. if you are using deep copy, and the destination string is
smaller than the source. Other times, it can slow things down:
anytime it's not big enough, for example, and so must be freed,
and re-allocated, or if you're using copy on write.

Depending on what you're doing with std::string, copy on write
can represent a significant performance gain, or a small (or
maybe not so small, in the case of multi-threaded code) loss.
The actual measurements were made some time ago, with g++
2.95.2, which didn't support threading at all, and had a small,
simple and very, very robust implementation of basic_string. My
own experiments suggest that they probably made the right trade
off using copy on write in this case, at least for my code.

If so, why would not it hold as well for copying? Think about
it! Copying may be done on write, allright, but why could not
assignement do exactly the same, i.e. copy on write with
reference counting for the data?

That's what g++ does. It still means that you have to do
something with the old data.

If it is done so, when a temporary argument is passed for
either the copy constructor or the assignement operator,
actual copying of the data would never take place.

I ask to compare it to what it should be compared to, not to
an oversimplified version. I mean that str( someFunction() )
is not to be benchmarked against str = someFunction() when
optimization is turned on because, in the former, temporaries
are rather easy to be eliminated, while in the second no!

And I ask you how much experience you've actually had compiling
and optimizing C++, in order to make such a blanket statement.
The standard contains a couple of special rules, just to permit
optimization of temporaries when constructing. They don't
necessarily apply when copying.

Is it well deserved to receive a performance penalty solely
for deviating from an idiom? Now, an affirmative answer to
this would be dumb!

It's expecting automatically that one idiom will be faster than
the other, without actually having measured the specific case in
question, which is dumb. It's choosing a particular idiom on
the grounds that it will be faster without having measured.

You should realize, and nobody will be able to seriously
disagree with me on this, that there is no reason for (now,
this is the correct comparison) my RVO version to run slower
than construction.

The problem is that you still don't understand. To start with,
your version doesn't use RVO, because there is no return value.
And anyone with any real experience will automatically disagree
when you claim differences without actually having measured, and
will disagree that the measurements you made for one case apply
to the next.

Bo Persson · Nov 4, 2007

Elias Salomão Helou Neto wrote:
:: But assignment has the additional problem of dealing with the old
:: value.
:
: As I see, copy on write with reference counting for the data is the
: only reason for "the additional problem of dealing with the old
: value" you mention (even though your wording is awful).

This has nothing to with reference counting. The problem with
assignment is that the string assigned to already has a value. What
are we going to do with that? How long does that take?

(and I'm not trying to win a litterature prize)

: If so, why would not it hold as well for copying? Think about it!
: Copying may be done on write, allright, but why could not
: assignement do exactly the same, i.e. copy on write with reference
: counting for the data? If it is done so, when a temporary argument
: is passed for either the copy constructor or the assignement
: operator, actual copying of the data would never take place.
:
::: But the real point here is that you were using something like str
::: = someFunction() instead of someFunction( str ). Do you see?
::
:: No, I don't!

::
:: We have a perfectly good and idiomatic piece of code in section 2.
:: It is simple, easy to read, and actualy runs faster. What more
:: could we ask??
:
: I ask to compare it to what it should be compared to, not to an
: oversimplified version. I mean that str( someFunction() ) is not to
: be benchmarked against str = someFunction() when optimization is
: turned on because, in the former, temporaries are rather easy to be
: eliminated, while in the second no!

I don't get this one.

:
:: In section 1, the programmer tries some kind of premature
:: optimzation which just complicates the code. That he also gets
:: worse run-time performance, is well deserved!
:
: Is it well deserved to receive a performance penalty solely for
: deviating from an idiom? Now, an affirmative answer to this would be
: dumb!

Ok.

Trying to outsmart the compiler, and getting slower code is well
deserved IMO.

:
::: Also, in normal cases, optimizations may not be as simple as they
::: are with std::string, which is under the compiler control (in
::: fact, the implementation of std::string should be near optimal
::: without any compiler optimization anyway). Within user-defined
::: classes, copy construction may execute startup code that will be
::: cleaned up upon destruction, which would not happen with
::: assignement operations, so the construction/destruction cycle
::: within a loop should best be avoided in most cases when it is not
::: needed, unless assignement is specially poorly implemented. In
::: such cases, however, it would be better to reimplement the
::: assignement properly. That is the reason my advice is to place
::: constructors outside the loop, though exceptions to the rule may
::: exist (yours doesn't seem to be one) AND to refrain from
::: returning large objects by value, wich would avoid the creation
::: of temporaries.
::
:: The fact is that std::string has no overhead in its copy
:: constructor, all it does is store a copy of the other string.
:
: I cannot figure out what you are meaning. Should not every copy-
: constructed object store a copy of the copied object? I guess you
: wanted to mean that they share data (copy on write).

No, I mean that constructing a string object from scratch can be
faster than destroying the old value, and then copying the new one.

Bo Persson

=?iso-8859-1?q?Elias_Salom=E3o_Helou_Neto?= · Nov 4, 2007

This has nothing to with reference counting. The problem with

assignment is that the string assigned to already has a value. What
are we going to do with that? How long does that take?

Now I see what you meant!

Release memory, not longer than destroying the object every iteration
of the loop, right?

(and I'm not trying to win a litterature prize)

But you are trying to be understood.

: If so, why would not it hold as well for copying? Think about it!
: Copying may be done on write, allright, but why could not
: assignement do exactly the same, i.e. copy on write with reference
: counting for the data? If it is done so, when a temporary argument
: is passed for either the copy constructor or the assignement
: operator, actual copying of the data would never take place.
:
::: But the real point here is that you were using something like str
::: = someFunction() instead of someFunction( str ). Do you see?
::
:: No, I don't!
::
:: We have a perfectly good and idiomatic piece of code in section 2.
:: It is simple, easy to read, and actualy runs faster. What more
:: could we ask??
:
: I ask to compare it to what it should be compared to, not to an
: oversimplified version. I mean that str( someFunction() ) is not to
: be benchmarked against str = someFunction() when optimization is
: turned on because, in the former, temporaries are rather easy to be
: eliminated, while in the second no!

I don't get this one.

What can I say, then? Try to squeeze your brain.

:
:: In section 1, the programmer tries some kind of premature
:: optimzation which just complicates the code. That he also gets
:: worse run-time performance, is well deserved!
:
: Is it well deserved to receive a performance penalty solely for
: deviating from an idiom? Now, an affirmative answer to this would be
: dumb!

Ok.

Trying to outsmart the compiler, and getting slower code is well
deserved IMO.

When did I try to outsmart the compiler?

::: Also, in normal cases, optimizations may not be as simple as they
::: are with std::string, which is under the compiler control (in
::: fact, the implementation of std::string should be near optimal
::: without any compiler optimization anyway). Within user-defined
::: classes, copy construction may execute startup code that will be
::: cleaned up upon destruction, which would not happen with
::: assignement operations, so the construction/destruction cycle
::: within a loop should best be avoided in most cases when it is not
::: needed, unless assignement is specially poorly implemented. In
::: such cases, however, it would be better to reimplement the
::: assignement properly. That is the reason my advice is to place
::: constructors outside the loop, though exceptions to the rule may
::: exist (yours doesn't seem to be one) AND to refrain from
::: returning large objects by value, wich would avoid the creation
::: of temporaries.
::
:: The fact is that std::string has no overhead in its copy
:: constructor, all it does is store a copy of the other string.
:
: I cannot figure out what you are meaning. Should not every copy-
: constructed object store a copy of the copied object? I guess you
: wanted to mean that they share data (copy on write).

No, I mean that constructing a string object from scratch can be
faster than destroying the old value, and then copying the new one.

Well, again you forget that your idiom has an implied destruction of
the object at every loop iteration, resulting in the need to deal with
exactly the same problem! How could that be different?

I will give you an example. Take the following two simple programs:

//Program 1:
#include <string>

std::string myFunction()
{
std::string str;
for ( unsigned i( 0 ); i < 1000; ++i )
str.append( "supercalifragilisomethingidonotremebmberandd"
"donotwantotsearchintheinternet" );

return( str );
}

int main()
{
for( unsigned i( 0 ); i < 100000; ++i )
std::string str( myFunction() );

return( 0 );
}

//Program 2:
#include <string>

void myFunction( std::string& str )
{
str.clear();
for ( unsigned i( 0 ); i < 1000; ++i )
str.append( "supercalifragilisomethingidonotremebmberandd"
"donotwantotsearchintheinternet" );
}

int main()
{
std::string str;
for( unsigned i( 0 ); i < 100000; ++i )
myFunction( str );

return( 0 );
}

According to you, Program 1 should run faster, right? But it is just
the opposite. Compiling both with no optimization (the default) using
gcc 4.1.2 20070502 Program 1 takes around 21 seconds to run against
around 15 seconds for Program 2. Now, let us turn optimization to its
higher level and see what happens. With the -O3 flag used when
compiling, Program 1's execution time falls to around 19 seconds,
while Program 2 goes down to amazing 12 seconds! Can you explain me
that?

It's time for another listing:

//Program 3:
#include <string>

std::string myFunction()
{
std::string str;
for ( unsigned i( 0 ); i < 1000; ++i )
str.append( "supercalifragilisomethingidonotremebmberandd"
"donotwantotsearchintheinternet" );

return( str );
}

int main()
{
std::string str;
for( unsigned i( 0 ); i < 100000; ++i )
str = myFunction();

return( 0 );
}

Program 3 takes little more than 17 seconds to run without
optimization turned on, explain it to me, please. When optimized, it
will take around 15 seconds to run.

Even though it is a contrived example, it shows who knows what is
talking about here. And I, for sure, did not have to look for some odd
example, it was the first one I tried.

From now on, refrain from making statements without the required

knowledge, all right? And always remember to try things out before
blindly believing in what people say to you (start by trying the
listings here).

Elias Salomão Helou Neto

Logic Problem with BigInteger Method	2	Aug 26, 2023
Calculate Y axis distance	5	May 5, 2012
Enable Polymorphism on Run()	5	Apr 30, 2009
How to overload Subscript operator	2	Aug 9, 2006
virtual inheritance	1	Dec 17, 2008
Why pA->foo() works in this code?	8	Jan 15, 2008
Need help finding Segmentation fault C++	0	Apr 16, 2022
New to python NEED SIMPLE MATH CODE plz	3	Jan 11, 2023

simple code performance question

galiorenye

Victor Bazarov

glory2

Victor Bazarov

glory2

Victor Bazarov

James Kanze

=?iso-8859-1?q?Elias_Salom=E3o_Helou_Neto?=

James Kanze

glory2

=?iso-8859-1?q?Elias_Salom=E3o_Helou_Neto?=

James Kanze

=?iso-8859-1?q?Elias_Salom=E3o_Helou_Neto?=

Bo Persson

peter koch

=?iso-8859-1?q?Elias_Salom=E3o_Helou_Neto?=

James Kanze

James Kanze

Bo Persson

=?iso-8859-1?q?Elias_Salom=E3o_Helou_Neto?=

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads