Why pass doubles as const references?

R

Rui Maciel

Ian said:
Where did I say I did?

This whole discussion is about how, somehow, passing doubles as const
references don't carry a performance penalty when compared with passing them
by value, and how the only basis that has been given to substantiate that
claim was that a corner case where a specific optimization technique is used
to deal with specific functions being used a specific way is somehow
representative of how all functions are always dealt with by all C++
compilers.

Claiming that compilers are free to implemente optimization techniques on
some corner cases, such as inlining specific types of functions under
specific circumstances, says nothing about the relative efficiency of
passing doubles by value when compared to passing doubles as const
references. It only means that under a set of very specific circumstances,
which are far from being general, it's posible that a specific compiler,
when passed a specific set of options, is able to influence the inefficiency
issues associated with passing primitive types by const reference.

Hence, the weasel words bit with regards to your "most code" bit, followed
by having pointed out that inlining is far from being employed, let alone a
given.

You can only count on it if you invest your time making sure that a
specific compiler will be able to compile a specific function within
your project to match your specific requirements, but this is way past
C++'s territory and firmly within platform and
implementation-specifics.

Most C++ relies in the inlining of trivial functions, it's at the heart
of the language. Would you expect every call to std::vector's
operator[] to involve an actual call?

Trivial functions are a small subset of the whole domain of functions. A
corner case, if you will.

I don't think you can apply the term "corner case" to a large part of
the standard library!

I didn't. I'm talking about trivial functions, not specific libraries.

I'm sure no one does.

Exactly. Therefore, any claim that somehow passing doubles as const
references "does not affect performance at all" when compared to passing
them by value is simply nonsense. Which is what I have been saying.


Rui Maciel
 
R

Rui Maciel

Öö Tiib said:
I assume nothing. Test code demonstrating that oh so very special cornered
case was posted by you. ;)

That's obviously not true. You've assumed that "performance of parameter
passing does not affect overall performance by any percentage", which is
obviously false.

In addition, the case I've presented is far from being a corner case. Any
C++ programmer, even newbies, knows what a reference is and what
dereferencing involves. In C++, the fact that accessing an object which was
passed by reference involves dereferencing it far from a corner case: it's
the norm.

I claimed that it does not likely matter.

Not true. You claimed that "it does not affect performance at all", which
was shown to be false.

What test demonstrates that it
does?

The test demonstrates that, contrary to what you've claimed, it does affect
performance, and it affects it significantly. The only cases where its
possible to tone down this performance impact are ones which involve very
specific functions, when used in a very specific way, compiled with a
specific compiler by passing specific compiler flags. You know, corner
cases.

Stack operations (passing parameters) and indirection to value in
cache are so fast that those did not matter much even with older hardware
and compilers.

You know, no matter how fast the hardware is, it still needs to execute the
operations it is told to execute. This means that if we compare two
routines, and one requires at least an extra instruction, the hardware will
still spend time executing that instruction.

In addition, dereferencing is all about memory access, which is where the
performance bottleneck is in today's hardware.

Modern stuff does them in parallel
<snip/>

Irrelevant.



Rui Maciel
 
I

Ian Collins

Rui said:
This whole discussion is about how, somehow, passing doubles as const
references don't carry a performance penalty when compared with passing them
by value, and how the only basis that has been given to substantiate that
claim was that a corner case where a specific optimization technique is used
to deal with specific functions being used a specific way is somehow
representative of how all functions are always dealt with by all C++
compilers.

You have a very vivid imagination. If you care to wonder back up the
thread and hang a left at my first reply to Scott Lurndal, you will see
I wrote: "The calls are still made, the function bodies are optimised.".

This was in relation to:

g++ x.cc -m64 -O1 && ./a.out
time pass by value: 2410000
time pass by reference: 2410000
You can only count on it if you invest your time making sure that a
specific compiler will be able to compile a specific function within
your project to match your specific requirements, but this is way past
C++'s territory and firmly within platform and
implementation-specifics.

Most C++ relies in the inlining of trivial functions, it's at the heart
of the language. Would you expect every call to std::vector's
operator[] to involve an actual call?

Trivial functions are a small subset of the whole domain of functions. A
corner case, if you will.

I don't think you can apply the term "corner case" to a large part of
the standard library!

I didn't.

You did, two posts up.
I'm talking about trivial functions, not specific libraries.

A large part (probably most of the templates) of the standard library is
made up from trivial functions.
 
Ö

Öö Tiib

That's obviously not true. You've assumed that "performance of parameter
passing does not affect overall performance by any percentage", which is
obviously false.

Yes, take excerpts from my words and pose them as proof about me expecting
special case that you post. Nonsense. Those words were certainly not about
code you posted.
In addition, the case I've presented is far from being a corner case. Any
C++ programmer, even newbies, knows what a reference is and what
dereferencing involves. In C++, the fact that accessing an object which was
passed by reference involves dereferencing it far from a corner case: it's
the norm.

Modern compilers and processors optimize a lot. Noobs know what dereference
means, not what it costs. Timings done with optimizations turned off and debugging turned on are pointless, since in worst of such tests a C++ program
may easily lose to Javascript.


posts timings done with debug build.
Modern compilers and processors do things in parallel optimize.
Not true. You claimed that "it does not affect performance at all", which
was shown to be false.

"Most likely it does not affect performance at all either way." Where it was
shown? By timing done with debug build?
The test demonstrates that, contrary to what you've claimed, it does affect
performance, and it affects it significantly. The only cases where its
possible to tone down this performance impact are ones which involve very
specific functions, when used in a very specific way, compiled with a
specific compiler by passing specific compiler flags. You know, corner
cases.

Paavo and Ian measured on different hardware and with different compilers.
Both got zero impact. Modern times you have to measure for to claim that
the constructs that you use have any impact to performance. In real code I
have not measured difference between the two ways that reaches 1%.
You know, no matter how fast the hardware is, it still needs to execute the
operations it is told to execute. This means that if we compare two
routines, and one requires at least an extra instruction, the hardware will
still spend time executing that instruction.

What extra instruction? Both have to push something to stack. Indirection
has been part of instruction for long time. Modern hardware manages to do it
without extra time.
In addition, dereferencing is all about memory access, which is where the
performance bottleneck is in today's hardware.

It is irrelevant who fetches the double under question from outside of
processor cache (caller or callee). So that "addition" only makes the affect
of different calling schemes used less significant.
 
A

army1987

Is there any good reason to declare a function parameter as `const
double &foo` rather than just `double foo`? I can see the point of that
when passing a very large object, but with a double I'd expect any
improvement in performance to be negligible. I've seen code using the
former, but I guess that's because it was translated from Fortran, where
all function arguments are passed by reference -- or am I missing
something?

(Turns out it was because the address of that function was in turn passed
to a function which expected a `double (*func)(const double&)` argument.)
 
D

Dombo

Op 11-Feb-13 22:50, Paavo Helde schreef:
Why -O1 and not -O2?

For reference: Visual Studio 2010, standard x64 Release build (/Zi
/nologo /W3 /WX- /MP /O2 /Oi /D "WIN32" /D "NDEBUG" ...)

Both value() and reference() were in a separate source file so they were
not inlined (explicit call instructions present in disassembly of main
()).

Note that having functions in separate source files does not necessarily
mean they cannot be inlined these days. In the case of Visual Studio
there is the 'Link Time Code Generation' option which may inline
functions even if they are defined in a different translation unit. I
compiled Rui's example in Visual Studio 2005 (32-bit) with the
reference() and value() functions in separate translation units the
functions were still inlined. It is a bit surprising that apparently
this doesn't happen with Visual Studio 2010 (x64).
 
R

Rui Maciel

Ian said:
You have a very vivid imagination. If you care to wonder back up the
thread and hang a left at my first reply to Scott Lurndal, you will see
I wrote: "The calls are still made, the function bodies are optimised.".

This was in relation to:

g++ x.cc -m64 -O1 && ./a.out
time pass by value: 2410000
time pass by reference: 2410000

You can't blame anyone's alledged "very vivid imagination" for your memory
problems, or in the very least a very selective memory. Or did you already
forgot that you are talking about a very specific corner case, where you
were only able to fabricate the result you intended by forcing a specific
compiler to optimize a specific function in a very specific way, and even
then you had to do at least 3 separate attempts?

You can only count on it if you invest your time making sure that a
specific compiler will be able to compile a specific function within
your project to match your specific requirements, but this is way
past C++'s territory and firmly within platform and
implementation-specifics.

Most C++ relies in the inlining of trivial functions, it's at the
heart
of the language. Would you expect every call to std::vector's
operator[] to involve an actual call?

Trivial functions are a small subset of the whole domain of functions.
A corner case, if you will.

I don't think you can apply the term "corner case" to a large part of
the standard library!

I didn't.

You did, two posts up.

Go read the post.

A large part (probably most of the templates) of the standard library is
made up from trivial functions.

Do you believe that your compiler optimizes every single function this way
every time a const reference is passed? In fact, do you actually believe
that the C++ standard mandates any of the optimization tricks you forced a
selected compiler to perform on this example?

There is a huge difference between what the C++ language is, and what a
specific implementation, under a set of very specific circumstances, is able
to do with a specific example. The C++ language doesn't change to fit
anyone's sweepign and baseless assumptions.


Rui Maciel
 
R

Rui Maciel

Paavo said:
It may or may not carry. Who cares?

Some people do care, particularly those who are forced to care about
efficiency and best practices. If not, Scott Meyers wouldn't have bothered
to specifically cover this in "Effective C++". IIRC, Andrei Alexandrescu
also gave a talk at Facebook (posted online somewhere) on the lessons he
learned about C++ efficiency and today's hardware, where he specifically
covers this topic.

The effect in any direction would be
visible only for trivial functions. But for trivial functions the ref-vs-
value issue is totally dwarfed by the inlining (or the lack of it). The
performance penalty for ref-or-value double is arguably at most 2x in
either direction (and much less by the actual measurements), but inlining
can yield many times more.

It isn't possible to assume that inline is always a given, or even possible.
Number crunching code tends to rely on routines provided by shared
libraries, and it isn't possible to assume that its possible to inline those
routines.

In addition, this is fundamentally a best practices issue. If we know
beforehand that passing doubles by value, when compared to passing them by
const reference, improves performance somewhere between 2x and zero, there
is no reason to opt for the const reference solution.

The same example again (Visual Studio 2010,
x64 Release build):

No inlining (trivial functions in the different .cpp):

time pass by value: 3197
time pass by reference: 3206

Inlining: everything in the same .cpp:

time pass by value: 968
time pass by reference: 956

Inlining (different .cpp-s, whole program optimization applied (/GL)):

time pass by value: 966
time pass by reference: 956

From here you see that the effect of inlining is over 3x, which makes the
ref-or-value question totally meaningless.

Conclusion: if you have trivial functions like these called in
performance-critical code, then better make sure they get inlined. When
inlined, the whole ref-vs-value passing issue evaporates so there is
nothing to talk about. When not inlined, the loss from ref-vs-value is
insignificant compared to the non-inlining penalty, so there is also
nothing to talk about.

I agree that in some cases inlining does bring performance benefits. Yet,
its also necessary to be aware of when inlining isn't possible, and this
tends to pop up a lot. In fact, when shared libraries are used to provide
primitive operations (i.e., BLAS) then this is the norm.


Rui Maciel
 
I

Ian Collins

Rui said:
You can't blame anyone's alledged "very vivid imagination" for your memory
problems, or in the very least a very selective memory. Or did you already
forgot that you are talking about a very specific corner case, where you
were only able to fabricate the result you intended by forcing a specific
compiler to optimize a specific function in a very specific way, and even
then you had to do at least 3 separate attempts?

I'm not sure if you are being obtuse, an arse or a troll.

All I did was compile the code *you posted* with the *same compiler* you
used and post the results. I made no mention of any corner case. If
you make another attempt at reading the few simple words that appear to
be causing you trouble, you'll see I contradicted the assumption the
function were inlined. Just to help you out, I'll quote them again:

"The calls are still made, the function bodies are optimised."

There, do you get it now?
 
R

Rui Maciel

Andy said:
Surely this whole discussion is implementation dependent anyway?

Exactly. All this fine-tuning tricks which specific compilers, under
precise circumstances, are able to perform is as implementation dependent as
it gets. Therefore, it makes no sense to pass off these corner cases as the
standard behavior which should be expected from standard C++.


My rule of thumb is to pass POD types by value and other types by const
reference. Then modify if I need to copy the data, or modify the source.
And I've not yet had to tune something so hard it really mattered - it's
always been better to tune the algorithm not these fine implementation
details.

Precisely. That's the crux of the matter. If we are talking about standard
C++ then this is the only thing that can be counted on. Anything beyond
this is, at best, platform-dependent, and only possible to pull in specific
corner cases, and only after a significant amount of tweaking and testing,
which obviously cannot be passed off as what is expected from standard C++.


Rui Maciel
 
F

Fred Zwarts \(KVI\)

"Rui Maciel" wrote in message news:[email protected]...
Exactly. All this fine-tuning tricks which specific compilers, under
precise circumstances, are able to perform is as implementation dependent
as
it gets. Therefore, it makes no sense to pass off these corner cases as
the
standard behavior which should be expected from standard C++.

As far as standard C++ is concerned passing by value may be faster or slower
than passing by reference.

So choosing passing by value because of performance improvement is only
applicable for specific compilers, under precise circumstances, and is as
implementation dependent as it gets. Therefore, it makes no sense to select
passing by value for performance improvements as the standard behavior which
should be expected from standard C++.
 
R

Rui Maciel

Paavo said:
Seems like a very specific corner case to me, and looks like if BLAS is
indeed encapsulating single floating-point operations in shared library
functions it could really benefit from templates/inlining/macros whatever
(cf. qsort() vs std::sort()).

BLAS is a C API for optimized routines which might not even be written in C.
You'd be hard pressed to use templates, inlining, or macros in a shared
library written in Fortran and wrapped in a C API, for example.

It is also not viable to simply implement a custom C++ version of that API,
because the people who write that stuff, as a rule, do know an awful lot
about what they are doing, and it simply isn't realistic to expect that all
that man-hour spent by experts in the field can be adequately replicated by
the average joe in his spare time.

And BLAS is only a single example, among many.

Again, there are reasons why people do pay attention to this sort of stuff.

You see, I have nothing against passing doubles by value - it is simpler,
less error-prone (think const_cast) and as you have argued repeatedly,
most probably not less efficient than pass-by-const-reference even on
exotic hardware. I would most definitely pass doubles by value myself.

However, if there is any slightest preference for pass by reference (a
general template function or a function typedef like in the case of OP) I
would not lose any sleep over it. The actual measurements suggest that any
loss of performance is unexistant or really minor. The only evidence
pointing in the opposite is from your own first test where you
accidentally measured unoptimized code.

It's perfectly fine if someone doesn't care if their code isn't efficient,
even if all it takes is being aware of a hand full of coding best practices
and how a programming language works. As you've said, in some cases
performance really doesn't matter.

What's not acceptable is having people make stuff up and state bold
assertions that fly in the face of reality. In this case, a claim that in
standard C++ passing doubles by reference does not affect overall
performance is simply not true, and it isn't possible to dance around this
fact even if we aren't talking about performance bottlenecks.


Rui Maciel
 
Ö

Öö Tiib

Your comment wasn't about any specific code: it was a sweeping
generalization. And a baseless (and patently wrong) one.

One of major benefits of using C++ as programming language is
performance. Not knowing how language features affect performance
means that you are not paying attention to major benefit of your
tool. Work done by someone just knowing the syntax may need to be
repaired by someone else who knows the performance. So you have
to work teamed with better specialists.

Things change on yearly basis. Most unpredictable (by any logic)
processor that I have seen was Intel's Pentium 4. Just switching
two seemingly unrelated lines could give significant boost or loss
there. Performance has therefore to be analyzed by measuring.

Experienced C++ developer has to have gained access to performance
information to know his tool. He can do measurements on all platforms
targeted. He can have someone in team who does those. He can have
those automated. Single try and lot of bla-bla attached displays
incompetence and one-lopsidedness.
" If it is complex algorithm then performance of parameter passing does not
affect overall performance by any percentage."

Come on.

I meant on cases measured by me the affect has been under 1%. Affect has
been in *both* directions so curiously by value is not "better" as rule.

I have measurements not only for doubles but for various other small things
like std::pair<wchar_t const*,int> or std::complex<float>. Similar results.
Insignificant for anybody but devoted scientists and often varying in
both directions sometimes even on same platform. Go read the thread
yourself both Ian and Paavo post such?

So on most platforms it does not matter "at all". "most likely" I added
because exotic platform (or even debug build on such) being under
question without being mentioned is "most unlikely".

PODs under 5 bytes seem to be worth passing by value on most platforms
because of performance. double is not under 5 bytes on any platform
reachable for me.

I generally suggest to *prefer* passing every POD that has size under 25
bytes by value, but the reason is that declaration is shorter and more
elegant while major performance affect either way I have yet to see:

void foo( Pod24 x ); // less bloat, no performance affect
void foo( Pod24 const& x ); // more bloat, no performance affect

24 bytes seems to be break-point after what passing by reference starts
to tend to give regular and noticeable benefit on most platforms. double
is far under it. I would be very interested if someone had any
contradicting with mine data either way for any popular platform for
any sane case.

Everything said can change this year with new popular chip or compiler
jumping out and standard does not matter at all.

Why you have turned an interesting discussion into pointless row of
baseless insults, and pointless tries to apply "common sense" and "logic"
and "come on" and "you have to understand" and "C++ standard" and "don't
play fool" to things that *should be measured* and that *I have measured*?

Your empty words just insult yourself. CLC++ is about all sides of C++
usage, performance is one very important perk of it, performance is
platform dependent and mentioning standard is just so pathetic?!?
 
J

James Kanze

On 13/02/2013 08:52, Rui Maciel wrote:
My rule of thumb is to pass POD types by value and other types by const
reference. Then modify if I need to copy the data, or modify the source.

Do you mean POD types, or non-class types. Something like

struct S { double d[ 1000000 ]; };

is a POD type, but I wouldn't recommend passing it by value.
The usual rule is class types by const reference, everything
else by value.
 
D

Dombo

Op 12-Feb-13 22:23, Paavo Helde schreef:
One needs to switch on the whole program optimization option (/GL) in the
compiler section of the project settings, setting only the linker /LTCG
option is not enough.

True, but I wasn't refering to a particular compiler or linker switch,
but to a feature of the Visual Studio toolchain.
According to MS documentation (http://msdn.microsoft.com/en-
us/library/0zza0de8%28v=vs.110%29.aspx) the /GL option is off by default:

"Whole program optimization is off by default and must be explicitly
enabled."

However, when I created a blank new project it had indeed switched on /GL
by itself, so I am not sure any more if this documentation means anything
at all or not.

This documentation is about the compiler, not the IDE. What is says is
that if /GL option is not specified on the command line of the compiler
the compiler (and linker) will not perform this optimization (regardless
of how the compiler is invoked). The default compiler- and linker
settings for Release and Debug build configurations are determined by
the IDE.
Given the quality of MS documentation (see e.g. _lock_file
description up to VS2005) I am not surprised.

The Visual Studio documentation leaves a lot to be desired and seems to
be getting less useful with every release (a simple Google search
usually yields more useful and better presented information than the
build-in help system). That being said Microsoft is far from the worst
as far as documentation goes. Not that Microsoft does a good job, it
doesn't, but way too many others do a far worse job, especially when it
comes to documenting the corner cases.
 
8

88888 Dihedral

Rui Macielæ–¼ 2013å¹´2月13日星期三UTC+8下åˆ5時10分04秒寫é“:
Paavo Helde wrote:






Some people do care, particularly those who are forced to care about

efficiency and best practices. If not, Scott Meyers wouldn't have bothered

to specifically cover this in "Effective C++". IIRC, Andrei Alexandrescu

also gave a talk at Facebook (posted online somewhere) on the lessons he

learned about C++ efficiency and today's hardware, where he specifically

covers this topic.













It isn't possible to assume that inline is always a given, or even possible.

Number crunching code tends to rely on routines provided by shared

libraries, and it isn't possible to assume that its possible to inline those

routines.



In addition, this is fundamentally a best practices issue. If we know

beforehand that passing doubles by value, when compared to passing them by

const reference, improves performance somewhere between 2x and zero, there

is no reason to opt for the const reference solution.


















I agree that in some cases inlining does bring performance benefits. Yet,

its also necessary to be aware of when inlining isn't possible, and this

tends to pop up a lot. In fact, when shared libraries are used to provide

primitive operations (i.e., BLAS) then this is the norm.





Rui Maciel

I think passing one or more arrays of doubles in references or addresses
for numeric computations is the default way in heavy number crunching jobs.

Of course passing a value type in some way which could be fit in the
hardware register or not sometimes does matter.
 
S

Stefan Ram

army1987 said:
Is there any good reason to declare a function parameter as `const double

»const« is always nice, so the decision could be
»double const & value« versus »double const value«.

A solution might be to use

CONST(double) value

and then to #define CONST(T) as »T const &« or »T const«
depending on profiling. Or, possibly, to use

DOUBLE_CONST value

when the definition would be different for »int«, and
DOUBLE_CONST is »double const« or »double const &«.
 
Ö

Öö Tiib

My rule of thumb is to pass POD types by value and other types by const
reference. Then modify if I need to copy the data, or modify the source.

Do you mean POD types, or non-class types. Something like

struct S { double d[ 1000000 ]; };

is a POD type, but I wouldn't recommend passing it by value.
The usual rule is class types by const reference, everything
else by value.

My apologies. The simple types, such as double, bool etc. Is non-class
the correct term? I can't find the definition, only people complaining
about it! I think I'm using the same rule as you, but not the correct name.

Most narrow set of types that contains both bool and double is called "arithmetic types". Next, bit wider is "fundamental types" that basically
adds void and nullptr_t to arithmetic types. Then there are "scalar types"
that add enumeration types, pointer types and pointer to member types.

POD (plain old data) is next, wider set. There is everything that "is trivial
and standard layout". That means POD is trivial to copy and to destroy and
its memory layout is so well standardized that you can use memcpy() to copy
it. Some arrays and structs are such but the details are going verbose here.

Just use:

#include <type_traits>
#include "X.hpp" // your header that declares type X
static_assert( std::is_pod<X>::value && sizeof(X)<25
, "X must be POD and less than 25 bytes long!" );

For me such trivial stuff that is under 25 bytes long is fine to pass by
value, it is never show-stopper defect AND the performance issues are
elsewhere as rule.
 
J

James Kanze

On 13/02/2013 08:52, Rui Maciel wrote:
My rule of thumb is to pass POD types by value and other types by const
reference. Then modify if I need to copy the data, or modify the source.
Do you mean POD types, or non-class types. Something like
struct S { double d[ 1000000 ]; };
is a POD type, but I wouldn't recommend passing it by value.
The usual rule is class types by const reference, everything
else by value.
My apologies. The simple types, such as double, bool etc. Is non-class
the correct term? I can't find the definition, only people complaining
about it! I think I'm using the same rule as you, but not the correct name.

POD is, very roughly speaking, anything you could write in C.
The simple types are the build-in types which can be defined
without any operators, so don't include pointers (or enums).
Non-class is anything which isn't a class (and don't forget that
in C++, a union is a sort of a class, and the keyword struct
also defines a class). The other thing we don't want to pass by
value is a C-style array, but the language takes care of that one.
 
J

James Kanze

]
Just use:

#include <type_traits>
#include "X.hpp" // your header that declares type X
static_assert( std::is_pod<X>::value && sizeof(X)<25
, "X must be POD and less than 25 bytes long!" );
For me such trivial stuff that is under 25 bytes long is fine to pass by
value, it is never show-stopper defect AND the performance issues are
elsewhere as rule.

Why be so complicated? The usual rule (at least everywhere I've
been) is "class types by const reference, everything else by
value". There are a few obvious exceptions (e.g. when we have
to deal with C style arrays, and one does tend to conform to the
standard library conventions, and pass iterators and functional
objects by value as well), but the general rule is simple,
and as you say, the performance issues are generally elsewhere
(if there are performance issues).

There is one important exception, of course: in a template, you
don't necessarily know whether a type is a class type or not.
Since the cost of passing an int or a double by const reference
isn't overwhelming, where as the cost of passing some classes by
value can be extremely expensive, the convention here is usually
to use const reference (again, except for iterators and
functional objects).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,113
Messages
2,570,688
Members
47,269
Latest member
VitoYwo03

Latest Threads

Top