virtual functions vs speed

S

Stijn Oude Brunink

Hello,

I have the following trade off to make:

A base class with 2 virtual functions would be realy helpfull for the
problem I'm working on. Still though the functions that my program will use
a lot are the ones that are virtual and thus will slow down the
calculation, at least that is what what I have read on several places on the
internet. It makes sense the program has to work with some kind off lookup
table for the virtual functions to excatly know what function to call.

How much is the slow down and is there a workaround?

I have been thinking about using function pointers but that seems as nasty
trick which will only create a lot development problems later on.

thanks for the respons

stijn
 
J

John Harrison

Stijn Oude Brunink said:
Hello,

I have the following trade off to make:

A base class with 2 virtual functions would be realy helpfull for the
problem I'm working on. Still though the functions that my program will
use
a lot are the ones that are virtual and thus will slow down the
calculation, at least that is what what I have read on several places on
the
internet. It makes sense the program has to work with some kind off lookup
table for the virtual functions to excatly know what function to call.

How much is the slow down and is there a workaround?

The slow down compared to what? Unless you have a workaround you have
nothing to compare with.

I guess the usual workaround is to have the base class have an extra field
that says what the derived class is and then have each 'virtual' function
test that field and directly call the derived class function (using a cast).
But that is slow as well. So all you can do is try both techniques and time
them. Any difference is likely to be measured in fractions of milliseconds.

The usual advice is to first get your code working in the most
straightforward way (you said virtual functions would be 'really useful').
Then, when you have a working program, see if it is running fast enough.
ONLY when you have determined that it is not should you look at ways of
speeding it up. Any other way is a waste of effort and produces ugly code as
well.
I have been thinking about using function pointers but that seems as
nasty
trick which will only create a lot development problems later on.

Right, its easier to speed up a well written program that it is to fix bugs
in a badly written program.

john
 
D

David Fisher

Stijn said:
I have the following trade off to make:

A base class with 2 virtual functions would be realy helpfull for the
problem I'm working on. Still though the functions that my program will use
a lot are the ones that are virtual and thus will slow down the
calculation, at least that is what what I have read on several places on the
internet. It makes sense the program has to work with some kind off lookup
table for the virtual functions to excatly know what function to call.

How much is the slow down and is there a workaround?

I would recommend "Inside the C++ Object Model" by Stanley Lippman - it
addresses exactly this sort of question.

What do the functions do ?

David Fisher
Sydney, Australia
 
N

Niels Dybdahl

A base class with 2 virtual functions would be realy helpfull for the
problem I'm working on. Still though the functions that my program will use
a lot are the ones that are virtual and thus will slow down the
calculation, at least that is what what I have read on several places on the
internet. It makes sense the program has to work with some kind off lookup
table for the virtual functions to excatly know what function to call.

How much is the slow down and is there a workaround?

For each call the compiler needs to load 2 values: first the address of the
table of virtual functions and then the address of the function itself.
However if you really call these functions a lot, then these values will be
in the processors cache and the delay will be very small. It will probably
be less than assignment of the value of one variable to another.
So either use virtual functions or try avoid function calls at all (make
them inline).

Niels Dybdahl
 
P

PKH

Stijn Oude Brunink said:
Hello,

I have the following trade off to make:

A base class with 2 virtual functions would be realy helpfull for the
problem I'm working on. Still though the functions that my program will
use
a lot are the ones that are virtual and thus will slow down the
calculation, at least that is what what I have read on several places on
the
internet. It makes sense the program has to work with some kind off lookup
table for the virtual functions to excatly know what function to call.

How much is the slow down and is there a workaround?

I have been thinking about using function pointers but that seems as
nasty
trick which will only create a lot development problems later on.

thanks for the respons

stijn

I did a time-test on virtual functions vs static calls once, and the
difference was minimal in my case. You
could make a small test-program and see for yourself.

PKH
 
S

Stijn Oude Brunink

some derived classes hold an array of pointers to other instants of derived
classes from the base class
my function does some simple addition and multiplication, and then calls the
same function via the pointers for the other instants, since I don't know
which derived class the pointer is pointing to the virtual function is handy
 
J

JKop

I have been thinking about using function pointers but that seems as
nasty trick which will only create a lot development problems later on.


For the love of God that's exactly how virtual functions work!

Why do you think that you can access memory any faster than the compiler
can?! Running shoes?

Virtual functions work via pointers (just to cover my bases here: No, it
doesn't say this in the Standard ), as such, they'll be exactly the same
speed as if you used function pointers *yourself*, plus they're more
convenient.

On last thing:

If you're running Windows, have a look for "ramdrive.sys" on your machine.
It's been around since about DOS 6 I think. Anyway, let's say you have 128MB
of RAM, set up a 50MB Ramdisk. Copy 50MB from your harddisk to this ramdisk.
Play around with the data on the ramdisk; moving, copying, renaming. Note
how quick it is, and note that that's 50MB you're dealing with. On Windows,
a pointer is 32-Bit, which on Windows equals to 4 bytes.

50MB = 51,200 KB = 52,428,800 bytes

which is the equivalent of 13,107,200 pointers.

Now, how many pointers are *you* dealing with? 2? You have exactly *no*
worries.

Just a little side note here... ever played the game Mortal Kombat? Well
there's a character in it called Shang Tsung who can morph into every other
character in the game. So you hit a certain button sequence (forward forward
x) and he morphs. This was grand on the Super Nintendo, the new character
was loaded instantly from the game cartidge (in fact, it wasn't loaded at
all, the cartridge was accessed directly from the cartridge as if it *was*
memory). But then you had the Playstation, which used CD's. CD's aren't
nearly as fast as RAM nor cartridge; as such, the new character had to be
loaded. So you'd hit forward forward x and the game would stall for about 10
seconds, then you'd be able to play again. There was the same problem on the
PC, loading the character from a hardisk. *My remedy*? Copy the game to a
ramdisk and run it. Forward forward x and voila, instant morph!

Again, just to stress it, RAM is fast! And by fast I mean *FAST*. So don't
be worrying about 2 miserable little pointers!


-JKop
 
J

JKop

JKop posted:
For the love of God that's exactly how virtual functions work!

Why do you think that you can access memory any faster than the
compiler can?! Running shoes?

Virtual functions work via pointers (just to cover my bases here: No,
it doesn't say this in the Standard ), as such, they'll be exactly the
same speed as if you used function pointers *yourself*, plus they're
more convenient.

On last thing:

If you're running Windows, have a look for "ramdrive.sys" on your
machine. It's been around since about DOS 6 I think. Anyway, let's say
you have 128MB of RAM, set up a 50MB Ramdisk. Copy 50MB from your
harddisk to this ramdisk. Play around with the data on the ramdisk;
moving, copying, renaming. Note how quick it is, and note that that's
50MB you're dealing with. On Windows, a pointer is 32- Bit, which on
Windows equals to 4 bytes.

50MB = 51,200 KB = 52,428,800 bytes

which is the equivalent of 13,107,200 pointers.

Now, how many pointers are *you* dealing with? 2? You have exactly *no*
worries.

Just a little side note here... ever played the game Mortal Kombat?
Well there's a character in it called Shang Tsung who can morph into
every other character in the game. So you hit a certain button sequence
(forward forward x) and he morphs. This was grand on the Super
Nintendo, the new character was loaded instantly from the game cartidge
(in fact, it wasn't loaded at all, the cartridge was accessed directly
from the cartridge as if it *was* memory). But then you had the
Playstation, which used CD's. CD's aren't nearly as fast as RAM nor
cartridge; as such, the new character had to be loaded. So you'd hit
forward forward x and the game would stall for about 10 seconds, then
you'd be able to play again. There was the same problem on the PC,
loading the character from a hardisk. *My remedy*? Copy the game to a
ramdisk and run it. Forward forward x and voila, instant morph!

Again, just to stress it, RAM is fast! And by fast I mean *FAST*. So
don't be worrying about 2 miserable little pointers!


-JKop

Also your compiler might cop at compile time what kind of
object it's dealing with, alleviating the need for pointers
at all!

But then again... you might have a shit comiler.


-JKop
 
A

Alex Vinokur

Stijn Oude Brunink said:
Hello,

I have the following trade off to make:

A base class with 2 virtual functions would be realy helpfull for the
problem I'm working on. Still though the functions that my program will use
a lot are the ones that are virtual and thus will slow down the
calculation, at least that is what what I have read on several places on the
internet. It makes sense the program has to work with some kind off lookup
table for the virtual functions to excatly know what function to call.

How much is the slow down and is there a workaround?
[snip]

Look at "Comparative performance measurement : virtual vs. ordinary methods" test:
http://article.gmane.org/gmane.comp.lang.c++.perfometer/66
 
P

Peter Koch Larsen

Alex Vinokur said:
Hello,

I have the following trade off to make:

A base class with 2 virtual functions would be realy helpfull for the
problem I'm working on. Still though the functions that my program will use
a lot are the ones that are virtual and thus will slow down the
calculation, at least that is what what I have read on several places on the
internet. It makes sense the program has to work with some kind off lookup
table for the virtual functions to excatly know what function to call.

How much is the slow down and is there a workaround?
[snip]

Look at "Comparative performance measurement : virtual vs. ordinary methods" test:
http://article.gmane.org/gmane.comp.lang.c++.perfometer/66

Hi Alex

I really can't believe those numbers. 600 clockcycles to call one nonvirtual
function?
I believe you should examine the assemblycode and find out what's going on.
Did you optimize? Did you substract time time taken outside the
functioncall?

I would have expected something like 2 or 3 clocks for the nonvirtual and 10
clocks for the virtual call.

/Peter
 
A

Arno Huetter

Stijn Oude Brunink said:
Hello,

I have the following trade off to make:

A base class with 2 virtual functions would be realy helpfull for the
problem I'm working on. Still though the functions that my program will use
a lot are the ones that are virtual and thus will slow down the
calculation, at least that is what what I have read on several places on the
internet. It makes sense the program has to work with some kind off lookup
table for the virtual functions to excatly know what function to call.

How much is the slow down and is there a workaround?

I have been thinking about using function pointers but that seems as nasty
trick which will only create a lot development problems later on.

thanks for the respons

stijn

Hi there,

I think you have to compare the penalty of the virtual function call
to the complexity of the function itself.

Possibly penalties: one vtable indirection, plus the fact that the
compiler cannot inline the function code (only applicable in case of
short functions) -> parameters must be passed using registers and/or
the call stack, call and return overhead, setting/restoring stack
base, stack frame pointers and program counter.

I am guessing here, but in a worst case scenario this might imply
something like 20+ additional cpu cycles in comparison to non-virtual,
inlined code. Compare that to the cpu cycles your function causes, and
you can evaluate the tradeoff you are facing.

Kind regards,
Arno Huetter
 
K

Karl Heinz Buchegger

Peter said:
I really can't believe those numbers. 600 clockcycles to call one nonvirtual
function?

As I understand the tables, the numbers you see are:

~600 ms for 50000000 calls

<Quote>
#=======================================================
# Comparison Group : virtual vs. ordinary methods
#-------------------------------------------------------
# Resource Name : CPU-time used
# Resource Cost Unit : clock
# Resource State Unit : clock
#-------------------------------------------------------
# Total repetitions : 50000000
# Performance metrics : clock / 50000000 repetitions
#=======================================================
</Quote>

with virtual functions taking approximately twice that time.
Pretty much what most people would expect.
 
A

Alex Vinokur

Peter Koch Larsen said:
"Alex Vinokur" <[email protected]> skrev i en meddelelse

[snip]
Look at "Comparative performance measurement : virtual vs. ordinary methods" test:
http://article.gmane.org/gmane.comp.lang.c++.perfometer/66
[snip]


Hi Alex

I really can't believe those numbers. 600 clockcycles to call one nonvirtual
function?

657 milliseconds per 50000000 calls of one nonvirtual function of class Foo00
which has both virtual and nonvirtual methods.

I believe you should examine the assemblycode and find out what's going on.
Did you optimize?

No.

GNU g++ version 3.3.3 (cygming special)

Compilation :
$ g++ -mno-cygwin *.cpp // i.e. No optimization

Did you substract time time taken outside the
functioncall?

What time is worth subtracing here?
 
G

Greg Comeau

I have the following trade off to make:

A base class with 2 virtual functions would be realy helpfull for the
problem I'm working on. Still though the functions that my program will use
a lot are the ones that are virtual and thus will slow down the
calculation, at least that is what what I have read on several places on the
internet. It makes sense the program has to work with some kind off lookup
table for the virtual functions to excatly know what function to call.
Probably.

How much is the slow down

Depends upon the platform.
and is there a workaround?

Yes, but that isn't the question, the question is can you
product an alternative that is superior?
I have been thinking about using function pointers but that seems as nasty
trick which will only create a lot development problems later on.

IOWs, you were thinking of making some kind of lookup table, and
some "index" you maintain, for your simulated virtual functions to
exactly know what function to call. Hey wait, that's exactly what
you just said the compiler is doing for you!

This is a common faux pas when comparing say C with C++.
It's easy to say "virtual functions have a penalty".
But you just don't use virtuals because they are they
you use them because you need them. So, the question
is what are the alternatives, and what are their penalties?
IOWs, if you want the behavior, there is probably a penalty
to all alternatives. And sometimes will find that often
there is a non-detectable situation and no clear winner,
except that going with the automated way may make sense,
since it avoids having to do it by hand. Note that
my comments are generalizations. Another issue is
whether or not you have a speed concern anyway.
The only way to know is to test it, and also test the
alternatives if it's deemed so action is necessary,
also noting that such results could be platform,
compiler, and even command line switch dependent.
 
G

Greg Comeau

I think you have to compare the penalty of the virtual function call
to the complexity of the function itself.

Sure, but that's also an incomplete comparision, because
then it's only looking at the dynamics of the virtual call.
and not the issue of how to avoid or better anything.
Possibly penalties: one vtable indirection, plus the fact that the
compiler cannot inline the function code (only applicable in case of
short functions) -> parameters must be passed using registers and/or
the call stack, call and return overhead, setting/restoring stack
base, stack frame pointers and program counter.

I am guessing here, but in a worst case scenario this might imply
something like 20+ additional cpu cycles in comparison to non-virtual,
inlined code. Compare that to the cpu cycles your function causes, and
you can evaluate the tradeoff you are facing.

But the tradeoff isn't virtual vs non-virtual, it's virtual
with it's automatic underlying machinery vs non-virtual with
equivalent machinery necessary to be added by hand.
 
P

Peter Koch Larsen

"Peter Koch Larsen" <[email protected]> skrev i en meddelelse
[snip]
I really can't believe those numbers. 600 clockcycles to call one nonvirtual
function?
I believe you should examine the assemblycode and find out what's going on.
Did you optimize? Did you substract time time taken outside the
functioncall?

My surprise was caused by this documentation from the URL:
#=======================================================
# Comparison Group : virtual vs. ordinary methods
#-------------------------------------------------------
# Resource Name : CPU-time used
# Resource Cost Unit : clock
# Resource State Unit : clock
#-------------------------------------------------------
# Total repetitions : 50000000
# Performance metrics : clock / 50000000 repetitions
#=======================================================

I assumed that clock meant clockcycles (and not "wall clock time").
Understanding otherwise is a little difficult seeing "Resource Cost Unit" as
clock instead of e.g. "seconds elapsed time". Also the line

"# Performance metrics : clock / 50000000 repetitions"

was interpreted as number of clockcycles when having 50000000 repetitions.
I would have expected something like 2 or 3 clocks for the nonvirtual and 10
clocks for the virtual call.

/Peter

Am i really the only one to be confused?

/Peter
 
P

Peter Koch Larsen

Karl Heinz Buchegger said:
As I understand the tables, the numbers you see are:

~600 ms for 50000000 calls

<Quote>
#=======================================================
# Comparison Group : virtual vs. ordinary methods
#-------------------------------------------------------
# Resource Name : CPU-time used
# Resource Cost Unit : clock
# Resource State Unit : clock
#-------------------------------------------------------
# Total repetitions : 50000000
# Performance metrics : clock / 50000000 repetitions
#=======================================================
</Quote>

with virtual functions taking approximately twice that time.
Pretty much what most people would expect.

I realise this now - see my post elsewhere in this thread. That the overhead
of calling a virtual function is twice that of calling a nonvirtual one is
quite reasonable.

/Peter
 
P

Peter Koch Larsen

Alex Vinokur said:
"Alex Vinokur" <[email protected]> skrev i en meddelelse

[snip]
Look at "Comparative performance measurement : virtual vs. ordinary methods" test:
http://article.gmane.org/gmane.comp.lang.c++.perfometer/66
[snip]


Hi Alex

I really can't believe those numbers. 600 clockcycles to call one nonvirtual
function?

657 milliseconds per 50000000 calls of one nonvirtual function of class Foo00
which has both virtual and nonvirtual methods.

I believe you should examine the assemblycode and find out what's going on.
Did you optimize?

No.

GNU g++ version 3.3.3 (cygming special)

Compilation :
$ g++ -mno-cygwin *.cpp // i.e. No optimization

Did you substract time time taken outside the
functioncall?

What time is worth subtracing here?

the entire loop could be something like

L1: DEC DWORD PTR [LOOP_CNT]
JZ L1
CALL FOO_F0
JMP L1

Thus, the loop overhead plays a major part of the overhead (depending a lot
on the CPU, of course).

If you had e.g. 100 repetitions inside the loop, the time there should be
neglicible.


/Peter
 
N

Nicholas Salerno

Stijn Oude Brunink said:
Hello,

I have the following trade off to make:

A base class with 2 virtual functions would be realy helpfull for the
problem I'm working on. Still though the functions that my program will use
a lot are the ones that are virtual and thus will slow down the
calculation, at least that is what what I have read on several places on the
internet. It makes sense the program has to work with some kind off lookup
table for the virtual functions to excatly know what function to call.

How much is the slow down and is there a workaround?

I have been thinking about using function pointers but that seems as nasty
trick which will only create a lot development problems later on.

thanks for the respons

stijn

I have some comments to make that may help:

1) Don't worry about the speed of virtual functions until you know for
a fact that it is a problem. Also, don't trust everything you read on
the Internet. It boils down to this: virtual functions are relatively
slow. This has some very important implications. First, virtual
functions are NOT slow. Second, depending on the application in
question, the relative difference MAY NOT be significant. How can you
tell if the relative difference is significant? By using a code
profiler. Let's assume that you are not meeting your time
requirements. Run the profiler and let the profiler tell you what are
the bottlenecks. Chances are you might be surprised that it indicates
something else than what you thought.

2) OK, let's assume that for whatever reason you don't want to use
virtual functions. The only solution that I know of, with respect to
speed, is to use templates. Templates have the ability to take
polymorphism from the run-time level to the compile-time level. It
means that the dynamic mechanism happens during compilation. The end
result is that the function call becomes a non-virtual function call.
However, this is easier said then done. I have mixed feelings about
template programming. It can be very tricky. Also, depending on how
much transparency your client code needs it may not be possible to use
the compile-time polymorphic technique of templates. If you are still
interested in using templates then lookup and read about the
"Curiously Recursive Template" pattern.

I'd say don't resort to templates unless you absolutely have to. In
my experience run-time polymorphism is not the performance evil that
many make it out to be. I currently work on a project that involves
live video feed. The code base is very object-oriented. It has many
strategy design patterns. Simply stated, I have two main tasks I want
to accomplish: grabbing a frame and preprocessing each frame. Both of
these tasks have nothing to do with the main application logic. If I
want to use different cameras then I have to be able switch the frame
grabbing and preprocessing strategy as seamlessly as possible. During
live capture my main application logic is calling virtual functions
from an abstract interface. Does the overhead of the virtual
functions slow down the frame rate? No. But then again it depends on
context. If the frame rate is more than 100 frames per second (fps)
then maybe the virtual functions will hurt performance. However,
fewer than 32 fps and the virtual functions are negligible. I'm not
saying this is true for all applications of this type. I'm saying
that it is true for my application. The profiler tells me that my
bottleneck, by far, is the preprocessor. Specifically, it is the
convolution operation (a standard image processing technique).
Calling the preprocessor through a function or virtual function would
make no difference in the total execution time for the convolution
operation. What did we have to do to speed up the live capture? We
used MMX instructions for the pixel processing inside the
preprocessor. No need to remove the virtual functions. They are
still there and my code base remains simple and maintainable. Just
something to think about.

Nicholas
 
E

E. Robert Tisdale

Stijn said:
I have the following trade off to make:

A base class with 2 virtual functions would be realy helpful
for the problem I'm working on.
Still though the functions that my program will use a lot
are the ones that are virtual and thus will slow down the calculation,
at least that is what what I have read on several places on the internet.
It makes sense the program has to work with some kind off lookup
table for the virtual functions to excatly know what function to call.

Nonsense!
The difference between a virtual function call
and a regular external function call
is on the order of a few nanoseconds.
The difference is so small that you can't measure it.

The problem with a virtual function is that
the actual function is not resolved until run-time
so the compiler can't inline it and take advantage
of the optimizations possible with inline code.
How much is the slow down and is there a workaround?

There is no workaround.
I don't think that you need to worry about this
unless your virtual functions only do a trivial amount of work
and you were seriously considering implementing them
as inline functions.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,176
Messages
2,570,947
Members
47,498
Latest member
log5Sshell/alfa5

Latest Threads

Top