Writing Scalabe Software in C++

S

Stephen Sprunk

Skybuck Flying said:
That's kinda interesting/cool... virtual overloaded operators.

However: IT'S HELLISH SLOW when looking at the assembler.

It might not be as slow as delphi's dynamic array referencing count and
such but still.. (however not fair to compare... because this c+= example
ain't dynamic infinite scalable ;))

Way to slow to be usualable for my purposes.

It was interesting to see C++ virtual overloaded operators in action...
maybe making the operators virtual would not be necessary if other tricks
used ? I am not good enough in C++ to try other tricks.

I also explored some other ideas, non of which are statisfieng.

Really, really sad.

I will probably have to convert my code to 64 bit emulated ints and say
goodbye to performance for 32 bit cases !

And, if you'd bothered to read my posts, you'd see that's exactly what I
told you you'd see: "Simply running in 64-bit mode (even emulated) all the
time will be faster on modern CPUs than trying to decide at runtime which is
better."

I didn't have to run any tests to know that, merely understanding how the
CPUs and compilers actually work. You might try investigating those things
before posting, as it'll save you (and us) a lot of time and effort.

S
 
S

Stephen Sprunk

MooseFET said:
That is simply false. Why do you think it can't be done with
virtual methods?

It can, of course.
Virtual methods when done the way that Borland Pascal, Delphi
and a good implementation of C++ add very little extra time to
the total run time. The virtual method dispatch code takes less
instructions than the entry code of most routines.

True. However, in this particular example, we're comparing the cost of
using virtual methods to select 32- and 64-bit code paths vs. the cost of
emulating 64-bit all the time.

* You have to do a vtable lookup
* You have to get the parameters into the right registers or, worse, in the
right places on the stack
* You have to call the function
* You have to do a function prolog
* Do the work
* You have to do a function epilog
* You have to return from the function
* You have to get the results from the return register or stack to where you
want it.

All of those steps need to be done in series, because they depend on each
other. You also lose the ability to schedule multiple such operations in
parallel or one operation in parallel with other code, greatly increasing
latency and reducing performance. Finally, there's significant additional
costs if you have L1-I misses, BHT misses, stack-based arguments, etc.

Compare all of that vs. just emulating a 64-bit type (assuming a 32-bit CPU)
for all math. It's obvious to anyone who understands CPU architecture which
will win. Skybuck's the only one who doesn't get it, for obvious reasons.

S
 
F

Frederick Williams

Skybuck said:
Hello,

This morning I had an idea ...

I hope that this doesn't sound impolite, but why are you posting to
sci.electronics.design and alt.math?
 
D

David Brown

Skybuck said:
For those that missed the other threads here is the explanation why I want
something like this:

For 32 bit compilers:

int (32 bit signed integer) is fast, it's translated to single 32 bit cpu
instructions.

long long (64 bit signed integer) is slow, it's translated to multiple 32
bit cpu instructions.

For 64 bit compilers

long long (64 bit signed integer) should be fast, it's translated to a
single 64 bit cpu instruction.

I want to write code just once ! not three times ! and I want maximum speed
!

Look, it's quite simple - if you want 32-bit data, use 32-bit integers.
If you want 64-bit data, use 64-bit integers. There is virtually no
situation where 64-bit integers are faster than 32-bit integers on a
64-bit processor. On such rare occasions when there *is* a difference,
coding specifically for the algorithm in question will make more difference.
 
S

Skybuck Flying

MooseFET said:
That is simply false. Why do you think it can't be done with virtual
methods?

Because virtual methods is not operator overloading.

Writing code such as:

Div( Multiply( Add( A, B ), C ), D )

Is unpractical and ofcourse slow, ^^^ call overhead.
Virtual methods when done the way that Borland Pascal, Delphi and a
good implementation of C++ add very little extra time to the total run
time. The virtual method dispatch code takes less instructions than
the entry code of most routines.

A good operator overloading implementation does not even have call overhead.

I think that's how Delphi's operator overloading for records work.

^^^ No overhead ^^^

(Not sure though, but I thought that's how it works)

No, it don't. It only gives the appearance of doing what you want.
There is nothing scalable going on.

Apperently you see it differently.

I already used these techniques to scale to infinety.

So you simply wrong about that one.

Bye,
Skybuck.
 
H

hagman

Well we can pretty safely forget about this "solution".

It's not really a solution.

The problem is with the data.

Different data types are needed.

32 bit data and 64 bit data.

Trying to cram those into one data type not possible.

Not with classes, not with records, maybe variants but those to slow.

If you do try you will run into all kinds of problems, code problems.

It was an interesting experience though.

I played around with DLL's then Packages. LoadPackage, UnloadPackage,
Tpersistent (Delphi stuff) then I realized let's just copy & paste the code
and try to use unit_someversion.TsomeClass but nope.

The problem with the data remains.

I really do want one data type to perform operations on, and this data type
should scale when necessary.

I want one code on this data type and should change when necessary.

It looks simply to do but currently in Delphi it's probably impossible to do
it fast, even slow versions create problems.

The best solution is probably my own solution:

TgenericInteger = record
mInteger : int64;
end;

Overloaded add operator:
if BitMode = 32 then
begin
int32(mInteger) := int32(mInteger) + etc;
end else
if BitMode = 64 then
begin
int64(mInteger) := int64(mInteger) + etc;
end;

Something like that.

Introduces a few if statements... which is overhead...

Question remains, how much overhead is it really ?

Yeah good question:

Which one is faster:

add
adc

Or:
mov al, bitmode
cmp al, 32
jne lab1
add eax, ecx
lab1:
cmp al, 64
jne lab2
add eax, ecx
adc eax, ecx
lab2:

Something like that...

Well I think always executing add, adc is faster then the compares and jumps
:) LOL.

End of story ? Not yet... this is simple example... what about mul and div ?
<- those complex for int64 emulated.

Maybe using if statement to switch to 32 bit when possible would be much
faster after all ?!

Bye,
Skybuck.

I think I still don't get what you want.
If you want 32 bits, use "int".
If you want 64 bits, use "long long"
If you want the biggest type that the target cpu can mul/div in a
single instrcution, use "long".
At least this sems to work out correctly with gcc and intel 32/64 bit
machines
At least you don't save *memory* by adding a 4byte vtable pointer just
to distinguish between the case of additional 4bytes of int or 8bytes
of long long (not to mention alignment) and I doubt you save much
*time* either.

Moreover, I have the impression that you don't treat mixed cases like
int32 + int64 well
 
S

Skybuck Flying

What makes you believe I don't get it ?

Please stop your unnecessary insults.

Bye,
Skybuck.
 
S

Skybuck Flying

Absolutely nonsense.

If I want I can write a computer program that runs 32 bit when possible and
64 bit emulated when needed.

My computer program will outperform your "always 64 emulated" program WITH
EASE.

The only problem is that I have to write each code twice.

A 32 bit version and a 64 bit version.

I simply instantiate the necessary object and run it.

Absolutely no big deal.

The only undesirable property of this solution is two code bases.

Your lack fo programming language knownledge and experience is definetly
showing.

Bye,
Skybuck.
 
S

Skybuck Flying

Math was accident, probably related anyway.

Electronics.design might be related as well ;)

Bye,
Skybuck.
 
P

pan

Because virtual methods is not operator overloading.
Writing code such as:
Div( Multiply( Add( A, B ), C ), D )
Is unpractical and ofcourse slow, ^^^ call overhead.

An user defined overloaded operator cal is as fast as an user
definedfunction call, simply because they're both the same thing.
A good operator overloading implementation does not even have call
overhead.

That is usually called "inlining", and can be applied both to
functionsand overloaded operators.
Anyway inlining is hard to happen for a virtual function or operator.
 
S

Skybuck Flying

There is definetly a speed difference especially for mul and div for the
modes I described.

Why do I have to choose the data type ?

Why can't the program choose the data type at runtime ?

Bye,
Skybuck.
 
S

Skybuck Flying

Yes you missed the other threads, I shall explain again lol:

I want:

1. One code base which adepts at runtime:

2. Uses 32 bit instructions when possible.

3. Switches to 64 bit instructions when necessary (true or emulated).

4. No extra overhead.

As far as I can tell the cpu's for pc's are inflexible:

32 bit data types require 32 bit instructions.

64 bit data types require 64 bit instructions or alternatively:

64 bit data types require multiple 32 bit instructions.

This means it's necessary to code 3 code paths !

I do not want to write code 3 times !

I want to express my formula's and algorithms just one time !

I want the program/code base to adept to the optimal instruction sequences
without actually having to code those three times !

I suggested a "feature extension" to processors: "Flexible Instruction Set".

The idea is to use a BitMode variable to specify to the cpu how it is
supposed to interpret the coded instructions sequences.

So that I can write simple one instruction sequence and only need to change
a single variable.

Many people started bitching that the current cpu's can already do this for
16/32/64.

I have seen no prove what so ever.

Can you provide prove ?

Bye,
Skybuck.
 
D

David Brown

Skybuck said:
There is definetly a speed difference especially for mul and div for the
modes I described.

Why do I have to choose the data type ?

Why can't the program choose the data type at runtime ?

If *you* are writing the program, *you* should know what sort of data is
stored in each variable. *You* can then tell the compiler by choosing
an appropriate data type. Is that so hard to grasp? It is up to *you*
to figure out that what limits there will be on the size of the data you
are using, and therefore pick 32-bit or 64-bit (or whatever) integers
for your program. If you think there could be large variations in the
sizes, then either use a data type that will certainly be big enough, or
pick one with no arbitrary limit (there are multiple precision integer
libraries available for most languages), or use a dynamically typed
language.
 
D

David Brown

Skybuck said:
Because virtual methods is not operator overloading.

Writing code such as:

Div( Multiply( Add( A, B ), C ), D )

Is unpractical and ofcourse slow, ^^^ call overhead.


A good operator overloading implementation does not even have call overhead.

I think that's how Delphi's operator overloading for records work.

^^^ No overhead ^^^

(Not sure though, but I thought that's how it works)



Apperently you see it differently.

I already used these techniques to scale to infinety.

So you simply wrong about that one.

Bye,
Skybuck.

When you are looking at operator overloading, the compiler sees an
expression such as "b * c", it considers it *exactly* the same as a
function "multiply(a, b)". There is absolutely no difference to the
compiler, and you can use virtual methods, overloading, inlining, and
any other tricks to get the effect you want.

I presume you also know that compilers do not have to use virtual calls
just because a function is a virtual method of a class? If the compiler
knows what class an object is, then it can short-circuit the virtual
method table and call the method directly. And if it knows the
definition of the method in question, it can automatically inline the
call - resulting in zero overhead.
 
S

Skybuck Flying

David Brown said:
If *you* are writing the program, *you* should know what sort of data is
stored in each variable. *You* can then tell the compiler by choosing an
appropriate data type. Is that so hard to grasp? It is up to *you* to
figure out that what limits there will be on the size of the data you are
using, and therefore pick 32-bit or 64-bit (or whatever) integers for your
program. If you think there could be large variations in the sizes, then
either use a data type that will certainly be big enough, or pick one with
no arbitrary limit (there are multiple precision integer libraries
available for most languages), or use a dynamically typed language.

Well that clearly sucks.

The world is not completely 64 bit, The world is not statis it fluctuates.

Sometimes the program only needs 32 bits, sometimes 64 bits.

Always choosing 64 bits would hurt performance LOL.

Bye,
Skybuck.
 
S

Skybuck Flying

Which is ofcourse impossible.

The compiler does not know what the program wants at compile time.

Does it want 32 bit or 64 bit ?

Only the program knows at runtime !

Depends on the situation.

Bye,
Skybuck.
 
M

MooseFET

Because virtual methods is not operator overloading.

Any operator overloading that allows the type of the variable to be
determined at run time most certainly is virtual methods. You need to
look into how it is done.

Operator overloading that is not virtual (ei: the variable type can't
be changed at run time) can be inlined. A smart compiler will do this
for small functions.
Writing code such as:

Div( Multiply( Add( A, B ), C ), D )

Is unpractical and ofcourse slow, ^^^ call overhead.

There is nothing unpractical about what you coded. People write code
like that all the time.

A good operator overloading implementation does not even have call overhead.

A non-virtual function can be inlined.
I think that's how Delphi's operator overloading for records work.

^^^ No overhead ^^^

(Not sure though, but I thought that's how it works)

Why don't you know? Go read up on it. You will find out that a lot
of what you are assumeing is wrong.

Apperently you see it differently.

I already used these techniques to scale to infinety.

So you simply wrong about that one.

No you are the one who is wrong. You are suggesting that there is run
time type determination that is different from the virtual method
dispatch. This is simply false. Get your compiler to spit out the
assembly and look at it. You will see what it really does.
 
J

J de Boyne Pollard

h> If you want 32 bits, use "int".
h> If you want 64 bits, use "long long"

If one specifically wants 32 bits, use "int32_t"/"uint32_t". If one
specifically wants 64 bits, use "int64_t"/"uint64_t".
 
M

MooseFET

Absolutely nonsense.

If I want I can write a computer program that runs 32 bit when possible and
64 bit emulated when needed.

My computer program will outperform your "always 64 emulated" program WITH
EASE.

The only problem is that I have to write each code twice.

This statement is incorrect. C, C++, Borland Pascal and it
decendants, and just about every other language I can think of allow
you to declare a new type to be the same as a simple type, allow
conditional compiles, and allow include files. You don't need to have
two copies of the source code.
A 32 bit version and a 64 bit version.

I simply instantiate the necessary object and run it.

Absolutely no big deal.

The only undesirable property of this solution is two code bases.

Your lack fo programming language knownledge and experience is definetly
showing.

Right back at you.
 
R

Ron Natalie

MooseFET said:
This statement is incorrect. C, C++, Borland Pascal and it
decendants, and just about every other language I can think of allow
you to declare a new type to be the same as a simple type, allow
conditional compiles, and allow include files. You don't need to have
two copies of the source code.

Incorrect. C and C++ certainly do not. You can #define or typedef
something that appears to be a type but they aren't distinct types.
You're just conditionally compiling which type you are using (which
accomplishes what you want). The distinction is an important one.
A typedef isn't seperately resolvable from the type it aliases.

All that being said, we have produced versions of our product for
a wide variety of machines in C++ and C and to this day provide
win 32 and 64 versions. The difference in code is a handful of
conditional compiles and typedefs. We spend more time dealing
with interfaces to other people's products (ESPECIALLY FREAKING
MICROSOFT) who haven't bothered to provide 64-bit versions of
all their interfaces.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,201
Messages
2,571,049
Members
47,655
Latest member
eizareri

Latest Threads

Top