How much overhead for a new templated class?

J

jacob navia

I am writing a containers library in C. The generic list container
adds for a new type:

code: 2495 bytes
data: 488 bytes

for a list container containing around 60 entry points.

This means that you will pay 2983 bytes for each instatiation
of the list template (one of the biggest around)

I would be interested to know the corresponding vaue of the
C++ stl. How many bytes of code it will cost you to instantiate
a list template for a new type?

Thanks in advance.
 
I

Ian Collins

I am writing a containers library in C. The generic list container
adds for a new type:

code: 2495 bytes
data: 488 bytes

for a list container containing around 60 entry points.

This means that you will pay 2983 bytes for each instatiation
of the list template (one of the biggest around)

I would be interested to know the corresponding vaue of the
C++ stl. How many bytes of code it will cost you to instantiate
a list template for a new type?\

I'm sure you can check that for yourself, but with g++, about 1K to
instantiate a list<int> and push_back a value.
 
D

Dombo

Op 21-May-12 22:37, jacob navia schreef:
I am writing a containers library in C. The generic list container
adds for a new type:

code: 2495 bytes
data: 488 bytes

for a list container containing around 60 entry points.

This means that you will pay 2983 bytes for each instatiation
of the list template (one of the biggest around)

I would be interested to know the corresponding vaue of the
C++ stl. How many bytes of code it will cost you to instantiate
a list template for a new type?

That depends on the compiler, compiler settings, the type for which the
list template is instantiated and the other types are instantiated.

For example:

#include <list>
#include <string>

class A
{
public:
A(): foo(0) {}
long foo;
};

class B
{
public:
B() {}
std::string bar;
};

int main()
{
std::list<void*> lpv;
std::list<A*> lpa;
std::list<B*> lpb;
std::list<A> la;
std::list<B> lb;

lpv.push_back(new A);
lpa.push_back(new A);
lpb.push_back(new B);
la.push_back(A());
lb.push_back(B());

return 0;
}


If you compile this code Visual C++ 2010 with optimizations enabled all
push_back() method calls result in the same code being called regardless
of the list type (in other words the code is not replicated for each
template instantiation). The code for the constructors is also shared
between std::list<void*>, std::list<A*>, std::list<B*> and std::list<A>,
but not for std::list<B>; std::list<B> has its own constructor code.

My observation is that template instantiation for pointer types often
share code with template instantiations for other pointer types. I.e.
instantiating a list template with another pointer type in itself does
not add code. With non-pointer types only some parts, if any, are shared
between template instantiations. Of course YMMV because of all the
variables involved that can affect the outcome.
 
I

Ian Collins

I am writing a containers library in C. The generic list container
adds for a new type:

code: 2495 bytes
data: 488 bytes

for a list container containing around 60 entry points.

This means that you will pay 2983 bytes for each instatiation
of the list template (one of the biggest around)

I would be interested to know the corresponding vaue of the
C++ stl. How many bytes of code it will cost you to instantiate
a list template for a new type?

Rather than fussing over a few bytes, how about a performance comparison?

Here's a simple (related to some work I have been doing recently) benchmark:

Generate a list of 10,000,000 random longs (call it random)
use that list to instantiate another list (call it list)
sort list
sum and empty list by removing the first entry until empty.

What would be your containers library solution and how does it compare
to this C++ solution?

#include <list>
#include <iostream>
#include <stdint.h>
#include <stdlib.h>

// For system hi-res timer, add your own here.
//
int64_t hiresTime();

const unsigned entries = 10000000;

typedef std::list<long> List;

void fill( List& list )
{
srand48(42);
for( unsigned n = 0; n < entries; ++n )
list.push_back(lrand48());
}

uint64_t get( List& list )
{
uint64_t result(0);
while( !list.empty() )
{
result += list.front();
list.pop_front();
}
return result;
}

double delta( int64_t start, int64_t now )
{
return (now-start) / 1000000000.0;
}

int main()
{
using std::cout;
using std::endl;

List random;

int64_t start = hiresTime();
fill( random );
int64_t now = hiresTime();

cout << "To generate " << delta(start, now) << 'S' << endl;

start = now;
List list(random);
now = hiresTime();

cout << "To create " << delta(start, now) << 'S' << endl;

start = now;
list.sort();
now = hiresTime();

cout << "To sort " << delta(start, now) << 'S' << endl;

start = now;
uint64_t n = get( list );
now = hiresTime();

cout << "To empty " << delta(start, now) << 'S' << endl;

cout << n << endl;
}
 
I

Ian Collins

Rather than fussing over a few bytes, how about a performance comparison?

Here's a simple (related to some work I have been doing recently) benchmark:

Generate a list of 10,000,000 random longs (call it random)
use that list to instantiate another list (call it list)
sort list
sum and empty list by removing the first entry until empty.

I forget to add include error checking for allocation failure.
 
J

Juha Nieminen

jacob navia said:
I would be interested to know the corresponding vaue of the
C++ stl. How many bytes of code it will cost you to instantiate
a list template for a new type?

It depends on which member functions of the templated class you are
calling. The compiler will instantiate only those functions that get
called and skip the rest.

(This is, in fact, not just an optimization, but a *necessity*. The compiler
*must* instantiate only what is being called because some templated classes
actually depend on this behavior.)
 
J

Juha Nieminen

Ian Collins said:
Rather than fussing over a few bytes, how about a performance comparison?

A performance comparison would be quite moot because the biggest bottleneck
in the whole thing is memory allocation. By far. A performance comparison
would only make sense if one version uses more memory allocations than the
other (for example, I have seen "generic" linked lists in C that actually
make two allocations per element rather than the one that std::list makes,
hence basically making the C version twice as slow as std::list).

It can be quite surprising how much time the program spends in memory
allocation alone. For instance, if you instantiate a std::set and add
some tens of millions of elements to it, it will take many seconds on
a typical computer. You'd think that the majority of the time is spent
re-balancing the tree after each insertion. You'd be wrong. Something
like 80-90% of the time is spent allocating memory!

Using a very fast allocator with std::set or std::list can speed it up
quite considerably, if many insertions and deletions are performed.
(The great thing about the C++ standard data containers is that they
support user-created memory allocators, something that's way more
difficult to do in a "generic" C container; as everything else.)
 
J

jacob navia

Le 22/05/12 08:05, Juha Nieminen a écrit :
A performance comparison would be quite moot because the biggest bottleneck
in the whole thing is memory allocation. By far.

Yes. With a fast allocator, my software runs at the same
speed than C++. This due to bottleneck is in:

Memory I/O
Allocation
A performance comparison
would only make sense if one version uses more memory allocations than the
other (for example, I have seen "generic" linked lists in C that actually
make two allocations per element rather than the one that std::list makes,
hence basically making the C version twice as slow as std::list).

My software makes one allocation per element.
It can be quite surprising how much time the program spends in memory
allocation alone. For instance, if you instantiate a std::set and add
some tens of millions of elements to it, it will take many seconds on
a typical computer. You'd think that the majority of the time is spent
re-balancing the tree after each insertion. You'd be wrong. Something
like 80-90% of the time is spent allocating memory!

That number is correct, that's why I decided to use (for single linked
lists)

struct element {
struct element *Next;
char data[1];
};
Using a very fast allocator with std::set or std::list can speed it up
quite considerably, if many insertions and deletions are performed.
(The great thing about the C++ standard data containers is that they
support user-created memory allocators, something that's way more
difficult to do in a "generic" C container; as everything else.)

My software supports user supplied allocators. The default allocator
object is:

typedef struct tagAllocator {
void *(*malloc)(size_t); /* Function to allocate memory */
void (*free)(void *); /* Function to release it */
void *(*realloc)(void *,size_t);/* Function to resize a block of
memory */
void *(*calloc)(size_t,size_t);
} ContainerAllocator;

You can replace it with your own allocator. For instance you can
supply a GC (Boehm's for instance) allocator, or a custom one,
as you wish.
 
I

Ian Collins

A performance comparison would be quite moot because the biggest bottleneck
in the whole thing is memory allocation. By far. A performance comparison
would only make sense if one version uses more memory allocations than the
other (for example, I have seen "generic" linked lists in C that actually
make two allocations per element rather than the one that std::list makes,
hence basically making the C version twice as slow as std::list).

Hence my simple use of std::list, I just wanted some idea how the two
compare (particularly with regard to the sort) before any optimisations
are added.
 
I

Ian Collins

Le 22/05/12 08:05, Juha Nieminen a écrit :

Yes. With a fast allocator, my software runs at the same
speed than C++. This due to bottleneck is in:

Can you provide a simple, unoptimised example? I'm genuinely curios how
the two compare both before and after allocator optimisation.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,813
Latest member
lawrwtwinkle111

Latest Threads

Top