reserve() for map

R

Ralf Goertz

Hi,

I frequently have to read in very many elements (about 60 Million) of a
map. I assume that using insert() or the []-operator leads to many
allocations of memory which I understand to be slow. Is there a way to
tell the map in advance to allocate enough memory for all elements? Do I
need to write my own allocator or can I use allocate() of the standard
allocator?
 
Ö

Öö Tiib

Hi,

I frequently have to read in very many elements (about 60 Million) of a
map. I assume that using insert() or the []-operator leads to many
allocations of memory which I understand to be slow. Is there a way to
tell the map in advance to allocate enough memory for all elements? Do I
need to write my own allocator or can I use allocate() of the standard
allocator?

There are tons of ways to avoid using map if that is causing
performance issues. Most obvious is to use vector of pairs. If later
linear search is too slow use std::sort or better to sort and
std::lower_bound to binary search.
 
R

Ralf Goertz

Öö Tiib said:
Hi,

I frequently have to read in very many elements (about 60 Million) of a
map. I assume that using insert() or the []-operator leads to many
allocations of memory which I understand to be slow. Is there a way to
tell the map in advance to allocate enough memory for all elements? Do I
need to write my own allocator or can I use allocate() of the standard
allocator?

There are tons of ways to avoid using map if that is causing
performance issues. Most obvious is to use vector of pairs. If later
linear search is too slow use std::sort or better to sort and
std::lower_bound to binary search.

The map was chosen because it is used in different threads where one
thread may erase an element which is irrelevant for the other threads
but which does not invalidate iterators. I wouldn't know how to do that
with vector. A list would be the other option where iterators remain
valid but then the same question as in the OP arises.
 
A

Alf P. Steinbach

Öö Tiib said:
Hi,

I frequently have to read in very many elements (about 60 Million) of a
map. I assume that using insert() or the []-operator leads to many
allocations of memory which I understand to be slow. Is there a way to
tell the map in advance to allocate enough memory for all elements? Do I
need to write my own allocator or can I use allocate() of the standard
allocator?

There are tons of ways to avoid using map if that is causing
performance issues. Most obvious is to use vector of pairs. If later
linear search is too slow use std::sort or better to sort and
std::lower_bound to binary search.

The map was chosen because it is used in different threads where one
thread may erase an element which is irrelevant for the other threads
but which does not invalidate iterators.

You need exclusive access to the map for the whole of that operation.

Removing an element may adjust several pointers in the internal data structure
(usually a red-black tree), involving elements used by other threads.

If you're not locking - bang.

I wouldn't know how to do that with vector.

A simple solution is to set a boolean flag in the item. Depends how many
elements will usually be set as "to be ignored".

A list would be the other option where iterators remain
valid but then the same question as in the OP arises.

I'm guessing that you could always try block insertions, if std::map supports
that (I'm pretty sure that it does but check). Read e.g. 2048 elements into a
vector, then block insert into the map, which /might/ allow it to do things more
efficiently. Measurement is the important thing, though.


Cheers & hth.,

- Alf
 
J

Juha Nieminen

Ralf Goertz said:
I frequently have to read in very many elements (about 60 Million) of a
map. I assume that using insert() or the []-operator leads to many
allocations of memory which I understand to be slow. Is there a way to
tell the map in advance to allocate enough memory for all elements? Do I
need to write my own allocator or can I use allocate() of the standard
allocator?

Try using this allocator:

http://warp.povusers.org/FSBAllocator/

(You don't need to tell the allocator anything, just use it as the
allocator for your map. You will probably notice a singificant improvement
in allocation performance.)
 
R

Ralf Goertz

Juha said:
Try using this allocator:

http://warp.povusers.org/FSBAllocator/

(You don't need to tell the allocator anything, just use it as the
allocator for your map. You will probably notice a singificant improvement
in allocation performance.)

Hm, this gave me a segfault upon termination of the program.
Interestingly, the segfault seems to have appeared in a
boost::unordered_set for which I hadn't used the FSBAllocator (it didn't
compile when I tried it for a boost::unordered_map, so I didn't try it
for unordered_set). This is the backtrace:

#0 0x00007fd195ea5efd in ?? () from /lib64/libc.so.6
#1 0x00007fd195ea75d8 in ?? () from /lib64/libc.so.6
#2 0x00007fd195eaa96c in free () from /lib64/libc.so.6
#3 0x000000000040e6f8 in deallocate (this=<value optimized out>,
__p=<value optimized out>)
at /home/usr/bin/../lib64/gcc/../../include/c++/4.4/ext/new_allocator.h:95
#4 delete_buckets (this=<value optimized out>, __p=<value optimized out>)
at /usr/include/boost/unordered/detail/buckets.hpp:101
#5 ~hash_buckets (this=<value optimized out>, __p=<value optimized out>)
at /usr/include/boost/unordered/detail/buckets.hpp:135
#6 ~hash_table (this=<value optimized out>, __p=<value optimized out>)
at /usr/include/boost/unordered/detail/fwd.hpp:492
#7 ~hash_unique_table (this=<value optimized out>, __p=<value optimized out>)
at /usr/include/boost/unordered/detail/fwd.hpp:604
#8 boost::unordered_set<std::basic_string<wchar_t, std::char_traits<wchar_t>,
std::allocator<wchar_t> >, boost::hash<std::basic_string<wchar_t,
std::char_traits<wchar_t>, std::allocator<wchar_t> > >,
std::equal_to<std::basic_string<wchar_t, std::char_traits<wchar_t>,
std::allocator<wchar_t> > >, std::allocator<std::basic_string<wchar_t,
std::char_traits<wchar_t>, std::allocator<wchar_t> > > >::~unordered_set
(this=<value optimized out>, __p=<value optimized out>)
at /usr/include/boost/unordered/unordered_set.hpp:154
#9 0x00007fd195e68065 in ?? () from /lib64/libc.so.6
#10 0x00007fd195e680b5 in exit () from /lib64/libc.so.6
 
B

Brian

Öö Tiib wrote:

The map was chosen because it is used in different threads where one
thread may erase an element which is irrelevant for the other threads
but which does not invalidate iterators. I wouldn't know how to do that
with vector. A list would be the other option where iterators remain
valid but then the same question as in the OP arises.

Boost has a stable_vector class.
http://bannalia.blogspot.com/2008/09/introducing-stablevector.html


Brian Wood
http://webEbenezer.net
(651) 251-9384
 
J

Juha Nieminen

Ralf Goertz said:
Hm, this gave me a segfault upon termination of the program.
Interestingly, the segfault seems to have appeared in a
boost::unordered_set for which I hadn't used the FSBAllocator (it didn't
compile when I tried it for a boost::unordered_map, so I didn't try it
for unordered_set).

unordered_map/unordered_set usually don't need specialized allocators
because the usually already use arrays (rather than allocating individual
elements), which is efficient.

Are you using multithreading? Note that FSBAllocator is not thread-safe
by default (although thread-safety can be turned on).

A minimal program which exhibits the problem would be nice.
 
R

Ralf Goertz

Juha said:
unordered_map/unordered_set usually don't need specialized allocators
because the usually already use arrays (rather than allocating individual
elements), which is efficient.

Are you using multithreading? Note that FSBAllocator is not thread-safe
by default (although thread-safety can be turned on).

Yes I read that. No for this particular test I didn't use mt. I just
wanted to see whether inserting would be faster, after all pairs are
read I returned from main().
A minimal program which exhibits the problem would be nice.

I will try that but probably not before Monday.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,146
Messages
2,570,832
Members
47,374
Latest member
EmeliaBryc

Latest Threads

Top