Optimal buffer growth

Angel Tsankov · Jun 18, 2008

Hi,

I remember reading in a book (or in an article) that the optmial buffer
growth factor is about 1.6. Now I need to find this book but I can't
remember its title. Can someone help me with this?

acehreli · Jun 18, 2008

Hi,

I remember reading in a book (or in an article) that the optmial buffer
growth factor is about 1.6. Now I need to find this book but I can't
remember its title. Can someone help me with this?

I think you are thinking Andrew Koenig.

Search for the thread titled "vector growth factor of 1.5" at
groups.google.com for a discussion of it.

Ali

Kai-Uwe Bux · Jun 19, 2008

Angel said:
I remember reading in a book (or in an article) that the optmial buffer
growth factor is about 1.6. Now I need to find this book but I can't
remember its title. Can someone help me with this?

What do you mean by "optimal"? It's a space-time tradeoff. If you increase
the multiplication factor, you decrease the expected number of move or copy
operations but you increase the expected memory overhead.

Assuming that the capacity multiplies by a factor of d > 1 at each
reallocation, I get (using Benford's law):

a) The expected ratio size() / capacity() is:

d - 1
--------- (for d = 2, approx = 0.72)
d ln(d)

b) Filling the vector by a series of push_back() operations will involve on
average

1
------- (for d = 2, approx = 1.44)
ln(d)

moves per object arising from reallocating the vector (i.e., not including
initial copy construction of the object into the vector).

For d = 1.6, that would be an expected memory usage of 80% and each object
would be reallocated 2.13 times on average.

Best

Kai-Uwe Bux

Angel Tsankov · Jun 19, 2008

Hi,

I think you are thinking Andrew Koenig.

Search for the thread titled "vector growth factor of 1.5" at
groups.google.com for a discussion of it.

Thanks a lot for the reference; I'll take a look. However, I still need to
find out that book...

James Kanze · Jun 19, 2008

What do you mean by "optimal"? It's a space-time tradeoff. If
you increase the multiplication factor, you decrease the
expected number of move or copy operations but you increase
the expected memory overhead.

The issue was that if the factor is larger than (1+sqrt(5))/2
(roughly 1.618), the memory freed after a reallocation could
never be reused by the vector; if you double the size at each
allocation, the total memory freed until that point will always
be less than the size requested for the new allocation. If the
factor is smaller, you can hope that sooner or later, the
underlying allocator will be able to merge previously filled
blocks, and fulfill the request from them.

Kai-Uwe Bux · Jun 19, 2008

James said:
The issue was that if the factor is larger than (1+sqrt(5))/2
(roughly 1.618), the memory freed after a reallocation could
never be reused by the vector; if you double the size at each
allocation, the total memory freed until that point will always
be less than the size requested for the new allocation. If the
factor is smaller, you can hope that sooner or later, the
underlying allocator will be able to merge previously filled
blocks, and fulfill the request from them.

Cool, the golden ratio strikes again.

I have to wonder, though, whether this has a measurable impact (a) on modern
architectures where memory is organized in pages or (b) in typical programs
where one probably has more than one dynamic data structure growing at a
time anyway.

Best

Kai-Uwe Bux

James Kanze · Jun 19, 2008

Cool, the golden ratio strikes again.

I have to wonder, though, whether this has a measurable impact
(a) on modern architectures where memory is organized in pages
or (b) in typical programs where one probably has more than
one dynamic data structure growing at a time anyway.

I don't really know. Nominally, with any of the "classical"
allocation algorithms, if you have one vector which just grows
and grows, it eventually migrates to the end of the free space
arena (because it becomes bigger than any of the holes), and
this factor could possibly affect just how large you could make
it grow. Except, of course, that you'll likely bring the
machine to its knees through paging before that. And that there
are enough additional factors involved that it's not certain
that the rule really changes anything.

Just out of curiousity, during my lunch hour, I wrote a simple
allocator and tested the principles. Using some simple
multipliers, I get the following output:
For 1.10: max size = 389582583 (39.0%)
For 1.20: max size = 389586745 (39.0%)
For 1.30: max size = 513088587 (51.3%)
For 1.40: max size = 477760691 (47.8%)
For 1.50: max size = 419279977 (41.9%)
For 1.60: max size = 432051256 (43.2%)
For 1.70: max size = 360273482 (36.0%)
For 1.80: max size = 307547665 (30.8%)
For 1.90: max size = 328691801 (32.9%)
For 2.00: max size = 268435456 (26.8%)
Change just about any of the parmeters, however, and you get
something different: using an initial size of 500 (rather than
128) results in:
For 1.10: max size = 374691238 (37.5%)
For 1.20: max size = 519586870 (52.0%)
For 1.30: max size = 420105341 (42.0%)
For 1.40: max size = 488743519 (48.9%)
For 1.50: max size = 323374783 (32.3%)
For 1.60: max size = 259493561 (25.9%)
For 1.70: max size = 288397093 (28.8%)
For 1.80: max size = 371637543 (37.2%)
For 1.90: max size = 357024500 (35.7%)
For 2.00: max size = 262144000 (26.2%)
Change the size of the arena from 1000000000 to 500000000, and
you get:
For 1.10: max size = 219909214 (44.0%)
For 1.20: max size = 225455293 (45.1%)
For 1.30: max size = 233540550 (46.7%)
For 1.40: max size = 174111040 (34.8%)
For 1.50: max size = 279519985 (55.9%)
For 1.60: max size = 168770022 (33.8%)
For 1.70: max size = 124662105 (24.9%)
For 1.80: max size = 170859814 (34.2%)
For 1.90: max size = 172995685 (34.6%)
For 2.00: max size = 134217728 (26.8%)
For the moment, I'm not sure what one can really conclude

.

Anyhow, for those interested, here's the code. It uses my
library, but only a few simple things from it, which can easily
be replaced. Also, it's entirely possible that I've got an
error somewhere in it (it was written very quickly), which could
explain the randomness of the results as well.

-------------- fill.cc ----------------
#include <cstdlib>
#include <iostream>
#include <iomanip>
#include <new>
#include "gb/FFmt.hh"
#include "gb/CommandLine.hh"
#include "gb/NumericOption.hh"

Gabi::BoundNumericOption
arenaSize( 'a', 1000000000 ) ;
Gabi::BoundNumericOption
startSize( 's', 128 ) ;
Gabi::BoundNumericOption
intervalCount( 'i', 10 ) ;

class Pool
{
public:
explicit Pool( size_t size = arenaSize ) ;
~Pool() ;
void* allocate( size_t n ) ;
void free( void* p ) ;

private:
struct BlockHeader
{
BlockHeader* next ;
bool isFree ;
} ;
void* base ;
BlockHeader* first ;

inline void* add( void* p, size_t n )
{
return static_cast< char* >( p ) + n ;
}
inline size_t diff( void* p1, void* p2 )
{

return static_cast said:
( p1 ) ;

}
} ;

void
testPool(
double ratio )
{
Pool p ;
size_t s = startSize ;
void* v = p.allocate( s ) ;
while ( v != NULL ) {
size_t s2 = (size_t)( ratio * s ) ;
void* v2 = p.allocate( s2 ) ;
p.free( v ) ;
v = v2 ;
if ( v != NULL ) {
s = s2 ;
}
}
std::cout << "For " << Gabi::FFmt( 4, 2 ) << ratio
<< ": max size = " << std::setw( 9 ) << s
<< " (" << Gabi::FFmt( 4, 1 ) << 100.0 * s / arenaSize
<< "%)"
<< std::endl ;
}

int
main( int argc, char** argv )
{
Gabi::CommandLine::instance().parse( argc, argv ) ;
for ( int i = 1 ; i <= intervalCount ; ++ i ) {
testPool( 1.0 + i / static_cast< double

( intervalCount.value() ) ) ;

}
return 0 ;
}

Pool:

ool(
size_t size )
{
base = std::malloc( size ) ;
if ( base == NULL ) {
throw std::bad_alloc() ;
}
first = static_cast< BlockHeader* >( base ) ;
BlockHeader* last
= static_cast< BlockHeader* >(
add( first, size - sizeof( BlockHeader ) ) ) ;
first->next = last ;
first->isFree = true ;
last->next = NULL ;
last->isFree = false ;
}

Pool::~Pool()
{
std::free( base ) ;
}

void*
Pool::allocate(
size_t n )
{
n += sizeof( BlockHeader ) ;
n = (n + 7) & (static_cast< size_t >( -1 ) << 3) ;
BlockHeader* result = NULL ;
BlockHeader* b = first ;
while ( result == NULL && b != NULL ) {
if ( b->isFree ) {
while ( b->next->isFree ) {
b->next = b->next->next ;
}
if ( diff( b, b->next ) > n ) {
result = b ;
}
}
b = b->next ;
}
if ( result != NULL ) {
if ( diff( result, result->next ) > n +
sizeof( BlockHeader ) ) {
BlockHeader* newNext
= static_cast< BlockHeader* >( add( result,
n ) ) ;
newNext->next = result->next ;
newNext->isFree = true ;
result->next = newNext ;
}
result->isFree = false ;
++ result ;
}
return result ;
}

void
Pool::free(
void* p )
{
if ( p != NULL ) {
BlockHeader* b = static_cast< BlockHeader* >( p ) - 1 ;
b->isFree = true ;
}
}
---------------- ------- ----------------

Working on mobile css menu with plenty of frustration!	2	Dec 29, 2022
Strict aliasing and buffer handling	20	Jun 20, 2011
Array of structs function pointer	10	Jul 16, 2023
How do I solidify my Python skills	1	Sep 15, 2023
Check if buffer is empty from a given position	0	Jun 29, 2016
Optimal Solution	9	May 23, 2008
Problem with a login script, SESSION user rights and put this together so it works with the other pages and MySQL. Code examples.	2	May 5, 2023
How can i fix Mysql images not found error?	0	Dec 13, 2023

Optimal buffer growth

Angel Tsankov

acehreli

Kai-Uwe Bux

Angel Tsankov

James Kanze

Kai-Uwe Bux

James Kanze

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads