Allocating large blocks

F

Francine.Neary

I'm experimenting with a program that manipulates a large matrix, i.e.
a 2-deep array. I started by just allocating this in one go, but later
it occurred to me that, as there are still some very old systems
around where such a large allocation might fail, perhaps it would be
better to split the allocation into many allocations, one for each
row, to increase portability to legacy systems.

The advantage I see is that the memory allocator or operating system
might not be able to find a contiguous block, but several smaller
blocks might be OK. The downside would be that there is likely to be
more overhead in the allocator, and my program itself needs to store
an extra thousand pointers to keep track of the individual rows.

What do people think is best?

Here's some example code in case what I say isn't clear.

#include <stdlib.h>

#define ROWS (1<<10)
#define COLS (1<<20)

/* approach 1 */

static int *M;

void init()
{
if(!M=malloc(ROWS*COLS*sizeof(int)))
abort();
}


/* approach 2 */

static int *r[ROWS];

void init()
{
int i;
for(i=0; i<ROWS; (r[i++]=malloc(COLS*sizeof(int))) || (abort(),0));
}
 
I

Ian Collins

I'm experimenting with a program that manipulates a large matrix, i.e.
a 2-deep array. I started by just allocating this in one go, but later
it occurred to me that, as there are still some very old systems
around where such a large allocation might fail, perhaps it would be
better to split the allocation into many allocations, one for each
row, to increase portability to legacy systems.

The advantage I see is that the memory allocator or operating system
might not be able to find a contiguous block, but several smaller
blocks might be OK. The downside would be that there is likely to be
more overhead in the allocator, and my program itself needs to store
an extra thousand pointers to keep track of the individual rows.

What do people think is best?
You'll have to try it on your implementation to see which works best for
you.
 
S

Stephen Sprunk

I'm experimenting with a program that manipulates a large matrix,
i.e. a 2-deep array. I started by just allocating this in one go, but
later it occurred to me that, as there are still some very old
systems around where such a large allocation might fail, perhaps
it would be better to split the allocation into many allocations, one
for each row, to increase portability to legacy systems.

The advantage I see is that the memory allocator or operating
system might not be able to find a contiguous block, but several
smaller blocks might be OK. The downside would be that there
is likely to be more overhead in the allocator, and my program
itself needs to store an extra thousand pointers to keep track of
the individual rows.

What do people think is best? ....
#define ROWS (1<<10)
#define COLS (1<<20)

With these sizes, both approaches will almost certainly fail on all existing
32-bit and smaller systems. OTOH, both will likely work on 64-bit systems.

If you were exaggerating the dimensions, there is a window where the per-row
method will succeed where the single-chunk one will fail, but will depend on
the implementation and it's typically fairly narrow.

The biggest advantage of doing it all at once is that you'll find out if
your program will get what it wants in one call to malloc() instead of after
(worst case) ROWS-1 calls.

S
 
M

Malcolm McLean

Stephen Sprunk said:
With these sizes, both approaches will almost certainly fail on all
existing 32-bit and smaller systems. OTOH, both will likely work on
64-bit systems.

If you were exaggerating the dimensions, there is a window where the
per-row method will succeed where the single-chunk one will fail, but will
depend on the implementation and it's typically fairly narrow.
If you want to allocate about 1GB on a machine with around 2GB installed you
are in that window. That's 1000 * 1 million chars, and correspondingly fewer
ints.
 
S

SM Ryan

(e-mail address removed) wrote:
# I'm experimenting with a program that manipulates a large matrix, i.e.
# a 2-deep array. I started by just allocating this in one go, but later
# it occurred to me that, as there are still some very old systems
# around where such a large allocation might fail, perhaps it would be
# better to split the allocation into many allocations, one for each
# row, to increase portability to legacy systems.

Perhaps if you're talking about 8086 or 80286 processors.
At some point you can decide some hardware is too old for
you to support and not worry about. Like I doubt your code
will run on mercury delay lines or ferrite core memories.

Otherwise the fewest stable blocks regardless of size is
most likely to make the memory allocator happy.

# #define ROWS (1<<10)
# #define COLS (1<<20)
#
# /* approach 1 */
#
# static int *M;
#
# void init()
# {
# if(!M=malloc(ROWS*COLS*sizeof(int)))

On some machines, this will become a single operating request
to memory map 4 or 8 gigabytes into your program. Efficient,
low overhead. And then with all the array elements contiguous
you can more predictably page the array by how you index it.

# int i;
# for(i=0; i<ROWS; (r[i++]=malloc(COLS*sizeof(int))) || (abort(),0));

This requires 500 times as many calls, perhaps each a separate
operating system request. Also the rows will have no predictable
relation to each other, so the paging will be more unpredictable.
 
S

Stephen Sprunk

Malcolm McLean said:
If you want to allocate about 1GB on a machine with around 2GB
installed ...

While I grant it's possible to have that much RAM installed without virtual
memory, it's unlikely, so the physical RAM size is usually irrelevant.
... you are in that window.

Not necessarily. A 1GB single-chunk allocation may work on some 32-bit
machines -- and 1GB broken down into multiple chunks may fail on others.
(Then you get into fun issues like Linux's infamous tendency to
"successfully" return a large block from malloc() and then segfault when you
try to use it...)

But, yes, you are generally correct that 1-2GB is much more likely to
succeed if it's done in chunks; most implementations have "holes" at various
memory locations that make extremely large allocations difficult to
impossible. That's roughly the window I was talking about, though it
varies. However, I've found that folks tend to either be well below that
window, or their data will soon grow to exceed it significantly, making the
optimal behavior in that window mostly irrelevant. If you need to work with
>1GB of data at a time -- or will within a couple years -- you need to give
in and move to a 64-bit system. They're cheap enough these days it's not
worth the time to figure out how to make memory-hungry apps work on 32-bit
boxes.

S
 
C

christian.bau

#define ROWS (1<<10)
#define COLS (1<<20)

/* approach 1 */

static int *M;

void init()
{
if(!M=malloc(ROWS*COLS*sizeof(int)))
abort();

}

Try to print the value of (size_t) (ROWS*COLS*sizeof(int)). You may be
surprised.
 
F

Francine.Neary

Try to print the value of (size_t) (ROWS*COLS*sizeof(int)). You may be
surprised.

peachy@sam:~$ echo '#include <stdio.h>
#include <stddef.h>
main() { return printf("%llu\n", (size_t)
((1<<10)*(1<<20)*sizeof(int))); }' | gcc -x c -
peachy@sam:~$ ./a.out
4294967296

Doesn't seem all that surprising?
 
K

Keith Thompson

peachy@sam:~$ echo '#include <stdio.h>
#include <stddef.h>
main() { return printf("%llu\n", (size_t)
((1<<10)*(1<<20)*sizeof(int))); }' | gcc -x c -
peachy@sam:~$ ./a.out
4294967296

Doesn't seem all that surprising?

You're probably using a 64-bit implementation. On a 32-bit
implementation, you most likely would have gotten a result of 0.

There's no clear definition of "64-bit implementation" or "32-bit
implementation". It typically refers to the width of the CPU
registers. In this case, what's relevant is the width of size_t (and
the value of sizeof(int)).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,999
Messages
2,570,243
Members
46,835
Latest member
lila30

Latest Threads

Top