sizeof([ALLOCATED MEMORY])

  • Thread starter ballpointpenthief
  • Start date
K

Keith Thompson

Howard Hinnant said:
I think that's an excellent idea. Perhaps there is some refinement
still to be done here. But the basic idea is fundamentally a win-win.

Thanks. On the other hand, there's always the danger of creeping
featurism. The potential complexity of a memory allocation system is
nearly unbounded. For example, it might sometimes make sense to make
a request like:

I need a chunk of at least 300 bytes of data. I'll probably need
to expand it to 500 bytes later on, so reserve 500 if you can; if
not, I'll settle for 300. I *might* need to expand it to 4000
bytes eventually, so optimize for that case if you can. Tell me
how many bytes you were actually able to allocate. And I'm also
going to need to allocate a 10000 byte chunk; if you can't satisfy
that request immediately, let me exactly what my options are for
freeing other chunks of allocated memory to make that allocation
possible.

(Substitute larger sizes and/or more allocations to make this scenario
realistic on a modern non-embedded system.)

In general, you want to optimize for your program's actual usage
pattern, without necessarily knowing in advance what that's going to
be. Just defining the data structures to describe what you want and
what the system can give you would be a daunting task, let alone
actually implementing it. There has to be a tradeoff between having a
simple interface and adding more functionality.

One could argue that the C's current malloc/realloc/free interface is
the best tradeoff. A malloc/realloc/xrealloc/free interface would
certainly provide some options that don't currently exist, but it's
not clear that it would be worth the added complexity (or that it
wouldn't).
 
H

Howard Hinnant

Keith Thompson said:
Just defining the data structures to describe what you want and
what the system can give you would be a daunting task, let alone
actually implementing it. There has to be a tradeoff between having a
simple interface and adding more functionality.

One could argue that the C's current malloc/realloc/free interface is
the best tradeoff. A malloc/realloc/xrealloc/free interface would
certainly provide some options that don't currently exist, but it's
not clear that it would be worth the added complexity (or that it
wouldn't).

Excellent points all.

I propose that a good saddle point between complexity and simplification
is somewhere near:

1. How much memory did you actually give me?

2. You gave me a N-byte block (and thanks). I'm interested in
expanding it (in-place) to at a minimum of (N+Nmin) bytes, and perhaps
to a maximum of (N+Nmax) bytes. Can you help me do that?

3. You gave me a N-byte block (and thanks). I no longer need a large
part of this memory and would like to recycle it. Can you shrink this
block in-place such that it is no smaller than M bytes?

My proposal focuses more on the immediate, and not so much on the "what
I might change my mind to in the future". Imho, the client is better
informed to predict future needs and should pose its questions such that
it hides those details from the allocation system and just asks more
specific questions.

-Howard
 
K

Kelsey Bjarnason

Kelsey Bjarnason said:
[snips]

The data structure as described above has an efficiency concern:
Repeated growth in size can lead to quadratic expense due to continually
reallocating for a slightly larger size.

So don't do that. In simplest terms, don't add, multiply.

Did you read my entire post or just stop here?

-Howard

I read the entire thing, but it seemed little better than the original
post; it completely fails to grasp, despite the discussion of multipliers,
that they're the actual solution to the problem at hand, and it fails to
grasp the obvious point that if you need N bytes, allocate N bytes, don't
allocate n-m and pray.

It also failed to grasp a trivial solution if it's the allocations, rather
than the consumption, that is the problem: allocate once, into a single
large buffer, and dole out parcels of it yourself.

So, basically, it was a lot of text, no content. What part did I miss?
 
R

Robert Latest

If I were going to suggest a new feature to address this (perhaps
straying off-topic a bit), I'd probably propose a new function similar
to realloc(), except that it's guaranteed not to relocate the object.
Let's call it xrealloc() (not a great name, but it will do for now).
If realloc() would have expanded or contracted the object in place,
xrealloc() is equivalent to realloc(). If realloc() would have
relocated the object, xrealloc() fails. (We wouldn't require
xrealloc() to succeed in *exactly* the same circumstances where
realloc() would expand or shrink the object in place, but it would
usually work out that way.)

If I've allocated space for 300 bytes, and I find that I need 302
bytes, I can call xrealloc(), asking for just 302 bytes. If this
succeeds, I'm done, and the call was probably very cheap. If it
fails, I can then do a more expensive realloc() call, requesting 400
or 500 bytes to leave room for future growth. And of course existing
code can ignore xrealloc() and continue to work as it always has.

This would allow an implementation to allocate more memory than you
ask for, while still permitting it to take back some of the extra
space if it needs it.

Well, I really can't see much of a point in this. Either you need
memory or you don't. Your proposed xrealloc() thing would make
programs behave like on a bazaar: "I'd like 1000 bytes more
memory" -- "Sure, it's gonna cost you though. But you can have
300 cheap if you want". Now what can a sensible program make of
that information?

The real issue is that you can optimize memory usage only if you
know the memory management strategy of your implementation. I
can't think there of a sensible way to learn much about this from
within a C program, let alone portably. Even if it existed -- if
memory performance was such a big issue with a particular
program, I woudn't leave it up to the implementation to decide on
strategies but rather write platform-specific modules based on
solid knowledge of the underlying mechanisms.

Graphics programs come to mind as software that needs hight
memory handling performance. If you ever used Adobe Photoshop
(Windows) and Gimp (Linux) on the same hardware you know what I'm
talking about. Photoshop is incredibly fast on images that nearly
bring the system down when opened in Gimp. I don't think that
Windows' memory system is that much better; I think Photoshop has
a dedicated, hightly optimized memory handling system built in.

robert
 
H

Howard Hinnant

Robert Latest said:
Well, I really can't see much of a point in this. Either you need
memory or you don't. Your proposed xrealloc() thing would make
programs behave like on a bazaar: "I'd like 1000 bytes more
memory" -- "Sure, it's gonna cost you though. But you can have
300 cheap if you want". Now what can a sensible program make of
that information?

I think I see the disconnect.

Some of us seem to be discussing writing custom code for a known task at
hand. Some of us are even discussing writing custom versions of malloc:
It also failed to grasp a trivial solution if it's the allocations, rather
than the consumption, that is the problem: allocate once, into a single
large buffer, and dole out parcels of it yourself.

What I am attempting to communicate (and obviously not doing a good job
of it) is writing a reusable (or semi-reusable) library in C that models
a dynamic array. For the definition of dynamic array I'm using that
found in Wikipedia, which does a much better job of describing it than
I've done here:

http://en.wikipedia.org/wiki/Dynamic_array

If you go through the exercise of writing a reusable dynamic array in C
(as people do, e.g.
http://geta.life.uiuc.edu/~gary/programs/CRITICA/critica105b/Array.c ),
then you eventually end up with code that looks something like
"addToArray" found at the above link:

void
addToArray (Array array, void *element)
{
<snip>
if (array->elements == array->maxSize) {
array->maxSize *= 2;
array->array = (void **) realloc (array->array, array->maxSize
* sizeof (void *));
<snip>
}
}

(a partial listing in an attempt to respect the author's GPL copyright).

What reasonable steps can the C standard take to help the average C
programmer write a more efficient version of "addToArray"? Common
implementations of malloc/realloc/free have properties that could easily
be exploited here if only those properties were exposed (e.g. how much
free memory currently resides after the allocation pointed to by
array->array?)

The dynamic array data structure is so prevalent in software design and
use today that it warrants this kind of attention to detail.

-Howard
 
B

Ben Pfaff

Howard Hinnant said:
(a partial listing in an attempt to respect the author's GPL copyright).

The GPL[*] encourages distribution of source code, so it's a
little weird to consider a partial listing as a way of respecting
it.

[*] Which is a license, not a copyright.
 
B

Ben Pfaff

Howard Hinnant said:
If you go through the exercise of writing a reusable dynamic array in C
(as people do, e.g.
http://geta.life.uiuc.edu/~gary/programs/CRITICA/critica105b/Array.c ),
then you eventually end up with code that looks something like
"addToArray" found at the above link:

I kind of like the x2nrealloc function that some GNU software
uses. It's a little more flexible and simpler to use than the
typical dynamic array. Here's the documentation:

/* If P is null, allocate a block of at least *PN such objects;
otherwise, reallocate P so that it contains more than *PN objects
each of S bytes. *PN must be nonzero unless P is null, and S must
be nonzero. Set *PN to the new number of objects, and return the
pointer to the new block. *PN is never set to zero, and the
returned pointer is never null.

Repeated reallocations are guaranteed to make progress, either by
allocating an initial block with a nonzero size, or by allocating a
larger block.

In the following implementation, nonzero sizes are doubled so that
repeated reallocations have O(N log N) overall cost rather than
O(N**2) cost, but the specification for this function does not
guarantee that sizes are doubled.

Here is an example of use:

int *p = NULL;
size_t used = 0;
size_t allocated = 0;

void
append_int (int value)
{
if (used == allocated)
p = x2nrealloc (p, &allocated, sizeof *p);
p[used++] = value;
}

This causes x2nrealloc to allocate a block of some nonzero size the
first time it is called.

To have finer-grained control over the initial size, set *PN to a
nonzero value before calling this function with P == NULL. For
example:

int *p = NULL;
size_t used = 0;
size_t allocated = 0;
size_t allocated1 = 1000;

void
append_int (int value)
{
if (used == allocated)
{
p = x2nrealloc (p, &allocated1, sizeof *p);
allocated = allocated1;
}
p[used++] = value;
}

*/
 
?

=?iso-8859-1?q?Ion_Gazta=F1aga?=

Ben Pfaff(e)k dio:
I kind of like the x2nrealloc function that some GNU software
uses. It's a little more flexible and simpler to use than the
typical dynamic array. Here's the documentation:

The C reallocation feature, in my opinion, misses some important points
that make memory allocation suboptimal, and disallows some C++
features:

-> Not all objects stored in the buffer can be binary copied. An struct
can have a pointer to itself, for example. This problem is obvious when
using C allocation functions to build C++ allocators for non POD
objects. Realloc binary copies automatically data, so we can't use
realloc for non binary copyable objects. We need a memory
reallocation/allocation function that has an option to disable data
copying.

-> We can't specify both a minimum size for allocation/reallocation and
a preferred size. We have to guess a reallocation size (for example,
doubling the size). Most of the times, we need to insert N extra
objects in a buffer with S objects, and currently we call realloc
doubling the size -> S*2. However, maybe the current block can be
expanded between N and S*2. Obviously, we prefer an expansion than a
reallocation:

allocate_at_least(p
,S+N /*min size*/
,S*2 /*preferred size*/
&allocated);

The meaning: if the current block can be expanded at least to S+N, do
it, otherwise try to allocate S*2, otherwise, allocate S+N, otherwise
return 0. Checking for expansion is very cheap, and if the next block
is free with a size between N and S, we can avoid the fragmentation and
reuse that space. This makes buffer expansion more probable, minimizes
allocations and improves performance.

-> In many realloc implementations, we can't expand backwards. Imagine
the following common situation, after some allocation/deallocation
operations:

| free1 | current_buffer | free2 |

free1 and free2 are not big enough to hold S+N elements, but free1 +
current_buffer + free2 is big enough. This reallocation is very fast
(surely constant time) in most implementations (for example using a
doubly linked list of memory blocks). Apart from this, locality is
improved since the previous block can be in the same memory page. Less
size overhead, less fragmentation, more locality and avoiding an
unneeded allocation. Unconditional backwards reallocation can disallow
any reallocation for complex c++ objects (for example, objects whose
constructor can throw) if we need strong exception guarantee. So I
would make backwards expansion optional.

----

The overhead of memory allocation is in many known applications the
biggest bottleneck. Minimizing unneeded allocations and using expansion
possibilities will reduce memory usage, will improve locality and will
improve speed.

Regards,

Ion
 
M

Malcolm

Howard Hinnant said:
Some of us seem to be discussing writing custom code for a known task at
hand. Some of us are even discussing writing custom versions of malloc:
I've got a whole armoury of memory allocation routines.
What I am attempting to communicate (and obviously not doing a good job
of it) is writing a reusable (or semi-reusable) library in C that models
a dynamic array. For the definition of dynamic array I'm using that
found in Wikipedia, which does a much better job of describing it than
I've done here:

http://en.wikipedia.org/wiki/Dynamic_array

If you go through the exercise of writing a reusable dynamic array in C
(as people do, e.g.
http://geta.life.uiuc.edu/~gary/programs/CRITICA/critica105b/Array.c ),
then you eventually end up with code that looks something like
"addToArray" found at the above link:

void
addToArray (Array array, void *element)
{
<snip>
if (array->elements == array->maxSize) {
array->maxSize *= 2;
array->array = (void **) realloc (array->array, array->maxSize
* sizeof (void *));
<snip>
}
}

(a partial listing in an attempt to respect the author's GPL copyright).

What reasonable steps can the C standard take to help the average C
programmer write a more efficient version of "addToArray"? Common
implementations of malloc/realloc/free have properties that could easily
be exploited here if only those properties were exposed (e.g. how much
free memory currently resides after the allocation pointed to by
array->array?)

The dynamic array data structure is so prevalent in software design and
use today that it warrants this kind of attention to detail.
Basically you don't do it like that.
The problem is that the generic AddToArray() function has an interface which
is too clunky considering the triviality of the underlying algorithm.

The answer is that the array will represent something in the real world,
which can be encapsulated.

eg
/* keep these opaque */
typedef struct
{
char *name;
char *title;
float salary;
} EMPOYEE;

typedef struct
{
EMPLOYEE *employees;
int Nemployees;
} PAYROLL;

Expose these

int getNemployees(PAYROLL *p);
void addemployee(PAYROLL *p, char *name, char *title);
void setsalary(PAYROLL *p, char *name, float salary);
void runpayroll(PAYROLL *p);

Now we need to decide at a general level what the program will do should it
run out of memory. Maybe we want to terminate with an error message, maybe
we want to pass back a flag, maybe we want to silently suppress the error.
The place to put this logic in is after the call to realloc - and messing
about with a general function means that we need to set error handling
strategies and so forth, and the whole thing becomes very difficult to use.

There could also be other errors, such as two employees with the same name,
or unrecognised job titles. Again, it makes sense to consider what to do at
more or less the same level.

Now let's say that our program runs too slowly, because of the continuous
reallocation of massive list of employees. Again, this is the level at which
to tackle the problem, and move to a tree or linked list structure, or maybe
store in extra capacity. Again, if we are running into efficiency problems,
the solution will be determined by the other characteristics of the problem
at hand - do we need to sort the employees, is it important to allow fast
random access, do we delete as well as add employees?
 
H

Howard Hinnant

The dynamic array data structure is so prevalent in software design and
use today that it warrants this kind of attention to detail.
Basically you don't do it like that.[/QUOTE]

Negating the usefulness of the dynamic array is unconvincing in the
extreme.

There are many useful data structures in modern software design. The
dynamic array is not only one of those useful data structures, it is one
of the most often used. That is not to say that it is a silver bullet
or anything like that. Indeed other data structures are often the right
choice, including the fixed size (even if the size is selected at
runtime) array.

But the dynamic array is an extremely useful data structure. It is not
always called "dynamic array". "Introduction to Algorithms" by Cormen,
Leiserson and Rivest (a very well respected book) refers to this data
structure as "dynamic table". Whatever you call it, it is a valuable
tool in the programmer's toolbox.

-Howard
 
R

Robert Latest

On 6 May 2006 03:08:17 -0700,
in Msg. said:
The overhead of memory allocation is in many known applications the
biggest bottleneck. Minimizing unneeded allocations and using expansion
possibilities will reduce memory usage, will improve locality and will
improve speed.

Absolutely, but the C language (and the C standard) is not the place for
it. If the the performance of your app hinges on efficient memory
managent, you must implement it yourself (or use an existing library).
And that is a good thing.

robert
 
R

Robert Latest

On 6 May 2006 03:08:17 -0700,
in Msg. said:
allocate_at_least(p
,S+N /*min size*/
,S*2 /*preferred size*/
&allocated);


The implementation of realloc is perfectly free to anticipate future
buffer growth in this way. Leave optimization to the implementation,
don't mandate it in the Standard.

robert
 
R

Robert Latest

On Fri, 05 May 2006 14:26:23 GMT,
in Msg. said:
What reasonable steps can the C standard take to help the average C
programmer write a more efficient version of [...]

It can't, and it shouldn't. Efficiency is beyond the scope of the C
language definition.

robert
 
?

=?iso-8859-1?q?Ion_Gazta=F1aga?=

Absolutely, but the C language (and the C standard) is not the place for
it. If the the performance of your app hinges on efficient memory
managent, you must implement it yourself (or use an existing library).
And that is a good thing.

Why not? If the C standard can offer a better mechanism than the
actual, it's a perfect place to implement it, IMHO.

Regards,

Ion
 
?

=?iso-8859-1?q?Ion_Gazta=F1aga?=

allocate_at_least(p
The implementation of realloc is perfectly free to anticipate future
buffer growth in this way. Leave optimization to the implementation,
don't mandate it in the Standard.

The implementation can't know what the application needs. If you want
to insert N new objects, you have 2 choices:

-> Realloc Current + N
-> Realloc Current*2

If you choose the first you have more chances to success, but you will
suffer a lot of more memory allocation because the growing factor will
be slow. If you choose the second, you are trying to avoid
reallocations but you have less chances to succed in expansion. The
implementation can never know what's the best option.

If proposed approach, the implementation is also free to anticipate.
You request between S+N and S*2 but the implementation is free to
allocate more if it wants so. Remember that the user obtains the
reallocated size.

I have implemented this approach in C++ and I can really tell you that
there is a big performance win in some applications. Obviously, you
might argue that this is not adequate for the standard. In my opinion,
however, a good memory allocation interface is a basic building block
for a language. Specially for C, since it's a language concerned with
speed and resource usage.

Regards,

Ion
 
H

Howard Hinnant

Robert Latest said:
On Fri, 05 May 2006 14:26:23 GMT,
in Msg.
What reasonable steps can the C standard take to help the average C
programmer write a more efficient version of [...]

It can't, and it shouldn't. Efficiency is beyond the scope of the C
language definition.

No one is discussing mandated efficiency.

The proposal is an expanded interface giving the programmer more
information about the runtime state of his program at little or no cost.
The information does not need to be generated, it already exists. It's
just that there is currently no interface with which to obtain this
information. It has been shown that this extra information can have a
positive impact on program performance (not standard library efficiency)
if utilized. Programs that do not need or want this extra information
are free to ignore it at no cost.

If you mean that an efficient library interface is beyond the scope of
the C language definition, I think you do the creators of this
definition a disservice. Personally I'm impressed that so many aspects
of C have withstood the test of time over the past 30 years. Given all
of the advances in software development in this time frame, it could not
have done so well without a fundamentally sound and useful interface.

I believe only a small part of this interface needs a minor tweak to
keep up with changing times.

-Howard
 
S

Stephen Sprunk

Ion Gaztañaga said:
The implementation can't know what the application needs. If you want
to insert N new objects, you have 2 choices:

-> Realloc Current + N
-> Realloc Current*2

If you choose the first you have more chances to success, but you will
suffer a lot of more memory allocation because the growing factor will
be slow. If you choose the second, you are trying to avoid
reallocations but you have less chances to succed in expansion. The
implementation can never know what's the best option.

If proposed approach, the implementation is also free to anticipate.
You request between S+N and S*2 but the implementation is free to
allocate more if it wants so. Remember that the user obtains the
reallocated size.

The reality is that most C implementations only allocate memory in
power-of-two sizes, so if you do many realloc()s with a size of current+N,
the vast majority of your calls will return within a few instructions.
I have implemented this approach in C++ and I can really tell you that
there is a big performance win in some applications. Obviously, you
might argue that this is not adequate for the standard. In my opinion,
however, a good memory allocation interface is a basic building block
for a language. Specially for C, since it's a language concerned with
speed and resource usage.

All this suggestion adds to any modern implemenation is a reduction in the
number of calls to realloc(), not the amount of real work done by the user's
code or by the implementation code. If all those no-op calls actually have
a measurable impact on your application's performance, then you're free to
either tinker with the implementation's code or to insert another layer on
top of the C interface that allows you to do away with them.

Standard C provides an interface that works very well in the vast majority
of cases, but if you're not happy with it, C also allows you to do your own
thing in defiance of the standard. That's the beauty (and ugliness) of C --
you're free to ignore it and do something different if needed.

S

--
Stephen Sprunk "Stupid people surround themselves with smart
CCIE #3723 people. Smart people surround themselves with
K5SSS smart people who disagree with them." --Aaron Sorkin


*** ***
 
?

=?iso-8859-1?q?Ion_Gazta=F1aga?=

The reality is that most C implementations only allocate memory in
power-of-two sizes, so if you do many realloc()s with a size of current+N,
the vast majority of your calls will return within a few instructions.

In the implementations I know, they use mixed algorithms with
merging/coalescing blocks that are not power of two. For example Doug
Lea allocator uses mixed algorithms including sizes that are not power
of two. Power of two only allocation (Kingsley type allocation) leads
to a huge memory waste, since we would waste the half of the memory in
internal fragmentation.
All this suggestion adds to any modern implemenation is a reduction in the
number of calls to realloc(), not the amount of real work done by the user's
code or by the implementation code. If all those no-op calls actually have
a measurable impact on your application's performance, then you're free to
either tinker with the implementation's code or to insert another layer on
top of the C interface that allows you to do away with them.

If I use a minimum size/preferred size approach I avoid real work. If
minimum size expansion is successful, I avoid an allocation + copy (and
fragmentation). So I'm not saving only calls but also new allocations +
copies. I'm also saving memory. It's always better to have a
minimum_size expansion than request a preferred_size (current*2)
allocation + a copy.
Standard C provides an interface that works very well in the vast majority
of cases, but if you're not happy with it, C also allows you to do your own
thing in defiance of the standard. That's the beauty (and ugliness) of C --
you're free to ignore it and do something different if needed.

You are right, but the changes in the implementation are minor. All the
information is already there: Realloc has to see if it can expand the
memory to preferred_size, so it knows current size and the maximum
expansion size. If this fails, it just has to try the same minimum
size. If that fails, it will try to allocate new memory. The size of
the block is there, so it can return it easily. Implementation of the
proposal is very easy, once we have realloc. We can change realloc in a
day to obtain this. However, programming your own allocation algorithm
is a huge work, since allocation/reallocations implementations are
complex to program and realloc will be perfectly tuned by vendors. I'm
proposing just a minor interface to get the best performance of the
already stored memory information.

Regards,

Ion
 
R

Robert Latest

Specially for C, since it's a language concerned with
speed and resource usage.

No it isn't: (n869.txt is a late draft of C99)

$ for f in {speed,efficiency,performance,resource}; do grep $f n869.txt; done
exceptions. The programmer can achieve the efficiency of
$

robert
 
R

Robert Latest

[ a lot of good stuff ]

You are right, and you know a lot about the subject of memory
allocation. But I think it would be wrong to burden the language
itself with these matters. Like I said before, an application
whose performance *really* depends critically on the details of
memory management should use a dedicated library that is
fine-tuned to deal with the problem at hand. The proposed
"minimum -- optimum" extension of realloc could potentially help
in this matter because it would be possible for an implementation
to provide performance-optimized memory management with this
function, but since the Standard couldn't possibly make a
statement about the performance gein, implementations would be
free to implement it as a trivial wrapper around realloc(). Which
would have gained you exactly nothing.

So if you don't need the performance just use realloc. If you
really need it, get a dedicated (and probably platform-specific)
library to do it for you.

robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,183
Messages
2,570,967
Members
47,517
Latest member
Andres38A1

Latest Threads

Top