On 7/28/2013 5:09 PM, Eric Sosman wrote:
correction: re-ran the heap-statistics tool, currently the highest point
seems to be 16-31 bytes (followed by 32-47 bytes, ...).
the most common object types are currently "metadata leaves" and
"metadata nodes" (basically, structures related to a hierarchical
database), followed mostly by various other small-structure types.
in any case though, small allocations seem to be pretty common.
Skewed sample problem.
One program doesn't necessarily represent the typical situation. If you have
a tree-like structure which dominates the total number of objects in the
system, you either have lots of allocations of sizeof(NODE), or a few
large allocations of N * sizeof(NODE). (See my book Basic Algororthms
about how to write a fast fixed-block allocator). It depends if allocation
performace is a concern or not.
The some program have mainly dynamic strings, others mainly fixed fields.
If you're ultimately storing data in a database like SQL, you might as well
write char str[64], because SQL can't handle arbitrarily long strings.
However if you're not, generally mallocing strings is neater an more robust.
this is not to say that they represent the bulk of the memory usage,
only that they held top-place (for the most allocated object type).
they represent around 0.87% of the total memory usage (5MB / 576MB),
with an allocation count of around 1.93M.
they are followed by heap-allocated triangles for skeletal models (~ 21k
allocs), terrain-chunk headers (6k allocs), and around 116 other object
types.
don't have a percentage for object-counts, I would have to add and
calculate this manually.
yeah, there are heap allocated strings and symbols in the mix as well,
but they don't hold as high of a position.
there were previously lots of individually wrapped int/float/double
values as well, but these have since been moved over to using slab
allocators.
to explain the 32kB spike:
this has to do with the voxel terrain logic, which has "chunks" which
are 16x16x16 arrays of 8 byte values (voxels, each represents the
locally active area in terms of 1 meter cubes, and are basically a
collection of bit-fields).
there are only about 5826 of them, but in the dump data, these represent
32% of the total memory usage (186MB / 576MB).
there are also serialized voxel regions while only having 8 allocations
(in the dump), represent 7% of the memory use (41MB / 576MB). regions
store the voxels in an RLE-compressed format, for parts of the terrain
that are not currently active.
then there are occasional other large things, like 9 console buffers
which use 2MB (currently for a 160x90 console with 4-bytes for each
character and formatting).
....
note that some data is also stored in RAM in a "compressed" format, such
as audio data for the mixer.
originally, this data was stored in RAM as raw PCM audio, but this was
kind of bulky (audio data can use a lot of RAM at 16-bit 44.1kHz), so I
developed a custom audio codec which allows random-access decompression,
and stores the audio at 176kbps.
now audio is no longer a significant memory user.
work was also going on recently to allow an alternate in-memory format
for the voxel chunks, which basically would exploit a property:
typically, each chunk only has a small number of unique voxel types;
so, in many cases, eligible chunks could be represented in a form where
they use 8-bit (byte) indices into a table of voxel-values, which would
store an eligible chunk in 6kB rather than 32kB.
but, as-is, this is a fairly involved modification.
....