Memory Allocation?

C

Chris S.

Is it possible to determine how much memory is allocated by an arbitrary
Python object? There doesn't seem to be anything in the docs about this,
but considering that Python manages memory allocation, why would such a
module be more difficult to design than say, the GC?
 
M

M.E.Farmer

Hello Chris,
I am sure there are many inaccuracies in this story but hey you asked
instead of seeking your owns answers so....
In general you need not worry about memory allocation.
Too be more specific objects have a size and most of them are known (at
least to a wizard named Tim) , but it doesn't really matter because it
doesn't work like that in Python.
CPython interpreter( I have never read a lick of the source this all
from late nite memory ) just grabs a chunk of memory and uses it as it
sees fit . Jython uses Java's GC . and etc..
Now tell me do you really want to take out the garbage or look at it?
Python does it for you so you don't have too.
But don't let me stop you.
You can probably still find some way to do it.
It is open source after all.
M.E.Farmer
 
G

Gerrit Holl

Chris said:
Is it possible to determine how much memory is allocated by an arbitrary
Python object? There doesn't seem to be anything in the docs about this,
but considering that Python manages memory allocation, why would such a
module be more difficult to design than say, the GC?

Why do you want it?

regards,
Gerrit.

--
In the councils of government, we must guard against the acquisition of
unwarranted influence, whether sought or unsought, by the
military-industrial complex. The potential for the disastrous rise of
misplaced power exists and will persist.
-Dwight David Eisenhower, January 17, 1961
 
D

Donn Cave

"Chris S. said:
Is it possible to determine how much memory is allocated by an arbitrary
Python object? There doesn't seem to be anything in the docs about this,
but considering that Python manages memory allocation, why would such a
module be more difficult to design than say, the GC?

Sorry, I didn't follow that - such a module as what?

Along with the kind of complicated internal implementation
details, you may need to consider the possibility that the
platform malloc() may reserve more than the allocated amount,
for its own bookkeeping but also for alignment. It isn't
a reliable guide by any means, but something like this might
be at least entertaining -
... def __init__(self, a):
... self.a = a
...
>>> d = map(id, map(A, [0]*32))
>>> d.sort()
>>> k = 0
>>> for i in d:
... print i - k
... k = i
...

This depends on the fact that id(a) returns a's storage
address.

I get very different results from one platform to another,
and I'm not sure what they mean, but at a guess, I think
you will see a fairly small number, like 40 or 48, that
represents the immediate allocation for the object, and
then a lot of intervals three or four times larger that
represent all the memory allocated in the course of creating
it. It isn't clear that this is all still allocated -
malloc() doesn't necessarily reuse a freed block right
away, and in fact the most interesting thing about this
experiment is how different this part looks on different
platforms. Of course we're still a bit in the dark as
to how much memory is really allocated for overhead.

Donn Cave, (e-mail address removed)
 
C

Chris S.

M.E.Farmer said:
Hello Chris,
I am sure there are many inaccuracies in this story but hey you asked
instead of seeking your owns answers so....
In general you need not worry about memory allocation.
Too be more specific objects have a size and most of them are known (at
least to a wizard named Tim) , but it doesn't really matter because it
doesn't work like that in Python.
CPython interpreter( I have never read a lick of the source this all
from late nite memory ) just grabs a chunk of memory and uses it as it
sees fit . Jython uses Java's GC . and etc..
Now tell me do you really want to take out the garbage or look at it?
Python does it for you so you don't have too.

Using similar logic, we shouldn't need access to the Garbage Collector
or Profiler. After all, why would anyone need to know how fast their
program is running or whether or not their garbage has been collected.
Python takes care of it.
 
C

Chris S.

Donn said:
Sorry, I didn't follow that - such a module as what?

GC == Garbage Collector (http://docs.python.org/lib/module-gc.html)
Along with the kind of complicated internal implementation
details, you may need to consider the possibility that the
platform malloc() may reserve more than the allocated amount,
for its own bookkeeping but also for alignment. It isn't
a reliable guide by any means, but something like this might
be at least entertaining -
... def __init__(self, a):
... self.a = a
...
d = map(id, map(A, [0]*32))
d.sort()
k = 0
for i in d:
... print i - k
... k = i
...

This depends on the fact that id(a) returns a's storage
address.

I get very different results from one platform to another,
and I'm not sure what they mean, but at a guess, I think
you will see a fairly small number, like 40 or 48, that
represents the immediate allocation for the object, and
then a lot of intervals three or four times larger that
represent all the memory allocated in the course of creating
it. It isn't clear that this is all still allocated -
malloc() doesn't necessarily reuse a freed block right
away, and in fact the most interesting thing about this
experiment is how different this part looks on different
platforms. Of course we're still a bit in the dark as
to how much memory is really allocated for overhead.

Donn Cave, (e-mail address removed)

Are you referring to Python's general method of memory management? I was
under the impression that the ISO C specification for malloc() dictates
allocation of a fixed amount of memory. free(), not malloc(), handles
deallocation. Am I wrong? Does Python use a custom non-standard
implementation of malloc()?
 
D

Dima Dorfman

Is it possible to determine how much memory is allocated by an arbitrary
Python object?

The hardest part about answering this question is figuring out what
you want to count as being allocated for a particular object. That
varies widely depending on why you need this information, which is why
there isn't a "getobsize" or similar call.

For example, consider a dict. The expression

x = {'alpha': 1, 'beta': 2}

allocates a dict object, a hash table to hold the values [1], and the
strings and integers [2]. So how much memory is used by x? The dict
object structure and hash table obviously belong to x. How about the
contents? Those can be shared. If you count them, the answer is
misleading because deleting that object won't free up that much
memory; but if you don't count them, then your answer isn't very
useful because the large part of the object is probably the contents.
Another possibility is to count how much memory would be released if
the object were to go away, but this requires inspecting the reference
count of every object that can be reached from the subject, and it
might still be wrong if there are cycles.

I expect that if you want to know the size, you can decide on the
semantics you want for your application. The answer might even depend
on your application, since only you know which parts of the object
really counts toward its size (e.g., the names of attributes on your
object probably don't count). But the answer won't be the same for my
application, so a generic "getobsize" doesn't help.

Once you know what you want, it's pretty easy to get an estimate.
You'll have to make some assumptions about Python internals and it
wouldn't be portable across versions. Write a function that dispatches
on the type of its argument. For simple objects, return their basic
size; for containers, return the sum of the basic size, aux storage
size, and sizes of the contents (call your function on each of them).
For any object, the basic size is type(x).__basicsize__. The size of
the auxilary storage depends on the object and probably on the length
of its contents; e.g., a list allocates 4 bytes per element, and a
dict 12 bytes per slot [3]. Many do overallocation, so you have to
account for that too.

The only thing the interpreter can really help with is to be able to
ask an object about how much auxiliary memory it allocated. Only
builtin types can do that, and only that type knows the answer. Having
that would save you from having to know, for example, that a dict
allocates 12 bytes per slot. Everything else doesn't need the
interpreter's support, and pure Python works just as well in your
module as in the standard library (that said, if you do write it, I'm
sure others might find it useful--you aren't the first one with a
desire for this kind of information).

(The above is about CPython. JPython probably has its own set of issues.)

Dima.

[1] Small dicts don't need this extra allocation. We can ignore that
for now.

[2] Assuming, for the moment, that this isn't a constant expression.
When it's a constant, the integers are preloaded as constants in
the code object. Right now, that's an unnecessary complication.

[3] Numbers for CPython 2.4.
 
T

Tim Hoffman

Hi Chris

Have a look at
http://www.python.org/doc/2.3.4/whatsnew/section-pymalloc.html

for a description of what is going on.

Basically malloc is used to grab a big chunk of memory, then
python objects use bits of it.

Rgds

Tim
Donn said:
Sorry, I didn't follow that - such a module as what?


GC == Garbage Collector (http://docs.python.org/lib/module-gc.html)
Along with the kind of complicated internal implementation
details, you may need to consider the possibility that the
platform malloc() may reserve more than the allocated amount,
for its own bookkeeping but also for alignment. It isn't
a reliable guide by any means, but something like this might
be at least entertaining -
... def __init__(self, a):
... self.a = a
... >>> d = map(id, map(A, [0]*32))
d.sort()
k = 0
for i in d:
... print i - k
... k = i
...
This depends on the fact that id(a) returns a's storage
address.

I get very different results from one platform to another,
and I'm not sure what they mean, but at a guess, I think
you will see a fairly small number, like 40 or 48, that
represents the immediate allocation for the object, and
then a lot of intervals three or four times larger that
represent all the memory allocated in the course of creating
it. It isn't clear that this is all still allocated -
malloc() doesn't necessarily reuse a freed block right
away, and in fact the most interesting thing about this
experiment is how different this part looks on different
platforms. Of course we're still a bit in the dark as
to how much memory is really allocated for overhead.

Donn Cave, (e-mail address removed)


Are you referring to Python's general method of memory management? I was
under the impression that the ISO C specification for malloc() dictates
allocation of a fixed amount of memory. free(), not malloc(), handles
deallocation. Am I wrong? Does Python use a custom non-standard
implementation of malloc()?
 
M

M.E.Farmer

Chris said:
Using similar logic, we shouldn't need access to the Garbage Collector
or Profiler. After all, why would anyone need to know how fast their
program is running or whether or not their garbage has been collected.
Python takes care of it.
Exactly! Even though there are a few who just won't listen ;)
I answered earlier that the *way* CPython allocates and uses memory it
was kinda useless. Similar objects in CPython and Jython and .etc..
have different implementions and consequently have different memory
footprints. Do you just want to keep track of average memory? Or do you
want know exactly when an object is collected and the memory is freed?
If so, that is also one of those grey areas of Python, objects are
collected and memory is freed at differing times , depending on several
factors plus the phase of the moon.
Lots of devils and plenty of details. If I am wrong someone correct me
;)
If you search around you can find many pointers to memory effecient
coding style in Python( never add a string together, use a list and
append then join, etc.)
Stick to profiling it will serve you better. But do not let me stop you
from writing a memory profiler ;)
M.E.Farmer
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,219
Messages
2,571,118
Members
47,733
Latest member
BlairMeado

Latest Threads

Top