Python C API String Memory Consumption

K

k3xji

When I run the following function, I seem to have a mem leak, a 20 mb
of memory
is allocated and is not freed. Here is the code I run:

.... ss = esauth.penc('sumer')
....
.... ss = esauth.penc('sumer')
....

And here is the penc() function.

static PyObject *
penc(PyObject *self, PyObject *args)
{
unsigned char *s= NULL;
unsigned char *buf = NULL;
PyObject * result = NULL;
unsigned int v,len,i = 0;

if (!PyArg_ParseTuple(args, "s#", &s, &len))
return NULL;

buf = strdup(s);
if (!buf) {
PyErr_SetString(PyExc_MemoryError,
"Out of memory: strdup failed");
return NULL;
}

/*string manipulation*/

result = PyString_FromString(buf);
free(buf);
return result;

}

Am I doing something wrong?

Thanks,
 
C

Carl Banks

When I run the following function, I seem to have a mem leak, a 20 mb
of memory
is allocated and is not freed. Here is the code I run:


...     ss = esauth.penc('sumer')
...


...     ss = esauth.penc('sumer')
...

And here is the penc() function.

static PyObject *
penc(PyObject *self, PyObject *args)
{
        unsigned char *s= NULL;
        unsigned char *buf = NULL;
        PyObject * result = NULL;
        unsigned int v,len,i = 0;

        if (!PyArg_ParseTuple(args, "s#", &s, &len))
        return NULL;

        buf = strdup(s);
        if (!buf) {
                PyErr_SetString(PyExc_MemoryError,
                        "Out of memory: strdup failed");
                return NULL;
        }

        /*string manipulation*/

        result = PyString_FromString(buf);
        free(buf);
        return result;

}

Am I doing something wrong?


It might just be an unfortunate case where malloc keeps allocating
memory higher and higher on the heap even though it frees all the
memory. And since it doesn't give it back to the OS, it runs out.

However, Python apparently does leak a reference if passed a Unicode
object; PyArg_ParseTuple automatically creates an encoded string but
never decrefs it. (That might be necessary evil to preserve
compatibility, though. PyString_AS_STRING does it too.)


Carl Banks
 
K

k3xji

Interestaing I changed malloc()/free() usage with PyMem_xx APIs and
the problem resolved. However, I really cannot understand why the
first version does not work. Here is the latest code that has no
problems at all:

static PyObject *
penc(PyObject *self, PyObject *args)
{
PyObject * result = NULL;
unsigned char *s= NULL;
unsigned char *buf = NULL;
unsigned int v,len,i = 0;

if (!PyArg_ParseTuple(args, "s#", &s, &len))
return NULL;

buf = (unsigned char *) PyMem_Malloc(len);
if (buf == NULL) {
PyErr_NoMemory();
return NULL;
}

/* string manipulation. */

result = PyString_FromStringAndSize((char *)buf, len);
PyMem_Free(buf);
return result;
}
 
M

MRAB

k3xji said:
Interestaing I changed malloc()/free() usage with PyMem_xx APIs and
the problem resolved. However, I really cannot understand why the
first version does not work. Here is the latest code that has no
problems at all:

static PyObject *
penc(PyObject *self, PyObject *args)
{
PyObject * result = NULL;
unsigned char *s= NULL;
unsigned char *buf = NULL;
unsigned int v,len,i = 0;

if (!PyArg_ParseTuple(args, "s#", &s, &len))
return NULL;

buf = (unsigned char *) PyMem_Malloc(len);
if (buf == NULL) {
PyErr_NoMemory();
return NULL;
}

/* string manipulation. */

result = PyString_FromStringAndSize((char *)buf, len);
PyMem_Free(buf);
return result;
}
In general I'd say don't mix your memory allocators. I don't know
whether CPython implements PyMem_Malloc using malloc, but it's better to
stick with CPython's memory allocators when writing for CPython.
 
J

John Machin

In general I'd say don't mix your memory allocators. I don't know
whether CPython implements PyMem_Malloc using malloc,

The fantastic manual (http://docs.python.org/c-api/
memory.html#overview) says: """the C allocator and the Python memory
manager ... implement different algorithms and operate on different
heaps""".
but it's better to
stick with CPython's memory allocators when writing for CPython.

for the reasons given in the last paragraph of the above reference.

HTH,
John
 
B

Benjamin Peterson

Carl Banks said:
However, Python apparently does leak a reference if passed a Unicode
object; PyArg_ParseTuple automatically creates an encoded string but
never decrefs it. (That might be necessary evil to preserve
compatibility, though. PyString_AS_STRING does it too.)

Unicode objects cache a copy of themselves as default encoded strings. It is
deallocated when the unicode object its self is.
 
F

Floris Bruynooghe

I assume you're doing a memcpy() somewhere in there... This is also
safer then your first version since the python string can contain an
embeded \0 and the strdup() of the first version would not copy that.
But maybe you're sure your input doesn't have NULLs in them so it
might be fine.
The fantastic manual (http://docs.python.org/c-api/
memory.html#overview) says: """the C allocator and the Python memory
manager ... implement different algorithms and operate on different
heaps""".


for the reasons given in the last paragraph of the above reference.

That document explictly says you're allowed to use malloc() and free()
in extensions. There is nothing wrong with allocating things on
different heaps, I've done and seen it many times and never had
trouble.

Why the original problem ocurred I don't understand either tough.

Regards
Floris
 
A

Aahz

When I run the following function, I seem to have a mem leak, a 20 mb
of memory
is allocated and is not freed. Here is the code I run:


... ss = esauth.penc('sumer')
...

... ss = esauth.penc('sumer')
...

BTW, note that if you're using Python 2.x, range(1000000) will cause a
"leak" because ints are never freed. Instead, use xrange().
 
H

Hrvoje Niksic

BTW, note that if you're using Python 2.x, range(1000000) will cause
a "leak" because ints are never freed. Instead, use xrange().

Note that using xrange() won't help with that particular problem.
 
C

Carl Banks

Note that using xrange() won't help with that particular problem.

I think it will because with xrange the integers will not all have to
exist at one time, so Python doesn't have to increase the size of the
integer pool to a million.


Carl Banks
 
H

Hrvoje Niksic

Carl Banks said:
I think it will because with xrange the integers will not all have
to exist at one time, so Python doesn't have to increase the size of
the integer pool to a million.

Good catch! I stand corrected.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,989
Messages
2,570,207
Members
46,782
Latest member
ThomasGex

Latest Threads

Top