Creating a python c-module: passing double arrays to c functions.segmentation fault. swig

K

kmgrds

Hello everybody,

I'm building a python module to do some heavy computation in C (for
dynamic time warp distance computation).

The module is working perfectly most of the time, which makes it so
difficult to track down the error. and I finally figured out that the
strange segmentation faults I get from time to time have something to
do with the length of the vectors I'm looking at. (So in this minimal
example I'm doing completely useless distance measures on vectors with
the same entry in all dimensions, but the errors seem to have nothing
to with the actual values in the vectors, only with the lengths.)

But somehow I haven't quite gotten the essential ideas of what one has
to do to debug this module and I hope someone can give me some advice.
The main problem is probably somewhere around passing the double
vectors to the c module or reserving and freeing memory.

so, for example
list1,list2 = [0.1 for i in range(555)],[0.2 for i in range(1874)]
ctimewarp(list1,list2) # see python file below
works perfectly fine
and if the vector has one more element
list1,list2 = [0.1 for i in range(555)],[0.2 for i in range(1875)]
ctimewarp(list1,list2)
it dies with a simple
Segmentation fault
nothing more said :-S

For very small lists again, no problem:
list1,list2 = [0.1 for i in range(3)],[0.2 for i in range(4)]
ctimewarp(list1,list2)
for intermediate size I get an error and more information:
list1,list2 = [0.1 for i in range(22)],[0.2 for i in range(99)]
ctimewarp(list1,list2)
give:

*** glibc detected *** python: free(): invalid next size (fast):
0x0804d090 ***
======= Backtrace: =========
/lib/i686/libc.so.6[0xb7bed4e6]
/lib/i686/libc.so.6(cfree+0x90)[0xb7bf1010]
/home/kim/Documents/pythonprojects/alignator/
_timewarpsimple.so[0xb7aadc0c]
/home/kim/Documents/pythonprojects/alignator/
_timewarpsimple.so[0xb7aae0db]
/usr/lib/libpython2.5.so.1.0(PyCFunction_Call+0x107)[0xb7d5d8e7]
======= Memory map: ========
08048000-08049000 r-xp 00000000 03:05 2344438 /usr/bin/python
08049000-0804a000 rwxp 00000000 03:05 2344438 /usr/bin/python
0804a000-080f3000 rwxp 0804a000 00:00 0 [heap]
.....(truncated)


When looping over these lengths of vectors, e.g. as I did in the
python file below, it runs smooth for thousands of times before dying
at 938 for the first list length and 1110 for the second. i noticed
that the sum of the list lengths is 2048=2^11 and it often seems to
die around that but that's just a guess...

If anyone has a little advice or a code snippet on how to pass float
lists to modules or anything of that kind, i would appreciate it very
much!

thanks a lot in advance
cheers
kim


__________________________________

Below you can see the c file, the .i file for swig, and a little
python script producing the errors:


_______________ .c file:

/* File : timewarpsimple.c */

#include <sys/param.h>
#include <math.h>
#include <stdlib.h>

double timewarp(double x[], int lenx, double y[], int leny) {

// printf ("%d *******************************\n", lenx);
// printf ("%d *******************************\n", leny);
double prev;
double recx[lenx+1];
double recy[leny+1];
double warp[lenx+2][leny+2];
int i,j;
prev = 0.0;
for (i = 0; i < lenx; i++) {
recx=x-prev;
prev = x;
}
recx[lenx]=1.0-prev;
prev = 0.0;
for (i = 0; i < leny; i++) {
recy=y-prev;
prev = y;
}
recy[leny]=1.0-prev;
// recency vectors are done

// let's warp

warp[0][0]=0.0;
for (i = 1; i < lenx+2; i++) {
warp[0]=1.0;
}
for (j = 1; j < leny+2; j++) {
warp[0][j]=1.0;
}

for (i = 1; i < lenx+2; i++) {
for (j = 1; j < leny+2; j++) {
warp[j]=fabs(recx[i-1]-recy[j-1]) + MIN(MIN(warp[i-1][j],warp
[j-1]),warp[i-1][j-1]);
}


}

return warp[lenx+1][leny+1];
}



________________ .i file:

%module timewarpsimple
%include "carrays.i"
%array_functions(double, doubleArray);

%{
double timewarp(double x[], int lenx, double y[], int leny);
%}

double timewarp(double x[], int lenx, double y[], int leny);

________________ here's what I'm doing to compile:

swig -python timewarpsimple.i
gcc -c timewarpsimple.c timewarpsimple_wrap.c -I/usr/include/
python2.5/
ld -shared timewarpsimple.o timewarpsimple_wrap.o -o
_timewarpsimple.so

which goes thru without any problem.

________________ .py file:

#!/usr/bin/python
# -*- coding: UTF-8 -*-

import timewarpsimple

def ctimewarp(list1,list2):
"""
takes two lists of numbers between 0 and 1 and computes a timewarp
distance
"""
print "timewarping"

alen = len(list1)
blen = len(list2)
a = timewarpsimple.new_doubleArray(alen*4) # Create the first
array
# Why I have to reserve 4 times more space I don't know, but it's the
only way to make it work...
b = timewarpsimple.new_doubleArray(alen*4) # Create the
second array

for i,p in enumerate(list1):
timewarpsimple.doubleArray_setitem(a,i,p) # Set a value. (a,i,0)
gives identical errors
for i,p in enumerate(list2):
timewarpsimple.doubleArray_setitem(b,i,p) # Set a value

warp = timewarpsimple.timewarp(a,alen,b,blen)
print "result",warp
timewarpsimple.delete_doubleArray(a)
timewarpsimple.delete_doubleArray(b)
#a,b = None,None
return warp

########## the tests:

#list1,list2 = [0.1 for i in range(22)],[0.2 for i in range(99)]
#ctimewarp(list1,list2)

for x in range(888,1111):
for y in range(888,1111):
list1,list2 = [0.1 for i in range(x)], [0.2 for i in range(y)]
print len(list1),len(list2),len(list1)+len(list2)
ctimewarp(list1,list2)
 
S

sturlamolden

Hello everybody,

I'm building a python module to do some heavy computation in C (for
dynamic time warp distance computation).

Why don't you just ctypes and NumPy arrays instead?


# double timewarp(double x[], int lenx, double y[], int leny);

import numpy
import ctypes
from numpy.ctypeslib import ndpointer
from ctypes import c_int

_timewarp = ctypes.cdll.timewarp.timewarp # timewarp.dll
array_pointer_t = ndpointer(dtype=double)
_timewarp.argtypes = [array_pointer_t, c_int, array_pointer_t, c_int]

def timewarp(x, y):
lenx, leny = x.shape[0], y.shape[0]
return _timewarp(x, lenx, y, leny)
 
K

kim

thanks a lot, sturlamolden, for the quick reply!

so i tried to use numpy and ctypes as you advised and i got it
working: with some little changes for my linux machine - i hope they
aren't the cause of the results:

to load it, i only got it going with LoadLibrary:
lib = numpy.ctypeslib.load_library("_timewarpsimple.so",".")
_timewarp = lib.timewarp

i replaced double by c_double:
array_pointer_t = ndpointer(dtype=c_double)

and i needed to define a restype, too:
_timewarp.restype = c_double

so, the strange errors with lots of stack traces are gone, but for
large lists i got the same problem: Segmentation fault. for example
for the following two lines:

list1,list2 = numpy.array([0.1 for i in range(999)]),
numpy.array([0.7 for i in range(1041)])
print timewarp(list1,list2)

So it is maybe more of a c problem, but still, if i copy the big
arrays into my c code, everything works out just fine
(x[]={0.1,0.1,... }...) so it may still have something to do with the
python-c interface and memory allocation or something like that.

below i put my new shorter python script, the c and the rest hasn't
changed.
what can i try now?

thanks again
kim


import numpy
import ctypes
from numpy.ctypeslib import ndpointer
from ctypes import c_int,c_double

lib = numpy.ctypeslib.load_library("_timewarpsimple.so",".")
_timewarp = lib.timewarp
#_timewarp = ctypes.cdll.timewarp.timewarp # timewarp.dll
array_pointer_t = ndpointer(dtype=c_double)
_timewarp.argtypes = [array_pointer_t, c_int, array_pointer_t, c_int]
_timewarp.restype = c_double

def timewarp(x, y):
lenx, leny = x.shape[0], y.shape[0]
print lenx,leny
return _timewarp(x, lenx, y, leny)

# testing:
list1,list2 = numpy.array([0.1 for i in range(999)]),
numpy.array([0.7 for i in range(1041)])
print timewarp(list1,list2)

for x in range(999,1111):
for y in range(999,1111):
list1,list2 = [0.1 for i in range(x)], [0.9 for i in range(y)]
print len(list1),len(list2),len(list1)+len(list2)
list1,list2 = numpy.array(list1), numpy.array(list2)
print timewarp(list1,list2)



Hello everybody,
I'm building a python module to do some heavy computation in C (for
dynamic time warp distance computation).

Why don't you just ctypes and NumPy arrays instead?

# double timewarp(double x[], int lenx, double y[], int leny);

import numpy
import ctypes
from numpy.ctypeslib import ndpointer
from ctypes import c_int

_timewarp = ctypes.cdll.timewarp.timewarp # timewarp.dll
array_pointer_t = ndpointer(dtype=double)
_timewarp.argtypes = [array_pointer_t, c_int, array_pointer_t, c_int]

def timewarp(x, y):
lenx, leny = x.shape[0], y.shape[0]
return _timewarp(x, lenx, y, leny)
 
S

sturlamolden

array_pointer_t = ndpointer(dtype=c_double)

This one is wrong. The dtype should be the datatype kept in the array,
which is 'float' (Python doubles) or 'numpy.float64'.

array_pointer_t = ndpointer(dtype=numpy.float64)


I'd take a good look at that C code. For example, this is not valid C:

double timewarp(double x[], int lenx, double y[], int leny) {

double prev;
double recx[lenx+1];
double recy[leny+1];
double warp[lenx+2][leny+2];
int i,j;

I would be valid C99, but you are not compiling it as C99 (gcc does
not implement automatic arrays correctly in C99 anyway). If Fortran is
what you want, get gfortran or Intel Fortran. To make this valid C,
you will need to malloc these buffers, and free them when you are
done. Another option is to give them a fixed maximum size:

#define MAXLEN 1024
double recx[MAXLEN + 1];
double recy[MAXLEN + 1];
double warp[MAXLEN +2][MAXLEN +2];

There may be other errors as well, I did not look at it carefully.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,954
Messages
2,570,116
Members
46,704
Latest member
BernadineF

Latest Threads

Top