R
Rouslan Korneychuk
It's still in the rough, but I wanted to give an update on my C++
extension generator. It's available at http://github.com/Rouslan/PyExpose
The documentation is a little slim right now but there is a
comprehensive set of examples in test/test_kompile.py (replace the k
with a c. For some reason, if I post this message with the correct name,
it doesn't show up). The program takes an input file like
<?xml version="1.0"?>
<module name="modulename" include="vector">
<doc>module doc string</doc>
<class name="DVector" type="std::vector<double>">
<doc>class doc string</doc>
<init overload=""/>
<init overload="size_t,const double&"/>
<property name="size" get="size" set="resize"/>
<def func="push_back"/>
<def name="__sequence__getitem__" func="at"
return-semantic="copy"/>
<def name="__sequence__setitem__" assign-to="at"/>
</class>
</module>
and generates the code for a Python extension.
The goal has been to generate code with zero overhead. In other words I
wanted to eliminate the tedium of creating an extension without
sacrificing anything. In addition to generating a code file, the
previous input would result in a header file with the following:
extern PyTypeObject obj_DVectorType;
inline PyTypeObject *get_obj_DVectorType() { return &obj_DVectorType; }
struct obj_DVector {
PyObject_HEAD
storage_mode mode;
std::vector<double,std::allocator<double> > base;
PY_MEM_NEW_DELETE
obj_DVector() : base() {
PyObject_Init(reinterpret_cast<PyObject*>(this),get_obj_DVectorType());
mode = CONTAINS;
}
obj_DVector(std::allocator<double> const & _0) : base(_0) {
PyObject_Init(reinterpret_cast<PyObject*>(this),get_obj_DVectorType());
mode = CONTAINS;
}
obj_DVector(long unsigned int _0,double const &
_1,std::allocator<double> const & _2) : base(_0,_1,_2) {
PyObject_Init(reinterpret_cast<PyObject*>(this),get_obj_DVectorType());
mode = CONTAINS;
}
obj_DVector(std::vector<double,std::allocator<double> > const & _0)
: base(_0) {
PyObject_Init(reinterpret_cast<PyObject*>(this),get_obj_DVectorType());
mode = CONTAINS;
}
};
so the object can be allocated in your own code as a single block of
memory rather than having a PyObject contain a pointer to the exposed type.
storage_type is an enumeration, adding very little to the size of the
Python object (or maybe nothing depending on alignment), but if you add
new-initializes="true" to the <class> tag and the exposed type never
needs to be held by a pointer/reference (as is the case when the exposed
type is inside another class/struct), even that variable gets omitted.
The code also never uses PyArg_ParseTuple or its variants. It converts
every argument using the appropriate PyX_FromY functions. I noticed
PyBindGen does the following when a conversion is needed for one argument:
py_retval = Py_BuildValue((char *) "(O)", value);
if (!PyArg_ParseTuple(py_retval, (char *) "i", &self->obj->y)) {
Py_DECREF(py_retval);
return -1;
}
Py_DECREF(py_retval);
On the other hand, here's the implementation for __sequence__getitem__:
PyObject * obj_DVector___sequence__getitem__(obj_DVector
*self,Py_ssize_t index) {
try {
std::vector<double,std::allocator<double> > &base =
cast_base_DVector(reinterpret_cast<PyObject*>(self));
return PyFloat_FromDouble(base.at(py_ssize_t_to_ulong(index)));
} EXCEPT_HANDLERS(0)
}
(cast_base_DVector checks that base is initialized and gets a reference
to it with regard to how it's stored in obj_DVector. If the class is
new-initialized and only needs one means of storage, it's code will just
be "return obj_DVector->base;" and should be inlined by an optimizing
compiler.)
I'm really interested in what people think of this little project.
extension generator. It's available at http://github.com/Rouslan/PyExpose
The documentation is a little slim right now but there is a
comprehensive set of examples in test/test_kompile.py (replace the k
with a c. For some reason, if I post this message with the correct name,
it doesn't show up). The program takes an input file like
<?xml version="1.0"?>
<module name="modulename" include="vector">
<doc>module doc string</doc>
<class name="DVector" type="std::vector<double>">
<doc>class doc string</doc>
<init overload=""/>
<init overload="size_t,const double&"/>
<property name="size" get="size" set="resize"/>
<def func="push_back"/>
<def name="__sequence__getitem__" func="at"
return-semantic="copy"/>
<def name="__sequence__setitem__" assign-to="at"/>
</class>
</module>
and generates the code for a Python extension.
The goal has been to generate code with zero overhead. In other words I
wanted to eliminate the tedium of creating an extension without
sacrificing anything. In addition to generating a code file, the
previous input would result in a header file with the following:
extern PyTypeObject obj_DVectorType;
inline PyTypeObject *get_obj_DVectorType() { return &obj_DVectorType; }
struct obj_DVector {
PyObject_HEAD
storage_mode mode;
std::vector<double,std::allocator<double> > base;
PY_MEM_NEW_DELETE
obj_DVector() : base() {
PyObject_Init(reinterpret_cast<PyObject*>(this),get_obj_DVectorType());
mode = CONTAINS;
}
obj_DVector(std::allocator<double> const & _0) : base(_0) {
PyObject_Init(reinterpret_cast<PyObject*>(this),get_obj_DVectorType());
mode = CONTAINS;
}
obj_DVector(long unsigned int _0,double const &
_1,std::allocator<double> const & _2) : base(_0,_1,_2) {
PyObject_Init(reinterpret_cast<PyObject*>(this),get_obj_DVectorType());
mode = CONTAINS;
}
obj_DVector(std::vector<double,std::allocator<double> > const & _0)
: base(_0) {
PyObject_Init(reinterpret_cast<PyObject*>(this),get_obj_DVectorType());
mode = CONTAINS;
}
};
so the object can be allocated in your own code as a single block of
memory rather than having a PyObject contain a pointer to the exposed type.
storage_type is an enumeration, adding very little to the size of the
Python object (or maybe nothing depending on alignment), but if you add
new-initializes="true" to the <class> tag and the exposed type never
needs to be held by a pointer/reference (as is the case when the exposed
type is inside another class/struct), even that variable gets omitted.
The code also never uses PyArg_ParseTuple or its variants. It converts
every argument using the appropriate PyX_FromY functions. I noticed
PyBindGen does the following when a conversion is needed for one argument:
py_retval = Py_BuildValue((char *) "(O)", value);
if (!PyArg_ParseTuple(py_retval, (char *) "i", &self->obj->y)) {
Py_DECREF(py_retval);
return -1;
}
Py_DECREF(py_retval);
On the other hand, here's the implementation for __sequence__getitem__:
PyObject * obj_DVector___sequence__getitem__(obj_DVector
*self,Py_ssize_t index) {
try {
std::vector<double,std::allocator<double> > &base =
cast_base_DVector(reinterpret_cast<PyObject*>(self));
return PyFloat_FromDouble(base.at(py_ssize_t_to_ulong(index)));
} EXCEPT_HANDLERS(0)
}
(cast_base_DVector checks that base is initialized and gets a reference
to it with regard to how it's stored in obj_DVector. If the class is
new-initialized and only needs one means of storage, it's code will just
be "return obj_DVector->base;" and should be inlined by an optimizing
compiler.)
I'm really interested in what people think of this little project.