S
Stefan Behnel
Paul Rubin, 04.08.2012 20:18:
CPython is written in C, though. So anything that CPython does can be done
in C. It's not like the CPython project used a completely unusual way of
writing C code.
Besides, I find your above statement questionable. You will always need
some kind of runtime infrastructure when you "compile Python into C", so
you can just as well use CPython for that instead of reimplementing it
completely from scratch. Both Cython and Nuitka do exactly that, and one of
the major advantages of that approach is that they can freely interact with
arbitrary code (Python or not) that was written for CPython, regardless of
its native dependencies. What good would it be to throw all of that away,
just for the sake of having "pure C code generation"?
No, you are going to compile only the generator function into a function
that uses gotos, maybe with an additional in-out struct parameter that
holds its state. Then, on entry, you read the label (or its ID) from the
previous state, reset local variables and jump to the label. On exit, you
store the state back end return. Cython does it that way. Totally straight
forward, as I said.
If you don't like that, you can experiment with anything from a dedicated
GC to transactional memory.
No idea - I'll look it up when I need one. Last I heard, PyPy had a couple
of GCs to choose from, but I don't know how closely the are tied into its
infrastructure.
Well, it's not like CPython leaks memory until it crashes, now does it? And
it's written in C. So there must be ways to handle this also in C.
Remember that CPython didn't even have a GC before something around 2.0,
IIRC. That worked quite ok in most cases and simply left the tricky cases
to the programmers. It really depends on what your requirements are. Small
embedded systems, time critical code and real-time systems are often much
better off without garbage collection. It's pure convenience, after all.
Huh? LuaJIT is a reimplementation of Lua that uses an optimising JIT
compiler specifically for Lua code. How is that similar to the Jython
runtime that runs *on top of* the JVM with its generic byte code based JIT
compiler?
Basically, LuaJIT's JIT compiler works at the same level as the one in
PyPy, which is why both can theoretically provide the same level of
performance gains.
Sure. Even when targeting the CPython runtime with the generated C code
(like Cython or Nuitka), you can still do a lot. And sure, static code
analysis will never be able to infer everything that a JIT compiler can see.
Stefan
Calling CPython hardly counts as compiling Python into C.
CPython is written in C, though. So anything that CPython does can be done
in C. It's not like the CPython project used a completely unusual way of
writing C code.
Besides, I find your above statement questionable. You will always need
some kind of runtime infrastructure when you "compile Python into C", so
you can just as well use CPython for that instead of reimplementing it
completely from scratch. Both Cython and Nuitka do exactly that, and one of
the major advantages of that approach is that they can freely interact with
arbitrary code (Python or not) that was written for CPython, regardless of
its native dependencies. What good would it be to throw all of that away,
just for the sake of having "pure C code generation"?
You're going to compile the whole Python program into a single C
function so that you can do gotos inside of it? What happens if the
program imports a generator?
No, you are going to compile only the generator function into a function
that uses gotos, maybe with an additional in-out struct parameter that
holds its state. Then, on entry, you read the label (or its ID) from the
previous state, reset local variables and jump to the label. On exit, you
store the state back end return. Cython does it that way. Totally straight
forward, as I said.
You mean you're going to have all the same INCREF/DECREF stuff on every
operation in compiled data? Ugh.
If you don't like that, you can experiment with anything from a dedicated
GC to transactional memory.
What implementations would those be? There's the Boehm GC which is
useful for some purposes but not really suitable at large scale, from
what I can tell. Is there something else?
No idea - I'll look it up when I need one. Last I heard, PyPy had a couple
of GCs to choose from, but I don't know how closely the are tied into its
infrastructure.
You're going to let the program just leak memory until it crashes??
Well, it's not like CPython leaks memory until it crashes, now does it? And
it's written in C. So there must be ways to handle this also in C.
Remember that CPython didn't even have a GC before something around 2.0,
IIRC. That worked quite ok in most cases and simply left the tricky cases
to the programmers. It really depends on what your requirements are. Small
embedded systems, time critical code and real-time systems are often much
better off without garbage collection. It's pure convenience, after all.
Compare that to the performance gain of LuaJIT and it starts to look
like something is wrong with that approach, or maybe some issue inherent
in Python itself.
Huh? LuaJIT is a reimplementation of Lua that uses an optimising JIT
compiler specifically for Lua code. How is that similar to the Jython
runtime that runs *on top of* the JVM with its generic byte code based JIT
compiler?
Basically, LuaJIT's JIT compiler works at the same level as the one in
PyPy, which is why both can theoretically provide the same level of
performance gains.
It seems very hard to do reasonable optimizations in the presence of
standard Python techniques like dynamically poking class instance
attributes. I guess some optimizations are still possible, like storing
attributes named as literals in the program in fixed slots, saving some
dictionary lookups even though the slot contents would have to still be
mutable.
Sure. Even when targeting the CPython runtime with the generated C code
(like Cython or Nuitka), you can still do a lot. And sure, static code
analysis will never be able to infer everything that a JIT compiler can see.
Stefan