Compiling

S

Simon Faulkner

Pardon me if this has been done to death but I can't find a simple
explanation.

I love Python for it's ease and speed of development especially for the
"Programming Challenged" like me but why hasn't someone written a
compiler for Python?

I guess it's not that simple eh?

Simon
 
R

Rocco Moretti

Simon said:
Pardon me if this has been done to death but I can't find a simple
explanation.

I love Python for it's ease and speed of development especially for the
"Programming Challenged" like me but why hasn't someone written a
compiler for Python?

I guess it's not that simple eh?

The "simple" explanation for the lack of a Python compiler is the
massive dynamisism (sp) in Python - since you can change practically
everything at any time, in order to compile a generic python program,
you have to effectively include the entire interpreter. It's been done
before (Python2C was the name, I think), but there wasn't much of a
speed-up vs. CPython, and it hasn't been updated to work with recent
versions of Python.

Recently there has been work on JIT type dynamic compilation techniques,
and static compilation of a reduced Python subset. If you want to know
more, look up the PyPy project. http://www.codespeak.net/pypy
 
S

Scott David Daniels

Simon said:
... why hasn't someone written a compiler for Python?
I guess it's not that simple eh?


What would you call PyPy?

As to the idea of a python-to-machine code translator, the
benefit would not be very great without at least a PyPy
level of understanding of the code -- no simple translation
would be much faster than the CPython implementation.

By the way, be careful about your tone. It sounds like
brick-throwing in this generally friendly newsgroup.

--Scott David Daniels
(e-mail address removed)
 
B

bruno at modulix

Simon said:
ty Bruno, I must confes that I don't understand much of that chapter!

I will work harder... :)

Hint : You probably don't need to understand anything in this chapter.
This compiler compiles Python source code to Python bytecode, which is
then executed by the Python interpreter. You may not have noticed -
since the Python interpreter is smart enough to call the compiler when
needed - but Python is compiled to bytecode before execution. Just look
at all the .pyc files on your filesystem.

For short : this was kind of a joke... I understand that what you were
looking for is a 'native code' compiler. AFAIK, this could of course be
done, but due to Python's very dynamic nature, it's not sure this would
lead to drastically better performances.

HTH
 
P

Paul Boddie

Simon said:
I love Python for it's ease and speed of development especially for the
"Programming Challenged" like me but why hasn't someone written a
compiler for Python?

There are various compilers for Python, but they vary in terms of
capabilities, language support and method of operation. Disregarding
Jython and the jythonc tool, which I believe is somewhat inspired by
earlier tools for CPython, there has been a fair amount of activity and
a number of works and/or papers on the topic.

First of all, there have been conventional compilers such as "Python to
C" [1] and other early attempts to do similar things [2, 3, 4]; such
work possibly informed the construction of some of the
bundler/installer tools [5]. Then, there were later attempts such as
Starkiller [6] to produce low-level language code, and also compilers
which produce other high-level languages [7, 8]. Most of the
aforementioned projects have attempted to deal with "normal" Python,
possibly with the exception of Starkiller. Subsequent works which deal
with more restricted dialects of Python include ShedSkin [9].

Then, there are hybrid language compilers such as Pyrex [10] which
avoid the issues of turning highly dynamic Python code into some
low-level representation by allowing programmers to write
speed-critical sections in a special Python dialect which translates
better into C/C++. Finally, there are just-in-time compilers such as
Psyco [11] which make use of run-time information to generate machine
code specialised for the operations being performed; the PyPy project
[12] appears to have such concerns in mind in the design of possible
replacement virtual machines for Python.
I guess it's not that simple eh?

Well, there have been other projects, too. Once upon a time there was a
promising alternative implementation of Python called Vyper [13, 14]
but work on that was discontinued. One prematurely-hyped work, pycore
[15], involved translating Python to Smalltalk but disappeared almost
instantly.

I believe that many Python programmers who are vocal on such topics
don't want to surrender any of the functionality that they've become
accustomed to, and with additional functionality arriving all the time
in CPython, it's arguably difficult to reconcile such functionality
with performant low-level code generation (at least in advance of
run-time). That said, there has been a movement to introduce static
typing, ostensibly for reliability purposes, but such justifications
have arguably arisen because claims about the various proposed typing
models and performance have largely gone unproven.

Paul

[1] http://sourceforge.net/projects/p2c/
[2] http://www.python.org/workshops/1996-06/papers/hugunin.IPCIV.html
[3]
http://www.foretec.com/python/workshops/1998-11/proceedings/papers/riehl/riehl.html
[4]
http://www.foretec.com/python/workshops/1998-11/proceedings/papers/aycock-211/aycock211.html
[5] http://davidf.sjsoft.com/mirrors/mcmillan-inc/install1.html
[6] http://www.python.org/pycon/dc2004/papers/1/
[7] http://perthon.sourceforge.net/
[8]
http://www.python.org/workshops/2000-01/proceedings/papers/aycock/aycock.html
[9] http://sourceforge.net/projects/shedskin/
[10] http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/
[11] http://psyco.sourceforge.net/
[12] http://codespeak.net/pypy/dist/pypy/doc/news.html
[13] http://monkeyfist.com/articles/266
[14] http://gnosis.cx/publish/programming/charming_python_8.html
[15]
http://webpages.charter.net/allanms/2004/08/you-dont-tug-on-supermans-cape.html
 
M

Magnus Lycka

Simon said:
Pardon me if this has been done to death but I can't find a simple
explanation.

I love Python for it's ease and speed of development especially for the
"Programming Challenged" like me but why hasn't someone written a
compiler for Python?

I guess it's not that simple eh?

In case you really just want to run your Python programs
on computers without Python installed, there are several
tools that will create a myprogram.exe from your myprogram.py.
These tools don't make machinecode executables. Instead, they
basically bundle python.exe, your program and whatever is
needed into an single .exe-file.

Google for e.g. py2exe or cx_freeze.
 
R

Ravi Teja

For short : this was kind of a joke... I understand that what you were
looking for is a 'native code' compiler. AFAIK, this could of course be
done, but due to Python's very dynamic nature, it's not sure this would
lead to drastically better performances.

This is a standard response to a rather frequent question here. But I
am not sure I ever understood. Scheme / Lisp are about as dynamic as
Python. Yet they have quite efficient native compilers. Ex: Bigloo
Scheme.

Another standard response is "nobody felt the need or got around to
it". And yet a number of Lisp and Scheme compilers exist when these
languages have a much smaller user base. Am I missing something here?
 
F

Fredrik Lundh

Ravi said:
This is a standard response to a rather frequent question here. But I
am not sure I ever understood. Scheme / Lisp are about as dynamic as
Python.

if that were fully true, it would be fairly trivial to translate Python to
scheme or lisp and compile it.
Another standard response is "nobody felt the need or got around to
it". And yet a number of Lisp and Scheme compilers exist when these
languages have a much smaller user base. Am I missing something here?

an endless supply of grad students ?

</F>
 
D

Diez B. Roggisch

This is a standard response to a rather frequent question here. But I
am not sure I ever understood. Scheme / Lisp are about as dynamic as
Python. Yet they have quite efficient native compilers. Ex: Bigloo
Scheme.

If you provide the necessary annotations for optimization. Not sure about
runtime. But then that is what psyco does (albeit for a limited range of
machines) at runtime. And even in bigloo you end up with guarding
statements for typechecking & and an code-size explosion.

Diez
 
R

Ravi Teja

Fredrik said:
if that were fully true, it would be fairly trivial to translate Python to
scheme or lisp and compile it.

I only dabble in Lisp / Scheme. I am curious on how Python is more
dynamic.
an endless supply of grad students ?

:).
 
G

gene tani

Simon said:
Pardon me if this has been done to death but I can't find a simple
explanation.

I love Python for it's ease and speed of development especially for the
"Programming Challenged" like me but why hasn't someone written a
compiler for Python?

I guess it's not that simple eh?

Simon

read Brett Cannon's thesis about type inference

http://www.ocf.berkeley.edu/~bac/thesis.pdf
 
R

Ravi Teja

Actually optimizations are not what concern me. I am pretty happy with
Pyrex/Swig etc for that. What I want is the ability to make a native
DLL/SO. A whole lot easier to integrate/embed to other languages.
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Ravi said:
This is a standard response to a rather frequent question here. But I
am not sure I ever understood. Scheme / Lisp are about as dynamic as
Python. Yet they have quite efficient native compilers. Ex: Bigloo
Scheme.

You might be missing two details here:
1. those compilers are less efficient than you might think
2. Scheme/Lisp are indeed less dynamic than Python

Consider the following Scheme example:

(module a)

(define (add a b)
(+ a b))

(print (add 3.0 4))

Bigloo generates the following code (which I have cleaned up
quite a bit):

/* toplevel-init */
obj_t BGl_toplevelzd2initzd2zzaz00()
{
AN_OBJECT;
obj_t BgL_arg1078z00_10;
BgL_arg1078z00_10 = BGl_addz00zzaz00(((double)3.0), ((long)4));
obj_t BgL_list1080z00_11;
BgL_list1080z00_11 = MAKE_PAIR(BgL_arg1078z00_10, BNIL);
return BGl_printz00zz__r4_output_6_10_3z00(BgL_list1080z00_11);
}

/* add */
obj_t BGl_addz00zzaz00(double BgL_az00_1, long BgL_bz00_2)
{
AN_OBJECT;
obj_t BgL_list1084z00_12;
obj_t BgL_arg1087z00_13;
BgL_arg1087z00_13 = MAKE_PAIR(BINT(BgL_bz00_2), BNIL);
BgL_list1084z00_12 = MAKE_PAIR(DOUBLE_TO_REAL(BgL_az00_1),
BgL_arg1087z00_13);
return BGl_zb2zb2zz__r4_numbers_6_5z00(BgL_list1084z00_12);
}


You can see several things from that:
1. The compiler was not able to/did not chose to implement
the add operation using native C. Instead, it allocates
two cons cells, and one object for the double; it then
calls a generic implementation of +.
2. The compiler *did* infer that the procedure add is
always called with (double long). This is because I
didn't export it. If I exported the function, it
would become

obj_t BGl_zc3anonymousza31077ze3z83zzaz00(obj_t BgL_envz00_16, obj_t
BgL_az00_17, obj_t BgL_bz00_18)
{
AN_OBJECT;
obj_t BgL_az00_8;obj_t BgL_bz00_9;
BgL_az00_8 = BgL_az00_17;
BgL_bz00_9 = BgL_bz00_18;
obj_t BgL_list1079z00_11;
obj_t BgL_arg1081z00_12;
BgL_arg1081z00_12 = MAKE_PAIR(BgL_bz00_9, BNIL);
BgL_list1079z00_11 = MAKE_PAIR(BgL_az00_8, BgL_arg1081z00_12);
return BGl_zb2zb2zz__r4_numbers_6_5z00(BgL_list1079z00_11);
}

In this case, all parameters are of type obj_t (which is a pointer
type).

Furthermore, looking at the actual implementation of 2+, it is
defined as

(define (2+ x y)
(2op + x y))

and then 2op is defined as

(define-macro (2op op x y)
(let ((opfx (symbol-append op 'fx))
(opfl (symbol-append op 'fl))
(opelong (symbol-append op 'elong))
(opllong (symbol-append op 'llong)))
`(cond
((fixnum? ,x)
(cond
((fixnum? ,y)
(,opfx ,x ,y))
((flonum? ,y)
(,opfl (fixnum->flonum ,x) ,y))
((elong? ,y)
(,opelong (fixnum->elong ,x) ,y))
((llong? y)
(,opllong (fixnum->llong ,x) ,y))
(else
(error ,op "not a number" ,y))))
; more cases enumerating all possible
; combinations of fixnum, flonum, elong, and llong

Now, compare this with Python:

- In the general case of the add definition, the code Bigloo
generates is roughly equivalent to the sequence of function calls
the Python interpreter performs. For Python,

def add(a,b):
return a+b

translates into

2 0 LOAD_FAST 0 (a)
3 LOAD_FAST 1 (b)
6 BINARY_ADD
7 RETURN_VALUE
8 LOAD_CONST 0 (None)
11 RETURN_VALUE

The entire work is done in BINARY_ADD: It allocates a tuple with
the two arguments, then dispatches to the actual __add__
implementation. Compared to the code Bigloo generates, this might
be more efficient (a single memory allocation instead of two).

- the approach of collecting all implementations of + in a single
place of Bigloo cannot be transfered to Python. Python's
implementation of + is dynamically extensible.

- the approach of directly calling add() in the toplevel init
cannot be applied to Python, either. The meaning of the name
"add" can change between the time of definition and the actual
call. Therefore, the simple function call must go through a
table lookup.

So while it would be possible to apply the same strategy to
Python, it likely wouldn't gain any performance increase over
the interpreter.

Regards,
Martin
 
R

Ravi Teja

So while it would be possible to apply the same strategy to
Python, it likely wouldn't gain any performance increase over
the interpreter.

Thanks,
That was quite illustrative. But as I posted elsewhere, I am looking at
the other advantages of native compilation rather than speed. Python's
ability to interface with C code is quite good and I am happy to simply
do the performance critical parts in it although I rarely end up
needing to do that. But more often I am looking to use Python libraries
in other languages since I am more familiar with them and they
typically tend to be more high level (the way I like it) than the
standard libraries of others.

Perhaps someone already has a tool that simplifies / automates creating
DLLs that embed and export Python functions as C prototypes. This was
discussed in Pyrex mailing list a while ago.
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Ravi said:
But more often I am looking to use Python libraries
in other languages since I am more familiar with them and they
typically tend to be more high level (the way I like it) than the
standard libraries of others.

Ah. In that case, the normal C API should work fine to call into Python
in most cases, no? If not, have you tried the CXX package, or
Boost::python?

Regards,
Martin
 
R

Ravi Teja

Martin said:
Ah. In that case, the normal C API should work fine to call into Python
in most cases, no? If not, have you tried the CXX package, or
Boost::python?

I am aware of those. But my "other" languages are not C/C++. Sure I
could first wrap them in a DLL written in C/C++ using CXX/Boost and
then call the DLL from the language in question. But that is more
trouble than it is worth.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,283
Messages
2,571,409
Members
48,102
Latest member
charleswillson

Latest Threads

Top