ANN: ParseTree 1.3.3 and ruby2c 1.0.0 beta 1

R

Ryan Davis

Actual announcements are on http://blog.zenspider.com/

Copy/paste job below:

=====

I am releasing ParseTree 1.3.3 today in preparation of our ruby2c
release (also today). Changes in ParseTree are minor, but necessary for
ruby2c.

ParseTree is a C extension (using RubyInline) that extracts the parse
tree for an entire class or a specific method and returns it as a
s-expression (aka sexp) using ruby's arrays, strings, symbols, and
integers.

As an example:
def conditional1(arg1)
if arg1 == 0 then
return 1
end
return 0
end

becomes:
[:defn,
:conditional1,
[:scope,
[:block,
[:args, :arg1],
[:if,
[:call, [:lvar, :arg1], :==, [:array, [:lit, 0]]],
[:return, [:lit, 1]],
nil],
[:return, [:lit, 0]]]]]

Features/Problems:
+ Uses RubyInline, so it just drops in.
+ Includes SexpProcessor and CompositeSexpProcessor.
+ Allows you to write very clean filters.
+ Includes show.rb, which lets you quickly snoop code.
+ Includes abc.rb, which lets you get abc metrics on code.
+ abc metrics = numbers of assignments, branches, and calls.
+ whitespace independent metric for method complexity.
+ Only works on methods in classes/modules, not arbitrary code.
+ Does not work on the core classes, as they are not ruby (yet).

Changes:
+ 3 minor enhancement
+ Cleaned up parse_tree_abc output
+ Patched up null class names (delegate classes are weird!)
+ Added UnknownNodeError and switched SyntaxError over to it.
+ 2 bug fixes
+ Fixed BEGIN node handling to recurse instead of going flat.
+ FINALLY fixed the weird compiler errors seen on some versions of
gcc 3 .4.x related to type punned pointers.

=====

Releasing ruby2c 1.0.0 beta 1

After far too long, I finally have the dubious honor of releasing
ruby2c 1.0.0 beta 1 today. I'm itching to do it, we really need to get
it out there so people can get their eyes on it and give us feedback.
I'm also nervous as hell... the thing is a mess!

Understand what we mean by beta. It means we need eyes on it, it means
it was ready enough to put out in the wild, but it also means that it
isn't ready for any real use.

What can it do?

Well, currently it can pass all of its unit tests (325 tests with 512
assertions) and it can translate nice simple static algorithmic code
into C without much problem. For example:
& cat x.rb
class Something
def blah; return 2+2; end
def main; return blah; end
end
& ./translate.rb x.rb > x.c
& gcc -I /usr/local/lib/ruby/1.8/powerpc-darwin x.c
x.c: In function `main':
x.c:17: warning: return type of `main' is not `int'
& ./a.out
& echo $?
4

What can it not do?

More than it can.

It can't (and won't) translate dynamic code. Period. That is simply not
the intent.

It probably can't translate a lot of static code that we simply haven't
come across or anticipated yet. Our tests cover a fair amount, our
validation runs cover a lot more than that, but it is still fairly
idiomatic ruby and that puts us at being better at certain styles of
coding and much worse at others.

It is also simply rough around the edges. We've rounded out the rdoc
but haven't done a thing for general documentation yet. These are on
our list, and rather high on our priority list, but we just haven't had
the time yet. For now, check out the rdoc and the PDF presentation that
we've had up for a while.

PLEASE: file bugs! We need feedback and we'd like to be able to track
it. The ruby2c project is on rubyforge and I'm getting the trackers set
up today as well.
 
G

George Moschovitis

This is what I 've been waiting for a LOONG time :)
Cant wait to start playing with this!

thank you very much!
George.
 
A

Alexander Kellett

I guess the name Ruby2C and its goals are not well choosen, for in my
opinion it makes no sense to rely on type inferal and conversion to
C-types in a inherently dynamic language like ruby. Furthermore, it is
very restrictive subset you choose.

not my place to say really as i'm not involved
directly in the project, but... the idea of ruby2c
is to make it possible to write an interpreter in
a fairly idiomatic ruby subset. the aim is not to
be used for directly executing end user code, but
instead for making a maintainable interpreter written
in this subset ruby, and for making it *much* easier
for ruby coders to write fast extension modules
without forcing them to code c :)
In Smalltalk, altough most things are written in Smalltalk itself, they
rely on a VM, which is able to interpret all kinds of smalltalk code. I
think this approach, which maybe YARV may realize, is much more
appropriate for a dynamic language like ruby.

maybe the paragraph is just confusing me :), but just in case,
the smalltalk vm doesn't directly execute smalltalk but instead
a fairly low level (though certainly not processor level) bytecode,
the smalltalk execution still requires compilation. much as with yarv.

yarv is written in c. thats already enough for me to
dislike it unfortunately though no offense to koichi he's
doing an *excellent* job.

Alex
 
A

Alexander Kellett

congratulations on the release!!!!
Alex

Actual announcements are on http://blog.zenspider.com/

Copy/paste job below:

=====

I am releasing ParseTree 1.3.3 today in preparation of our ruby2c
release (also today). Changes in ParseTree are minor, but necessary
for ruby2c.

ParseTree is a C extension (using RubyInline) that extracts the parse
tree for an entire class or a specific method and returns it as a
s-expression (aka sexp) using ruby's arrays, strings, symbols, and
integers.

As an example:
def conditional1(arg1)
if arg1 == 0 then
return 1
end
return 0
end

becomes:
[:defn,
:conditional1,
[:scope,
[:block,
[:args, :arg1],
[:if,
[:call, [:lvar, :arg1], :==, [:array, [:lit, 0]]],
[:return, [:lit, 1]],
nil],
[:return, [:lit, 0]]]]]

Features/Problems:
+ Uses RubyInline, so it just drops in.
+ Includes SexpProcessor and CompositeSexpProcessor.
+ Allows you to write very clean filters.
+ Includes show.rb, which lets you quickly snoop code.
+ Includes abc.rb, which lets you get abc metrics on code.
+ abc metrics = numbers of assignments, branches, and calls.
+ whitespace independent metric for method complexity.
+ Only works on methods in classes/modules, not arbitrary code.
+ Does not work on the core classes, as they are not ruby (yet).

Changes:
+ 3 minor enhancement
+ Cleaned up parse_tree_abc output
+ Patched up null class names (delegate classes are weird!)
+ Added UnknownNodeError and switched SyntaxError over to it.
+ 2 bug fixes
+ Fixed BEGIN node handling to recurse instead of going flat.
+ FINALLY fixed the weird compiler errors seen on some versions of
gcc 3 .4.x related to type punned pointers.

=====

Releasing ruby2c 1.0.0 beta 1

After far too long, I finally have the dubious honor of releasing
ruby2c 1.0.0 beta 1 today. I'm itching to do it, we really need to get
it out there so people can get their eyes on it and give us feedback.
I'm also nervous as hell... the thing is a mess!

Understand what we mean by beta. It means we need eyes on it, it
means it was ready enough to put out in the wild, but it also means
that it isn't ready for any real use.

What can it do?

Well, currently it can pass all of its unit tests (325 tests with 512
assertions) and it can translate nice simple static algorithmic code
into C without much problem. For example:
& cat x.rb
class Something
def blah; return 2+2; end
def main; return blah; end
end
& ./translate.rb x.rb > x.c
& gcc -I /usr/local/lib/ruby/1.8/powerpc-darwin x.c
x.c: In function `main':
x.c:17: warning: return type of `main' is not `int'
& ./a.out
& echo $?
4

What can it not do?

More than it can.

It can't (and won't) translate dynamic code. Period. That is simply
not the intent.

It probably can't translate a lot of static code that we simply
haven't come across or anticipated yet. Our tests cover a fair amount,
our validation runs cover a lot more than that, but it is still fairly
idiomatic ruby and that puts us at being better at certain styles of
coding and much worse at others.

It is also simply rough around the edges. We've rounded out the rdoc
but haven't done a thing for general documentation yet. These are on
our list, and rather high on our priority list, but we just haven't
had the time yet. For now, check out the rdoc and the PDF presentation
that we've had up for a while.

PLEASE: file bugs! We need feedback and we'd like to be able to track
it. The ruby2c project is on rubyforge and I'm getting the trackers
set up today as well.
 
B

Benedikt Huber

It can't (and won't) translate dynamic code. Period. That is simply not
the intent.
[This is just my personal opinion, and is not meant to be mean]

I guess the name Ruby2C and its goals are not well choosen, for in my
opinion it makes no sense to rely on type inferal and conversion to
C-types in a inherently dynamic language like ruby. Furthermore, it is
very restrictive subset you choose.

In your example, you end up with a method like
void hello1(long param) { .. }

Classes compiled in such a way:
* Need a very restrictive wrapper to be called from ruby
* Methods have to be final
* No dynamic binding
* Explicit type conversion before calling the method
* Cannot be extended from ruby
* Do not use the ruby C framework

Smalltalk was mentioned in the BLOG as an example of a language, where
most functionality is written in the language itself, but:

In Smalltalk, altough most things are written in Smalltalk itself, they
rely on a VM, which is able to interpret all kinds of smalltalk code. I
think this approach, which maybe YARV may realize, is much more
appropriate for a dynamic language like ruby.

But then again, maybe I did not understand your intent.

best regards,
 
M

martinus

This is very cool. Would it be possible to use PraseTree for a
refactoring toolkit?

martinus
 
B

Benedikt Huber

for making it *much* easier for ruby coders to write fast extension
modules without forcing them to code c :)
I understood this point. So RubyC would be a better name (i.e. a high
level description language for C with automatic type inference). If
this is the _main_ goal, i can see some benefits. Also, you would
have to supply some low-level IO mechanism if you want to write e.g.
hardware related extensions.
maybe the paragraph is just confusing me :), but just in case, the
smalltalk vm doesn't directly execute smalltalk but instead a fairly low
level (though certainly not processor level) bytecode, the smalltalk
execution still requires compilation. much as with yarv.
I apologize. I was talking about Smalltalk bytecode - but bytecode can do
the same things as sourcecode (if you have a compiler, of course).
 
M

Michael Walter

I understood this point. So RubyC would be a better name (i.e. a high
level description language for C with automatic type inference). If
this is the _main_ goal, i can see some benefits. Also, you would
have to supply some low-level IO mechanism if you want to write e.g.
hardware related extensions.
How about PreRuby? [1] :)

Winking to welcome everyone-ly yours,
Michael

[1] "The PreScheme compiler makes use of type inference, partial
evaluation and Scheme and Lisp compiler technology to compile the
problematic features of Scheme, such as closures, into C code without
significant run-time overhead."
 
E

Eric Hodel

--Apple-Mail-39-214053251
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=US-ASCII; format=flowed

It can't (and won't) translate dynamic code. Period. That is simply
not
the intent.
[This is just my personal opinion, and is not meant to be mean]

I guess the name Ruby2C and its goals are not well choosen, for in my
opinion it makes no sense to rely on type inferal and conversion to
C-types in a inherently dynamic language like ruby.

We're only human. Getting Ruby2C as far as it is has been a very large
investment of our time, and I think its cool that we have any type
inferencing at all. We'd like to be able to translate more dynamic
things, but it will be easier if we have other eyeballs on this helping
us out.

You should, however, check out the propaganda document if you haven't
already, it gives a much better idea of our goals:

http://www.zenspider.com/~ryand/Ruby2C.pdf
Furthermore, it is very restrictive subset you choose.

That's because we haven't yet had the time to make it any larger than
it is. We are releasing because we want to recruit people to help us
expand that subset. (And to do other things, check out the propaganda
document above.)

Instead, we focused on having a very helpful tool-chain and an
extensive suite of tests. These have helped us do very powerful things
to the Ruby AST in a very short amount of time.
In your example, you end up with a method like

Classes compiled in such a way:
* Need a very restrictive wrapper to be called from ruby
* Methods have to be final
* No dynamic binding

Check out the propaganda document... There's an interesting slide near
the very end.
* Explicit type conversion before calling the method

You have to do this when writing wrappers to C code anyhow, but again,
read that propaganda document.
* Cannot be extended from ruby

The propaganda document gives a good workaround for this, and is a good
example of the 90/10 rule.
* Do not use the ruby C framework

But they can! See the propaganda document.

Remember that extension writing is *not* our goal, it is more of a side
benefit that Ruby2C gives you.
Smalltalk was mentioned in the BLOG as an example of a language, where
most functionality is written in the language itself, but:

In Smalltalk, altough most things are written in Smalltalk itself, they
rely on a VM, which is able to interpret all kinds of smalltalk code. I
think this approach, which maybe YARV may realize, is much more
appropriate for a dynamic language like ruby.

But then again, maybe I did not understand your intent.

You catch it exactly, but you miss how our tool fits into what Squeak
Smalltalk does.

You could write all of the code in Ruby's core, even the VM, in Ruby.
Then you translate the absolute minimum to C you need automatically
with Ruby2C (eventually, just the VM).

In order to get there, however, you need to have Ruby2C working in
general.

Ruby2C is better suited to the C side of things, Array, Hash, String, a
VM, than it is to the Ruby side of things, because the C side of Ruby
is much less dynamic. These things can all be written in the Ruby2C
subset then translated automatically. As Ruby VMs get faster,
eventually Ruby2C will no longer be necessary, and hopefully will only
be used on the Ruby VM itself.

--
Eric Hodel - (e-mail address removed) - http://segment7.net
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04

--Apple-Mail-39-214053251
content-type: application/pgp-signature; x-mac-type=70674453;
name=PGP.sig
content-description: This is a digitally signed message part
content-disposition: inline; filename=PGP.sig
content-transfer-encoding: 7bit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Darwin)

iD8DBQFCASEkMypVHHlsnwQRAiZGAJsHMm9xBjrShmtdePUNy1YBdp2HBACfROCP
hopvr4VIByL/pHffr8zxkHM=
=vVYZ
-----END PGP SIGNATURE-----

--Apple-Mail-39-214053251--
 
B

Benedikt Huber

You should, however, check out the propaganda document if you haven't
already, it gives a much better idea of our goals:
I had read it, but I missed that page at the end. Sorry for that.
But inlining a method, and converting a whole program to plain C (w.o.
the overhead from dynamic method dispatch etc.) are two different
things. For my defense: the latter is what you promote in the first 25
slides.
You could write all of the code in Ruby's core, even the VM, in Ruby.
Then you translate the absolute minimum to C you need automatically with
Ruby2C (eventually, just the VM).
Ok, this sounds very ambitious. The ruby core is well written and it's
hard to write an equivalent substitution.
And the VM part: It think it is very hard to write a fast VM in C. A
Ruby2C translator which generates a fast VM sounds like a miracle.

Anyway, good luck - I'm sure it is a lot of fun.
At least you do not have to write in C[1] ;)

[1] http://gnu.de.uu.net/wic.html
 
R

Ryan Davis

This is very cool. Would it be possible to use PraseTree for a
refactoring toolkit?

Hrm. Not having written a refactoring toolkit, I can only speculate.
I'd say it has an OK chance at it, but ParseTree itself is a very very
thin layer to (what I'm guessing) is a lot of infrastructure needed to
do a full refactoring tool.

The problem is that the parse tree doesn't preserve comments at all.
This alone could make a good refactoring browser difficult. My guess is
that you'd use ParseTree just for analysis and something like emacs or
freeride to do the actual work.

As an aside, I wrote a (very) quick proof of concept for RubyInline
that uses ParseTree. It is called Ruby2Ruby. It allows you to
"translate" ruby into ruby by providing a to_ruby method on the Method
class. By doing something like: obj.method:)meth).to_ruby you get a
string back that is (as close as we can get it) a reconstruction of the
method's source code. to_ruby is actually pretty small. It grabs the
parse tree for the method in question, and runs it through a rather
small (because it is incomplete) and simple (because SexpProcessor's
architecture is really cool) class called RubyToRubyProcessor. Not much
work. It took me roughly 30 minutes to get the PoC done.
 
R

Ryan Davis

Care to offer any comparisons with Python Pyrex?
http://nz.cosc.canterbury.ac.nz/~greg/python/Pyrex/

I've only taken a brief look, but from what I can gather Pyrex is a lot
more like RubyInline than it is like ruby2c. In RI you can do things
like this:

require 'inline'
class MyTest
def factorial(n)
f = 1
n.downto(2) { |x| f *= x }
f
end

inline do |builder|
builder.include "<math.h>"
builder.c "
long factorial_c(int max) {
int i=max, result=1;
while (i >= 2) { result *= i--; }
return result;
}"
end
end

and call both MyTest.new.factorial(5) and MyTest.new.factorial_c(5).
Just by loading the "file" above, all of your argument and return type
conversion is done for you, the code is exported, compiled, linked, and
loaded back in (And only done when actual changes occur in the code I
might add). No install.rb or setup.rb. No waiting. No extra phases at
all.
 
R

Ryan Davis

[1] "The PreScheme compiler makes use of type inference, partial
evaluation and Scheme and Lisp compiler technology to compile the
problematic features of Scheme, such as closures, into C code without
significant run-time overhead."

Dood. Thanks. I'll have to check that out.
 
R

Ryan Davis

AFAIK, rrb project which adds some refactoring capabilities to emacs
uses similar to ParseTree library called ripper. BTW, ripper is now
part of Ruby 1.9.

Whoa. You give ParseTree a lot more credit than it deserves. Ripper is
big, and it is doing real work to do what it does. ParseTree is a
little brown stinky ferret that digs down a hole and violently rips the
AST away from the warm bosom of ruby. In other words, we cheat, they
don't.
 
R

Ryan Davis

not my place to say really as i'm not involved directly in the
project, but... the idea of ruby2c is to make it possible to write an
interpreter in a fairly idiomatic ruby subset. the aim is not to be
used for directly executing end user code, but instead for making a
maintainable interpreter written in this subset ruby, and for making
it *much* easier for ruby coders to write fast extension modules
without forcing them to code c :)

I'm torn. On one hand I'd like to solely focus on metaruby (keep the
eye on the ball). On the other, I think ruby2c has good potential to be
generally usable by a much wider audience to optimized bottlenecked
code. It seems to me a good way to recruit for a majority of the
toolset so we can then better balance our time between the two goals.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,226
Members
46,815
Latest member
treekmostly22

Latest Threads

Top