Seven new VMs, all in a row

P

Peter Suk

Hello George,

G> I'm curious - how good a fit are Ruby semantics for Smalltalk VMs?
Do
G> Smalltalk VMs/bytecode provide the following Ruby facilities, and,
if
G> not, how easily can they be emulated?

I would like to add

- singleton methods

to this list as i think this does not exist in SmallTalk, and this is
a feature that could knock out the whole method compilation algorithm.

Nope. (Rest is left as an excercise.)
 
A

Avi Bryant

George said:
I'm curious - how good a fit are Ruby semantics for Smalltalk VMs?

They fit almost exactly, which is why 20-30x is not an outrageous speed
claim, and why comparisons with Mono, JVM, Parrot etc are entirely
missing the point: this is not equivalent to compiling Ruby to Java
classes, say, but to realising that Sun's Hotspot team had been writing
a VM *specifically to run Ruby* all along - which they effectively were
before Sun crippled it, but that's another story.
Do
Smalltalk VMs/bytecode provide the following Ruby facilities, and, if
not, how easily can they be emulated?

- blocks (with Ruby semantics in case they're different from
Smalltalk's)
Yes.

- mixins

Yes. Lothar also asked about singleton methods, to which the answer is
also Yes.
- instance variables that don't need to be predeclared with the class

Not exactly as Ruby does, but you can simply add new instance variables
to the class definition whenever you compile a method that uses a new
one.
- class variables & constants
Yes.

- throw/catch
Yes.

- exceptions
Yes.

- continuations

Yes (depending on the VM; Squeak, VisualWorks, and Dolphin all support
them).
- support for OS facilities like select, signals etc

Not portably; the OS facilities and binary extension mechanisms are
highly specific to VMs.

One thing nobody's asked about is variable and optional arguments for
methods. Some VMs (like Squeak's) can handle this just fine, but some
may require that the number of args pushed onto the stack match the
selector. In any case, to do it optimally it would require generating
bytecode, not Smalltalk code, which makes targetting a whole range of
VMs difficult; to do it portably would mean creating an extra Array on
every message send.

Avi
 
P

Peter Suk

I'm curious - how good a fit are Ruby semantics for Smalltalk VMs? Do
Smalltalk VMs/bytecode provide the following Ruby facilities, and, if
not, how easily can they be emulated?

- blocks (with Ruby semantics in case they're different from
Smalltalk's)
- mixins
- instance variables that don't need to be predeclared with the class
- class variables & constants
- throw/catch
- exceptions
- continuations
- support for OS facilities like select, signals etc

I still have to look into continuations and OS facilities in detail.
(Pointers to where these are described in detail for Ruby?) However I
believe continuations are supported for Smalltalk exceptions. Mixins
will be supported by the Compiler manipulating Single Inheritance when
Classes are defined. But as for everything else:
- blocks (with Ruby semantics in case they're different from
Smalltalk's)

Almost the same as Smalltalk's.
- instance variables that don't need to be predeclared with the class

Can be added dynamically in Smalltalk.
- class variables & constants
Check.

- throw/catch

Really the same as below.
- exceptions

Check.

From now on, if you are completely ignorant of Smalltalk, I will cease
answering these questions here.

--Peter
 
R

Robert Feldt

Hello everyone,

I thought I'd talk about my new project here, since there is a good
chance that someone might be interested in it. I'm planning to put
Ruby on top of Smalltalk VMs. Ruby and Smalltalk are very similar
under the covers, so Smalltalk VMs are a very good match for the
language. This will give Ruby a much faster execution environment
(perhaps 30X), VMs which are capable of incremental garbage collection,
generational garbage collection that is so fast your progra still works
even with an infinite loop allocating new objects (I do this as a lark
sometimes), a wonderful debugger which will let programmers modify
methods on the fly & continue execution, a "workspace" window where you
can execute arbitrary code, a visual "inspect", a powerful "Refactoring
Browser," an industrial strength OODB (Gemstone) with objects and
methods you can define in Ruby, and a readily accessible meta-level
which will allow Rubyists to readily modify their own language. (For
example, you could then use Method wrappers to very quickly implement
an Aspect-Oriented Ruby.)

My strategy for doing this involves writing a Ruby parser (or, rather,
translating the existing one in JRuby to Ruby) then writing a Smalltalk
Parser object to request parsing of Ruby code into an AST from a Ruby
program outside of Smalltalk. We then reify the AST inside the image
and use it to compile Ruby methods into bytecodes. (Multiple syntaxes
can coexist in one Smalltalk image.) Once this is done, we can then
compile the external Ruby parser and bring it into Smalltalk.
Afterwards, we can use the Refactoring Browser Smalltalk parser plus a
little runtime type inferencing to incrementally transform the image
into pure Ruby.

We can do all of this without changing the structure of Ruby files &
Modules or requiring Rubyists to do Smalltalk style image oriented
development. And for those of you sharp enough to wonder: yes, we can
handle Modules, Mixins, and fully qualified Method names without
changing the Smalltalk VMs. (At least those that have Namespaces.)

If anyone is interested, please drop me a line.
I'm interested; sounds like a great idea. However, I haven't been very
lucky turning up (free) info about the ("commercial") Smalltalk VM's
and their instruction sets etc (apart from the classic smalltalk
papers and I guess squeak where everything is open). Do you have good
pointers to such info?

Best,

Robert Feldt
(e-mail address removed)
 
A

Avi Bryant

Sorry, my last post was meant to be in reply to Lothar's request for a
VM spec. I still haven't got the hang of this new google groups UI...

Avi
 
L

Lothar Scholz

Hello Avi,

AB> Sorry, my last post was meant to be in reply to Lothar's request for a
AB> VM spec. I still haven't got the hang of this new google groups UI...

Yes, nice but this is Squeak. It seems that this is just a bytecode
machine without a JIT. Is there any document available vor VisualWorks ?
 
L

Lothar Scholz

Hello Lothar,

LS> Hello Avi,

AB>> Sorry, my last post was meant to be in reply to Lothar's request for a
AB>> VM spec. I still haven't got the hang of this new google groups UI...

LS> Yes, nice but this is Squeak. It seems that this is just a bytecode
LS> machine without a JIT. Is there any document available vor VisualWorks ?

just found this paragraph and then stopped reading the document:

---------------------------
In a typical system it often turns out that the same message is sent to instances of the
same class again and again; consider how often we use arrays of SmallInteger or
Character or String. To improve average performance, the VM can cache the
found method. If the same combination of the method and the receiver's class are
found in the cache, we avoid a repeat of the full search of the MethodDictionary
chain. See the method Interpreter > lookupInMethodCacheSel:class:
for the implementation.
VisualWorks and some other commercial Smalltalks use inline cacheing, whereby the
cached target and some checking information is included inline with the dynamically
translated methods. Although more efficient, it is more complex and has strong
interactions with the details of the cpu instruction and data caches.
---------------------------

So message dispatching with squeak is not much more efficent then what
i expect from YARV, a simple bytecode dispatcher with dynamic method
lookup tables and a small lookup cache. It now seems to be a simple question
of the GC and there i would again vote for the Boehm Weisser GC which
is a quite fast incremental GC working on all popular platforms and much easier
to integrate (even with typing hints) into "gc.c" then your project.

Did you ever thought about the legal problems of your problems when
using the visual works engine ? I did a short look at the cincom webpage and
i guess it is as expensive as ever, which means > 5000 US$ per
license. I hate it when the guys are not even publishing there prices
without a contact form.
 
A

Avi Bryant

Robert said:
Great, thanks Avi. Can I assume the commercial VMs are similar to this
or do you know of detailed descriptions of them?

I don't know of any detailed descriptions. Apart from the obvious fact
that they have JITs and the stock Squeak engine doesn't, one difference
I know is that VisualWorks still uses an object table rather than
direct object references, which makes #become: much faster and probably
helps with garbage collection. However, for the purpose of knowing how
to model Ruby on top of the VM (the object format, classes and method
dictionaries, how context stacks work), they should all be very
similar, at least on the surface - never mind what optimizations are
going on below.
BTW, wasn't Peter Deutsch himself onto a similar project for Python:

http://blog.amber.org/2004/08/12/python-on-smalltalks-vm/

Yes, but that's a much harder problem, since Python has very different
semantics from Smalltalk, whereas Ruby's are for most purposes
identical.

Avi
 
P

Peter Suk

Did you ever thought about the legal problems of your problems when
using the visual works engine ?

Yes. I used to work for them.
I did a short look at the cincom webpage and
i guess it is as expensive as ever, which means > 5000 US$ per
license. I hate it when the guys are not even publishing there prices
without a contact form.

Yes. I wish they'd cut that out, but they can be quite "old-school"
software industry-wise.

The main purpose for the VisualWorks version is for commercial server
images that have to run fast. Certain organizations will prefer to
have a product with official support and will pay a premium for this
and speed. If/When this happens, I'll license their Object Engine as a
VAR, and pass on the licensing costs. (Also, they are looking for new
products to sell.) Otherwise, the Ruby image can just be a fast
platform for education and academic research in Ruby just as it is for
Smalltalk. Also, it is the platform that I know best, so it is a good
place for me to start. For purposes of just having the Refactoring,
Debugging, Browsing environment, people who want free as in speech will
probably opt for hosting on Squeak.

--Peter
 
M

Matt Lawrence

I'm not *building* a Ruby VM on top of a Smalltalk VM. The Smalltalk VM will
be allocating Objects and JIT-ing bytecodes natively. There is almost no
Smalltalk in the Smalltalk VMs, and what semantics there are are a very good
fit for Ruby anyhow. All I'm doing is providing enough meta-Ruby to enable
the Smalltalk VMs to run Ruby (including a Ruby compiler written in Ruby).

(But this language defined in itself is a weird concept to most programmers,
and I'll probably go blue in the face repeating this.)

Makes sense to me. All in all, this sounds like a really cool project and
I'm looking forward to trying it.

My Ruby skills are still a bit weak, but I know a lot about various
low-level architectures. Koichi's talk last year was very interesting.

-- Matt
Nothing great was ever accomplished without _passion_
 
G

Glenn Parker

Avi said:
Not exactly as Ruby does, but you can simply add new instance variables
to the class definition whenever you compile a method that uses a new
one.

Isn't that jumping the gun just a bit? An instance variable (in Ruby)
should not exist in an object until the line that assigns/creates it is
actually executed. It's a subtle point, but it could impact some types
of reflective programming. Maybe you can mask its existence somehow?
 
G

George

Avi / Peter -- thanks for providing the feedback I was looking for.
Ruby and Smalltalk do indeed sound like a good match. I remember Robert
Feldt saying a few years ago that he'd done some experiments with Ruby
From now on, if you are completely ignorant of Smalltalk,
I will cease answering these questions here.

Yikes! I _am_ almost completely ignorant of Smalltalk (I'm know the
basic concepts of the language, but I've never programmed in it), but
I've done some work on VMs and was genuinely interested in the answers.
 
R

Robert Feldt

Avi / Peter -- thanks for providing the feedback I was looking for.
Ruby and Smalltalk do indeed sound like a good match. I remember Robert
Feldt saying a few years ago that he'd done some experiments with Ruby
on Smallscript's VM that looked promising.
That's right George. However, I concluded Smallscript was not open
enough to build on at that time and haven't followed it since then.
Also I just did some ruby2smallscript (source to source) translation
experiments so it was a different approach.

Best,

Robert
 
A

Avi Bryant

Glenn said:
Isn't that jumping the gun just a bit? An instance variable (in Ruby)
should not exist in an object until the line that assigns/creates it is
actually executed. It's a subtle point, but it could impact some types
of reflective programming. Maybe you can mask its existence somehow?

Yes, pretty easily I'd think. For example: when you create a new
instance in Smalltalk, all the instance variables start out initialized
to Smalltalk's nil value (of class UndefinedObject). As soon as it was
referenced or assigned to, that would get replaced with some Ruby value
(possibly Ruby's nil, of class NilClass). So you could always tell for
a given instance which instance variables already "exist" inside the
Ruby semantics.

You could also just use a hashtable for each instance to hold all of
the variables, like Ruby does, but the fact that Smalltalk can do
direct instance variable access ends up being a nice speed and memory
gain, so I'd rather not give that up.

Avi
 
P

Patrick Down

What about systems like Rails where members are added to the class
definition at runtime from a database definition? ( I assume that's how
it works. )
 
G

Glenn Parker

Avi said:
Yes, pretty easily I'd think. For example: when you create a new
instance in Smalltalk, all the instance variables start out initialized
to Smalltalk's nil value (of class UndefinedObject). As soon as it was
referenced or assigned to, that would get replaced with some Ruby value
(possibly Ruby's nil, of class NilClass). So you could always tell for
a given instance which instance variables already "exist" inside the
Ruby semantics.

You could also just use a hashtable for each instance to hold all of
the variables, like Ruby does, but the fact that Smalltalk can do
direct instance variable access ends up being a nice speed and memory
gain, so I'd rather not give that up.

The Smalltalk Ruby will still need to handle more dynamic methods of
instance variable creation:

class MyClass
def add_ivar(name)
instance_variable_set(name, nil)
end
end

What happens to instances that have already been created when a new
instance variable is seen by the compiler?

There are also issues with using more memory than necessary if the
interpreter creates every instance variable the moment it is observed by
the compiler.

I'm guessing Ruby instance variables will have to be created dynamically.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,169
Messages
2,570,920
Members
47,464
Latest member
Bobbylenly

Latest Threads

Top