Concurrent Ruby?

K

Kyle Murphy

Apologies if this is a really stupid question, I am new to programming,
but after reading about Erlang and it's speed increase on multi-core
devices I had to ask.

With Matz supposedly making Ruby 2.0 right now, is it possible to make
it concurrent like Erlang so as to take advantage of the future
multi-core devices? Thank you.
 
D

David Masover

With Matz supposedly making Ruby 2.0 right now, is it possible to make
it concurrent like Erlang

Not like Erlang, no.

Erlang does a couple of things differently. The most obvious one, which makes
it so scalable, is the message-passing -- Erlang uses "processes" and
message-passing almost as a programming paradigm. We talk
about "Object-Oriented Programming"; Erlang people talk
about "Concurrency-Oriented Programming".

These are much easier to write and scale than threads, and they perform much
better than single threads.

There are a few of us working to rectify this situation, at least
semantically -- there's Revactor, Dramatis, and my own unreleased project
which I've been wasting a few weekend hours on.

Another reason, which I'm running into while working on the above project, is
that Erlang has no mutable data. It even goes so far as to make variables
single-assignment, which is just annoying, but the data structures themselves
are never changed. Take a simple (contrived) Ruby example:


def some_function(options={})
options[:foo] ||= 'Foo'
options[:bar] ||= 'Bar'
options[:foobar] ||= options[:foo] + options[:bar]

some_file.each_line do |line|
line.chomp!
line.gsub! /curses/i, '******'
puts line
end
end


See, we're changing things. Arrays, strings, whatever -- it's actually the
characters inside the string that are changing.

In Erlang, (almost) no data ever changes, you just create new data. Which
means that when you send a message to another process, it's as simple as
sending a pointer across -- which means it's not only a constant-time
operation, it's an absurdly cheap constant-time operation. So the data is
shared, but because it never changes, you don't have to lock it.

Which means that in Erlang, message-passing is so cheap we don't have to worry
about it. If we ported the message-passing to Ruby, it's either unreliable or
it's massively expensive and still somewhat unreliable. I'm not sure there's
a good way around this, though if there is, I intend to find it.
so as to take advantage of the future
multi-core devices? Thank you.

This might happen -- maybe, sort of. Keeping all of the above in mind,
threading in Ruby is modeled after the traditional C and Java model, which
means they're probably more expensive to create, and certainly more
dangerous, which means there won't be as many of them.

On top of all that...

Right now, Ruby shares a problem with Python called the GIL -- the Global (or
Giant) Interpreter Lock. What this means is that only one Ruby instruction
may execute at a time. So even though they're using separate OS threads, and
even though different Ruby threads might run on different cores, the speed of
your program (at least the Ruby part) is limited to the speed of a single
core.

The standard response, which you'll probably already see (since I'm taking the
time to write a longer answer), is that you can do threading in two ways:
Either fork off a whole new Ruby process, so you probably can't have any
shared-memory problems -- and/or write the expensive parts in C, and have
your C extension release the Ruby GIL.

(See, you can have more than one bit of C code running in a Ruby program at
once, even alongside all the Ruby stuff -- at least until they need to do
something with Ruby itself.)

There's also JRuby, which uses Java's native threads, and has no GIL. There
have been some problems with them lately, but they should work -- but again,
keep all of the above in mind. You'll be threading as well as Java does, not
as well as Erlang does.

As you can probably tell, I'm not really happy about all of this.

Now, unlike Python, it looks as though the Ruby GIL might eventually be
removed. And there is JRuby. And there's the various actor projects (mine
included). So it's conceivable that we'd get Ruby scalable to arbitrary
numbers of processors.

But again, I suspect Erlang is still going to do it better, if all you care
about is multicore and efficiency. (Ruby is doing a better job of Unicode,
has much more library support, and I much prefer its syntax.)
 
F

Florian Gilcher

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


a long mail

Nice writeup. You forgot one thing about Erlang, though: It is
(mostly) sideeffect-free while
object orientated languages always rely on sideeffects.
This makes it harder when it comes to concurrency.

Regards,
Skade
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)

iEYEARECAAYFAkiO+JAACgkQJA/zY0IIRZYWAQCgjyeagX/cPnHcYZWqgJq4BQSM
HjcAoKAhINdMzbO6tGzjnNoX37J6Oqu9
=P443
-----END PGP SIGNATURE-----
 
R

Robert Klemme

2008/7/29 Florian Gilcher said:
Nice writeup.

Absolutely agree. Thanks David!
You forgot one thing about Erlang, though: It is (mostly)
sideeffect-free while

Well, he said that data does not change which is basically the same.
object orientated languages always rely on sideeffects.

I'd rather say "usually" because immutable classes are quite common.
This makes it harder when it comes to concurrency.

Obviously.

Kind regards

robert
 
D

David Masover

Nice writeup.
Thanks!

You forgot one thing about Erlang, though: It is
(mostly) sideeffect-free while
object orientated languages always rely on sideeffects.

If I understand it right, side effects in Erlang simply take a different form.
Nothing's stopping me from sending random, spurious messages in the middle of
a supposedly-innocuous function.

I did talk about data not being mutable, which provides both a semantic
(lock-free) and a technical advantage (raw speed).

I'm trying to figure out how to at least partly duplicate the semantic
advantage in Ruby, but it's not easy -- I'm stuck either #freeze-ing
everything, or wrapping every message in an actor of its own, and both
approaches seem more obnoxious and error-prone than forcing the developer to
deal with it.
 
C

Charles Oliver Nutter

David said:
There's also JRuby, which uses Java's native threads, and has no GIL. There
have been some problems with them lately, but they should work -- but again,
keep all of the above in mind. You'll be threading as well as Java does, not
as well as Erlang does.

I'm not sure what you mean by problems...there have not been problems
with them lately; they work as you'd expect native threads to work. They
do require a bit more diligence on your part if you're sharing data
across the threads, since for performance reasons we don't do any extra
synchronization of e.g. Array, Hash, String. But native threads work
fine on JRuby.

- Charlie
 
A

ara.t.howard

I'm trying to figure out how to at least partly duplicate the semantic
advantage in Ruby, but it's not easy -- I'm stuck either #freeze-ing
everything, or wrapping every message in an actor of its own, and both
approaches seem more obnoxious and error-prone than forcing the
developer to
deal with it.

fan out multiple processes with a message queue each - easy to do with
drb. naive impl:


cfp:~> cat a.rb
b got "hello" (pid=94677)
a got "hello" (pid=94676)




cfp:~> cat a.rb

a =
actor {
recv_msg { |msg|
puts "a got #{ msg.inspect } (pid=#{ Process.pid })"
}
}


b =
actor {
recv_msg { |msg|
puts "b got #{ msg.inspect } (pid=#{ Process.pid })"
a.send_msg msg
}
}


b.send_msg 'hello'

STDIN.gets



BEGIN {

require 'rubygems'
require 'thread'
require 'drb'
require 'slave'

class Actor
include ::DRb::DRbUndumped

def initialize &block
@q = Queue.new
@block = block
act!
end

def act!
@Thread = Thread.new do
Thread.current.abort_on_exception = true
instance_eval &@block
end
end

def send_msg message
@q.push message
end

def recv_msg
while(( message = @q.pop ))
yield message
end
end
end

def actor(*a, &b)
Slave.new{ Actor.new(*a, &b) }.object
end

STDOUT.sync = true

}


a @ http://codeforpeople.com/
 
D

David Masover

fan out multiple processes with a message queue each - easy to do with
drb.

That implies a full copy (I think), which isn't always what's needed.

Without actually testing your implementation, what happens when I send, say, a
reference to an actor? (Kind of an essential feature.)

And without actually doing any benchmarks (how's that for naive?), I still
find it hard to believe that DRb+Queue would scale better than Thread+Queue,
for large numbers of actors. (Keep in mind, it's not unusual for an Erlang
program to have thousands of processes.)

Given that I still have a vague hope that YARV will eventually remove the GIL,
I'd rather stick to Threads, if I can make them safe.
 
D

David Masover

I'm not sure what you mean by problems...there have not been problems
with them lately;

Maybe it wasn't actually "lately".

And there's still the rest of it:
They
do require a bit more diligence on your part if you're sharing data
across the threads,

That's the whole problem that I'm attacking right now -- while a pure actor
model wouldn't share any data, I'm not even sure I can safely clone
everything properly, if I was going that route. And I'd rather not, for
obvious performance reasons.
 
A

ara.t.howard

That implies a full copy (I think), which isn't always what's needed.

Without actually testing your implementation, what happens when I
send, say, a
reference to an actor? (Kind of an essential feature.)

DRb handles references. DRbUndumped provides a means to pass
references to remote objects around.
And without actually doing any benchmarks (how's that for naive?), I
still
find it hard to believe that DRb+Queue would scale better than Thread
+Queue,
for large numbers of actors. (Keep in mind, it's not unusual for an
Erlang
program to have thousands of processes.)

no doubt that's true. processes can help you now though - especially
since threads don't scale right now in ruby with multi processor
machines.
Given that I still have a vague hope that YARV will eventually
remove the GIL,
I'd rather stick to Threads, if I can make them safe.


sure, but if you want to burn up processors you simply have to use
processes attm.

you might find this interesting

http://groups.google.com/group/ruby...76?lnk=gst&q=threadify+jruby#0cbc4a86f2237476

a @ http://codeforpeople.com/
 
D

David Masover

=20
On Jul 29, 2008, at 10:33 PM, David Masover wrote:
=20
=20
DRb handles references. DRbUndumped provides a means to pass =20
references to remote objects around.

Alright. What if I send a complex datastructure? Strings, I can live with, =
but=20
what about multidimensional arrays?
=20
no doubt that's true. processes can help you now though - especially =20
since threads don't scale right now in ruby with multi processor =20
machines.

I believe work is going on to make Threads scale in 1.9 -- current 1.9 stil=
l=20
has a GIL, though.

They do scale in JRuby, and probably in IronRuby (haven't tried).
=20
=20
sure, but if you want to burn up processors you simply have to use =20
processes attm.

Or I could use JRuby. Or IronRuby.

I don't want to burn up processors atm. I want to build an architecture whi=
ch=20
will be able to burn up processors in the future. I want to solve concurenc=
y=20
on a single machine once and be done with it -- without having to use Erlan=
g.
http://groups.google.com/group/ruby-talk-google/browse_thread/thread/b4e346=
478eeeead4/0cbc4a86f2237476?lnk=3Dgst&q=3Dthreadify+jruby#0cbc4a86f2237476

=46rom that link:

"the sync overhead is prohibitive =20
for in memory stuff"

I am, specifically, interested in doing in-memory stuff. If I can solve tha=
t=20
problem, I'm not as worried about the network stuff, especially as others=20
have already solved that well enough (DRb and friends).
 
C

Charles Oliver Nutter

David said:
That's the whole problem that I'm attacking right now -- while a pure actor
model wouldn't share any data, I'm not even sure I can safely clone
everything properly, if I was going that route. And I'd rather not, for
obvious performance reasons.

Well if there are specific threading issues, we'd like to solve them.
And at this very moment we're debating and working on ways to make the
core collection types (String, Array, Hash) at least not dump a stack
trace when they're used unsafely. So I think there's little reason why
you couldn't implement a decent Actor framework on top of JRuby.

Also, we recently added Rubinius's MVM API atop our existing MVM
support, so that's another route you can go and really isolate
instances. But of course, they eat up more memory that way.

- Charlie
 
T

Tony Arcieri

[Note: parts of this message were removed to make it a legal post.]

Apologies if this is a really stupid question, I am new to programming,
but after reading about Erlang and it's speed increase on multi-core
devices I had to ask.

With Matz supposedly making Ruby 2.0 right now, is it possible to make
it concurrent like Erlang so as to take advantage of the future
multi-core devices? Thank you.

Rubinius is able to spawn a VM per CPU core, and allow quasi-Erlang style
concurrency using Actor objects which can communicate across inter-VM
message buses.

It's not as elegant as Erlang's SMP scheduler (something like that really
isn't possible without a shared-nothing process architecture), but it more
or less provides the same approach Erlang uses for distributed systems (i.e.
each CPU is a "node")
 
C

Charles Oliver Nutter

Tony said:
Rubinius is able to spawn a VM per CPU core, and allow quasi-Erlang style
concurrency using Actor objects which can communicate across inter-VM
message buses.

It's not as elegant as Erlang's SMP scheduler (something like that really
isn't possible without a shared-nothing process architecture), but it more
or less provides the same approach Erlang uses for distributed systems (i.e.
each CPU is a "node")

It's worth mentioning JRuby also supports the MVM API, and sub-VMs share
nothing with their parents save them message queue. Sub-VMs also are
launched in their own native thread (though of course JRuby has native
threads within a given VM as well). It wouldn't be much of a leap to
implement the Actor model as well.

- Charlie
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,202
Messages
2,571,057
Members
47,662
Latest member
sxarexu

Latest Threads

Top