Threading Performance?

J

Jeff McNeil

Greetings.

I apologize if this has been answered somewhere obvious. I did a fair
bit of Googling prior to piping up. If this has been addressed,
please feel free to simply point and grunt!

In a nutshell, I have concerns surrounding the Ruby threads
implementation as it appears to be a home-grown system. How does
would performance stack up against FreeBSD 5.x KSD threads? I've run
into problems with the Python dummy_thread module before with respect
to performance, especially when dealing with intensive IO. I have a
bit of a fear that I'll have the same issue here.

I've been tasked with writing a dynamic HTTPS/HTTP gateway of sorts,
and as such, it ought to get quite busy in the network IO
department. I'd like to take advantage of thread pooling and whatnot.

I'd really love to use Ruby as I'm quickly falling in love with it -
what a nice language.

Thoughts?

Jeff
 
R

Rick Nooner

Greetings.

I apologize if this has been answered somewhere obvious. I did a fair
bit of Googling prior to piping up. If this has been addressed,
please feel free to simply point and grunt!

In a nutshell, I have concerns surrounding the Ruby threads
implementation as it appears to be a home-grown system. How does
would performance stack up against FreeBSD 5.x KSD threads? I've run
into problems with the Python dummy_thread module before with respect
to performance, especially when dealing with intensive IO. I have a
bit of a fear that I'll have the same issue here.

I've been tasked with writing a dynamic HTTPS/HTTP gateway of sorts,
and as such, it ought to get quite busy in the network IO
department. I'd like to take advantage of thread pooling and whatnot.

I'd really love to use Ruby as I'm quickly falling in love with it -
what a nice language.

Thoughts?

Jeff

I wrote a data collection system that gathers statistics from over
2000 servers every five minutes 24/7. This has both high network
usage patterns as well as high disk usage patterns.

I also had a 4 processor box (a Sun E4500). Ruby threads cannot
take advantage of a multiprocessor server.

In order to have each collection cycle finish within the 5 minute
window alloted, all 4 processors must be fully utilized.

The architecture that I settled on was similar to a threaded worker
pool, except instead of threads I used processes with the main
process acting as the scheduler and the child processes reading
work tasks from a distributed queue (Ruby makes this easy with
Rinda). This allows scaling both by adding more processors
(and processes) to the server OR because of Rinda (and dRB)
simply by adding more collection servers.

So far, this solution has been running over a year and a half
on the single 4 processor server with no unscheduled down time.

My take is don't use the Ruby threading model for
network intensive tasks. Rather think about using dRB
and process level parallelism. You might be suprised
at how well it works and how scalable this makes your
system.

Rick
 
J

Jeff McNeil

Interesting. I'll do a bit of comparison work between that approach
and native Python "threading" support. I'd assume the dRB approach to
be slower. Just out of curiosity, has any work been done towards
improving the threading system? The current system is great for
systems that might not fully support threading out of the box, but
I'd really like to see support for the POSIX threads at the system
level.

-Jeff
 
R

Rick Nooner

Interesting. I'll do a bit of comparison work between that approach
and native Python "threading" support. I'd assume the dRB approach to
be slower. Just out of curiosity, has any work been done towards
improving the threading system? The current system is great for
systems that might not fully support threading out of the box, but
I'd really like to see support for the POSIX threads at the system
level.

-Jeff

I hope we get native threads at some point. It would be nice for
things like you're talking about. I also hope that the threading
won't be like Python using a single global lock.

I've done quite a bit of work with Python and Python threading.
While it should be better at using a multi-processor
system and I/O of all forms than Ruby, its global interpreter
lock really reduces performance when running threaded code. You
either have to use a process methodolgy like I described earlier
or write the threaded code in C/C++.

I have written several large systems in Python over the past 10
years and have written using Python threads, process level parallelism
and threaded C/C++ libraries using Python for the higher level
logic. The C/C++ libraries proved to be by far the fastest. The
same ideas could be used with Ruby.

Cavet, I haven't really used Python for about 2 years so I don't
know what the current state of the art is there.

A good point about Ruby threads is that they are rock solid
and can be used in much greater numbers than native threads
in many circumstances.

Rick
 
J

Jos Backus

In a nutshell, I have concerns surrounding the Ruby threads
implementation as it appears to be a home-grown system. How does
would performance stack up against FreeBSD 5.x KSD threads? I've run
into problems with the Python dummy_thread module before with respect
to performance, especially when dealing with intensive IO. I have a
bit of a fear that I'll have the same issue here.
...

Have you checked out Sydney?

http://blog.fallingsnow.net/

An overview of Sydney is at:

http://blog.fallingsnow.net/articles?page=2

Hopefully Evan's modifications will make it into Matz' Ruby at some point
soon, before this task becomes too difficult.
 
N

nobuyoshi nakada

Hi,

At Fri, 30 Sep 2005 15:07:56 +0900,
Jos Backus wrote in [ruby-talk:158365]:
An overview of Sydney is at:

http://blog.fallingsnow.net/articles?page=2

Hopefully Evan's modifications will make it into Matz' Ruby at some point
soon, before this task becomes too difficult.

It is too radical to be incorporated to stable version, but a
version from CVS trunk is not available.

Instead, Sasada is struggling with pthread.
 
P

Paul

I've had similar experiences. As far as I remember from when we had a
peek under the hood to check the thread implementation, its based on a
select() model. Ruby threads are not really threads in the sense of
scheduled time-sliced processes.

In my case, we were writting something similar (HTTP decoding engine)
and handled the network io with ruby threads (because that what selects
are really good for), but then handled decoding in a forked child
(because its all CPU, it would have blocked the other ruby 'threads').
 
J

Jos Backus

Hi,

At Fri, 30 Sep 2005 15:07:56 +0900,
Jos Backus wrote in [ruby-talk:158365]:
An overview of Sydney is at:

http://blog.fallingsnow.net/articles?page=2

Hopefully Evan's modifications will make it into Matz' Ruby at some point
soon, before this task becomes too difficult.

It is too radical to be incorporated to stable version, but a
version from CVS trunk is not available.

Agreed. This should strictly be a HEAD endeavor, not stable.
Instead, Sasada is struggling with pthread.

Evan: is there a Sydney patch available against HEAD?
 
J

Jeff McNeil

Yeah, a lot like pre-KSE FreeBSD threads, implemented via poll(). I'm
not an expert in the area, but I believe a lot of user space
threading libraries are implemented this way (people used to run some
apps under a linked-in LinuxThreads as an alternative).

I can work around it using a different design approach, more in-tune
with what Rick suggested.

-Jeff
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,183
Messages
2,570,968
Members
47,524
Latest member
ecomwebdesign

Latest Threads

Top