Ruby interpreter as client-server

M

MiG

I have more than 10 ruby programs running on server. They sleep for the
most of time but each of them consumes 2MB of memory.

I have an idea of client-server Ruby interpreter. There will be only one
running interpreter as a daemon (but not detached, I'm D.J.
Bernstein's daemontools fan :)))

Everything what will change will be the header of scripts:
#!/usr/bin/ruby-client

The ruby-client would send it's stdin to server and get output back.
I've tried to write this server in Ruby as UNIXServer using "load"
method to run scripts, but I don't know how to redirect stdin/stdout.
I wrote $stdin/$stdout wrapper but it globally changes every
puts/gets/read/write/etc. I need to buffer it separately in each server
thread (for each client).

I think it should be useful:
1. less memory usage when many scripts are running at once
2. fast - no starting time of interpreter
3. can use ruby in chroot

Is it possible to write it in Ruby or should it be a low-level feature?

thx for ideas!

Jan Molic
 
A

Austin Ziegler

I have more than 10 ruby programs running on server. They sleep for the
most of time but each of them consumes 2MB of memory.

I have an idea of client-server Ruby interpreter. There will be only one
running interpreter as a daemon (but not detached, I'm D.J.
Bernstein's daemontools fan :)))

Everything what will change will be the header of scripts:
#!/usr/bin/ruby-client

The ruby-client would send it's stdin to server and get output back.
I've tried to write this server in Ruby as UNIXServer using "load"
method to run scripts, but I don't know how to redirect stdin/stdout.
I wrote $stdin/$stdout wrapper but it globally changes every
puts/gets/read/write/etc. I need to buffer it separately in each server
thread (for each client).

I think it should be useful:
1. less memory usage when many scripts are running at once
2. fast - no starting time of interpreter
3. can use ruby in chroot

Is it possible to write it in Ruby or should it be a low-level feature?

Look into DRb.

-austin
 
M

MiG

OK, I've then tried to use DRb instead my own implementation of "RPC",
but still don't know how to redirect stdin/stdout of each script runned
on server to client's stdin/stdout. I think DRb can't do it, it is
"RPC" but not stdin/stdout wrapper.

Jan Molic
 
F

Florian Gross

MiG said:
but still don't know how to redirect stdin/stdout of each script runned
on server to client's stdin/stdout.

Can't you do something like this?

client_stdin, client_stdout, client_stderr = # Fetch them here over drb
# You might have to use Drb::Undumpable

file = # Fetch over drb
Thread.new do
$stdin, $stdout, $stderr = client_stdin, client_stdout, client_stderr

load(file)
end

Of course there's still ARGV, ENV and misc. global variables that can be
set with Ruby command line options ($DEBUG, $-w...) which you'd also
need to handle. I think all the globals are scoped on a Thread-base
which means that they should be easy to do. I suppose that you'd need
introduce a new Module with custom constants for changing ARGV and so on
however. (And that still won't work with ::ARGV, but I've never seen
that used anyway.)

Regards,
Florian Gross
 
M

MiG

I think no because $variable is global variable. It redirects stdout in
all threads even if it is in other module (I think module protects only
namespace not global variables).
 
A

Austin Ziegler

I think no because $variable is global variable. It redirects stdout in
all threads even if it is in other module (I think module protects only
namespace not global variables).

http://www.rubygarden.org/ruby?WriteToAString

Note: you can also rewrite your scripts to read from/write to a
provided input/output object instead of using $stdout explicitly. See
bin/ldiff in Diff::LCS.

-austin
 
M

MiG

I've tried just this :)

But there is still problem that $stdout is global variable and
pust/print/etc in load method use just these global variables.

I don't know how to populate my wrapper object to load method and
separate buffers between threads at once.

Jan Molic
 
B

Bill Kelly

Hi,

From: "MiG said:
I've tried just this :)

But there is still problem that $stdout is global variable and
pust/print/etc in load method use just these global variables.

I don't know how to populate my wrapper object to load method and
separate buffers between threads at once.

Sorry if I've misunderstood your current approach. It sounds like you
are using 'load' and separate threads to run the scriptlets? Since you're
on Unix, would fork be easier?

Once you fork, the global redirection of $stdin/$stdout should be OK
in the child process. Your parent process (server) could be a small
loop that receives requests, forks children to handle them, and pipes
the results from the child back to whoever originated the request?

Just a thought - sorry if I've misunderstood what you're doing . . .


Regards,

Bill
 
A

Austin Ziegler

I've tried just this :)

But there is still problem that $stdout is global variable and
pust/print/etc in load method use just these global variables.

I don't know how to populate my wrapper object to load method and
separate buffers between threads at once.

What I'm saying is don't actually use puts/print, etc. Use:

@outp << foo # equivalent to "print foo"

You can do:

def process(argv, inp = $stdin, outp = $stdout, errp = $stderr)
@inp = inp
...
end

process(ARGV)

-austin
 
M

MiG

I thing forking creates new interpreter instance, so this is not what I
want (why do not run normal /usr/bin/ruby then?).
I don't need client-server for something like RPC but I need less memory
usage (run only one instance and scripts in it's threads).
What I'm saying is don't actually use puts/print, etc.
Of course, I can use $myout.puts but this is again what I don't want.
I'm trying to be 100% compatible, to be able to run every script without
changes. Everything what would change is the header of scripts from
#!/usr/bin/ruby
to
#!/usr/bin/ruby-client
or run it by ruby-client directly.

Jan Molic
 
B

Bill Kelly

Hi,
I thing forking creates new interpreter instance, so this is not what I
want (why do not run normal /usr/bin/ruby then?).
I don't need client-server for something like RPC but I need less memory
usage (run only one instance and scripts in it's threads).

fork() is optimized, though. It's a standard Unix idiom for spawning
child server proceses from the parent. It uses copy-on-write memory
management so the parent's memory space doesn't have to be
duplicated for the child. They share the "same" memory pages
until such time as the parent or child modifies a page, at which point
a private/unique copy of that page is made for the modifying
process; so the parent and child really are separate semantically,
but in a very efficient way.

So forking is much more efficient, both in speed and memory usage,
than running a separate /usr/bin/ruby.

You could try a quick test and see how many times you can fork()
in a second.... Should be a lot.


Hope this helps,

Regards,

Bill
 
M

MiG

You could try a quick test and see how many times you can fork()
in a second.... Should be a lot.

Now I'm trying to find the way to get real memory usage. I have written
a memory protector that kills a process which use more than max memory
and it doesn't work well with this forks because it counts output of ps
command:
ps -e -o "vsz,pid,args"

2848 6228 ruby f
2848 6227 ruby f
2848 6226 ruby f
2848 6225 ruby f
2848 6224 ruby f
2848 6223 ruby f
2848 6222 ruby f
2848 6221 ruby f
...

It looks like every ruby interpreter needs 2MB, but if I run free, memory
usage is almost the same.

Have you any idea how to get the real memory usage of each process?

Thank you,
Jan Molic
 
M

MiG

I've just tried

F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
5 S 0 24574 24438 0 85 0 - 712 pause pts/171 00:00:00 ruby
5 S 0 24575 24438 0 85 0 - 712 pause pts/171 00:00:00 ruby
5 S 0 24576 24438 0 85 0 - 712 pause pts/171 00:00:00 ruby
5 S 0 24577 24438 0 85 0 - 712 pause pts/171 00:00:00 ruby
5 S 0 24578 24438 0 85 0 - 712 pause pts/171 00:00:00 ruby
5 S 0 24579 24438 0 85 0 - 712 pause pts/171 00:00:00 ruby
5 S 0 24580 24438 0 85 0 - 712 pause pts/171 00:00:00 ruby
5 S 0 24581 24438 0 85 0 - 712 pause pts/171 00:00:00 ruby
5 S 0 24582 24438 0 85 0 - 712 pause pts/171 00:00:00 ruby
5 S 0 24583 24438 0 85 0 - 712 pause pts/171 00:00:00 ruby
...

and I really don't know what amount of memory ruby really consumes :))
But still: if it is more than xxxKB it's too much. The purpose of my
idea of client-server interpreter is to use only what uses the one
interpreter.

Jan Molic
 
Y

Yohanes Santoso

MiG said:
Have you any idea how to get the real memory usage of each process?

There is no easy way to get accurate real memory usage of each
process, especially those that are just freshly forked.

BTW, the 'real' memory usage is the RSS column, not VSZ. VSZ is the
size in virtual memory. VSZ-RSS is the size on swap or will be swapped
without further examination.

Take for example:

dede:~$ ps -ao pid,vsz,rss,command|grep ruby
3086 3012 1364 ruby1.8 -e fork { sleep(9999) }; Process.wait
3087 3016 1376 ruby1.8 -e fork { sleep(9999) }; Process.wait

The above does not necessarily mean that these two processes combined
use 1364+1376=2740 KB. Most UNIX-like OSes perform copy-on-write
(CoW). Meaning, if you fork but do not modify anything, then you will
only incur the cost to track another process in the kernel (around
12KB or so in Linux).

YS.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,810
Latest member
Kassie0918

Latest Threads

Top