Issue with drb....

C

Chris Sheppard

Hello all; I'm having a bit of a problem with distributed Ruby on
Win32; I'm using 1.8.2, from the installer, rc7a. I'm rather enjoying
ruby, and we're using it for a lot of testing in the company where I
work, but this problem is causing some grumbling about whether Ruby is
the right tool for the job... :-/

The issue is that sockets basically seem to be going away; I'm getting
a DRbConnectionError:Invalid argument in the 'read' function, (line
554) in 'load', coming from 'recv_reply'(611) from recv_reply(865),
from 'send_message' (1104), from method_missing (1015)

It looks like the handle it's trying to read is invalid somehow; this
seems very odd. What's even odder is that I still seem able to
communicate with the other end; when my testing script gets the
exception, it proceeds to shut down our software on all the remote
machines that are part of the actual 'test', seemingly without any
trouble. This includes on the machine were we were trying to
communicate and got this invalid socket thing.

What's especially irritating is that it is looking like the message is
being sent; the error is occurring on the reception of the reply, so
this means I can't just put in a 'retry the whole thing on error'
(well, I probably CAN, at least in this case, because it's just asking
for a report, but I don't think it's a Good Idea in general...). OR
does the invalid handle suggest that the message never got through, so
we could be safe doing a Big Retry? Note that I'm having trouble
getting this to reproduce on anything smaller than like a 2 hour test,
which makes the 'make a little change, check if it's working' not work
quite so well..

Has anyone seem anything like this? I've been getting suggestions
here like 'replace the whole test infrastructure with an equivalent in
C#' which makes me itch, so I'd really like to get this solved!

Thanks, and if you need any more info, feel free to email me; I'm Very
Interested in getting this solved...

Chris
 
L

Lennon Day-Reynolds

Chris,

There are quite a few reasons this could be going wrong, both inside
and outside of your Ruby test harness. Since you said that the problem
was only reproducible on long (>1hr.) tests, though, I would suspect
socket connection (or other system-level resource) timeouts.

I would suggest adding a 'ping()' method to your DRb server, and then
having clients call it periodically (say, every 5-10 seconds) in a
background thread or process, as well as optionally before any call
with important data to be transferred. That way, both the client and
the server can detect connection failures before you have to worry
about losing data.

DRb is cheap wire-level scaffolding, but it's not a reliable messaging
system; that has to be handled at the application level.
 
L

Lennon Day-Reynolds

Chris,

There are quite a few reasons this could be going wrong, both inside
and outside of your Ruby test harness. Since you said that the problem
was only reproducible on long (>1hr.) tests, though, I would suspect
socket connection (or other system-level resource) timeouts.

I would suggest adding a 'ping()' method to your DRb server, and then
having clients call it periodically (say, every 5-10 seconds) in a
background thread or process, as well as optionally before any call
with important data to be transferred. That way, both the client and
the server can detect connection failures before you have to worry
about losing data.

DRb is cheap wire-level scaffolding, but it's not a reliable messaging
system; that has to be handled at the application level.
 
R

Robert Klemme

Chris Sheppard said:
Hello all; I'm having a bit of a problem with distributed Ruby on
Win32; I'm using 1.8.2, from the installer, rc7a. I'm rather enjoying
ruby, and we're using it for a lot of testing in the company where I
work, but this problem is causing some grumbling about whether Ruby is
the right tool for the job... :-/

The issue is that sockets basically seem to be going away; I'm getting
a DRbConnectionError:Invalid argument in the 'read' function, (line
554) in 'load', coming from 'recv_reply'(611) from recv_reply(865),
from 'send_message' (1104), from method_missing (1015)

Maybe the server does not enter an infinite loop and is gone by the time
the client tries to connect (a second time).

Difficult to answer without more input...

robert
 
C

Chris Sheppard

Robert Klemme said:
Maybe the server does not enter an infinite loop and is gone by the time
the client tries to connect (a second time).

Difficult to answer without more input...

robert


No, I'm reasonably certain the server is still running, because it
proceeds to shut down our piece of software on that machine, ie
FURTHER COMMUNICATION WORKS. Which is weird. Really, to boil it
down, I want to know what the Invalid Argument error is, above all
other things; a closed connection I can deal with, as well as a reset
connection and suchlike; I know what causes them. I don't know what
causes the Invalid argument error, so I have no idea how to handle
it...

Thanks!

Chris
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,159
Messages
2,570,879
Members
47,413
Latest member
ReeceDorri

Latest Threads

Top