Deadlock in DRb

L

Lars Christensen

In a program with two DRb servers running (two time start_service), i
get the following deadlock after a while of running with a client
connecting to both servers:

deadlock 0x284c748: sleep:J(0x2c84f7c) (main) - server.rb:54
deadlock 0x2c84f7c: sleep:F(4) - c:/lang/ruby/lib/ruby/1.8/drb/drb.rb:
944
deadlock 0x2d01338: sleep:F(5) - c:/lang/ruby/lib/ruby/1.8/drb/drb.rb:
566
deadlock 0x2c854cc: sleep:F(3) - c:/lang/ruby/lib/ruby/1.8/drb/drb.rb:
944
deadlock 0x2cff81c: sleep:S - c:/lang/ruby/lib/ruby/1.8/drb/drb.rb:
626
c:/lang/ruby/lib/ruby/1.8/drb/drb.rb:626: Thread(0x2cff81c): deadlock
(fatal)

How can I debug this issue? I don't understand why it is a deadlock at
all, since drb.rb:944 is a call to Socket#accept, which does not
depend purely on other Ruby threads.

Any ideas?

Lars
 
R

Robert Klemme

2008/4/22 said:
In a program with two DRb servers running (two time start_service), i

Why do you have two servers?
get the following deadlock after a while of running with a client
connecting to both servers:

deadlock 0x284c748: sleep:J(0x2c84f7c) (main) - server.rb:54
deadlock 0x2c84f7c: sleep:F(4) - c:/lang/ruby/lib/ruby/1.8/drb/drb.rb:
944
deadlock 0x2d01338: sleep:F(5) - c:/lang/ruby/lib/ruby/1.8/drb/drb.rb:
566
deadlock 0x2c854cc: sleep:F(3) - c:/lang/ruby/lib/ruby/1.8/drb/drb.rb:
944
deadlock 0x2cff81c: sleep:S - c:/lang/ruby/lib/ruby/1.8/drb/drb.rb:
626
c:/lang/ruby/lib/ruby/1.8/drb/drb.rb:626: Thread(0x2cff81c): deadlock
(fatal)

How can I debug this issue? I don't understand why it is a deadlock at
all, since drb.rb:944 is a call to Socket#accept, which does not
depend purely on other Ruby threads.

Any ideas?

For a deadlock you need at least two resources that are locked in
different order. Maybe you have synchronized calls across the two
servers that deadlock.

You could use set_trace_func to trace program execution until the
deadlock and look at the execution flow.

Kind regards

robert
 
L

Lars Christensen

Why do you have two servers?

Well... legacy. I have converted my application to having only 1 DRb
service started, but the same problem occurs. I still get a deadlock
after the clients have been connecting for a while.
For adeadlockyou need at least two resources that are locked in
different order.  Maybe you have synchronized calls across the two
servers thatdeadlock.

My main thread is blocked by DRb.thread.join. All other threads are
inside the DRb library on either Socket#accept, #read or #write.

How can there be a deadlock if a thread is waiting in a Socket#accept
call? As I understand the Ruby deadlock detection is simply fires when
there is no thread to run.
You could use set_trace_func to trace program execution until thedeadlockand look at the execution flow.

I have tried this, but it doesn't show anything other that the
deadlock report from Ruby, i.e. that the threads are calling
Socket#accept, #read or #write and Thread#join.

Lars
 
R

Robert Klemme

2008/4/23 said:
Well... legacy. I have converted my application to having only 1 DRb
service started, but the same problem occurs. I still get a deadlock
after the clients have been connecting for a while.

Too bad.
My main thread is blocked by DRb.thread.join. All other threads are
inside the DRb library on either Socket#accept, #read or #write.

And, are there any locks held?
How can there be a deadlock if a thread is waiting in a Socket#accept
call? As I understand the Ruby deadlock detection is simply fires when
there is no thread to run.


I have tried this, but it doesn't show anything other that the
deadlock report from Ruby, i.e. that the threads are calling
Socket#accept, #read or #write and Thread#join.

These issues are next to impossible to debug without access to code
and an understanding of what the app really does. I'm afraid, I can't
help you further right now.

Kind regards

robert
 
E

Ezra Zygmuntowicz

Too bad.


And, are there any locks held?


These issues are next to impossible to debug without access to code
and an understanding of what the app really does. I'm afraid, I can't
help you further right now.

Kind regards

robert



What version and patch level of ruby do you have? If you have ruby
1.8.6 and the patch level is less than p111 then you have a faulty
ruby interpreter with broken threading that can cause these deadlocks.
Make sure you are using ruby 1.8.5 or ruby 1.8.6p11 minimum.

Cheers-


- Ezra Zygmuntowicz
-- Founder & Software Architect
-- (e-mail address removed)
-- EngineYard.com
 
L

Lars Christensen

        What version and patch level of ruby do you have? If you have ruby  
1.8.6 and the patch level is less than p111 then you have a faulty  
ruby interpreter with broken threading that can cause these deadlocks.  
Make sure you are using ruby 1.8.5 or ruby 1.8.6p11 minimum.

Had the same problem with 1.8.6p111. I finally tracked down the
problem to a bug in Process.create from the 'win32-process' gem. Some
code added to this function afterversion 0.5.5 would call CloseHandle
on something that was not a handle but a process or thread ID. When
these are the same as socket handles, etc, the process would sometimes
deadlock, sometimes simply close a listening socket, fail in
Socket#accept, or go into infinte loops.

http://rubyforge.org/tracker/index.php?func=detail&aid=19753&group_id=85&atid=411.

I was able to work around it by setting :close_handles => false in the
call to Process#create.

Lars
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,969
Messages
2,570,161
Members
46,710
Latest member
bernietqt

Latest Threads

Top