ThreadWait problem

V

Vance A Heron

Rubyists,

I have a bug - probably in my code or understanding, but
possibly in the ThreadWait library. I'd appreciate any
help in understanding and/or fixing it.

My application retrieves and parses data from a tree
structured, XML/HTTP accessible datastore.

I retrieve objects which contain 0 or more leaf objects
and 0 or more container objects. Container objects are
queued for subsequent retrieval and processing. To improve
performance, I decided to use threads to process multiple
objects simultaneously.

Originally (non threaded), container objects were appended
to a list, and the main loop shifted targets off as long
as there were any in the list. With the threaded version,
I changed that to use a queue.

The program works fine for small to medium size trees,
but fails for large ones, with several processes
in the wait group having a status of nil.

Here's my original attempt at the thread dispatcher.
---
tgroup = ThreadsWait.new
$tgts = Queue.new

$tgts.enq(top_tgt)

# have more nodes to process
while $tgts.length > 0
cur_tgt = $tgts.deq
t = Thread.new do
proc_tgt(cur_tgt, param, file_mode)
end
tgroup.join_nowait(t)

# running threads can add targets to Q
# only allow t_max threads to run
while ((tgroup.threads.length > 0) && ($tgts.length == 0) ||
(tgroup.threads.length >= t_max))
tgroup.next_wait
end
end
---

For large trees, the code never exited, and looking I found
tgroup to contain several threads with a status of nil.

I first added the line
abort_on_execption = true
thinking that some of the threads were dying unexpectedly,
and being ignored. Stil no joy.

In desperation i've added the following.
near the begining of the file after including thwait
----
class ThreadsWait
attr_accessor :threads
end
----

Immediately prior to the 2nd while
---
tgroup.threads.delete_if {|t| t.status == nil }
---

This seems to work, but is *UGLY*. I wish I could give
a short example, but the original code only fails on
data sets > ~10K items - which take > 30 minutes to run.

If it makes any difference, I've had t_max set to 10 and
20 for these
 
N

nobu.nokada

Hi,

At Thu, 25 Nov 2004 04:09:24 +0900,
Vance A Heron wrote in [ruby-talk:121320]:
For large trees, the code never exited, and looking I found
tgroup to contain several threads with a status of nil.

I first added the line
abort_on_execption = true
thinking that some of the threads were dying unexpectedly,
and being ignored. Stil no joy.

Yes, Thread#status returns nil the thread died by an
exception. Does the following patch help you?
In desperation i've added the following.
near the begining of the file after including thwait

Can you inspect @wait_queue of tgroup and Thread.list at that
time? I.E., !tgroup.threads.empty? and !tgroup.threads.all?


Index: lib/thwait.rb
===================================================================
RCS file: /cvs/ruby/src/ruby/lib/thwait.rb,v
retrieving revision 1.8
diff -U2 -p -d -r1.8 thwait.rb
--- lib/thwait.rb 18 Apr 2004 23:19:46 -0000 1.8
+++ lib/thwait.rb 24 Nov 2004 23:23:48 -0000
@@ -118,6 +118,9 @@ class ThreadsWait
for th in threads
Thread.start(th) do |t|
- t.join
- @wait_queue.push t
+ begin
+ t.join
+ ensure
+ @wait_queue.push t
+ end
end
end
 
V

Vance A Heron

Thank you Nobu. The patch fixes the problem!
I hope it or something like it makes it into the
next "official" version of ruby.

Once again, excellent response from the
Ruby community.

In answer to your questions, I had the statment
tgroup.threads.each {|t| print #{t} #{t.status}\n" }
which printed out several threads handles with no status.

The next_wait does a threads.emtpy? call, which
returned false (i.e. the group wasn't emtpy), which
was confirmed by seeing the list of thread handles
in the print stmt. I did not check the wait_queue.

I had set the "abort_on_exception" variable to true
to catch any threads that died unexpectedly - it
never was executed, so I think all the threads completed
normaly.

Vance

Hi,

At Thu, 25 Nov 2004 04:09:24 +0900,
Vance A Heron wrote in [ruby-talk:121320]:
For large trees, the code never exited, and looking I found
tgroup to contain several threads with a status of nil.

I first added the line
abort_on_execption = true
thinking that some of the threads were dying unexpectedly,
and being ignored. Stil no joy.

Yes, Thread#status returns nil the thread died by an
exception. Does the following patch help you?
In desperation i've added the following.
near the begining of the file after including thwait

Can you inspect @wait_queue of tgroup and Thread.list at that
time? I.E., !tgroup.threads.empty? and !tgroup.threads.all?


Index: lib/thwait.rb
===================================================================
RCS file: /cvs/ruby/src/ruby/lib/thwait.rb,v
retrieving revision 1.8
diff -U2 -p -d -r1.8 thwait.rb
--- lib/thwait.rb 18 Apr 2004 23:19:46 -0000 1.8
+++ lib/thwait.rb 24 Nov 2004 23:23:48 -0000
@@ -118,6 +118,9 @@ class ThreadsWait
for th in threads
Thread.start(th) do |t|
- t.join
- @wait_queue.push t
+ begin
+ t.join
+ ensure
+ @wait_queue.push t
+ end
end
end
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
474,162
Messages
2,570,893
Members
47,432
Latest member
GTRNorbert

Latest Threads

Top