ruby from command line timing out?

Jason N.Perkins · Jan 9, 2005

I'm running a script from the command line that's going to take a
couple of hours to complete. Between 15 and 20 minutes into its run,
the script throws an execution expired (Timeout::Error). Is there an
environment variable that I should be looking at modifying? The error
message in its entirety is:

/usr/local/lib/ruby/1.8/timeout.rb:42:in `new': execution expired
(Timeout::Error)
from ./spider.rb:6334:in `join'
from ./spider.rb:6334
from ./spider.rb:6334:in `each'
from ./spider.rb:6334

Francis Hwang · Jan 9, 2005

Is it safe to guess, based on the name of the script, that it spiders
web pages? If that's the case, Timeout::Error s are going to happen
quite frequently as a particular web page loads too slowly.

I'm running a script from the command line that's going to take a
couple of hours to complete. Between 15 and 20 minutes into its run,
the script throws an execution expired (Timeout::Error). Is there an
environment variable that I should be looking at modifying? The error
message in its entirety is:

/usr/local/lib/ruby/1.8/timeout.rb:42:in `new': execution expired
(Timeout::Error)
from ./spider.rb:6334:in `join'
from ./spider.rb:6334
from ./spider.rb:6334:in `each'
from ./spider.rb:6334

Francis Hwang
http://fhwang.net/

Jason N.Perkins · Jan 9, 2005

Is it safe to guess, based on the name of the script, that it spiders
web pages? If that's the case, Timeout::Error s are going to happen
quite frequently as a particular web page loads too slowly.

I'm catching those errors with no problem with a 'rescue'. This seems
to be specific to the script itself.

Bill Atkins · Jan 9, 2005

Can you post the code?

Bill

I'm catching those errors with no problem with a 'rescue'. This seems
to be specific to the script itself.

Jason N.Perkins · Jan 9, 2005

Can you post the code?

Sure. The blogs variable is an array of the urls of blogs - I intend to
eventually have these urls stored in MySQL, but for now an array works.
I emptied that array so that those sites that I have in it aren't
getting hit by too many people trying to help out. The threading is
derived from a sample in "Programming Ruby." I'd love any additional
feedback outside of dealing with the timeout issue.

#! /usr/local/bin/ruby -w

require 'open-uri'
require 'thread'

blogs = [ ]

buffer=Queue.new

# load the blogs into the queue
blogs.each do |blog|
buffer.enq( blog )
end

consumers = (1..150).map do |i|
Thread.new("consumer #{i}") do |name|
begin
blog = buffer.deq
open( blog ) do |content|
begin
metas = content.read.scan( /<meta([^(>]*)>/m ).uniq
metas.each do |current_meta|
current_meta = current_meta.to_s

if current_meta =~ /\s+name\s*=\s*[\"']([^\"']+)[\"']/
name = $1
current_meta =~ /\s+content\s*=\s*[\"']([^\"']+)[\"']/
content = $1

case name
when "geo.position"
print "#{blog} \t #{content} \n"

when "ICBM"
print "#{blog} \t #{content} \n"
end
end
end
rescue Exception
p "#{blog}: $! \n"
end
end
end until buffer == :END_OF_WORK
end
end

begin
consumers.size.times{ buffer.enq

END_OF_WORK) }
consumers.each{|th| th.join}
rescue Exception
print $!
end

Francis Hwang · Jan 9, 2005

Jason,

Is the line 6334 that shows up in the traceback this line:

consumers.each{|th| th.join}

And one tip, which may not have anything to do with this problem but
might make your code easier to understand and/or debug: Since threading
is so bloody difficult, I try to make it affect as little of the
program as possible. In a case like your code, for example, I would've
let the threaded part simply handle the loading of the web pages, but
let the parsing happen afterward when all the threads have been joined
again. This is how FeedBlender (http://feedblender.rubyforge.org/) does
it, so that way if there's a bug I can figure out if it's because of
the threading or not.

Can you post the code?

Click to expand...

Sure. The blogs variable is an array of the urls of blogs - I intend
to eventually have these urls stored in MySQL, but for now an array
works. I emptied that array so that those sites that I have in it
aren't getting hit by too many people trying to help out. The
threading is derived from a sample in "Programming Ruby." I'd love any
additional feedback outside of dealing with the timeout issue.

#! /usr/local/bin/ruby -w

require 'open-uri'
require 'thread'

blogs = [ ]

buffer=Queue.new

# load the blogs into the queue
blogs.each do |blog|
buffer.enq( blog )
end

consumers = (1..150).map do |i|
Thread.new("consumer #{i}") do |name|
begin
blog = buffer.deq
open( blog ) do |content|
begin
metas = content.read.scan( /<meta([^(>]*)>/m ).uniq
metas.each do |current_meta|
current_meta = current_meta.to_s

if current_meta =~ /\s+name\s*=\s*[\"']([^\"']+)[\"']/
name = $1
current_meta =~ /\s+content\s*=\s*[\"']([^\"']+)[\"']/
content = $1

case name
when "geo.position"
print "#{blog} \t #{content} \n"

when "ICBM"
print "#{blog} \t #{content} \n"
end
end
end
rescue Exception
p "#{blog}: $! \n"
end
end
end until buffer == :END_OF_WORK
end
end

begin
consumers.size.times{ buffer.enqEND_OF_WORK) }
consumers.each{|th| th.join}
rescue Exception
print $!
end

Francis Hwang
http://fhwang.net/

Carlos · Jan 9, 2005

begin
consumers.size.times{ buffer.enqEND_OF_WORK) }
consumers.each{|th| th.join}
rescue Exception
print $!
end

I think, when the thread that is being "joined" raises timeout error, the
program will finish and the other threads won't be joined. Maybe you should
put the begin...rescue around the join (inside the each).

Hope this helps. Good luck.

Jason N.Perkins · Jan 9, 2005

Jason,

Is the line 6334 that shows up in the traceback this line:

Yeah, that's the line that's timing out and why I was wondering if
there's a global timeout value for the script that I can either modify
up or turn off completely.

And one tip, which may not have anything to do with this problem but
might make your code easier to understand and/or debug: Since
threading is so bloody difficult, I try to make it affect as little of
the program as possible. In a case like your code, for example, I
would've let the threaded part simply handle the loading of the web
pages, but let the parsing happen afterward when all the threads have
been joined again. This is how FeedBlender
(http://feedblender.rubyforge.org/) does it, so that way if there's a
bug I can figure out if it's because of the threading or not.

OK, I'll give that a try. Thanks, Francis!

Eric Hodel · Jan 10, 2005

--Apple-Mail-6-372814925
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=US-ASCII; format=flowed

Yeah, that's the line that's timing out and why I was wondering if
there's a global timeout value for the script that I can either modify
up or turn off completely.

Timeout::Error comes from timeout.rb.

Your Timeout::Error probably comes out of HTTP, open-uri doesn't
require timeout, and has no timeout blocks.

Try Thread.abort_on_exception = true at the top of your script, and
remove the begin/end block inside the thread.

--
Eric Hodel - (e-mail address removed) - http://segment7.net
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04

--Apple-Mail-6-372814925
content-type: application/pgp-signature; x-mac-type=70674453;
name=PGP.sig
content-description: This is a digitally signed message part
content-disposition: inline; filename=PGP.sig
content-transfer-encoding: 7bit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Darwin)

iD8DBQFB4siuMypVHHlsnwQRAh7nAJ91O4t3wO1AsUTonGqbbu6sO1zGkACcCRks
YIFxph39vYuLQLmngL+1Pb4=
=8IY4
-----END PGP SIGNATURE-----

--Apple-Mail-6-372814925--

yahoo finance api:Timeout::Error: execution expired	1	Jul 13, 2010
Timeout error--newbie needs help, please	0	Jan 16, 2009
Ruby 1.8.7 from 1.8.6	2	Sep 20, 2010
Problem with displaying command line outputs	2	Apr 15, 2011
help me out	0	Jan 30, 2008
Timeout exception open-uri	2	Aug 19, 2010
How to catch Timeout::ERROR ?	0	Dec 13, 2005
Not Passing On Command Line Arguments	1	Jul 23, 2009

ruby from command line timing out?

Jason N.Perkins

Francis Hwang

Jason N.Perkins

Bill Atkins

Jason N.Perkins

Francis Hwang

Carlos

Jason N.Perkins

Eric Hodel

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads