M
Mark Probert
On top of the memory leak issue, I have been trying to track down unhandled
exceptions in my code. I have run across a very strange behavior that I will
try and explain.
Problem(?) code (line numbers from bsn_a.rb)
141 def alive?
142 t = TCPSocket.new(@host, @port)
143 return true
144
145 rescue Errno::ETIMEDOUT
146 @exception = " Timed out (#{@host}:#{@port})"
147 rescue SocketError => e
148 @exception = " Socket error - #{e}"
149 rescue Exception => e
150 @exception = e
151 return false
152 end
So, in a test driver, it all works as expected with junk data:
11:42 (kant)$ ruby test.rb
.. trying to go to Foo (Foo:10.10.10.5:foo:bar)
--> failed Foo:10.10.10.5:foo:bar
.. trying to go to Foo (Bar:10.10.10.6:foo:bar)
--> failed Bar:10.10.10.6:foo:bar
However, in my actual program, something really bizarre happens:
11:43 (kant)$ ruby healthcollect.rb -g -n eeua.txt -c flow.txt -d data
.. Running 10 commands on 2 nodes.
.. Data going into directory --> data/20050210_1145_eeua
.. processing the nodes... (thread count=35)
.. threading now ...
.. trying to go to Foo (Foo:10.10.10.5:foo:bar)
Exception `SocketError' at ./bsn_a.rb:142 - getaddrinfo: hostname nor
servname provided, or not known
.. trying to go to Bar (Bar:10.10.10.6:foo:bar)
Exception `SocketError' at ./bsn_a.rb:142 - getaddrinfo: hostname nor
servname provided, or not known
Exception `SocketError' at /usr/local/lib/ruby/1.8/net/telnet.rb:352 -
getaddrinfo: hostname nor servname provided, or not known
Exception `SocketError' at /usr/local/lib/ruby/1.8/net/telnet.rb:352 -
getaddrinfo: hostname nor servname provided, or not known
Exception `SocketError' at /usr/local/lib/ruby/1.8/net/telnet.rb:360 -
getaddrinfo: hostname nor servname provided, or not known
--> failed Foo:10.10.10.5:foo:bar
Exception `SocketError' at /usr/local/lib/ruby/1.8/net/telnet.rb:360 -
getaddrinfo: hostname nor servname provided, or not known
--> failed Bar:10.10.10.6:foo:bar
Telnet is throwing a 'SocketError' and line 142 is throwing one, and neither
are being caught!
Now, if I comment out 147-148, I get the following from the program:
11:44 (kant)$ ruby healthcollect.rb -g -n eeua.txt -c flow.txt -d data
.. Running 10 commands on 2 nodes.
.. Data going into directory --> data/20050210_1144_eeua
.. processing the nodes... (thread count=35)
.. threading now ...
.. trying to go to Foo (Foo:10.10.10.5:foo:bar)
Exception `SocketError' at ./bsn_a.rb:142 - getaddrinfo: hostname nor
servname provided, or not known
.. trying to go to Bar (Bar:10.10.10.6:foo:bar)
Exception `SocketError' at ./bsn_a.rb:142 - getaddrinfo: hostname nor
servname provided, or not known
--> failed Bar:10.10.10.6:foo:bar
--> failed Foo:10.10.10.5:foo:bar
So, it throws the exception at line 142, but Telnet exception goes away!?!
Can anyone shed any light on what is happening here? I really have no clue on
how to proceed at this point.
As far as I can tell, the test driver is an accurate model of the 'real'
program -- it is threaded, it has the same class hierarchy, it includes the
same libraries, it just doesn't have all the pre- and post-processing in it.
They are both including the same 'bsn_a.rb'.
11:52 (kant)$ ruby -v
ruby 1.8.2 (2004-12-25) [i386-freebsd5.3]
Regards,
exceptions in my code. I have run across a very strange behavior that I will
try and explain.
Problem(?) code (line numbers from bsn_a.rb)
141 def alive?
142 t = TCPSocket.new(@host, @port)
143 return true
144
145 rescue Errno::ETIMEDOUT
146 @exception = " Timed out (#{@host}:#{@port})"
147 rescue SocketError => e
148 @exception = " Socket error - #{e}"
149 rescue Exception => e
150 @exception = e
151 return false
152 end
So, in a test driver, it all works as expected with junk data:
11:42 (kant)$ ruby test.rb
.. trying to go to Foo (Foo:10.10.10.5:foo:bar)
--> failed Foo:10.10.10.5:foo:bar
.. trying to go to Foo (Bar:10.10.10.6:foo:bar)
--> failed Bar:10.10.10.6:foo:bar
However, in my actual program, something really bizarre happens:
11:43 (kant)$ ruby healthcollect.rb -g -n eeua.txt -c flow.txt -d data
.. Running 10 commands on 2 nodes.
.. Data going into directory --> data/20050210_1145_eeua
.. processing the nodes... (thread count=35)
.. threading now ...
.. trying to go to Foo (Foo:10.10.10.5:foo:bar)
Exception `SocketError' at ./bsn_a.rb:142 - getaddrinfo: hostname nor
servname provided, or not known
.. trying to go to Bar (Bar:10.10.10.6:foo:bar)
Exception `SocketError' at ./bsn_a.rb:142 - getaddrinfo: hostname nor
servname provided, or not known
Exception `SocketError' at /usr/local/lib/ruby/1.8/net/telnet.rb:352 -
getaddrinfo: hostname nor servname provided, or not known
Exception `SocketError' at /usr/local/lib/ruby/1.8/net/telnet.rb:352 -
getaddrinfo: hostname nor servname provided, or not known
Exception `SocketError' at /usr/local/lib/ruby/1.8/net/telnet.rb:360 -
getaddrinfo: hostname nor servname provided, or not known
--> failed Foo:10.10.10.5:foo:bar
Exception `SocketError' at /usr/local/lib/ruby/1.8/net/telnet.rb:360 -
getaddrinfo: hostname nor servname provided, or not known
--> failed Bar:10.10.10.6:foo:bar
Telnet is throwing a 'SocketError' and line 142 is throwing one, and neither
are being caught!
Now, if I comment out 147-148, I get the following from the program:
11:44 (kant)$ ruby healthcollect.rb -g -n eeua.txt -c flow.txt -d data
.. Running 10 commands on 2 nodes.
.. Data going into directory --> data/20050210_1144_eeua
.. processing the nodes... (thread count=35)
.. threading now ...
.. trying to go to Foo (Foo:10.10.10.5:foo:bar)
Exception `SocketError' at ./bsn_a.rb:142 - getaddrinfo: hostname nor
servname provided, or not known
.. trying to go to Bar (Bar:10.10.10.6:foo:bar)
Exception `SocketError' at ./bsn_a.rb:142 - getaddrinfo: hostname nor
servname provided, or not known
--> failed Bar:10.10.10.6:foo:bar
--> failed Foo:10.10.10.5:foo:bar
So, it throws the exception at line 142, but Telnet exception goes away!?!
Can anyone shed any light on what is happening here? I really have no clue on
how to proceed at this point.
As far as I can tell, the test driver is an accurate model of the 'real'
program -- it is threaded, it has the same class hierarchy, it includes the
same libraries, it just doesn't have all the pre- and post-processing in it.
They are both including the same 'bsn_a.rb'.
11:52 (kant)$ ruby -v
ruby 1.8.2 (2004-12-25) [i386-freebsd5.3]
Regards,