EPIPE results when writing to a socket for which writing has been shutdown.
This most commonly occurs when the socket has closed. You need to handle
this exception, since you can't absolutely prevent the socket from being
closed.
The exception is already caught and logged, but this is really not
good enough. By "handling this exception", do you mean that there is a
way to handle it such that the connection still works? I found some
code that attempts to retry when SIGPIPE was received, but this only
results in the same error all over again.
Why can this not be prevented (in the general case)? Unless something
fancy happened, what can cause the socket to close? Looking at the raw
data received by the connected host, the connection gets lost in mid-
stream; I can not see anything that might cause the remote side to
close the connection (in which case I'd expect a "connection reset by
peer" or something).
There might be some other change which would be appropriate, though,
if it is the case that something your application is doing is causing the
socket to be closed (for example, sending a message which the remote side
decides is invalid and causing it to close the socket explicitly from its
end).
The program is doing the same thing repeatedly and it works 95% of the
time, so I am fairly sure that nothing special is sent.
It's difficult to make any specific suggestions in that area without
knowing exactly what your program does.
Unfortunately the application is rather complex and a simple test case
is not possible.
Basically, it creates a number of daemon threads, each of which
creates a (thread local, non-shared) instance of telnetlib and
connects to a remote host. Are there any special conditions that must
be taken care of when opening a number of sockets in threads? (The
code runs on AIX 4.1, where Python supports native OS threads.)
-Samuel