TCP reset caused by socket.py

O

Object01

I've been working with the source code for Trac (http://
trac.edgewall.org/) lately and have run across a bizarre problem. It
seems that all POST requests to Trac's standalone server (tracd) have
a random chance of causing the server to issue a TCP RST packet that
resets the connection.

Running Trac 10.3.1 on Win2K3 using Python 2.4, watching traffic with
Wireshark 0.99.5.

I've been stepping through the code using Winpdb 1.3.2 and have
isolated the problem to the socket.py that's included in Python 2.4.
Line 157, in _socketobject.close():

self.send = self.recv = self.sendto = self.recvfrom =
self._sock._dummy

is what's causing the TCP RST to be issued. If I set a breakpoint on
that line and step over it on a POST request, there's about an 80%
chance the server will issue a TCP RST. When debugging, the entire
response makes it onto the wire before TCP RST is issued. If I'm -
not- debugging, it's anybody's guess as to how much of the response
makes it out. The interruption, when it occurs, always seems to be
between calls to _fileobject.write(). This indicates a timing issue:
perhaps buffered data isn't being waited on properly prior to
the .close() method doing its work.

I can't tell if this is a problem with the way Trac was coded (i.e.
are they violating the rules of sockets?) or whether it indicates a
problem in Python's socket implementation. Either way, isn't this a
strange statement (an assignment) for a TCP RST to occur? I can only
figure that the garbage collector is unpredictably disposing of a
socket at this opportunity. And why only for POST requests?

I'm looking for any insight anyone can provide about this!
 
G

Gabriel Genellina

I've been working with the source code for Trac (http://
trac.edgewall.org/) lately and have run across a bizarre problem. It
seems that all POST requests to Trac's standalone server (tracd) have
a random chance of causing the server to issue a TCP RST packet that
resets the connection.

Running Trac 10.3.1 on Win2K3 using Python 2.4, watching traffic with
Wireshark 0.99.5.

I've been stepping through the code using Winpdb 1.3.2 and have
isolated the problem to the socket.py that's included in Python 2.4.
Line 157, in _socketobject.close():

self.send = self.recv = self.sendto = self.recvfrom =
self._sock._dummy

is what's causing the TCP RST to be issued. If I set a breakpoint on
that line and step over it on a POST request, there's about an 80%
chance the server will issue a TCP RST. When debugging, the entire
response makes it onto the wire before TCP RST is issued. If I'm -
not- debugging, it's anybody's guess as to how much of the response
makes it out. The interruption, when it occurs, always seems to be
between calls to _fileobject.write(). This indicates a timing issue:
perhaps buffered data isn't being waited on properly prior to
the .close() method doing its work.

I think the trigger is the line just above that, where the "real" socket
is deleted. A RST when you close a socket is OK. From your description it
appears that the close method should not have been called at that time;
I'd look at the stack and see why is it being called when it shouldn't.

I don't believe Python itself randomly calls close() so it looks like a
Trac problem, or in the webserver used. Is there any timer used? Is the
server multithreaded?
 
O

Object01

I think the trigger is the line just above that, where the "real" socket
is deleted. A RST when you close a socket is OK. From your description it
appears that the close method should not have been called at that time;
I'd look at the stack and see why is it being called when it shouldn't.

I don't believe Python itself randomly calls close() so it looks like a
Trac problem, or in the webserver used. Is there any timer used? Is the
server multithreaded?

The server is multithreaded, handling each request on its own thread.
But is a RST really a part of a valid close operation? It was my
understanding that the RST is used to indicate a problem with the
connection, usually a half-open connection. I never see RSTs unless
this error occurs.

Looking at the stack, it appears the socket is being closed at the
right time. All data has been sent and the request handler is in its
tear-down phase, expecting no more data from the client (who never
sends any). I don't see different stack traces between error and no-
error requests.

And on line 156, how is the real socket being deleted? It (the _sock
member) appears to be set to a dummy class that has no real
functionality.
 
B

Bjoern Schliessmann

Object01 said:
The server is multithreaded, handling each request on its own
thread.
Ugh.

But is a RST really a part of a valid close operation?

Depends on the state of the parties :) The proper way to close
non-defunct connections is using FIN segments.
It was my understanding that the RST is used to indicate a problem
with the connection, usually a half-open connection. I never see
RSTs unless this error occurs.

RFC793:

| Reset Generation
|
| As a general rule, reset (RST) must be sent whenever a segment
| arrives which apparently is not intended for the current
| connection. A reset must not be sent if it is not clear that this
| is the case.

Regards,


Björn
 
O

Object01

Depends on the state of the parties :) The proper way to close
non-defunct connections is using FIN segments.


RFC793:

| Reset Generation
|
| As a general rule, reset (RST) must be sent whenever a segment
| arrives which apparently is not intended for the current
| connection. A reset must not be sent if it is not clear that this
| is the case.

Regards,

Björn

Is there something I can look for in the packet traffic that would
indicate one party is misbehaving? The sequence numbers seem ok.
*shrug* I'd expect to see data sent from server to client and then
see a sequence of packets that close the connection gracefully.
Instead I see the server send data to the client and then a RST,
nothing more.

The crux of this strangeness seems to be Python sending a RST without
any prompting from the client. It's just send to client, send to
client, send to client, reset. OH! Maybe the client isn't
acknowledging properly or the server is incorrectly expecting
acknowledgment and resets when it doesn't arrive?

Rrr...
 
G

Gabriel Genellina

The server is multithreaded, handling each request on its own thread.
But is a RST really a part of a valid close operation?

No! Sorry, just nonsense.
Looking at the stack, it appears the socket is being closed at the
right time. All data has been sent and the request handler is in its
tear-down phase, expecting no more data from the client (who never
sends any). I don't see different stack traces between error and no-
error requests.

Is this applicable in your case?:
http://brad.livejournal.com/2152593.html?thread=10832273#t10832273
(closing a nonblocking socket with a nonempty output queue generates a RST)
And on line 156, how is the real socket being deleted? It (the _sock
member) appears to be set to a dummy class that has no real
functionality.

Indirectly: if the _sock attribute was the last reference to the real
socket object (and that's likely the case), assigning anything to _sock
will decrement its reference count to 0, then becoming a candidate for
garbage collection.
 
O

Object01

Is this applicable in your case?:http://brad.livejournal.com/2152593.html?thread=10832273#t10832273
(closing a nonblocking socket with a nonempty output queue generates a RST)

Based on my stepping through the code, everything passed to
_fileobject.write() makes it out onto the wire just fine. Now, if the
debugger isn't attached, like I said, it's anybody's guess how much
data gets out before the RST shows up. I need to delve deeper into
Trac's use of blocking vs. non-blocking sockets.
Indirectly: if the _sock attribute was the last reference to the real
socket object (and that's likely the case), assigning anything to _sock
will decrement its reference count to 0, then becoming a candidate for
garbage collection.

I don't know much about the timing of Python's garbage collection. Is
it pretty aggressive?
 
B

Bjoern Schliessmann

Object01 said:
Is there something I can look for in the packet traffic that would
indicate one party is misbehaving? The sequence numbers seem ok.
*shrug* I'd expect to see data sent from server to client and
then see a sequence of packets that close the connection
gracefully. Instead I see the server send data to the client and
then a RST, nothing more.

I'm not sure, without any context. RSTs also seem to be used if a
TCP port is "closed", though from your description it seems that a
connection has been established when those RSTs come in.

Regards,


Björn
 
G

Gabriel Genellina

I don't know much about the timing of Python's garbage collection. Is
it pretty aggressive?

As soon as the reference count reaches zero (at least for current CPython
version). Objects that are part of a reference cycle may take a while but
are detected and collected eventually.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,190
Members
46,740
Latest member
AdolphBig6

Latest Threads

Top