Sockets and TCP Data Segments

G

Gordon Beaton

We are writing data to a socket and in most cases, a TCP segment is
then passed down to the IP layer and a separate packet is sent
across the network. However, occasionally we get two or three lots
of data being sent in the same TCP segment, which the receiving
station cannot handle. We have enabled TCP_NODELAY, but this hasn't
resolved the problem.

Is there a way of forcing each piece of data to be sent in its own
packet?

No there isn't. TCP sends a continuous stream of bytes. It knows
nothing about messages, and doesn't attempt to respect any message
boundaries you might have defined in your application. Depending on
the network MTU, the rate of transmission, the size of your messages
and the phase of the moon, it might send several messages at once or
break a single message into several transmissions.

If you want to use TCP as if it were a message based protocol, you
need to delimit your messages in the application. A simple and common
solution involves sending each message as a short header with the
length of the message body that follows. The recipient then reads the
length, followed by the specified number of bytes.

TCP_NODELAY is not a solution to your problem, however by turning it
on you turn off some other optimizations (i.e. Nagle) that the
protocol normally does.

/gordon
 
P

Pete Mainwaring

We are writing data to a socket and in most cases, a TCP segment is
then passed down to the IP layer and a separate packet is sent across
the network. However, occasionally we get two or three lots of data
being sent in the same TCP segment, which the receiving station cannot
handle. We have enabled TCP_NODELAY, but this hasn't resolved the
problem.

Is there a way of forcing each piece of data to be sent in its own
packet?

(By the way, I'm not a Java programmer - I'm the network
administrator. I've been capturing the packets across the network
because we have this problem, but the programmer can't see anything in
the available commands when handlin sockets to set this).

Thanks,

Pete
 
C

Chris Smith

Pete said:
We are writing data to a socket and in most cases, a TCP segment is
then passed down to the IP layer and a separate packet is sent across
the network. However, occasionally we get two or three lots of data
being sent in the same TCP segment, which the receiving station cannot
handle. We have enabled TCP_NODELAY, but this hasn't resolved the
problem.

Is there a way of forcing each piece of data to be sent in its own
packet?

Sorry to hear that you're working with a horribly broken network
application. There is no way to guarantee that each data set is sent in
a different TCP packet in Java. The only sure-fire approach would be to
write a platform-specific application, probably in C, that uses a lower-
level raw network interface and reimplements a fake TCP stack within the
app. Depending on the operating system, part of that application may
have to run as a driver inside the kernel.

That said, there are ways to increase the *probability* that a Java
application (or C, or any other application using a standard TCP
interface) will group data sets into the same packet. That is:

1. Set TCP_NO_DELAY on the socket. Looks like you've already done this.

2. Use a BufferedOutputStream directly wrapping the Socket's
OutputStream. This prevents messages from being split.

3. Call flush() on the OutputStream immediately following any set of
data written.

Note that you should do all of these. For example, doing just #2
without #3 will make the problem far worse.

--
www.designacourse.com
The Easiest Way To Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation
 
P

Pete Mainwaring

Gordon Beaton said:
No there isn't. TCP sends a continuous stream of bytes. It knows
nothing about messages, and doesn't attempt to respect any message
boundaries you might have defined in your application. Depending on
the network MTU, the rate of transmission, the size of your messages
and the phase of the moon, it might send several messages at once or
break a single message into several transmissions.

If you want to use TCP as if it were a message based protocol, you
need to delimit your messages in the application. A simple and common
solution involves sending each message as a short header with the
length of the message body that follows. The recipient then reads the
length, followed by the specified number of bytes.

TCP_NODELAY is not a solution to your problem, however by turning it
on you turn off some other optimizations (i.e. Nagle) that the
protocol normally does.

/gordon

Thanks Chris and Gordon for your replies. I had a feeling that I was
going to get the answer that it can't be done. I'd already had a dig
around for any other ways of doing it, for example, as you say, it
could be done in C (using tcpsend I think).

I'll pass this back to our Java programmer for him to worry about.

Thanks again,

Pete
 
L

Lawrence Kirby

On Thu, 11 Nov 2004 00:20:53 -0800, Pete Mainwaring wrote:

....
Thanks Chris and Gordon for your replies. I had a feeling that I was
going to get the answer that it can't be done. I'd already had a dig
around for any other ways of doing it, for example, as you say, it
could be done in C (using tcpsend I think).

Fundamentally it can't be done in any language because the issue is not
language related, it is the nature of TCP which simulates a reliable byte
stream, not a sequence of messages/records. It may be that you can use
"typical" TCP stack implementation characteristics to get each message
sent in a separate packet most of the time, but that's only going to
happen if you can be sure that the previous packet has been sent on the
network (and not subsequently lost causing a retransmit) before you supply
the next. Then there's the receiver side. If another packet is received
before the reader application gets the first then the TCP stack will pass
as much data as it can when the application does perform a read, i.e. data
from both messages if it can and maybe partial data from the second.

Your problem is that the receiver application is making invalid
assumptions about TCP and is ending up with race conditions. The only real
solution is to fix the receiver application. Using some sort of message
pacing (ick) i.e. a minimum time interval between writing messages might
improve the situation you have (i.e. make "losing" the race less likely)
but it isn't a real solution.

Lawrence
 
P

Pete Mainwaring

Lawrence Kirby said:
On Thu, 11 Nov 2004 00:20:53 -0800, Pete Mainwaring wrote:

...


Fundamentally it can't be done in any language because the issue is not
language related, it is the nature of TCP which simulates a reliable byte
stream, not a sequence of messages/records. It may be that you can use
"typical" TCP stack implementation characteristics to get each message
sent in a separate packet most of the time, but that's only going to
happen if you can be sure that the previous packet has been sent on the
network (and not subsequently lost causing a retransmit) before you supply
the next. Then there's the receiver side. If another packet is received
before the reader application gets the first then the TCP stack will pass
as much data as it can when the application does perform a read, i.e. data
from both messages if it can and maybe partial data from the second.

Your problem is that the receiver application is making invalid
assumptions about TCP and is ending up with race conditions. The only real
solution is to fix the receiver application. Using some sort of message
pacing (ick) i.e. a minimum time interval between writing messages might
improve the situation you have (i.e. make "losing" the race less likely)
but it isn't a real solution.

Lawrence

Thanks for your follow-up information Lawrence. I have since looked in
more depth at the packet trace (following through the TCP sequence and
acknowledgement numbers) and found what the problem is. The data in
question IS actually being sent in separate packets initially, but the
receiving end never acknowledges any of the 3 packets that are sent.
Consequently, the transmitting end times out and retransmits the TCP
data, but because it now has 3 lots of data in the buffer, as you
quite rightly say, it sends them in one packet (exactly as it should
do). So the problem is indeed in the receiving end application and as
it turns out, nothing to do with the operation of the TCP stack.

Thanks again for all the replies.

Pete
 
E

Esmond Pitt

Pete said:
Thanks for your follow-up information Lawrence. I have since looked in
more depth at the packet trace (following through the TCP sequence and
acknowledgement numbers) and found what the problem is. The data in
question IS actually being sent in separate packets initially, but the
receiving end never acknowledges any of the 3 packets that are sent.
Consequently, the transmitting end times out and retransmits the TCP
data, but because it now has 3 lots of data in the buffer, as you
quite rightly say, it sends them in one packet (exactly as it should
do). So the problem is indeed in the receiving end application and as
it turns out, nothing to do with the operation of the TCP stack.

Actually this is all to do with the receiver's TCP stack, which is where
acknowledgements come from, and nothing to do with the application,
which has nothing to do with sending ACKs. The receiver is probably just
doing delayed ACKs (unless it has a strange TCP stack implementation).

The fact remains that the application must be coded to cope with a data
stream rather than discrete messages.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,981
Messages
2,570,188
Members
46,731
Latest member
MarcyGipso

Latest Threads

Top