Should i disable Nagel's Algorithm in my http client?

P

phus

I write a perl script to download many small files(<1M bytes) from
Internet(use Socket 1.78). but i found the script seems running slow.
so should i setsockopt TCP_NODELAY?
BTW: i found while the script running, "gabbage collect" in perl will
make my script run very slow. Anybody help? Thanks very much!
 
T

Thrill5

phus said:
I write a perl script to download many small files(<1M bytes) from
Internet(use Socket 1.78). but i found the script seems running slow.
so should i setsockopt TCP_NODELAY?
BTW: i found while the script running, "gabbage collect" in perl will
make my script run very slow. Anybody help? Thanks very much!
Do not turn on TCP_NODELAY, it will not make your script any faster. Nagle's
algorithm, TCP_NODELAY and TCP delayed ACK's are widely misunderstood. As a
general rule, setting this option does NOTHING for application performance,
and could make the application actually run slower.

TCP is implemented as a stream. You write to the stream and the data is
buffered. The data in the buffer is packaged and sent immediately UNLESS
there is unacknowledged data in the buffer. If there is unacknowledged
data, the data in the buffer is sent when the size of the buffer is the MSS
or larger(maximum segment size, which is 1460 bytes on an Ethernet network),
or when the previous data that was sent is ACK'ed by the receiver, or you
close the socket. If you set TCP_NODELAY, every time you write to the
stream a packet will be sent. If "TCP Delayed Acknowledgement" is enabled on
the receiver, the receiver can wait up to approximately 200 milli-seconds
before sending an ACK of the data. The receiver will wait until it has data
to send, or 200ms expires. Note that most operating systems offer a way to
turn off Delayed Acknowledgements or adjust the delay time.

Now, you might think that setting TCP_NODELAY is a GREAT idea because it
will eliminate a 200ms delay between packets, but this is not the case, it
will eliminate only a delay between the first and second packets. Unless you
are writing an application that is extremely time sensitive like a telnet
client or telnet server where ANY delay would be noticeable because of user
interaction, NEVER EVER EVER set TCP_NODELAY. Why? because if you write
data to the stream in small increments you send a packet to the network on
EVERY write with a TCP/IP header on every packet as a penalty.

Lets look at a simplified extreme example to see why TCP_NODELAY doesn't
really help you and could hurt you.

Lets say you need to send 1,000 bytes of data and it is written with putc()
to the stream and TCP_NODELAY is enabled, you would need to send 41,000
bytes of traffic to the receiver (40 byte header plus 1 byte of traffic = 41
bytes times 1,000 = 41,000 bytes) With TCP_NODELAY off, you would send the
same data in 1,080 bytes of traffic (40 byte header plus 1 byte = 41 bytes
for the first packet, then next 999 times you call putc the data would be
buffered until the first packet was acknowledged 200ms later, at which time
you would then send the next packet with a 40 byte header plus 999 bytes of
traffic = 1039 bytes. 41 bytes for the first packet plus 1,039 bytes for
the second = 1,080 bytes). Is the first scenario any faster or slower that
the second? Yes and no depending on the network. It is faster by about 200
milli-seconds in a LAN environment where the serialization delay is
virtually non-existent, but in a WAN environment it could be MUCH slower
because the serialization delay could be longer than the delayed ACK timer
on the receiver. A T1 (1.544 Mb/s) would take approximately 21 ms to send
41,000 bytes of data, and a 256KB/s link has a serialization delay of about
125 milli-seconds for same amount of data. The serialization delay on the
256Kb/s circuit is 3.3 milli-seconds for 1,080 bytes. I'm only counting the
serialization delay here, not the delay added by routers for processing or
queuing due to congestion (if any.)

The delayed ACK of the second packet doesn't have any effect because the
data sent to the receiver is given to the listener immediately. For HTTP
traffic, the listener is the HTTP server and would reply immediately with
the requested page. Your application doesn't care or know about ACKs so the
last packet of data never has a penalty. As a matter of fact, the first
packet of data back to the requestor will ACK the last data packet sent by
the requestor.

If you change the write size, or the total amount of data to send ,the most
you are ever going to eliminate is 200ms, the delayed ACK time.

If you change the way the data is written to the stream by doing a single
write of the 1000 bytes, TCP_NODELAY does nothing for you because in both
cases only a single packet is written and neither has a delayed ACK penalty.
With this in mind, don't use TCP_NODELAY and don't use the stream as a
string buffer to send commands. TCP_NODELAY will only fix situations where
you "write twice, read once" Instead of turning on TCP_NODELAY, the
application should "write once, read once". Need to send the command "GET
index.html\n" Don't do this:

my $url = "index.html";
$sock->print("GET ");
$sock->print("$url\n\n");
$sock->read($data);

Two packets sent, with a 200ms delay between the first and second.

Do this instead:
my $url = "index.html";
$sock->print("GET $url\n\n");
$sock->read($data);
One packet, no delay.

As a network engineer, I can tell you that in looking at "application
slowness" problems, never ever once was the solution to turn on TCP_NODELAY.
When looking at these problems, seeing the "PSH" bit set on a TCP header
(indicating that TCP_NODELAY was enabled) is ALWAYS a huge RED flag.
Turning it off has always the application faster, or made NO difference at
all. In those cases where it makes no difference, the problem is always due
to poor application design with respect to how the client and server
communicate.

Scott
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,202
Messages
2,571,057
Members
47,661
Latest member
FloridaHan

Latest Threads

Top