Definition of a socket on Sun's website

L

lewmania942

Hi,

there's an original definition of a socket at the following URL:

http://java.sun.com/docs/books/tutorial/networking/sockets/definition.html

The page is called "What is a Socket?" and the description is not long.

The author states that:
If everything goes well, the server accepts the connection. Upon
acceptance, the server gets a new socket bound to a different port.
It needs a new socket (and consequently a different port number) so
that it can continue to listen to the original socket for connection
requests while tending to the needs of the connected client.

I don't understand that. Moreover, there's a drawing where, upon
acceptance of a new connection, the *server* is using a new port.

I always thought that a TCP connection was uniquely identified by a
*pair* of sockets (server:port and client:port) and that a "server"
socket could be simultaneously used in several connections (and I
double checked by reading the RFCs).

That page got me so confused that I decided to do some testing.

So I did a test here on my local network. I set up Tomcat on one
machine, and configured the stateful firewall of this machine to
drop *every* packets but the ones coming from another machine
destined to the port 8080 and the ones leaving port 8080 and going
to another machine. So the server, for that simple test, wasn't
even allowed to talk to himself (ie no traffic allowed on 127.0.0.1).

I really wanted to be sure that no other port than 8080 was used
on the server.

And sure enough I could connect from several other machines to the
server... Using only port 8080 on the server.

Now I'm puzzled and I've got several questions for the
gurus out there...

1. Isn't the following phrase simply plain wrong:
It needs a new socket (and consequently a different port number)
so that it can continue to listen to the original socket for
connection requests while tending to the needs of the connected
client.

!?


2. Is there a theorical limit of how many simultaneous TCP
connections a single server (no load-balancing) can handle
on a single socket? For example, could a web server accepts
100.000 simultaneous connections on port 80, from 100.000
different clients (and hence having 100.000 different
"single server socket" <--> socket pairs identifying
100.000 connections) ? (given it has enough bandwith/processing
power, etc.)


3. In one thread posted on cljh a few months ago, called "Socket woes",
someone said:
After a socket is closed the port number remains unavailable
for a time (four minutes "by statute," IIRC, although it's
fairly common for Web servers to use shorter intervals); this
is to allow time for stale packets to expire from the network.
(You wouldn't want a packet that had been temporarily trapped
in a routing loop to escape and disrupt a new unrelated
connection that happened to use the same port number ...)

The message can be found here:

http://groups-beta.google.com/group/comp.lang.java.help/msg/a74f1800615b9b6f?dmode=source

Does a web server prevent a client from reconnecting from the exact
same IP, with the exact same (client) port number before some
time interval? Or is this precaution taken on the client's side?

Could really some hypothetical "stale packets" (in case the same
client, on the same IP, using the same "client socket") disrupt
an HTTP connection? Doesn't "HTTP over TCP over IP" prevent this?
(I mean, wouldn't those "stale packets" simply be discarded?)

Thanks in advance for any explanations,

Lew
 
A

Alan Krueger

I don't understand that. Moreover, there's a drawing where, upon
acceptance of a new connection, the *server* is using a new port.

I always thought that a TCP connection was uniquely identified by a
*pair* of sockets (server:port and client:port) and that a "server"
socket could be simultaneously used in several connections (and I
double checked by reading the RFCs).

That seems confusing to me as well. A separate socket with a separate
underlying file descriptor is usually created from the listening socket
to handle the ongoing separate connection, but I didn't think that it
used a different port.
 
M

Murat Tasan

okay. basically, a server "listens" on some port. web servers, for
example, listen on port 80 most often.

once a connection is established with client A, the server machine assigns
a new port for the specific communication with client A, say arbitrarily
port 37000.

port 80 continues to listen for new client connections, while port 37000
now does the talking to client A. this new port number is handled most
often lower in the network stack than you need to worry about, but you can
witness this by monitoring a web server closely (install your own web
server with session support, and start two simultaneous connections
(sessions), and you should be able to see that both will be using
different ports, and both will be different than port 80).

now say client B comes along and requests something through port 80. then
the server assigns a new port to talk directly to client B. say port
37001.

now the server can talk to both clients through dedicated ports, while
still listening on port 80 for new connections.

for a great talk on this stuff, you can try the book:

Interprocess Communications in Linux
by John Shapley Gray.

it doesn't cover java, but lower level unix IPC stuff (i actually have the
older version for true unix, not linux). an understanding at that level
helps a great deal when programming at a higher level in he network stack.

another good text is:

Computer Networks
by Tanenbaum.

hope that helps,

murat
 
G

Gordon Beaton

okay. basically, a server "listens" on some port. web servers, for
example, listen on port 80 most often.

once a connection is established with client A, the server machine
assigns a new port for the specific communication with client A, say
arbitrarily port 37000.

This is patently false.

There is no need for the server to do any such juggling with port
numbers, simply because a port number is not same thing as a socket
and does not uniquely identify the connection.

If the client connects to port 80, it will remain connected to port
80. There can and will be multiple, distinct sockets open at the
server, all using port 80. There is no conflict since, as the original
poster writes, the connection is uniquely identified by a pair of
sockets, or a 4 tuple consisting of the server IP and port number, and
the client IP and port number.

Check for yourself, using netstat or similar tool to see the
connections. Or a network sniffer like tcpdump to see the packets.

/gordon
 
S

Steve Horsley

Hi,

there's an original definition of a socket at the following URL:

http://java.sun.com/docs/books/tutorial/networking/sockets/definition.html

The page is called "What is a Socket?" and the description is not long.

The author states that:




I don't understand that. Moreover, there's a drawing where, upon
acceptance of a new connection, the *server* is using a new port.

Im not surprised. It's complete bollocks. I wrote to Sun years
ago pointing out that they were wrong, but they didn't bother to
answer. I don't know why they have a "feedback" link at all.
I always thought that a TCP connection was uniquely identified by a
*pair* of sockets (server:port and client:port) and that a "server"
socket could be simultaneously used in several connections (and I
double checked by reading the RFCs).

And you were right.
That page got me so confused that I decided to do some testing.

So I did a test here on my local network. I set up Tomcat on one
machine, and configured the stateful firewall of this machine to
drop *every* packets but the ones coming from another machine
destined to the port 8080 and the ones leaving port 8080 and going
to another machine. So the server, for that simple test, wasn't
even allowed to talk to himself (ie no traffic allowed on 127.0.0.1).

I really wanted to be sure that no other port than 8080 was used
on the server.

And sure enough I could connect from several other machines to the
server... Using only port 8080 on the server.

Good work. Can I recommend ethereal? It's an excellent open
source protocol analyser, and I think you would find it very
interesting and educational. It will also re-confirm your results
above.
Now I'm puzzled and I've got several questions for the
gurus out there...

1. Isn't the following phrase simply plain wrong:
Its not just wrong. It is also very misleading and wastes many
people's time.
2. Is there a theorical limit of how many simultaneous TCP
connections a single server (no load-balancing) can handle
on a single socket? For example, could a web server accepts
100.000 simultaneous connections on port 80, from 100.000
different clients (and hence having 100.000 different
"single server socket" <--> socket pairs identifying
100.000 connections) ? (given it has enough bandwith/processing
power, etc.)
Yes. By my reckoning, one other machine can maintain 65535
connections (avoid port 0), and there are almost 2^32 other IP
addresses, giving a theoretical nearly 2^48 connections. With
IPv6 it's an alltogether bigger number.
3. In one thread posted on cljh a few months ago, called "Socket woes",
someone said:




The message can be found here:

http://groups-beta.google.com/group/comp.lang.java.help/msg/a74f1800615b9b6f?dmode=source

Does a web server prevent a client from reconnecting from the exact
same IP, with the exact same (client) port number before some
time interval? Or is this precaution taken on the client's side?

I have a feeling that both ends are required to implement this
hold-down, but I'm not certain. Consult the RFC.
Could really some hypothetical "stale packets" (in case the same
client, on the same IP, using the same "client socket") disrupt
an HTTP connection? Doesn't "HTTP over TCP over IP" prevent this?
(I mean, wouldn't those "stale packets" simply be discarded?)
Yes. If a stale packet disrupted the TCP connection, that
connection will then get reset, effectively slamming the phone
down. Certain denial of service attacks work by sending such
packets. Of course, a smart application could simply reconnect
and carry on.

Steve
 
E

Esmond Pitt

http://java.sun.com/docs/books/tutorial/networking/sockets/definition.html
The page is called "What is a Socket?" and the description is not long.
The author states that:

The author is wrong: the port number is the same for listening socket
and accepted socket. There is no reason for it to be different and there
is nowhere in the TCP/IP protocol where the server can tell the client
about a new port number for the connection.

I first reported this about five years ago but nothing has changed.
 
R

Remon van Vliet

Yes, a TCP connection is uniquely defined by (remote_addr, remote_port,
local_addr, local_port). Any variation in any of those fields enables a new
connection.
That seems confusing to me as well. A separate socket with a separate
underlying file descriptor is usually created from the listening socket
to handle the ongoing separate connection, but I didn't think that it
used a different port.
It does, once connection is accepted the connection gets a new local port on
the server.
 
R

Remon van Vliet

Remon van Vliet said:
Yes, a TCP connection is uniquely defined by (remote_addr, remote_port,
local_addr, local_port). Any variation in any of those fields enables a new
It does, once connection is accepted the connection gets a new local port on
the server.
Woopsy, made a mistake there, that obviously isnt true. Same local port on
the server side (namely the port number it's listening to). Sorry ;)
 
R

Raymond DeCampo

Esmond said:
The author is wrong: the port number is the same for listening socket
and accepted socket. There is no reason for it to be different and there
is nowhere in the TCP/IP protocol where the server can tell the client
about a new port number for the connection.

I first reported this about five years ago but nothing has changed.

Anybody have any idea where this confusion originated from? It seems
that quite a few people have bad information; the ultimate source is
probably not Sun's Java What is a Socket? page.

Ray
 
J

John C. Bollinger

Raymond said:
Esmond Pitt wrote:


Anybody have any idea where this confusion originated from? It seems
that quite a few people have bad information; the ultimate source is
probably not Sun's Java What is a Socket? page.

I imagine the confusion arises partially from the fact that when the
server accepts a connection, it _does_ get a new socket with which to
communicate with the connected client. The new socket uses the same
local port as the listening socket, however (and is bound to a specific
local address, whereas the listening socket does not need to be). That
that's the origin is speculation, however.

In Java the situation is somewhat less confusing because the two
behaviors (accepting connections and communicating with connected
clients) are implemented by different classes (ServerSocket and Socket,
respectively), at least for TCP.
 
T

Thomas Weidenfeller

Raymond said:
Anybody have any idea where this confusion originated from? It seems
that quite a few people have bad information; the ultimate source is
probably not Sun's Java What is a Socket? page.

My guess here would be that people have difficulties to distinguish how
a concurrent TCP server and a concurrent UDP server work.

Several protocols on top of UDP do indeed change the server port number
to create the illusion of a connection - a property which UDP itself
does not have. Other UDP-based protocols exchange (dynamically bound)
port numbers in the payload. TFTP or SIP plus RTP/RTCP come to mind.

But there is more wrong with that particular web page. In general, a
socket in networking is a so called "half association" (I leave it to
the reader to figure this one out :). In BSD Unix the term became
defined as being an endpoint for communication. BSD Unix also got a
system call of the same name. And BSD's networking API became the
de-facto standard for networking (IP and otherwise). So these days
people usually mean a data structure obtained via a socket() system call
(or a similar system call / method invocation in their environment)
which represents an endpoint for communication when they talk about a
socket.

Non of these definitions is in any way TCP specific or talks about
ports, as Sun's "definition" in the tutorial does.

/Thomas
 
D

Dale King

Esmond said:
The author is wrong: the port number is the same for listening socket
and accepted socket. There is no reason for it to be different and there
is nowhere in the TCP/IP protocol where the server can tell the client
about a new port number for the connection.

I first reported this about five years ago but nothing has changed.

Many of us I'm sure have complained to Sun and they still haven't
corrected it.

I do find however that this is in fact a common misconception and gets
repeated a lot (e.g. Murat's in-depth explanation that was completely
wrong). I don't think Sun is the originator of the error.

I think the misconception comes from assuming that an instance of a
socket has a one-to-one correspondance with a port number. When you
listen on a port you create a some socket instance to listen on that
port (note that I am not limiting myself to Java here as the same
concepts apply to most networking API's). When someone connects to that
socket your server code will get a new socket instance. The problem is
that people think that because you get a new socket instance, it means
you are using a different port. I too once made that false association.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,962
Messages
2,570,134
Members
46,690
Latest member
MacGyver

Latest Threads

Top