network programming: how does s.accept() work?

S

Steve Holden

7stud said:
If two sockets are bound to the same host and port on the server, how
does data sent by the client get routed? Can both sockets recv() the
data?
"Routing" traditionally means passing hop-by-hop from one IP address to
another, but I'll assume that you are actually concerned about delivery
of packets from two separate clients - lets call them (addr1, p1) and
(addrs, p2) - to two server processes both using the same endpoint
address which we will call (addrS, pS). In all cases the first memebr of
the tuple is an IP address and the second is a port number.

Note that the condition I mentioned earlier (with the caveat added by
Roy) ensures that while addr1 and addr2 might be the same, or p1 and p2
might be the same, they can *never* be the same together: if the TCP
layer at addr1 allocates port p1 to one client process, when another
client process asks for an ephemeral port TCP guarantees that it wonn't
be given p1, because that is already logged as in use by another process.

So, in Python terms that represents a guarantee that

(addr1, p1) != (addr2, p2)

and consequently (addr1, p1, addrS, pS) != (addr2, p2, addrS, pS)

Now, when a packet arrives at the server system addressed to the server
endpoint, the TCP layer (whcih maintains a record of *both* endpoints
for each connection) simply looks at the incoming address and port
number to determine which process, of the potentially many using (addrS,
pS), it needs to be delivered to.

If this isn't enough then you should really take this problem to a
TCP/IP group. It's pretty basic, and until you understand it you will
never make sense of TCP/IP client/server communications.

http://holdenweb.com/linuxworld/NetProg.pdf

might help, but I don't guarantee it.

regards
Steve
 
M

Micah Cowan

Hrvoje said:
Actually the client is the one that allocates a new port. All
connections to a server remain on the same port, the one it listens
on:

(Hey, I know you! ;) )

Right.

7stud, what you seem to be missing, and what I'm not sure if anyone has
clarified for you (I have only skimmed the thread), is that in TCP,
connections are uniquely identified by a /pair/ of sockets (where
"socket" here means an address/port tuple, not a file descriptor). It is
fine for many, many connections, using the same local port and IP
address, so long as the other end has either a different IP address _or_
a different port. There is no issue with lots of processes sharing the
same socket for various separate connections, because the /pair/ of
sockets is what identifies them. See the "Multiplexing" portion of
section 1.5 of the TCP spec (http://www.ietf.org/rfc/rfc0793.txt).

Reading some of what you've written elsewhere on this thread, you seem
to be confusing this address/port stuff with what accept() returns. This
is hardly surprising, as unfortunately, both things are called
"sockets": the former is called a socket in the various RFCs, the latter
is called a socket in documentation for the Berkeley sockets and similar
APIs. What accept() returns is a new file descriptor, but the local
address-and-port associated with this new thing is still the very same
ones that were used for listen(). All the incoming packets are still
directed at port 80 (say) of the local server by the remote client.

It's probably worth mentioning at this point, that while what I said
about many different processes all using the same local address/port
combination is true, in implementations of the standard Berkeley sockets
API the only way you'd _arrive_ at that situation is that all of those
different connections that have the same local address/port combination
is that they all came from the same listen() call (ignoring mild
exceptions that involve one server finishing up connections while
another accepts new ones). Because while one process has a socket
descriptor bound to a particular address/port, no other process is
allowed to bind to that combination. However, for listening sockets,
that one process is allowed to accept many connections on the same
address/port. It can handle all those connections itself, or it can fork
new processes, or it can pass these connected sockets down to
already-forked processes. But all of them continue to be bound to the
same local address-and-port.

Note that, if the server's port were to change arbitrarily for every
successful call to accept(), it would make it much more difficult to
filter and/or analyze TCP traffic. If you're running, say, tcpdump, the
knowledge that all the packets on a connection that was originally
directed at port 80 of google.com, will continue to go to port 80 at
google.com (even though there are many, many, many other connections out
there on the web from other machines that are all also directed at port
80 of google.com), is crucial to knowing which packets to watch for
while you're looking at the traffic.
 
G

Grant Edwards

7stud, what you seem to be missing, and what I'm not sure if anyone has
clarified for you (I have only skimmed the thread), is that in TCP,
connections are uniquely identified by a /pair/ of sockets (where
"socket" here means an address/port tuple, not a file descriptor).

Using the word "socket" as a name for an address/port tuple is
precisely what's causing all the confusion. An address/port
tuple is simply not a socket from a python/Unix/C point of
view, and a socket is not an address/port tuple.
It is fine for many, many connections, using the same local
port and IP address, so long as the other end has either a
different IP address _or_ a different port. There is no issue
with lots of processes sharing the same socket for various
separate connections, because the /pair/ of sockets is what
identifies them. See the "Multiplexing" portion of section 1.5
of the TCP spec (http://www.ietf.org/rfc/rfc0793.txt).
Exactly.

Reading some of what you've written elsewhere on this thread,
you seem to be confusing this address/port stuff with what
accept() returns. This is hardly surprising, as unfortunately,
both things are called "sockets": the former is called a
socket in the various RFCs,

I must admit wasn't familiar with that usage (or had forgotten
it).
 
R

Roy Smith

7stud said:
If two sockets are bound to the same host and port on the server, how
does data sent by the client get routed? Can both sockets recv() the
data?

Undefined.

You certainly won't find the answer in the RFCs which define the protocol
because sockets aren't part of the protocol.

Unfortunately, you won't find the answer in the Socket API documentation
either because the socket API documentation is pretty vague about most
stuff.

One possible answer is that the operating system won't let you bind two
sockets to the same (address, port) pair. But, another possibility is that
it will. And even if it won't, consider the case of a process which forks;
the child inherits the already bound socket from the parent.

So, either way, you're left with the question, what happens with two
sockets both bound to the same (address, port) pair? For the sake of
simplicity, I'm assuming UDP, so there's no connection 4-tuple to worry
about. The answer is, again, undefined. One reasonable answer is that
packets received by the operating system are doled out round-robin to all
the sockets bound to that port. Another is that they're duplicated and
delivered to all sockets. Anything is possible.

But, as other posters have said, this really isn't a Python question. This
is a networking API question. Python just gives you a very thin layer on
top of whatever the operating system gives you, and lets all the details of
the OS implementation quirks shine through.
 
G

Gabriel Genellina

---
When you surf the Web, say to http://www.google.com, your Web browser
is a client. The program you contact at Google is a server. When a
server is run, it sets up business at a certain port, say 80 in the
Web case. It then waits for clients to contact it. When a client does
so, the server will usually assign a new port, say 56399, specifically
for communication with that client, and then resume watching port 80
for new requests.

You should *not* trust all you find on the Net...
 
M

Micah Cowan

Grant said:
Using the word "socket" as a name for an address/port tuple is
precisely what's causing all the confusion. An address/port
tuple is simply not a socket from a python/Unix/C point of
view, and a socket is not an address/port tuple.

FWIW, the word was used to mean the address/port tuple (RFC 793) before
there was ever a python/Unix/C concept of "socket".

And I totally agree that it's confusing; but I submit that IETF has a
stronger claim over the term than Unix/C/Python, which could have just
stuck with "network descriptor" or some such. ;)
 
M

Micah Cowan

Gabriel said:
You should *not* trust all you find on the Net...

Didn't give it a thorough read, but I did see a section about the server
setting up a new socket, called a "connection socket".

Which isn't incorrect, but proves Grant's point rather well, that the
confusion is due to the overloaded term "socket". In that context, it's
speaking quite clearly of the "Python/C/Unix" concept of a "socket", and
not (as some other texts do) of the address/port combination.

To reiterate for 7stud, accepting a new "connection socket" does _not_
change the address or port from the originally bound "for-listening" socket.
 
G

Grant Edwards

FWIW, the word was used to mean the address/port tuple (RFC
793) before there was ever a python/Unix/C concept of
"socket".

I could claim I was innocently unaware of that usage, though I
have read the RFCs, so I'll go with Steve Martin's classic
excuse: "I forgot."
And I totally agree that it's confusing; but I submit that
IETF has a stronger claim over the term than Unix/C/Python,
which could have just stuck with "network descriptor" or some
such. ;)

They probably had to come up with a system call name that was
uniquely identified by six characters or something like that.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,736
Latest member
AdolphBig6

Latest Threads

Top