July 16, 2008

A Quick Note On Ports and Sockets

At this point, most people are familiar with the use of an IP address. Basically it is a unique number used on a network to identify a network computer. When we talk about computers communicating with each other, an IP address alone is useless. A single computer may have many programs running that want to communicate over the network. This means that a single address point (the IP address) is not enough to define what program a remote computer would like to communicate with. This is where port numbers come in. A program can bind itself to one or many open ports (meaning they are not currently in use by another program). Once bound to a port, a remote computer can communicate with a program by addressing the computer's address (IP) along with the program's address on that computer (Port).

A common misconception is that a socket represents a bound port on a computer (the IP:Port pair). In reality, this is only half of what a socket represents. In fact, a socket represents two IP:Port pairs: one for the host, and one for the client. This means that multiple remote computers can connect to a program via the same port while each connection is represented by a different socket. Each socket is unique because they each have a different remote IP:Port pair even though they all have the same host IP:Port pair. To take this a little further, one remote computer can have multiple connections to a port on a host. This is possible because each remote connection will be bound to different ports on the remote machine, again giving us unique IP:Port pairs for each socket connection.

The way this generally works behind this scenes goes something like this. The host computer will open a socket which it will bind to some port to listen for incoming connections. When a remote computer connects, the listening socket will create a new socket and bind it to the host and connecting remote IP:Port pairs. The original listening socket will then continue listening on its bound port for more incoming connections. Sockets are typically implemented in this way because only a single connection can be represented by a socket at a time. The original socket could have handled the communications of the incoming connection itself, but there would then be no socket open to listen for other incoming connections.

No comments: