Network, Sockets, Etc
Halliburton Systems Inc - THEOS Networking Series

Since I mentioned that we have been doing this stuff since before THEOS had networking, let me describe the way we used to do it first. Moving to sockets was really a natural extension of this.

Back in the days BTN (before THEOS networking) we had a pair of machines in our office connected by 16 serial cables, with a 16-port Maxpeed on each machine dedicated to talking to the other machine. We also had some smaller numbers of ports that talked to other machines, both in our office and around the country, but let's keep to the one pair of machines.

Suppose we needed to read send a file from machine A to machine B. Machine A would check its table of lines to machine B and find one that was not currently being used. It would then mark it as being in use in the table, then send a command over the line. Machine B would be listening on the line waiting for a command to come in and would execute it when it arrived, sending the answer back over the same line on which the command arrived. The concept is pretty simple, but there are some obvious problems.

About the only good news is that security was good because even we couldn't figure out how to "sneak in" through this system.

So now we have an Ethernet connection and these "socket" things; how does that help? Let's look at the picture using sockets.

The concept of the socket is similar to a single serial port, except that the socket is only a logical "device". There is no relationship between the number of sockets in use and any physical device. Sockets are identified by a giant number that identifies the machine on which they reside as well as an arbitrary "socket number". This is a little like extensions on your phone system, but it is hard to continue that comparison very far. A closely related concept is the "port". A port is a logical address on a computer; it too consists of a computer address and a port number. There are certain "well-known" port numbers in common use; this is how one system knows how to start talking to another one. As an example, requests to a web server are normally sent to port 80 on the web host. The mail receiving program on most systems is on port 25. In general, port numbers below 1024 are reserved for "well-known" services that may require strong permissions on the host system.

Remember that we had to manage the multiple serial lines in the original system? Of course, there is a similar problem with sockets but it is more easily solved; it's easier mostly because the operating system does it for us. The idea is that a server issues a listen() call on a particular port number to tell the operating system that it will handle any incoming request for a connection to that port. It then waits on an accept() call; this call returns when a connection has been received. Except for a few very simple cases, the server now does a fork() to create a new copy of itself to actually handle the request and the original one goes back to the accept() to wait for another request.

To continue the phone system analogy dangerously further, this is like a voice mail system that says "dial 2 for sales" and then connects each caller who dials a "2" to a free telephone extension in the sales department. Caller after caller can dial "2" and get an answer as long as there are enough salespeople ready to pick up phones. In the computer case, the fork() calls create new salespeople as they are needed so we never run out.

So where is the socket? A socket gets created each time the accept() call comes home. This is like the extension number of our newly- created salesperson (actually, of his newly created phone). It is the socket that gets passed to the child process; this socket is conceptually similar to a device name like "multi14" in the serial line case because it now uniquely identifies all that is necessary to talk to the other end of the connection. Assuming we are using TCP (very few adult programs use UDP these days), data poured into a socket at one end of the connection is guaranteed to come out the other end correctly, with flow control, error correction, and routing handled by the TCP/IP system. If one end goes away, an error is returned to the other end. In short, we have the moral equivalent of a pair of wires, no matter how complicated the routing between the ends.

This approach solves lots of our problems. First, there is no real limit on the number of servers that can be active on a given port at the same time; each just has its own socket and therefore its own connection. Second, because the connections are only logical, the total bandwidth is available to however many connections are active. On a 10Mbps Ethernet with only one connection active, we in theory have a 10Mbps connection. With two, each is 5Mbps if both are pumping data at the same time. Actually this is much like the multi-user nature of THEOS. Do 10 users every *really* need the CPU at the same time?

More good news? We only need one wire to connect two machines. We can get from any machine to any other machine on the network, even if it is not on the local network. TCP tells us if a connection goes away, which happens automatically if the program at either end dies. Wow! Networking.


Have any questions? E-mail us at info@hsix.com


Return to Previous Page | Return to the HSI Homepage