Routers, Tunnels, Magic
Halliburton Systems Inc - THEOS Networking Series

Over the past few months you have seen lots of comment on the THEOS forum about the need for routing, gateway logic (my 3rd grade English teacher would spin in her grave if I said "gatewaying"), and the mysterious-sounding tunneling. Let's take a look at these concepts and why each is important in a networking system.

First, let's look at the network in use here at HSI. There are certain details I'm going to leave out on purpose, some because they are just too complex, and some for security reasons, but here is the basic idea.

From the standpoint of the outer world, our network begins where our T1 comes in from our Internet provider. The Internet itself, of course, is really just the connection of lots of networks like ours, and has all the same issues on a larger scale that we must face on ours. The T1 is connected to a "router" which looks like an appliance box like a multiplexor or big modem; actually, it is a computer running Unix and will respond to all the usual Unix commands and networking functions. It has a special program to interface with the operator to make setting up the various tables easier.

In addition to the T1 (a T1, by the way, is a 1.45Mbps data line, also called a DS1 in some places) connection, the router has an Ethernet port and several serial ports. The Ethernet port is connected onto our main office LAN, which has a half-dozen or so machines of mixed operating systems on it (THEOS, Linux, W95, WfW, NT, JNOS/DOS, etc).

The office LAN is connected to several other LANs by WAN connections running over high-speed serial lines; these connections use PPP (point to point protocol). At the other end of each of these PPP links there is another Ethernet. Two of these remote Ethernets are connected to other networks as well: one at a remote customer site connects to their parent company LAN and also to another Internet provider; another remote LAN connects to the AMPRNET amateur radio network via a dedicated router that supports four different radios. Some attempt at drawing this with ASCII characters follows: The "===" are Ethernet and the "---" are PPP links. The radio links are just too complicated to draw because each leads to a different remote LAN.

                              E1 === backup Internet link --->
                              ^
                              | (WAN link D to E)
                              v
                       D1 === D2 === rest of LAN "D"
                       ^
                       | (WAN link A to D)
                       v
--T1 router === A1 === A2 === A3 === A4 ===..rest of office net
                ^
                | (WAN link A to B)
                v
                B1 === B2 === B3 === rest of LAN "B"
                       ^
                       | (WAN link B to C)
                       v
                       C1 === C2  (C1/C2 handle the radios, not shown)

Whew! That wasn't easy to draw. There are lots more pieces to this, but this is plenty for the discussion to follow. The easiest way to visualize what is going on is to remember that the Ethernet connections have physical limitations, mainly the maximum connection distance, while the PPP WAN links do not. The various Ethernets are located far enough apart from each other that they can't be combined, while the machines on a given Ethernet are within a hundred feet or so of each other.

Now, each machine in the diagram has a unique IP address; they are arranged so that the machines on a given LAN all have similar addresses. As is common practice, we express the addresses in binary and then use masks to identify the individual LAN address groups. As an example, LAN A uses addresses 0x01 - 0x7F and LAN B uses addresses 0x81 - 0xBF. Why is it important to use these groups? The answer lies in the complexity of handling routing between machines in these LANs.

Routing is the process of figuring out how to get a packet of information from one machine to another. Without some organization it would be necessary for each machine to know complete details of the network and be updated every time anything changed in the network. That would be hard enough for us, but imagine having to deal with the entire Internet this way! With the addressing scheme outlined above, each machine need only know how to route to the machine that *really* knows how to route to a given place.

The terminology of "gateway" and "router" is often confusing; this is because most machines that perform one of these functions also performs the other. There is little utility in a gateway-only machine, and there are few real-life examples of a router-only machine. OK, so what are these functions?

A gateway is a machine that connects a LAN to some other network; in the diagram above, the gateways are E1, D2, D1, A2, A1, B1, B2, C1, and (although not obvious from the diagram) C2. Generally speaking, any machine that has more than one network connection is probably a gateway, although there are exceptions. The machine identified as "router" is also a gateway of course, since it is connected both to our network and to the Internet.

The purpose of a gateway is to carry packets from one interface to the other to connect the two networks. A "firewall" is a gateway that provides filtering to prevent certain packets from passing through or to prohibit packets to/from certain addresses. On our network, all gateways are also firewalls, with different filtering rules for each. In most cases these rules prohibit packets originating outside our network from passing through to the inside. To prevent "spoofing" (an outsider pretending to be inside our network) the main router prohibits packets that appear to come from our network from coming in on the T1.

On any LAN, all machines "hear" all packets and just ignore any packet that is not of interest. Within the same LAN, there is no issue of routing, because everybody can hear everybody else. But how do we get packets from one LAN to another? The answer is that each machine has a simple routing table that describes how packets originating on that machine or coming into that machine from any other source should be sent. The table on D2 need only have three entries: one to handle the addresses on LAN D via the Ethernet port, one to handle LAN E via the PPP port, and one for everything else (called the "default route") to be sent to D1, which acts as a gateway. Of couse, actually sending the packet to D1 requires the use of the routing table entry for LAN D. The reason the tables are simple is that we route groups of addresses; if the machine addresses were not organized into LAN groups, we would need a routing table entry for each machine on each machine--instant chaos.

Machine D1 likewise needs only a few entries. The first entry handles LAN D via Ethernet. The second handles LAN E by forwarding packets to D2, which acts as a gateway in this direction. The third is the default which sends everything else to A2.

The result of all this is that each machine needs have only a very simple routing table, with most having three entries. If the LAN address masks are designed cleverly, the tables are even simpler. If LAN C and LAN B addresses can be described as a single range of addresses, A1 needs only one entry for both, since all such addresses can be sent to B1 for handling. Even the "router" machine only needs four entries: one for LAN D/E, one for LAN B/C, one for LAN A, and one for everything else.

Complicated enough by now? We still have to discuss tunnels. Suppose the T1 line dies and we want to switch to the backup link to route packets that are arriving via radio on LAN C. Do we now need to change the routing everywhere along the network, compromising our firewalls at the same time? The easiest thing to do is to establish an IP/IP "tunnel" between machines C1 and E1. This becomes a logical "interface" that makes it appear that these two machines are directly connected to each other. Then, only the routing tables in C1 and E1 need be changed, now routing these packets via the new "interface" we just "hooked up".

For another example of tunneling, suppose we wanted split our address space and put part of our network in another location thousands of miles away. Would we need a separate dedicated line to connect these two sites? No, we just use the Internet connection that each location already has and establish a tunnel between them over the regular Internet. Now the two networks appear to be directly connected and are both "inside" our firewall, even though one of them is thousands of miles away. This function would be particularly useful to THEOS developers who want to support a group of users at one site who will be connected to an application server at another site, connected via the Internet.

The amateur radio network uses this extensively. There are tunnels established from LAN C to sites all over the world, including Florida, Australia, and California. This allows users with hand-held radios and laptop computers to "directly" connect to their counterparts thousands of miles away.

Oh, did I forget the "magic" part? The magic is that all of this works cleanly and reliably. The routing is handled by several different mechanisms, including "ARP", "proxy ARP", "gated", "routed", and several others. These are all programs and functions that keep the routing tables up to date. If we were to add a new LAN F that was connected to the network at poing A4, we would only need to update machine A4's tables and let it tell the rest of the network about its new role. If the A1 <-> B1 link were not reliable, we could implement a second link between A2 and B2 and let it take over automatically if the first failed; automatic routing updates would be broadcast over the network and everything would still work.


Have any questions? E-mail us at info@hsix.com


Return to Previous Page | Return to the HSI Homepage