
Over the past few months you have seen lots of comment on the THEOS
forum about the need for routing, gateway logic (my 3rd grade English
teacher would spin in her grave if I said "gatewaying"), and the
mysterious-sounding tunneling. Let's take a look at these concepts
and why each is important in a networking system.
First, let's look at the network in use here at HSI. There are certain
details I'm going to leave out on purpose, some because they are just
too complex, and some for security reasons, but here is the basic
idea.
From the standpoint of the outer world, our network begins where our
T1 comes in from our Internet provider. The Internet itself, of course,
is really just the connection of lots of networks like ours, and has
all the same issues on a larger scale that we must face on ours.
The T1 is connected to a "router" which looks like an appliance box
like a multiplexor or big modem; actually, it is a computer running
Unix and will respond to all the usual Unix commands and networking
functions. It has a special program to interface with the operator
to make setting up the various tables easier.
In addition to the T1 (a T1, by the way, is a 1.45Mbps data line, also
called a DS1 in some places) connection, the router has an Ethernet
port and several serial ports. The Ethernet port is connected onto
our main office LAN, which has a half-dozen or so machines of mixed
operating systems on it (THEOS, Linux, W95, WfW, NT, JNOS/DOS, etc).
The office LAN is connected to several other LANs by WAN connections
running over high-speed serial lines; these connections use PPP
(point to point protocol). At the other end of each of these PPP
links there is another Ethernet. Two of these remote Ethernets
are connected to other networks as well: one at a remote customer
site connects to their parent company LAN and also to another
Internet provider; another remote LAN connects to the AMPRNET
amateur radio network via a dedicated router that supports four
different radios. Some attempt at drawing this with ASCII
characters follows: The "===" are Ethernet and the "---" are
PPP links. The radio links are just too complicated to draw
because each leads to a different remote LAN.
Whew! That wasn't easy to draw. There are lots more pieces to this,
but this is plenty for the discussion to follow. The easiest way to
visualize what is going on is to remember that the Ethernet connections
have physical limitations, mainly the maximum connection distance,
while the PPP WAN links do not. The various Ethernets are located
far enough apart from each other that they can't be combined, while
the machines on a given Ethernet are within a hundred feet or so of
each other.
Now, each machine in the diagram has a unique IP address; they are
arranged so that the machines on a given LAN all have similar
addresses. As is common practice, we express the addresses in
binary and then use masks to identify the individual LAN address
groups. As an example, LAN A uses addresses 0x01 - 0x7F and LAN B
uses addresses 0x81 - 0xBF. Why is it important to use these
groups? The answer lies in the complexity of handling routing
between machines in these LANs.
Routing is the process of figuring out how to get a packet of
information from one machine to another. Without some organization
it would be necessary for each machine to know complete details of
the network and be updated every time anything changed in the
network. That would be hard enough for us, but imagine having to
deal with the entire Internet this way! With the addressing scheme
outlined above, each machine need only know how to route to the
machine that *really* knows how to route to a given place.
The terminology of "gateway" and "router" is often confusing; this is
because most machines that perform one of these functions also
performs the other. There is little utility in a gateway-only
machine, and there are few real-life examples of a router-only
machine. OK, so what are these functions?
A gateway is a machine that connects a LAN to some other network; in
the diagram above, the gateways are E1, D2, D1, A2, A1, B1, B2, C1,
and (although not obvious from the diagram) C2. Generally speaking,
any machine that has more than one network connection is probably
a gateway, although there are exceptions. The machine identified
as "router" is also a gateway of course, since it is connected
both to our network and to the Internet.
The purpose of a gateway is to carry packets from one interface to
the other to connect the two networks. A "firewall" is a gateway
that provides filtering to prevent certain packets from passing
through or to prohibit packets to/from certain addresses. On
our network, all gateways are also firewalls, with different
filtering rules for each. In most cases these rules prohibit
packets originating outside our network from passing through
to the inside. To prevent "spoofing" (an outsider pretending
to be inside our network) the main router prohibits packets that
appear to come from our network from coming in on the T1.
On any LAN, all machines "hear" all packets and just ignore any packet
that is not of interest. Within the same LAN, there is no issue of
routing, because everybody can hear everybody else. But how do we
get packets from one LAN to another? The answer is that each machine
has a simple routing table that describes how packets originating
on that machine or coming into that machine from any other source
should be sent. The table on D2 need only have three entries:
one to handle the addresses on LAN D via the Ethernet port, one
to handle LAN E via the PPP port, and one for everything else (called
the "default route") to be sent to D1, which acts as a gateway.
Of couse, actually sending the packet to D1 requires the use of the
routing table entry for LAN D. The reason the tables are simple
is that we route groups of addresses; if the machine addresses
were not organized into LAN groups, we would need a routing
table entry for each machine on each machine--instant chaos.
Machine D1 likewise needs only a few entries. The first entry
handles LAN D via Ethernet. The second handles LAN E by forwarding
packets to D2, which acts as a gateway in this direction. The
third is the default which sends everything else to A2.
The result of all this is that each machine needs have only a very
simple routing table, with most having three entries. If the LAN
address masks are designed cleverly, the tables are even simpler.
If LAN C and LAN B addresses can be described as a single range
of addresses, A1 needs only one entry for both, since all such
addresses can be sent to B1 for handling. Even the "router"
machine only needs four entries: one for LAN D/E, one for LAN B/C,
one for LAN A, and one for everything else.
Complicated enough by now? We still have to discuss tunnels.
Suppose the T1 line dies and we want to switch to the backup
link to route packets that are arriving via radio on LAN C. Do
we now need to change the routing everywhere along the network,
compromising our firewalls at the same time? The easiest thing to
do is to establish an IP/IP "tunnel" between machines C1 and E1.
This becomes a logical "interface" that makes it appear that
these two machines are directly connected to each other. Then,
only the routing tables in C1 and E1 need be changed, now routing
these packets via the new "interface" we just "hooked up".
For another example of tunneling, suppose we wanted split our
address space and put part of our network in another location
thousands of miles away. Would we need a separate dedicated line
to connect these two sites? No, we just use the Internet connection
that each location already has and establish a tunnel between
them over the regular Internet. Now the two networks appear to
be directly connected and are both "inside" our firewall, even
though one of them is thousands of miles away. This function
would be particularly useful to THEOS developers who want to support
a group of users at one site who will be connected to an application
server at another site, connected via the Internet.
The amateur radio network uses this extensively. There are tunnels
established from LAN C to sites all over the world, including Florida,
Australia, and California. This allows users with hand-held radios
and laptop computers to "directly" connect to their counterparts
thousands of miles away.
Oh, did I forget the "magic" part? The magic is that all of this works
cleanly and reliably. The routing is handled by several different
mechanisms, including "ARP", "proxy ARP", "gated", "routed", and
several others. These are all programs and functions that keep the
routing tables up to date. If we were to add a new LAN F that
was connected to the network at poing A4, we would only need to update
machine A4's tables and let it tell the rest of the network about
its new role. If the A1 <-> B1 link were not reliable, we could
implement a second link between A2 and B2 and let it take over
automatically if the first failed; automatic routing updates
would be broadcast over the network and everything would still work.
E1 === backup Internet link --->
^
| (WAN link D to E)
v
D1 === D2 === rest of LAN "D"
^
| (WAN link A to D)
v
--T1 router === A1 === A2 === A3 === A4 ===..rest of office net
^
| (WAN link A to B)
v
B1 === B2 === B3 === rest of LAN "B"
^
| (WAN link B to C)
v
C1 === C2 (C1/C2 handle the radios, not shown)
Have any questions?
E-mail us at info@hsix.com