Table of Contents NSFNET
Routing Architecture
TCP/IP Tutorial and Technical Overview
Interior routing protocols or interior
gateway protocols (IGPs) are used to exchange routing information between
routers within a single autonomous system. They are also used by routers which
run exterior routing protocols to collect network-reachability
information for the autonomous system.
Note: The term interior routing protocol has no abbreviation in
common use, so we shall use the abbreviation IGP as is usual in TCP/IP
literature.
The most widely used IGPs are:
Before discussing these three protocols in detail, we shall look at two
important groups of routing algorithm used in IGPs.
In this section, we discuss the Vector-Distance and Link-State, Shortest
Path First routing algorithms.
The term Vector-Distance refers to a class of
algorithms that gateways use to update routing information. Each router begins
with a set of routes for those networks or subnets to which it is directly
attached, and possibly some additional routes to other networks or hosts if the
network topology is such that the routing protocol will be unable to produce
the desired routing correctly. This list is kept in a
routing table, where each entry identifies a destination network or host
and gives the ``distance'' to that network. The distance is called a
metric and is typically measured in ``hops''.
Periodically, each router sends a copy of its routing
table to any other router it can reach directly. When a report arrives at
router B from router A, B examines the set of destinations it receives and the
distance to each. B will update its routing table if:
- A knows a shorter way to reach a destination.
- A lists a destination that B does not have in its table.
- A's distance to a destination, already routed through A from B, has
changed.
This kind of algorithm is easy to implement, but it has a number of
disadvantages:
- When routes change rapidly, that is, a new connection appears or an old one
fails, the routing topology may not stabilize to match the changed network
topology because information propagates slowly from one router to another and
while it is propagating, some routers will have incorrect routing information.
Another disadvantage is that each router has to send a copy of its entire
routing table to every neighbor at regular intervals. Of course, one can use
longer intervals to reduce the network load but that introduces problems
related to how well the network responds to changes in topology.
Vector-distance algorithms using hop counts as a metric do not take account
of the link speed or reliability. Such an algorithm will use a path with hop
count 2 that crosses two slow-speed lines, instead of using a path with hop
count 3 that crosses three token-rings and may be substantially faster.
The most difficult task in a vector-distance algorithm is to prevent
instability. Different solutions are available:
- Counting to infinity
Let us choose a value of 16 to represent infinity.
Suppose a network becomes inaccessible; all the immediately neighboring routers
time out and set the metric to that network to 16. We can consider that all the
neighboring routers have a piece of hardware that connects them to the vanished
network, with a cost of 16. Since that is the only connection to the vanished
network, all the other routers in the system will converge to new routes that
go through one of those routers with a direct but unavailable connection. Once
convergence has happened, all the routers will have metrics of 16 for the
vanished network. Since 16 indicates infinity, all routers then regard the
network as unreachable.
The question with vector distance algorithms is not will convergence
occur but how long will it take? Let us consider the configuration shown
in Figure - The Counting to Infinity
Problem.
Figure: The Counting to Infinity Problem - All links have a metric
of 1 except for the indirect route from C to D which has a metric of
10.
Let us consider only the routes from each gateway to the target network.
Now, consider that the link from B to D fails. The routes should now adjust
to use the link from C to D. The routing changes start when B notices that the
route to D is no longer usable. For RIP this occurs when B does not receive a
routing update on its link to D for 180 seconds.
The following picture shows the metric to the target network, as it appears
in the routing table of each gateway.
Figure: The Counting to Infinity Problem
The problem is that B can get rid of its route to D (using a timeout
mechanism), but vestiges of that route persist in the system for a long time
(time between iterations is 30 seconds using RIP). Initially, A and C still
think they can reach D via B, so they keep sending updates listing metrics of
3. B will receive these updates and, in the next iteration, will claim that it
can get to D via either A or C. Of course, it can't because the routes claimed
by A and C (D reachable via B with a metric of 3) are now gone, but they have
no way of knowing that yet. Even when they discover that their routes via B
have gone away, they each think there is a route available via the other.
Eventually the system will converge, when the direct link from C to D has a
lower cost than the one received (by C) from B and A. The worst case is when a
network becomes completely inaccessible from some part of the system: in that
case, the metrics may increase slowly in a pattern like the one above until
they finally reach ``infinity''. For this reason, the problem is called
counting to infinity. Thus the choice of infinity is a trade off between
network size and speed of convergence in case counting to infinity happens.
This explains why we chose as low a value as 16 to represent infinity. 16 is
the value used by RIP.
- The other solutions will be discussed within the RIP protocol (see
Routing Information Protocol (RIP)).
The growth in networking over the past few years has
pushed the currently available Interior Gateway Protocols, which use
vector-distance algorithms, past their limits. The primary alternative to
vector-distance schemes is a class of protocols known as Link State,
Shortest Path First.
The important features of these routing protocols are:
- A set of physical networks is divided into a number of areas.
- All routers within an area have an identical database.
- Each router's database describes the complete topology (which routers are
connected to which networks) of the routing domain. The topology of an area is
represented with a database called a Link State Database describing all
of the links that each of the routers in the area has.
- Each router uses its database to derive the set of optimum paths to all
destinations from which it builds its routing table. The
algorithm used to determine the optimum paths is called a Shortest Path
First (SPF) algorithm.
In general, a link state protocol works as follows. Each router periodically
sends out a description of its connections (the state of its links) to its
neighbors (routers are neighbors if they are connected to the same network).
This description, called a Link State Advertisement
(LSA), includes the configured cost of the connection. The LSA is flooded
throughout the router's domain. Each router in the domain maintains an
identical synchronized copy of a database composed of this link state
information. This database describes both the topology of the router's domain
and routes to networks outside of the domain such as routes to networks in
other autonomous systems. Each router runs an algorithm on its topological
database resulting in a shortest-path tree. This shortest-path tree contains
the shortest path to every router and network the gateway can reach. From the
shortest-path tree, the cost to the destination and the next hop to forward a
datagram to is used to build the router's routing table.
Link-state protocols, in comparison with vector-distance protocols, send out
updates when there is news, and may send out regular updates as a way of
ensuring neighbor routers that a connection is still active. More importantly,
the information exchanged is the state of a router's links, not the contents of
the routing table. This means that link-state algorithms use fewer network
resources than their vector-distance counterparts, particularly when the
routing is complex or the autonomous system is large. They are, however,
compute-intensive. In return, users get faster response to network events,
faster route convergence, and access to more advanced network services.
This was used in the ``Fuzzball'' software for LSI/11
minicomputers, which were widely used in Internet experimentation. The Hello
protocol is described in RFC 891 - DCN Local-Network Protocols.
It is not an Internet standard.
Note: OSPF (see Open Shortest Path First
Protocol (OSPF) Version 2) includes a quite separate protocol for
negotiation between routers which is also called the Hello protocol.
The communication in the Hello protocol is via Hello messages which are
carried via IP datagrams. Hello uses protocol number 63 (reserved for ``any
local network'').
The Hello protocol is significant partly because of its wide deployment
during the early expansion of the Internet and partly because it provides an
example of a vector-distance algorithm that does not use
hop counts like RIP (see Routing Information
Protocol Version 1 (RIP, RIP-1)) but, instead, network delays as a metric
for the distance.
A Distributed Computer Network (DCN) physical
host is a PDP11-compatible processor which supports a number of cooperating
sequential processses, each of which is given a unique 8-bit identifier called
its port ID. Every DCN host contains one or more internet processes, each of
which supports a virtual host given a unique 8-bit identifier called its host
ID. There is a one-to-one correspondence between internet addresses and host
IDs. Each DCN physical host is identified by a host ID for the purpose of
detecting loops in routing updates, which establish the minimum-delay paths
between the virtual hosts.
Each physical host contains two tables:
-
Host Table
- This contains estimates of round-trip delay and
logical-clock offset (that is, the difference between the logical clock of this
host and the logical clock of the sender's host). It is indexed by the host
number. The host table is maintained dynamically using updates generated by
periodic (from 1 to 30 seconds) Hello messages.
-
Net Table
- This contains an entry for every neighbor network
that may be connected to the local network and certain other networks that are
not neighbors. Each entry contains the network number, as well as the host
number of the router (located on the local network) to that network. The Net
table is fixed at configuration time for all hosts except those that support
the GGP or EGP routing protocols. In these cases the Net table is updated as
part of the routing operation.
In addition, entries in either table can be changed by operator commands.
The delay and offset estimates are updated by Hello messages exchanged on
the links connecting physical neighbors.
Here is the format of a Hello message:
Figure: Hello Message Format
Where:
-
Checksum
- contains a checksum covering the fields indicated
-
Date
- is the local host's date
-
Time
- is the local host's time
-
Timestamp
- used in round-trip calculation (see below)
-
L Offset
- contains the offset of the block of entries of internet addresses used on
the local network
-
#hosts
- contains the number of entries from the host table that follows
-
Delay n
- delay to reach host n
-
Offset n
- offset from host n (difference between clocks)
Let us consider the two main steps of the Hello protocol.
Periodically each host sends a Hello message to its
neighbor on each of the communication links common to both of them. For each of
these links the sender keeps a set of state variables, including a copy of the
source-address field of the last Hello message received. When constructing a
Hello message the sender sets the destination-address field to this state
variable and the source-address field to its own address. It then fills in the
date and time fields from its clock and the time stamp from another state
variable. It finally copies the delay and offset values from its host table
into the message.
Round-trip delay calculations are performed on the host receiving the Hello
message. Each link has an internal state variable assigned, which is updated as
each Hello message is received; this variable takes the value of the time
field, minus the current time-of-day. When the next Hello message is
transmitted, the value assigned to the time stamp field is computed as the
low-order 16-bits of this variable minus the current time-of-day. The round
trip delay is computed as the low-order 16-bits of the current time-of-day
minus the value of the timestamp field.
When a Hello message arrives which results in a valid
round trip-delay calculation, a host update process is performed. This consists
of adding the round trip delay to each of the ``Delay n'' entries in the Hello
message in turn and comparing each of these calculated delays to the delay
field of the corresponding host table. Each entry is then updated according to
the following rules:
- If the link connects to another host on the same network and the port ID of
the link output process matches the port ID field of the entry, then update the
entry.
- If the link connects to another host on the same network and the port ID of
the link output process does not match the port ID field of the entry
and the calculated delay is less than the host delay field of the host
table by at least a specified switching threshold (currently 100 milliseconds),
then update the entry. For example, if host A sends host B a Hello message, and
if B's current delay to reach a given destination, D, is greater than the delay
from A to D plus the delay from B to A, B changes its route and sends traffic
to D via A.
The purpose of the switching threshold is to avoid (together with minimum
delay specification) unnecessary switching between links and transient loops
which can occur due to normal variations in propagation delays.
Please refer to RFC 891 for more details.
There are two versions of RIP. Version 1 (RIP-1) is a
widely deployed protocol with a number of known limitations. Version 2 (RIP-2)
is an enhanced version designed to alleviate the limitations of RIP while being
highly compatible with it. The term RIP is used to refer to Version 1, while
RIP-2 refers to Version 2. Whenever the reader encounters the term RIP in
TCP/IP literature, it is safe to assume that it is referring to Version 1
unless explicitly stated otherwise. We shall use this nomenclature in this
section except when the two versions are being compared, when we shall use the
term RIP-1 to avoid possible confusion.
RIP is a standard protocol (STD 34). Its status
is elective. It is described in RFC 1058, although many RIP
implementations pre-date this RFC by a number of years. RIP is generally
implemented with a daemon named routed. RIP is
also supported by gated daemons.
RIP was based on the Xerox PUP and XNS routing protocols. It is widely used,
as the code is incorporated in the routing code of Berkeley BSD UNIX which
provides the basis for many UNIX implementations.
RIP is a straightforward implementation of
vector-distance routing for local networks. RIP
communication uses UDP as a transport protocol, with port number 520 as the
destination port (see User Datagram Protocol
(UDP) for a description of UDP and ports). RIP operates in one of two
modes: active (normally used by routers) and
passive (normally used by hosts). The difference between the two is
explained below. RIP messages are sent in UDP datagrams and each contains up to
25 pairs of numbers as shown in Figure - RIP
Message.
Figure: RIP Message - Between 1 and 25 routes may be listed in a RIP
message. With 25 routes the message is 504 bytes long (25x20+4) which is the
maximum size message that can be transmitted in a 512-byte UDP
datagram.
-
Command
- is 1 for a RIP request or 2 for a RIP reply.
-
Version
- is 1.
-
Address Family
- is 2 for IP addresses.
-
IP address
- is the IP address for this routing entry: either a host or a subnet (in
which case the host number is zero).
-
Hop count metric
- is the number of hops to the destination. The hop count for a directly
connected interface is 1, and each intermediate router increments it by 1 to a
maximum of 15, with 16 indicating that no route exists to the destination.
Both active and passive RIP participants listen to all broadcast messages
and update their routing table according to the vector-distance algorithm
described earlier.
- When RIP is started it sends a message to each of its neighbors (on
well-known UDP port 520) asking for a copy of the neighbor's routing table.
This message is a query (command set to 1) with an address family of 0 and a
metric of 16. The neighboring routers return a copy of their routing tables.
- When RIP is in active mode it sends all or part of its routing table to all
of its neighbor routers (by broadcasting and/or by sending it on any
point-to-point links to its neighbors). This is done every 30 seconds. The
routing table is sent as a reply (command is 2, even though it is unsolicited).
- When RIP discovers a metric has changed, it broadcasts the change to other
routers.
- When RIP receives a reply, the message is validated and the local routing
table is updated if necessary.
To improve performance and reliability, RIP specifies that once a router (or
host) learns a route from another router, it must keep that route until it
learns of a better one (with a strictly lower cost). This prevents routes from
oscillating between two or more equal cost paths.
- When RIP receives a request, other than one for the entire table, it is
returned as the response with the metric for each entry set to the value from
the local routing table. If no route exists in the local table, the metric is
set to 16.
- RIP routes learned from other routers time out unless they are
re-advertised within 180 seconds (6 broadcast cycles). When a route times out,
its metric is set to infinity, the invalidation of the route is broadcast to
the router's neighbors, and 60 seconds later, the route is deleted from the
local routing table.
RIP is not designed to solve every possible routing problem.
RFC 1720 (STD 1) describes these technical limitations of
RIP as ``serious'' and the IETF is evaluating candidates
for a new standard ``open'' protocol to replace RIP. Possible candidates
include OSPF (see Open Shortest Path First Protocol
(OSPF) Version 2) and OSI IS-IS (see OSI
Intermediate System to Intermediate System (IS-IS)). However, RIP is widely
deployed and therefore is unlikely to be completely replaced for some time. RIP
has the following specific limitations:
- The maximum cost allowed in RIP is 16 which means that the network is
unreachable. Thus RIP is inadequate for large networks (that is, those in which
legitimate hop counts approach 16).
- RIP does not support variable length subnet masks (variable
subnetting). There is no facility in a RIP message to specify a subnet mask
associated with the IP address.
- RIP has no facilities to ensure that routing table updates come from
authorized routers. It is an unsecure protocol.
- RIP only uses fixed metrics to compare alternative routes. It is not
appropriate for situations where routes need to be chosen based on real-time
parameters such as measured delay, reliability, or load.
- The protocol depends upon counting to infinity
to resolve certain unusual situations. As described earlier
(Vector-Distance), the resolution of a loop
would require either much time (if the frequency of updates was limited) or
much bandwidth (if updates were sent whenever changes were detected). As the
size of the routing domain grows, the instability of the vector-distance
algorithm in the face of changing topology becomes apparent. RIP specifies
mechanisms to minimize the problems with counting to infinity (these are
described below) which allows RIP to be used for larger routing domains, but
eventually RIP will be unable to cope. There is no fixed upper limit, but the
practical maximum depends upon the frequency of changes to the topology, the
details of the network topology itself, and what is deemed as an acceptable
maximum time for the routing topology to stabilize.
Solving the counting to infinity problem is done by using the
split horizon, poisoned reverse and triggered updates
techniques.
Let's consider our example network (shown in
Figure - The Counting to Infinity
Problem) again.
Figure: The Counting to Infinity Problem - All links have a metric
of 1 except for the indirect route from C to D which has a metric of
10.
As described in Vector-Distance the problem
was caused by the fact that A and C are engaged in a pattern of mutual
deception. Each claims to be able to reach D via the other. This can be
prevented by being more careful about where information is sent. In particular,
it is never useful to claim reachability for a destination network to the
neighbor from which the route was learned (reverse routes).
The split horizon with poisoned reverse scheme
includes routes in updates sent to the router from which they were learned, but
sets their metrics to infinity. If two routers have routes pointing at each
other, advertising reverse routes with a metric of 16 will break the loop
immediately. If the reverse routes are simply not advertised (this scheme is
called simple split horizon), the erroneous routes
will have to be eliminated by waiting for a timeout. Poisoned reverse does have
a disadvantage: it increases the size of the routing messages.
Split horizon with poisoned reverse will prevent any
routing loop that involves only two gateways. However, it is still possible to
end up with patterns in which three routers are engaged in mutual deception.
For example, A may believe it has a route through B, B through C, and C through
A. This cannot be solved using split horizon. This loop will only be resolved
when the metric reaches infinity and the network or host involved is then
declared unreachable. Triggered updates are an attempt to speed up this
convergence. Whenever a router changes the metric for a route, it is required
to send update messages almost immediately, even if it is not yet time for one
of the regular update messages (RIP specifies a small time delay, between 1 and
5 seconds, in order to avoid having triggered updates generate excessive
network traffic).
RIP-2 is a draft standard protocol. Its status
is elective. It is described in RFC 1723.
RIP-2 extends RIP-1. It is less powerful than other recent IGPs such as OSPF
(see Open Shortest Path First Protocol (OSPF)
Version 2) and IS-IS (see OSI Intermediate
System to Intermediate System (IS-IS)), but it has the advantages of easy
implementation and lower overheads. The intention of RIP-2 is to provide a
straightforward replacement for RIP which can be used on small to medium-sized
networks, can be employed in the presence of variable subnetting (see
Subnets) or supernetting (see
Classless Inter-Domain Routing (CIDR)) and
importantly, can interoperate with RIP-1.
RIP-2 takes advantage of the fact that half of the bytes in a RIP-1 message
are reserved (must be zero) and that the original RIP-1 specification was well
designed with enhancements in mind, particularly in the use of the version
field. One notable area where this is not the case is in the interpretation of
the metric field. RIP-1 specifies it as being a value between 0 and 16 stored
in a four-byte field. For compatibility, RIP-2 preserves this
definition, meaning that it agrees with RIP-1 that 16 is to be interpreted as
infinity, and wastes most of this field.
Note: Neither RIP-1 nor RIP-2 are properly suited for use as an IGP
in an AS where a value of 16 is too low to be regarded as infinity, because
high values of infinity exacerbate the counting to infinity problem. The more
sophisticated Link-State protocol used in OSPF and IS-IS provides a much better
routing solution when the AS is large enough to have a legitimate hop count
close to 16.
Provided that a RIP-1 implementation obeys the specification in RFC 1058,
RIP-2 can interoperate with RIP-1. The RIP message format is extended as shown
in Figure - RIP-2 Message.
Figure: RIP-2 Message - The first entry in the message may be an
authentication entry, as shown here, or it may be a route as in a RIP-1
message. If the first entry is an authentication entry, only 24 routes may be
included in a message; otherwise the maximum is 25 as in RIP-1.
The fields in a RIP-2 message are the same as for a
RIP-1 message except as follows:
-
Version
- Is 2. This tells RIP-1 routers to ignore the fields designated as ``must
be zero'' (if the value is 1, RIP-1 routers are required to discard messages
with non-zero values in these fields since the messages originate with a router
claiming to be RIP-1-compliant but sending non-RIP-1 messages).
-
Address Family
- May be X'FFFF' in the first entry only, indicating
that this entry is an authentication entry.
-
Authentication Type
- Defines how the remaining 16 bytes are to be used.
The only defined types are 0 indicating no authentication and 2 indicating that
the field contains password data.
-
Authentication Data
- The password is 16 bytes, plain text ASCII, left
adjusted and padded with ASCII NULLs (X'00').
-
Route Tag
- Is a field intended for communicating information
about the origin of the route information. It is intended for interoperation
between RIP and other routing protocols. RIP-2 implementations must preserve
this tag, but RIP-2 does not further specify how it is to be used.
-
Subnet Mask
- The subnet mask associated with the subnet referred
to by this entry.
-
Next Hop
- A recommendation about the next hop that the router
should use to send datagrams to the subnet or host given in this entry.
To ensure safe interoperation with RIP, RFC 1723 specifies the following
restrictions for RIP-2 routers sending over a network interface where a RIP-1
router may hear and operate on the RIP messages.
- Information internal to one network must never be advertised into another
network.
- Information about a more specific subnet may not be advertised where RIP-1
routers would consider it a host route.
- Supernet routes (routes with a subnet mask shorter than the natural
or ``unsubnetted'' network mask) must not be advertised where they could be
misinterpreted by RIP-1 routers.
RIP-2 also supports the use of multicasting rather than simple broadcasting.
This can reduce the load on hosts which are not listening for RIP-2 messages.
This option is configurable for each interface to ensure optimum use of RIP-2
facilities when a router connects mixed RIP-1/RIP-2 subnets to RIP-2-only
subnets. Similarly, the use of authentication in mixed environments can be
configured to suit local requirements.
RIP-2 is implemented in recent versions of the gated
daemon, often termed gated Version 3. Since the
draft standard is new at the time of writing, many implementations will comply
with the earlier version described in RFC 1388. Such
implementations will interoperate with those adhering to RFC 1723.
For more information on RIP-2, see:
- RFC 1721 - RIP Version 2 Protocol Analysis
- RFC 1722 - RIP Version 2 Protocol Applicability Statement
- RFC 1723 - RIP Version 2 - Carrying Additional Information
- RFC 1724 - RIP Version 2 MIB Extension
Note: The term OSPF is invariably used to refer
to OSPF Version 2 (OSPF-2). OSPF Version 1, which is described in RFC 1131, is
obsolete.
OSPF is a draft standard protocol. Its status
is elective, but RFC 1370 contains an applicability statement for
OSPF which says that any router implementing a protocol other than simple
IP-based routing must implement OSPF (this does not preclude a router
implementing other protocols as well, of course). OSPF is
described in RFC 1583, which obsoletes RFC 1247. OSPF
implementations based on RFC 1583 are backward-compatible with implementations
based on RFC 1247 and will interoperate with them. Readers interested in the
development of OSPF Version 2 from Version 1 should refer to Appendix F of RFC
1247 and Appendix E of RFC 1583.
OSPF is an interior routing protocol, but it is designed to operate with a
suitable exterior protocol, such as BGP. See BGP
OSPF Interaction.
OSPF is a complex standard when compared to RIP: RFC 1583 runs to 216 pages,
whereas RIP, specified in RFC 1058 has 33 pages and RIP-2 (RFC 1723) adds only
another 9. Much of the complexity of OSPF is directed towards a single purpose:
ensuring that the topological databases are the same for all of the routers
within an area. Because the database is the basis for all routing choices, if
routers were to have independent databases, they could make mutually
conflicting decisions.
OSPF communicates using IP (it is protocol number 89). It is a
Link-State, Shortest Path First protocol as described in
Link-State, Shortest Path First. OSPF
supports different kinds of networks such as point-to-point networks,
broadcast networks, such as Ethernet and token-ring, and non-broadcast
networks, such as X.25.
The OSPF specification makes use of state machines to define the
behavior of routers complying with the protocol. Aspects
of a router's operation which are important to OSPF, such as its network
interfaces and its neighboring routers, are described as being in one of a
finite number of states (for example, a neighbor may be in the down
state). There is a separate state machine for each separate component (for
example, two network interfaces have separate state machines) and the state of
one is independent of the state of another. The possible states are sufficient
to describe all possible conditions relevant to the protocol, so a state
machine is always in one, and only one, of its possible states. State changes
occur only as a result of events. There is a finite set of events for
each type of state machine which is sufficient to describe all possible
occurrences relevant to the protocol. The behavior of the state machine in
response to an event is defined for all possible combinations of state and
event. For example, if the state machine for a network interface experiences an
InterfaceDown event, the state machine changes to the down state
unconditionally. The InterfaceDown event is generated by the OSPF
implementation whenever it receives an indication from a lower-level protocol
that the interface is not functioning. See RFC 1583 for a complete description
of each of the state machines, their possible states and events and the changes
associated with them.
Here are some definitions which are necessary to understand the sequence of
operations described later in this section:
-
Area
- A set of networks within a single autonomous system
that have been grouped together. The topology of an area is hidden from the
rest of the autonomous system, and each area has a separate topological
database (see below). Routing within the autonomous system takes place on two
levels, depending on whether the source and destination of a packet reside in
the same area (intra-area routing) or different areas (inter-area
routing).
The division of an autonomous system into areas enables a significant
reduction in the volume of routing traffic required to manage the routing
database for a large autonomous system.
-
Backbone
- The backbone consists of those networks not contained
in any area, their attached routers, and those routers that belong to multiple
areas. The backbone must be logically contiguous. If it is not
physically contiguous, the separate components must be connected using
virtual links (see below). The backbone is responsible for the
distribution of routing information between areas. The backbone itself has all
the properties of an area; its topology is separate from that of the areas.
-
Area Border Router
- A router connected to multiple areas. An area border
router has a copy of the topological database for each area that it is
connected to. An area border router is always part of the backbone. Area border
routers are responsible for the propagation of inter-area routing information
into the areas to which they are connected.
-
Internal Router
- A router which is not an area border router.
-
AS Border Router (ASBR)
- A router which exchanges routing information with
routers belonging to other autonomous systems. All routers in the AS know the
path to all AS boundary routers. An ASBR may be an area border router or
an internal router. It need not be part of the backbone.
Note: The nomenclature for this type of router is somewhat varied.
RFC 1583, which describes OSPF uses the term AS Boundary Router.
RFC 1267 and 1268 which describe BGP use the terms
Border Router and Border Gateway. RFC 1340 which describes the
interaction between OSPF and BGP uses the term AS Border Router. We
shall use the last term consistently when describing both OSPF and BGP.
-
Virtual Link
- A virtual link is part of the backbone. Its
endpoints are two area border routers which share a common non-backbone area.
The link is treated like a point-to-point link with metrics cost equal to the
intra-area metrics between the endpoints of the links. The routing through the
virtual link is done using normal intra-area routing.
-
Transit Area
- An area through which a virtual route is physically
connected.
-
Stub Area
- An area configured to use default routing for
inter-AS routing. A stub area can be configured where there is only a single
exit from the area, or where any exit may be used without preference for
routing to destinations outside the autonomous system. By default inter-AS
routes are copied to all areas, so the use of stub areas can reduce the storage
requirements of routers within those areas for autonomous systems where a lot
of inter-AS routes are defined.
-
Multiaccess Network
- A physical network that supports the attachment of
multiple routers. Each pair of routers on such a network is assumed to be able
to communicate directly.
-
Hello Protocol
- The part of the OSPF protocol used to establish and
maintain neighbor relationships. This is not the Hello protocol
described in The Hello Protocol.
-
Neighboring routers
- Two routers that have interfaces to a common network.
On multiaccess networks, neighbors are dynamically discovered by the Hello
protocol.
Each neighbor is described by a state machine which describes the
conversation between this router and its neighbor. A brief outline of the
meaning of the states follows. See the section immediately following for a
definition of the terms adjacency and designated router.
-
Down
- Initial state of a neighbor conversation. It indicates that there has been
no recent information received from the neighbor.
-
Attempt
- A neighbor on a non-broadcast network appears down and an attempt should be
made to contact it by sending regular Hello packets.
-
Init
- A Hello packet has recently been received from the neighbor. However,
bidirectional communication has not yet been established with the neighbor
(that is, the router itself did not appear in the neighbor's Hello packet).
-
2-way
- In this state, communication between the two routers is bidirectional.
Adjacencies can be established, and neighbors in this state or higher are
eligible to be elected as (backup) designated routers.
-
ExStart
- The two neighbors are about to create an adjacency.
-
Exchange
- The two neighbors are telling each other what they have in their
topological databases.
-
Loading
- The two neighbors are synchronizing their topological databases.
-
Full
- The two neighbors are now fully adjacent; their databases are synchronized.
Various events cause a change of state. For example, if a router receives a
Hello packet from a neighbor that is down, the neighbor's state changes to
init, and an inactivity timer is started. If the timer fires (that is, no
further OSPF packets are received before it expires) the neighbor will return
to the down state. Refer to RFC 1583 for a complete description of the states
and information on the events which cause state changes.
-
Adjacency
- A relationship formed between selected neighboring
routers for the purpose of exchanging routing information. Not every
pair of neighboring routers become adjacent. In particular, not every pair of
routers will stay synchronized. If all neighbors were to be synchronized, the
number of synchronized pairs on a multiaccess network such as a LAN would be
n(n-1)/2 where n is the number of routers on the LAN. In
large networks, the synchronization traffic would swamp the network, rendering
it unusable. The concept of adjacencies is used to limit the number of
synchronized pairs to 2n-1, ensuring that the amount of synchronization
traffic is manageable.
-
Link State Advertisement
- Refers to the local state of a router or network.
This includes the state of the router's interfaces and adjacencies. Each link
state advertisement is flooded throughout the routing domain. The collected
link state advertisements of all routers and networks form the area's
topological database.
-
Flooding
- The process of ensuring that each link state
advertisement is passed between adjacent routers to reach every router in the
area. The flooding procedure is reliable.
-
Designated Router
- Each multiaccess network that has at least two
attached routers, has a Designated Router. The Designated Router generates a
link state advertisement for the multiaccess network. It is elected by the
Hello protocol. It becomes adjacent to all other routers
on the network. Since the topological databases of all routers are synchronized
through adjacencies, the Designated Router plays a central part in the
synchronization process.
-
Backup Designated Router
- In order to make the transition to a new Designated
Router smoother, there is a Backup Designated Router for each multiaccess
network. The Backup Designated Router is also adjacent to all routers on the
network, and becomes Designated Router when the previous Designated Router
fails. Because adjacencies already exist between the Backup Designated Router
and all other routers attached to the network, new adjacencies do not have to
be formed when the Backup Designated Router takes over from the Designated
Router, shortening the time required for the takeover considerably. The Backup
designated router is elected using the Hello protocol.
-
Interface
- The connection between a router and one of its
attached networks. Each interface has state information associated with it
which is obtained from the underlying lower-level protocols and the OSPF
protocol itself. A brief description of each state is given here. Please refer
to RFC 1583 for more details, and for information on the events that will cause
an interface to change its state.
-
Down
- The interface is unavailable. This is the initial state of an interface.
-
Loopback
- The interface is looped back to the router. It cannot be used for regular
data traffic.
-
Waiting
- The router is trying to determine the identity of the Designated Router or
its backup.
-
Point-to-Point
- The interface is to a point-to-point network or is a virtual link. The
router forms an adjacency with the router at the other end.
Note: The interfaces do not need IP addresses. Since the remainder
of the internet has no practical need to see the routers' interfaces to the
point-to-point link, just the interfaces to other networks, any IP addresses
for the link would be needed only for communication between the two routers. To
conserve the IP address space, the routers can dispense with IP addresses on
the link. This has the effect of making the two routers appear to be one to IP
but this has no ill effects. Such a link is called an unnumbered link.
-
DR Other
- The interface is on a multiaccess network but this router is neither the
Designated Router nor its backup. The router forms adjacencies with the
Designated Router and its backup.
-
Backup
- The router is the Backup Designated Router. It will be promoted to
Designated Router if the present Designated Router fails. The router forms
adjacencies with every other router on the network.
-
DR
- The router itself is the Designated Router. The router forms adjacencies
with every other router on the network. The router must also originate a
network links advertisement for the network node.
-
Type of Service (TOS) metrics
- In each type of link state advertisement, different
metrics can be advertised for each IP Type of Service. A metric for TOS 0 (used
for OSPF routing protocol packets) must always be specified. Metrics for other
TOS values can be specified; if they are not, these metrics are assumed equal
to the metric specified for TOS 0.
-
Link State Database
- Also called the directed graph or the
topological database. It is created from the link state advertisements
generated by the routers in the area.
Note: RFC 1583 uses the term Link State Database in preference to
topological database. The former term has the advantage that it describes the
contents of the database, the latter is more descriptive of the purpose of the
database -- to describe the topology of the area. We have previously used the
term topological database for this reason, but for the remainder of this
section where we discuss the operation of OSPF in more detail, we will refer to
it as the Link State Database.
-
Shortest-Path Tree
- Each router runs the Shortest Path First (SPF)
algorithm on the Link State Database to obtain its shortest-path tree. The tree
gives the route to any destination network or host as far as the area boundary.
It is used to build the routing table.
Note: Because each router occupies a different place in the area's
topology, application of the SPF algorithm gives a different tree for each
router, even though the database is identical.
Area border routers run multiple copies of the algorithm but build a single
routing table.
-
Routing table
- The routing table contains entries for each
destination: network, subnet or host. For each destination, there is
information for one or more types of service (TOS). For each combination of
destination and type of service, there are entries for one or more optimum
paths to be used.
-
Area ID
- A 32-bit number identifying a particular area. The
backbone has an Area ID of zero.
-
Router ID
- A 32-bit number identifying a particular router.
Each router within the AS has a single router ID. One possible implementation
is to use the lowest numbered IP address belonging to a router as its router
ID.
-
Router Priority
- An 8-bit unsigned integer, configurable on a per-interface basis indicating
this router's priority in the selection of the (backup) Designated Router. A
Router Priority of zero indicates that this router is ineligible to be the
Designated Router.
The basic sequence of operations performed by OSPF routers is:
- Discovering OSPF neighbors
- Electing the Designated Router
- Forming adjacencies
- Synchronizing databases
- Calculating the routing table
- Advertising Link States
Routers will go through these steps when they first come up, and will repeat
these steps in response to events which occur in the network. Each router must
perform each of these steps for each network it is connected to, except for the
calculation of the routing table. Each router generates and maintains a single
routing table for all networks.
Each of these steps is described in the following sections.
When OSPF routers start, they initiate and sustain
relationships with their neighbors using the Hello protocol. The Hello protocol
also ensures that communication between neighbors is bidirectional. Hello
packets are sent periodically out to all router interfaces. Bidirectional
communication is indicated when the router sees itself in the neighbor's Hello
packet. On a broadcast network, Hello packets are sent using multicast;
neighbors are then discovered dynamically. On non-broadcast networks each
router that may potentially become a Designated Router has a list of all
routers attached to the network and will send Hello packets to all other
potential Designated Routers when its interface to the non-broadcast network
first becomes operational.
The OSPF header is described in
Figure - OSPF Packet Header.
Figure: OSPF Packet Header
-
Version #
- The OSPF version number (2).
-
Type
- Hello (1), database description (2), Link-State Request (3), Link-State
Update (4), or Link-State Acknowledgment (5).
-
Packet length
- Length of the protocol packet in bytes including the OSPF header.
-
Router ID
- The ID of the router originating the packet.
-
Area ID
- The area that the packet is being sent into.
-
Checksum
- The standard IP checksum of the entire contents of the packet, excluding
the 64-bit authentication field.
-
AuType
- Identifies the authentication scheme to be used for the packet. The
authentication type is configurable on a per-area basis. Additional
authentication data is configurable on a per-interface basis. The
authentication types currently defined are 0 (No authentication) and 1 (plain
text 64-bit password).
-
Authentication
- A 64-bit field for use by the authentication scheme.
The format of the OSPF Hello packet is given in
Figure - OSPF Hello Packet.
Figure: OSPF Hello Packet
-
Network Mask
- The network mask associated with this interface. This is the subnet mask
if subnetting is implemented, or the equivalent mask for a non-subnetted
network (for example, 255.255.255.0 for a non-subnetted Class C network).
-
Dead Interval
- The number of seconds that must elapse before returning a silent neighbor
to the down state.
-
Hello Int (Hello Interval)
- The number of seconds between this router's Hello packets.
-
Priority
- This router's Router Priority (for this interface).
-
Designated Router
- The IP address of the Designated Router for this network, according to the
sending router. This is set to 0 if no Designated Router is known.
-
Backup Designated Router
- The IP address of the Backup Designated Router for this network, according
to the sending router. This is set to 0 if no Backup Designated Router is
known.
-
Neighbor
- The Router IDs of each router from whom valid Hello packets have been
received recently from the network. Recently means within the last Dead
Interval.
This is done using the Hello protocol. A brief
description of the process is given here. See RFC 1583 for a full description.
The router examines the list of its neighbors, discards any with which it does
not have bidirectional communication or which have a Router Priority of zero,
and records the Designated Router, Backup Designated Router and Router Priority
declared by each one. The router adds itself to the list, using the Router
Priority configured for the interface and zero (unknown) for the Designated
Router and Backup Designated Router values, if the calculating router has just
come up.
The following rules are used to determine the Backup Designated Router:
- If one or more routers declare themselves to be the Backup Designated
Router and not to be the Designated Router, then the one with the higher Router
Priority wins.
- In the event of a tie, the router with the highest Router ID wins.
- If no router has declared itself to be the Backup Designated Router, then
the Router with the highest Router Priority is elected unless it has declared
itself to be the Designated Router.
- Again, in the event of a tie, the router with the highest Router ID wins.
Because the calculating router is in the list, it may determine that it is to
become the Backup Designated Router. A similar process is followed for the
Designated Router:
- If one or more routers declare themselves to be the Designated Router, then
the one with the higher Router Priority wins.
- In the event of a tie, the router with the highest Router ID wins.
- If no router has declared itself to be the Backup then the Backup
Designated Router becomes the Designated Router.
The actual process is considerably more complex than this, because the Hello
messages transmitted include changes to the fields recorded on other routers,
and these changes cause events in those routers which in turn will trigger
state changes or other actions. The intent behind the mechanism is twofold:
- That when a router comes up, it should not usurp the position of (Backup)
Designated Router from the current holder even if it has a higher Router
priority.
- That the promotion of a Backup Designated Router to Designated Router
should be orderly and require the Backup to accept its responsibilities.
The algorithm does not always result in the router with the highest priority
being the Designated Router, nor in the one with the second highest priority
being the Backup.
The Designated Router has the following responsibilities:
- The Designated Router generates the network links advertisement on behalf
of the network. This advertisement is flooded throughout the area and
describes this network to all routers in all networks in this area.
- The Designated Router becomes adjacent to other routers on the network.
These adjacencies are central to the flooding procedure used to ensure that
link state advertisements reach all routers in the area and therefore that the
topological database used by all routers remains the same.
The Backup Designated Router has the following responsibility
- The Designated Router becomes adjacent to all other routers on the network.
This ensures that it can take over from the Designated Router rapidly.
After a neighbor has been discovered, bidirectional
communication ensured, and (on a multiaccess network) a Designated Router
elected, a decision is made regarding whether or not an adjacency should be
formed with the neighbor:
- On multiaccess networks, all routers become adjacent to both the Designated
Router and the Backup Designated Router.
- On point-to-point links (or virtual links), the router always forms an
adjacency with the router at the other end.
If the decision is made to not form an adjacency, the state of the neighbor
communication remains in the 2-way state.
Adjacencies are established using Database Description
packets. These contain a summary of the sender's link
state database. Multiple packets may be used to describe the database: for this
purpose a poll-response procedure is used. The router with the higher router ID
will become the master, the other will become the slave. Database Description
packets sent by the master (polls) are acknowledged by Database
Description packets sent by the slave (responses). The packets contain
sequence numbers to ensure a match between polls and responses. This is called
the Database Exchange Process.
The format of the OSPF Database Description packet is shown in
Figure - OSPF Database Description
Packet.
Figure: OSPF Database Description Packet
-
0
- Reserved, must be 0.
-
I bit
- Init bit. Set to 1 when the packet is the first in the sequence of database
description.
-
M bit
- More bit. Indicates that more data descriptions are to follow.
-
MS bit
- Master/slave bit. Set to 1 when the router is the master, 0 when it is the
slave.
-
DD sequence number
- Used to sequence the collection of database description packets.
The rest of the packet contains a list of some or all of the contents of the
topological database. Each item in the database is a link state advertisement.
The database description packets contain the headers from these advertisements.
The headers are sufficient to uniquely identify each advertisement. This
information is used in the subsequent database synchronization. The format of a
Link State Header is shown in Figure - OSPF
Link State Advertisement Header.
Figure: OSPF Link State Advertisement Header
The fields in the link state advertisement header are:
-
LS age
- A 16-bit number indicating the time in seconds since the origin of the
advertisement. It is increased as the link state advertisement resides in a
router's database and with each hop it travels as part of the flooding
procedure. When it reaches a maximum value, it ceases to be used for
determining routing tables and is discarded unless it is still needed for
database synchronization. The age is also to determine which of two otherwise
identical copies of an advertisement a router should use.
-
Options
- Two bits which describe optional OSPF capabilities. The E-bit indicates an
external routing capability: it is set unless the advertisement is for a
router, network links or summary link in a stub area. The E-bit is used for
information only and does not affect the routing table. The T-bit indicates
that the advertisement describes paths for types of service in addition to TOS
0.
-
LS type
- The types of the link state advertisement are:
-
1
- Router links. These describe the states of a router's interfaces.
-
2
- Network links. These describe the routers attached to a network.
-
3
- Summary links. These describe inter-area, intra-AS routes. They are created
by area border routers and allow routes to networks within the AS but outside
the area to be described concisely.
-
4
- Summary links. These describe routes to the boundary of the AS (that is, to
AS boundary routers). They are created by area border routers. They are very
similar to type 3.
-
5
- AS external links. These describe routes to networks outside the AS. They
are created by AS boundary routers. A default route for the AS can be
described this way.
-
Link State ID
- A unique ID for the advertisement which is dependent on the Link State
Type. For types 1 and 4 it is the Router ID, for types 3 and 5 it is an IP
network number, and for type 2 it is the IP address of the Designated Router.
-
Advertising Router
- The Router ID of the router that originated the link state advertisement.
For type 1 advertisements, this field is identical to the Link State ID. For
type 2, it is the Router ID of the network's Designated Router. For types 3 and
4, it is the Router ID of an area border router. For type 5, it is the Router
ID of an AS boundary router.
-
LS sequence number
- Used to allow detection of old or duplicate link state advertisements.
-
LS checksum
- Checksum of the complete link state advertisement excluding the LS age
field.
After the Database Exchange Process is over, each
router has a list of those link advertisements for which the neighbor has more
up-to-date instances. These are then requested in Link State Request
packets. The response to a Link State Request packet is a
Link State Update packet which contains some or all of the link state
advertisements requested. At most one Link State Request can be outstanding: if
no response is received, the requester must retry the request.
Link state advertisements come in five formats. The format of a Router Links
Advertisement (Type 1) is shown in
Figure - OSPF Router Links Advertisement.
Figure: OSPF Router Links Advertisement - This advertisement is
itself encapsulated in an OSPF packet.
-
V Bit
- When set, this router is the endpoint of a virtual link which is using this
area as a transit area.
-
E Bit
- When set, the router is an AS boundary router.
-
B Bit
- When set, the router is an area border router.
-
# links
- The number of links described by this advertisement.
-
Link ID
- Identifies the object that this link connects to. The value depends upon
the type field (see below).
-
1
- Neighboring router's Router ID
-
2
- IP Address of the Designated Router
-
3
- This value depends on what the inter area route is to:
-
4
- Neighboring router's Router ID
-
Link Data
- This value also depends upon the type field (see RFC 1583 for details).
-
Type
- What this link connects to.
-
1
- Point-to-point connection to another router
-
2
- Connection to a transit network
-
3
- Connection to a stub network or to a host
-
4
- Virtual link
-
# metric
- The number of different TOS metrics given for this link in addition to the
metric for TOS 0.
-
TOS 0 metric
- The cost of using this outbound link for TOS 0. All OSPF routing protocol
packets are sent with the IP TOS field set to 0.
-
TOS
- IP type of service that this metric refers to. RFC
1349 defines the possible TOS values in an IP header (see also
Internet Protocol (IP)) using a 4-bit sequence.
OSPF encodes these by treating the sequence as a number and doubling it (there
is a reserved bit of 0 immediately following the TOS value field in the IP
datagram, so OSPF allows for its future inclusion in the TOS value). There are
five defined values:
Table: Type of Service Values
-
metric
- The cost of using this outbound router link for traffic of the specified
Type of Service.
As an example, suppose the point-to-point link between routers RT1 (IP
address: 192.1.2.3) and RT6 (IP address: 6.5.4.3) is a satellite link. To
encourage the use of this line for high bandwidth traffic, the AS administrator
may set an artificially low metric for that TOS. Router RT1 would then
originate the following router links advertisement (assuming RT1 is an area
border router and is not an AS boundary router):
; RT1's router links advertisement
LS age = 0 ; always true on origination
LS type = 1 ; indicates router links
Link State ID = 192.1.2.3 ; RT1's Router ID
Advertising Router = 192.1.2.3 ; RT1
bit E = 0 ; not an AS boundary router
bit B = 1 ; area border router
#links = 1
Link ID = 6.5.4.3 ; neighbor router's Router ID
Link Data = 0.0.0.0 ; interface to unnumbered SL
Type = 1 ; connects to router
# other metrics = 1
TOS 0 metric = 8
TOS = 2 ; high bandwidth
metric = 1 ; traffic preferred
The format of a Network Links Advertisement (Type 2) is shown in
Figure - OSPF Network Links
Advertisement.
Figure: OSPF Network Links Advertisement - This advertisement is
itself encapsulated in an OSPF packet.
-
Network Mask
- The IP address mask for the network. For example a Class A network would
have the mask 255.0.0.0.
-
Router ID 1-n
- The IP addresses of all routers on the network which are adjacent to the
Designated Router (including the sending router). The number of routers in the
list is deduced from the length field in the header.
The format of a Summary Links Advertisement (Type 3 or 4) is shown in
Figure - OSPF Summary Links
Advertisement.
Figure: OSPF Summary Links Advertisement - This advertisement is
itself encapsulated in an OSPF packet.
-
Network Mask
- For a Type 3 link state advertisement, this is the IP address mask for the
network. For a Type 4 link state advertisement, this is not meaningful and must
be zero.
-
TOS 0
- zero
-
metric
- The cost of this route for this type of service in the same units used for
TOS metrics in Type 1 advertisements.
-
TOS x
- Zero or more entries for additional types of service. The number of entries
can be determined from the length field in the header.
The format of an External Links Advertisement is shown in
Figure - OSPF External Links
Advertisement.
Figure: OSPF External Links Advertisement - This advertisement is
itself encapsulated in an OSPF packet.
-
Network Mask
- The IP address mask for the network.
-
Bit E
- The type of external metric. If set, the type is 2, otherwise it is 1.
-
1
- The metric is directly comparable to OSPF Link State metrics
-
2
- The metric is considered larger than all OSPF Link State metrics
-
TOS 0
- zero
-
metric
- The cost of this route. Interpretation depends on the E-bit.
-
Forwarding Address
- The IP address that data traffic for this type of service intended for the
advertised destination is to be forwarded to. The value zero indicates that
traffic should be forwarded to the AS boundary router that originated the
advertisement.
-
External Route Tag
- A 32-bit value attached to the external route by an AS boundary router.
OSPF does not define or use this value, but see
BGP OSPF Interaction for its use when BGP is
being used as an external routing protocol.
-
TOS x
- Zero or more entries for additional types of service. The number of entries
can be determined from the length field in the header.
When all Link State Request packets have been answered, the databases are
synchronized and the routers are described as fully adjacent. This adjacency is
now added to the two routers' link state advertisements.
Using its attached areas' link state databases as
input, a router runs the SPF algorithm to build its routing table. The routing
table is always built from scratch: updates are never made to an existing
routing table. An old routing table is not discarded until changes between the
two tables have been identified. Briefly, the calculation consists of the steps
listed below. See RFC 1583 for more details about how the algorithm is
implemented.
- The intra-area routes are calculated building the shortest path tree for
each attached area using the router itself as the root of the tree. The router
also calculates whether the area can act as a transit area for virtual links.
- The inter-area routes are calculated through examination of summary link
advertisements. For area border routers (which are part of the backbone) only
link advertisements corresponding to the backbone are used (that is, an area
border router will always route inter-area traffic through the backbone).
- If the router is connected to one or more transit areas, the router
replaces any routes it has calculated by routes through transit areas if these
are superior.
- AS external routes are calculated through examination of AS external link
advertisements. The locations of the AS boundary routers are already known
because these are determined like any other intra-area or inter-area routes.
When the algorithm produces multiple equal cost routes, OSPF can distribute
the load across them evenly. The maximum supported number of equal cost routes
is implementation dependent.
A router periodically advertises its link state, so the absence of a recent
advertisement indicates to a router's neighbors that the router is down. All
routers which have established bidirectional communication with a neighbor run
an inactivity timer to detect such an occurrence. If the timer is not reset, it
will eventually pop, and the associated event places the state machine
corresponding to that neighbor in the down state. This means that
communication must be re-established from the beginning, including
re-synchronization of databases. A router also re-issues its advertisements
when its state changes.
A router can issue several link state advertisements into each area. These
are propagated throughout the area by the flooding procedure. Each router
issues a Router Links Advertisement. If the router is also the Designated
Router for one or more of the networks in the area, then it will originate
Network Links Advertisements for those networks. Area border routers issue one
Summary Link Advertisement for each known inter-area destination. AS boundary
routers originate one AS External Link Advertisement for each known external
destination. Destinations are advertised one at a time so that the change in
any single route can be flooded without reflooding the entire collection of
routes. During the flooding procedure, many link state advertisements can be
carried by a single Link State Update packet.
OSPF is a complex routing protocol, as will be clear
from the preceding sections. The benefits of this complexity (over RIP) are as
follows:
- Because of the synchronized Link State databases, OSPF routers will
converge much faster than RIP routers after topology changes. This effect
becomes more pronounced as autonomous systems get larger.
- It includes Type of Service (TOS) routing that is designed to
compute separate routes for each type of service. For any destination, multiple
routes can exist, each route supporting one or more different Types of Service.
- It uses weighted metrics for different speed links. For example, a T1 1.544
Mbps link might be assigned a metric of 1 and a 9600 bps SLIP link might be
assigned a metric of 10.
- It provides load balancing since an OSPF gateway can use more than one
equal minimum cost path to a destination.
- A subnet mask is associated with each route, allowing variable-length
subnetting (see Subnets) and supernetting
(see Classless Inter-Domain Routing (CIDR)).
- All exchanges between routers may be authenticated by the use of passwords.
- OSPF supports host-specific routes, network-specific routes as well as
subnet routes.
- OSPF allows contiguous networks and hosts to be grouped together into
areas within an AS, simplifying the topology and reducing the amount of
routing information which must be exchanged. Knowledge of an area's topology
remains hidden from other areas.
- It minimizes broadcasts by allowing a more complex graph topology in which
multiaccess networks have a Designated Router which is responsible for
describing that network to the other networks in the area.
- It allows external routing information exchange, that is, routing
information learned from another autonomous system.
- It allows routing within the AS to be configured according to a virtual
topology rather than just the physical interconnections. Areas can be joined
using virtual links which cross other areas without requiring complicated
routing.
- It allows the use of point-to-point links without IP addresses which can
save scarce resources in the IP address space.
A detailed description of OSPF can be found in the following RFCs:
- RFC 1245 - OSPF Protocol Analysis
- RFC 1246 - Experience with the OSPF Protocol
- RFC 1253 - OSPF Version 2: Management Information Base
- RFC 1370 - Applicability Statement for OSPF
- RFC 1583 - OSPF Version 2
Intermediate System to Intermediate System (IS-IS) is
a similar protocol to OSPF: it also uses a Link State,
Shortest Path First algorithm (see Link-State,
Shortest Path First for more details). However, IS-IS is an OSI protocol
used for routing Connectionless Network Protocol (CLNP)
packets within a routing domain. CLNP is the OSI protocol
most comparable to IP.
Integrated IS-IS extends IS-IS to encompass TCP/IP. Integrated IS-IS
is described in RFC 1195. Its goal is to provide a
single (and efficient) routing protocol for TCP/IP and for OSI.
Its design makes use of the OSI IS-IS routing protocol, augmented with
IP-specific information, and provides explicit support for IP subnetting,
variable subnet masks, TOS-based (type of service) routing, and external
routing. It provides a provision for authentication information. Integrated
IS-IS is based on the same SPF routing algorithm as OSPF.
Integrated IS-IS does not employ mutual encapsulation of IP and CLNP
packets: both types are forwarded ``as-is'', nor does it change the behavior of
the router as expected by either protocol suite. Integrated IS-IS behaves like
an IGP in a TCP/IP network and in an OSI network. The only change is the
addition of additional IP-related information.
IS-IS uses the term Intermediate System (IS) to
refer to an IS-IS router, but we shall use the term router, since this is
freely used in the Integrated IS-IS standard.
IS-IS groups networks into domains in a fashion that is analogous to OSPF.
A routing domain is analogous to an Autonomous
system, and it is subdivided into areas just like OSPF. Here is an
overview of some of the more important aspects of IS-IS routing. Where
possible, comparisons are made with equivalent concepts used in OSPF but it is
dangerous to draw too close a parallel, since there are fundamental differences
between the two protocols.
- Routers are divided into Level 1 routers, which know
nothing of the topology outside their areas, and Level 2 routers, which do know
about the higher-level topology, but know nothing about the topology inside the
areas unless they are also Level 1 routers.
- A Level 1 router may belong to more than one area, but unlike OSPF this is
not for routing purposes but for ease of management of the domain, and would
normally be short term. A Level 1 router recognizes another as a neighbor if
they are both in the same area.
- A Level 2 router recognizes all other Level 2 routers as neighbors. A Level
2 router may also be a Level 1 router in one area, but not more.
- A Level 1 router in IS-IS cannot have a link to an external router (in OSPF
an internal router can be an AS border router).
- There is a Level 2 backbone containing all Level 2 routers, but unlike in
OSPF, it must be physically connected.
- The OSI addressing scheme explicitly identifies the target area for a
packet, allowing simple selection of routing choices as follows:
- Level 2 routers route towards the area without regard to its internal
structure.
- Level 1 routers route towards the destination if it is within their area,
or to the nearest Level 2 router if it is not.
- Multiaccess networks use a Designated Router concept. To avoid the
``n(n-1)/2'' problem described under OSPF, IS-IS implements a
pseudonode for the LAN. Each LAN-attached router
is regarded as having a link to the pseudonode but no link to any of the other
routers on the LAN. The Designated Router then acts on behalf of the
pseudonode.
Integrated IS-IS permits considerable mixing of the two protocol suites,
subject to certain restrictions on the topology. Three types of router are
defined:
-
IP-only
- A router which uses IS-IS as the routing protocol for IP and does not
otherwise support OSI protocols (for example, such routers would not be able to
forward OSI CLNP packets).
-
OSI-only
- A router which uses IS-IS as the routing protocol for OSI but not IP.
-
Dual
- A router which uses IS-IS as a single integrated routing protocol for both
IP and OSI.
It is possible to have a mixed domain containing IS-IS routers, some of which
are IP-only, some OSI-only and some are dual. Each area within a mixed domain
is configured to be OSI, IP or dual. Areas which are to carry mixed traffic
must have dual routers for all of the Level 1 routers. Similarly, the Level 2
routers in a mixed domain must all be dual routers if mixed traffic is to be
routed between areas.
As its name suggests, Integrated IS-IS offers an integrated routing solution
for multi-protocol networks. OSPF, like other TCP/IP routing protocols, uses an
approach termed Ships In the Night (SIN) to handle
coexistence issues. In the SIN approach, each multiprotocol router runs a
separate process for each network layer (IP and OSI). A SIN router allows
network managers to insert new SIN-based routing protocols, such as OSPF, one
by one in the network, but the protocols exist independently of one another,
and their frames pass each other like ships in the night.
Since the customer base of independent router vendors remains largely
TCP/IP-focused, most of these vendors are choosing, for now, to stick with SIN
even if it means their routers will not be able to work in OSI networks. A few
of them have announced that they will support Integrated IS-IS in the future.
Table of Contents Exterior
Routing Protocols