Table of Contents NSFNET Routing Architecture
TCP/IP Tutorial and Technical Overview

3.3 Interior Routing Protocols

Interior routing protocols or interior gateway protocols (IGPs) are used to exchange routing information between routers within a single autonomous system. They are also used by routers which run exterior routing protocols to collect network-reachability information for the autonomous system.

Note: The term interior routing protocol has no abbreviation in common use, so we shall use the abbreviation IGP as is usual in TCP/IP literature.

The most widely used IGPs are:

Before discussing these three protocols in detail, we shall look at two important groups of routing algorithm used in IGPs.

3.3.1 Routing Algorithms

In this section, we discuss the Vector-Distance and Link-State, Shortest Path First routing algorithms.

3.3.1.1 Vector-Distance

The term Vector-Distance refers to a class of algorithms that gateways use to update routing information. Each router begins with a set of routes for those networks or subnets to which it is directly attached, and possibly some additional routes to other networks or hosts if the network topology is such that the routing protocol will be unable to produce the desired routing correctly. This list is kept in a routing table, where each entry identifies a destination network or host and gives the ``distance'' to that network. The distance is called a metric and is typically measured in ``hops''.

Periodically, each router sends a copy of its routing table to any other router it can reach directly. When a report arrives at router B from router A, B examines the set of destinations it receives and the distance to each. B will update its routing table if:

This kind of algorithm is easy to implement, but it has a number of disadvantages:

The most difficult task in a vector-distance algorithm is to prevent instability. Different solutions are available:

3.3.1.2 Link-State, Shortest Path First

The growth in networking over the past few years has pushed the currently available Interior Gateway Protocols, which use vector-distance algorithms, past their limits. The primary alternative to vector-distance schemes is a class of protocols known as Link State, Shortest Path First.

The important features of these routing protocols are:

In general, a link state protocol works as follows. Each router periodically sends out a description of its connections (the state of its links) to its neighbors (routers are neighbors if they are connected to the same network). This description, called a Link State Advertisement (LSA), includes the configured cost of the connection. The LSA is flooded throughout the router's domain. Each router in the domain maintains an identical synchronized copy of a database composed of this link state information. This database describes both the topology of the router's domain and routes to networks outside of the domain such as routes to networks in other autonomous systems. Each router runs an algorithm on its topological database resulting in a shortest-path tree. This shortest-path tree contains the shortest path to every router and network the gateway can reach. From the shortest-path tree, the cost to the destination and the next hop to forward a datagram to is used to build the router's routing table.

Link-state protocols, in comparison with vector-distance protocols, send out updates when there is news, and may send out regular updates as a way of ensuring neighbor routers that a connection is still active. More importantly, the information exchanged is the state of a router's links, not the contents of the routing table. This means that link-state algorithms use fewer network resources than their vector-distance counterparts, particularly when the routing is complex or the autonomous system is large. They are, however, compute-intensive. In return, users get faster response to network events, faster route convergence, and access to more advanced network services.

3.3.2 The Hello Protocol

This was used in the ``Fuzzball'' software for LSI/11 minicomputers, which were widely used in Internet experimentation. The Hello protocol is described in RFC 891 - DCN Local-Network Protocols. It is not an Internet standard.

Note: OSPF (see Open Shortest Path First Protocol (OSPF) Version 2) includes a quite separate protocol for negotiation between routers which is also called the Hello protocol.

The communication in the Hello protocol is via Hello messages which are carried via IP datagrams. Hello uses protocol number 63 (reserved for ``any local network'').

The Hello protocol is significant partly because of its wide deployment during the early expansion of the Internet and partly because it provides an example of a vector-distance algorithm that does not use hop counts like RIP (see Routing Information Protocol Version 1 (RIP, RIP-1)) but, instead, network delays as a metric for the distance.

A Distributed Computer Network (DCN) physical host is a PDP11-compatible processor which supports a number of cooperating sequential processses, each of which is given a unique 8-bit identifier called its port ID. Every DCN host contains one or more internet processes, each of which supports a virtual host given a unique 8-bit identifier called its host ID. There is a one-to-one correspondence between internet addresses and host IDs. Each DCN physical host is identified by a host ID for the purpose of detecting loops in routing updates, which establish the minimum-delay paths between the virtual hosts.

Each physical host contains two tables:

Host Table
This contains estimates of round-trip delay and logical-clock offset (that is, the difference between the logical clock of this host and the logical clock of the sender's host). It is indexed by the host number. The host table is maintained dynamically using updates generated by periodic (from 1 to 30 seconds) Hello messages.
Net Table
This contains an entry for every neighbor network that may be connected to the local network and certain other networks that are not neighbors. Each entry contains the network number, as well as the host number of the router (located on the local network) to that network. The Net table is fixed at configuration time for all hosts except those that support the GGP or EGP routing protocols. In these cases the Net table is updated as part of the routing operation.
In addition, entries in either table can be changed by operator commands.

The delay and offset estimates are updated by Hello messages exchanged on the links connecting physical neighbors.

Here is the format of a Hello message:


Figure: Hello Message Format

Where:

Checksum
contains a checksum covering the fields indicated
Date
is the local host's date
Time
is the local host's time
Timestamp
used in round-trip calculation (see below)
L Offset
contains the offset of the block of entries of internet addresses used on the local network
#hosts
contains the number of entries from the host table that follows
Delay n
delay to reach host n
Offset n
offset from host n (difference between clocks)

Let us consider the two main steps of the Hello protocol.

3.3.2.1 Round-Trip Delay Calculation

Periodically each host sends a Hello message to its neighbor on each of the communication links common to both of them. For each of these links the sender keeps a set of state variables, including a copy of the source-address field of the last Hello message received. When constructing a Hello message the sender sets the destination-address field to this state variable and the source-address field to its own address. It then fills in the date and time fields from its clock and the time stamp from another state variable. It finally copies the delay and offset values from its host table into the message.

Round-trip delay calculations are performed on the host receiving the Hello message. Each link has an internal state variable assigned, which is updated as each Hello message is received; this variable takes the value of the time field, minus the current time-of-day. When the next Hello message is transmitted, the value assigned to the time stamp field is computed as the low-order 16-bits of this variable minus the current time-of-day. The round trip delay is computed as the low-order 16-bits of the current time-of-day minus the value of the timestamp field.

3.3.2.2 Host Updates

When a Hello message arrives which results in a valid round trip-delay calculation, a host update process is performed. This consists of adding the round trip delay to each of the ``Delay n'' entries in the Hello message in turn and comparing each of these calculated delays to the delay field of the corresponding host table. Each entry is then updated according to the following rules:

The purpose of the switching threshold is to avoid (together with minimum delay specification) unnecessary switching between links and transient loops which can occur due to normal variations in propagation delays.

Please refer to RFC 891 for more details.

3.3.3 Routing Information Protocol (RIP)

There are two versions of RIP. Version 1 (RIP-1) is a widely deployed protocol with a number of known limitations. Version 2 (RIP-2) is an enhanced version designed to alleviate the limitations of RIP while being highly compatible with it. The term RIP is used to refer to Version 1, while RIP-2 refers to Version 2. Whenever the reader encounters the term RIP in TCP/IP literature, it is safe to assume that it is referring to Version 1 unless explicitly stated otherwise. We shall use this nomenclature in this section except when the two versions are being compared, when we shall use the term RIP-1 to avoid possible confusion.

3.3.3.1 Routing Information Protocol Version 1 (RIP, RIP-1)

RIP is a standard protocol (STD 34). Its status is elective. It is described in RFC 1058, although many RIP implementations pre-date this RFC by a number of years. RIP is generally implemented with a daemon named routed. RIP is also supported by gated daemons.

RIP was based on the Xerox PUP and XNS routing protocols. It is widely used, as the code is incorporated in the routing code of Berkeley BSD UNIX which provides the basis for many UNIX implementations.

RIP is a straightforward implementation of vector-distance routing for local networks. RIP communication uses UDP as a transport protocol, with port number 520 as the destination port (see User Datagram Protocol (UDP) for a description of UDP and ports). RIP operates in one of two modes: active (normally used by routers) and passive (normally used by hosts). The difference between the two is explained below. RIP messages are sent in UDP datagrams and each contains up to 25 pairs of numbers as shown in Figure - RIP Message.


Figure: RIP Message - Between 1 and 25 routes may be listed in a RIP message. With 25 routes the message is 504 bytes long (25x20+4) which is the maximum size message that can be transmitted in a 512-byte UDP datagram.

Command
is 1 for a RIP request or 2 for a RIP reply.
Version
is 1.
Address Family
is 2 for IP addresses.
IP address
is the IP address for this routing entry: either a host or a subnet (in which case the host number is zero).
Hop count metric
is the number of hops to the destination. The hop count for a directly connected interface is 1, and each intermediate router increments it by 1 to a maximum of 15, with 16 indicating that no route exists to the destination.

Both active and passive RIP participants listen to all broadcast messages and update their routing table according to the vector-distance algorithm described earlier.

Basic Operation
Limitations

RIP is not designed to solve every possible routing problem. RFC 1720 (STD 1) describes these technical limitations of RIP as ``serious'' and the IETF is evaluating candidates for a new standard ``open'' protocol to replace RIP. Possible candidates include OSPF (see Open Shortest Path First Protocol (OSPF) Version 2) and OSI IS-IS (see OSI Intermediate System to Intermediate System (IS-IS)). However, RIP is widely deployed and therefore is unlikely to be completely replaced for some time. RIP has the following specific limitations:

Solving the counting to infinity problem is done by using the split horizon, poisoned reverse and triggered updates techniques.

Split horizon with poisoned reverse

Let's consider our example network (shown in Figure - The Counting to Infinity Problem) again.


Figure: The Counting to Infinity Problem - All links have a metric of 1 except for the indirect route from C to D which has a metric of 10.

As described in Vector-Distance the problem was caused by the fact that A and C are engaged in a pattern of mutual deception. Each claims to be able to reach D via the other. This can be prevented by being more careful about where information is sent. In particular, it is never useful to claim reachability for a destination network to the neighbor from which the route was learned (reverse routes). The split horizon with poisoned reverse scheme includes routes in updates sent to the router from which they were learned, but sets their metrics to infinity. If two routers have routes pointing at each other, advertising reverse routes with a metric of 16 will break the loop immediately. If the reverse routes are simply not advertised (this scheme is called simple split horizon), the erroneous routes will have to be eliminated by waiting for a timeout. Poisoned reverse does have a disadvantage: it increases the size of the routing messages.

Triggered updates

Split horizon with poisoned reverse will prevent any routing loop that involves only two gateways. However, it is still possible to end up with patterns in which three routers are engaged in mutual deception. For example, A may believe it has a route through B, B through C, and C through A. This cannot be solved using split horizon. This loop will only be resolved when the metric reaches infinity and the network or host involved is then declared unreachable. Triggered updates are an attempt to speed up this convergence. Whenever a router changes the metric for a route, it is required to send update messages almost immediately, even if it is not yet time for one of the regular update messages (RIP specifies a small time delay, between 1 and 5 seconds, in order to avoid having triggered updates generate excessive network traffic).

3.3.3.2 Routing Information Protocol Version 2 (RIP-2)

RIP-2 is a draft standard protocol. Its status is elective. It is described in RFC 1723.

RIP-2 extends RIP-1. It is less powerful than other recent IGPs such as OSPF (see Open Shortest Path First Protocol (OSPF) Version 2) and IS-IS (see OSI Intermediate System to Intermediate System (IS-IS)), but it has the advantages of easy implementation and lower overheads. The intention of RIP-2 is to provide a straightforward replacement for RIP which can be used on small to medium-sized networks, can be employed in the presence of variable subnetting (see Subnets) or supernetting (see Classless Inter-Domain Routing (CIDR)) and importantly, can interoperate with RIP-1.

RIP-2 takes advantage of the fact that half of the bytes in a RIP-1 message are reserved (must be zero) and that the original RIP-1 specification was well designed with enhancements in mind, particularly in the use of the version field. One notable area where this is not the case is in the interpretation of the metric field. RIP-1 specifies it as being a value between 0 and 16 stored in a four-byte field. For compatibility, RIP-2 preserves this definition, meaning that it agrees with RIP-1 that 16 is to be interpreted as infinity, and wastes most of this field.

Note: Neither RIP-1 nor RIP-2 are properly suited for use as an IGP in an AS where a value of 16 is too low to be regarded as infinity, because high values of infinity exacerbate the counting to infinity problem. The more sophisticated Link-State protocol used in OSPF and IS-IS provides a much better routing solution when the AS is large enough to have a legitimate hop count close to 16.

Provided that a RIP-1 implementation obeys the specification in RFC 1058, RIP-2 can interoperate with RIP-1. The RIP message format is extended as shown in Figure - RIP-2 Message.


Figure: RIP-2 Message - The first entry in the message may be an authentication entry, as shown here, or it may be a route as in a RIP-1 message. If the first entry is an authentication entry, only 24 routes may be included in a message; otherwise the maximum is 25 as in RIP-1.

The fields in a RIP-2 message are the same as for a RIP-1 message except as follows:

Version
Is 2. This tells RIP-1 routers to ignore the fields designated as ``must be zero'' (if the value is 1, RIP-1 routers are required to discard messages with non-zero values in these fields since the messages originate with a router claiming to be RIP-1-compliant but sending non-RIP-1 messages).
Address Family
May be X'FFFF' in the first entry only, indicating that this entry is an authentication entry.
Authentication Type
Defines how the remaining 16 bytes are to be used. The only defined types are 0 indicating no authentication and 2 indicating that the field contains password data.
Authentication Data
The password is 16 bytes, plain text ASCII, left adjusted and padded with ASCII NULLs (X'00').
Route Tag
Is a field intended for communicating information about the origin of the route information. It is intended for interoperation between RIP and other routing protocols. RIP-2 implementations must preserve this tag, but RIP-2 does not further specify how it is to be used.
Subnet Mask
The subnet mask associated with the subnet referred to by this entry.
Next Hop
A recommendation about the next hop that the router should use to send datagrams to the subnet or host given in this entry.

To ensure safe interoperation with RIP, RFC 1723 specifies the following restrictions for RIP-2 routers sending over a network interface where a RIP-1 router may hear and operate on the RIP messages.

  1. Information internal to one network must never be advertised into another network.
  2. Information about a more specific subnet may not be advertised where RIP-1 routers would consider it a host route.
  3. Supernet routes (routes with a subnet mask shorter than the natural or ``unsubnetted'' network mask) must not be advertised where they could be misinterpreted by RIP-1 routers.

RIP-2 also supports the use of multicasting rather than simple broadcasting. This can reduce the load on hosts which are not listening for RIP-2 messages. This option is configurable for each interface to ensure optimum use of RIP-2 facilities when a router connects mixed RIP-1/RIP-2 subnets to RIP-2-only subnets. Similarly, the use of authentication in mixed environments can be configured to suit local requirements.

RIP-2 is implemented in recent versions of the gated daemon, often termed gated Version 3. Since the draft standard is new at the time of writing, many implementations will comply with the earlier version described in RFC 1388. Such implementations will interoperate with those adhering to RFC 1723.

For more information on RIP-2, see:

3.3.4 Open Shortest Path First Protocol (OSPF) Version 2

Note: The term OSPF is invariably used to refer to OSPF Version 2 (OSPF-2). OSPF Version 1, which is described in RFC 1131, is obsolete.

OSPF is a draft standard protocol. Its status is elective, but RFC 1370 contains an applicability statement for OSPF which says that any router implementing a protocol other than simple IP-based routing must implement OSPF (this does not preclude a router implementing other protocols as well, of course). OSPF is described in RFC 1583, which obsoletes RFC 1247. OSPF implementations based on RFC 1583 are backward-compatible with implementations based on RFC 1247 and will interoperate with them. Readers interested in the development of OSPF Version 2 from Version 1 should refer to Appendix F of RFC 1247 and Appendix E of RFC 1583.

OSPF is an interior routing protocol, but it is designed to operate with a suitable exterior protocol, such as BGP. See BGP OSPF Interaction.

OSPF is a complex standard when compared to RIP: RFC 1583 runs to 216 pages, whereas RIP, specified in RFC 1058 has 33 pages and RIP-2 (RFC 1723) adds only another 9. Much of the complexity of OSPF is directed towards a single purpose: ensuring that the topological databases are the same for all of the routers within an area. Because the database is the basis for all routing choices, if routers were to have independent databases, they could make mutually conflicting decisions.

OSPF communicates using IP (it is protocol number 89). It is a Link-State, Shortest Path First protocol as described in Link-State, Shortest Path First. OSPF supports different kinds of networks such as point-to-point networks, broadcast networks, such as Ethernet and token-ring, and non-broadcast networks, such as X.25.

The OSPF specification makes use of state machines to define the behavior of routers complying with the protocol. Aspects of a router's operation which are important to OSPF, such as its network interfaces and its neighboring routers, are described as being in one of a finite number of states (for example, a neighbor may be in the down state). There is a separate state machine for each separate component (for example, two network interfaces have separate state machines) and the state of one is independent of the state of another. The possible states are sufficient to describe all possible conditions relevant to the protocol, so a state machine is always in one, and only one, of its possible states. State changes occur only as a result of events. There is a finite set of events for each type of state machine which is sufficient to describe all possible occurrences relevant to the protocol. The behavior of the state machine in response to an event is defined for all possible combinations of state and event. For example, if the state machine for a network interface experiences an InterfaceDown event, the state machine changes to the down state unconditionally. The InterfaceDown event is generated by the OSPF implementation whenever it receives an indication from a lower-level protocol that the interface is not functioning. See RFC 1583 for a complete description of each of the state machines, their possible states and events and the changes associated with them.

Here are some definitions which are necessary to understand the sequence of operations described later in this section:

Area
A set of networks within a single autonomous system that have been grouped together. The topology of an area is hidden from the rest of the autonomous system, and each area has a separate topological database (see below). Routing within the autonomous system takes place on two levels, depending on whether the source and destination of a packet reside in the same area (intra-area routing) or different areas (inter-area routing).
  • Intra-area routing is determined only by the area's own topology. That is, the packet is routed solely on information obtained within the area; no routing information obtained outside the area can be used.
  • Inter-area routing is always done via the backbone.
  • The division of an autonomous system into areas enables a significant reduction in the volume of routing traffic required to manage the routing database for a large autonomous system.

    Backbone
    The backbone consists of those networks not contained in any area, their attached routers, and those routers that belong to multiple areas. The backbone must be logically contiguous. If it is not physically contiguous, the separate components must be connected using virtual links (see below). The backbone is responsible for the distribution of routing information between areas. The backbone itself has all the properties of an area; its topology is separate from that of the areas.
    Area Border Router
    A router connected to multiple areas. An area border router has a copy of the topological database for each area that it is connected to. An area border router is always part of the backbone. Area border routers are responsible for the propagation of inter-area routing information into the areas to which they are connected.
    Internal Router
    A router which is not an area border router.
    AS Border Router (ASBR)
    A router which exchanges routing information with routers belonging to other autonomous systems. All routers in the AS know the path to all AS boundary routers. An ASBR may be an area border router or an internal router. It need not be part of the backbone.

    Note: The nomenclature for this type of router is somewhat varied. RFC 1583, which describes OSPF uses the term AS Boundary Router. RFC 1267 and 1268 which describe BGP use the terms Border Router and Border Gateway. RFC 1340 which describes the interaction between OSPF and BGP uses the term AS Border Router. We shall use the last term consistently when describing both OSPF and BGP.

    Virtual Link
    A virtual link is part of the backbone. Its endpoints are two area border routers which share a common non-backbone area. The link is treated like a point-to-point link with metrics cost equal to the intra-area metrics between the endpoints of the links. The routing through the virtual link is done using normal intra-area routing.
    Transit Area
    An area through which a virtual route is physically connected.
    Stub Area
    An area configured to use default routing for inter-AS routing. A stub area can be configured where there is only a single exit from the area, or where any exit may be used without preference for routing to destinations outside the autonomous system. By default inter-AS routes are copied to all areas, so the use of stub areas can reduce the storage requirements of routers within those areas for autonomous systems where a lot of inter-AS routes are defined.
    Multiaccess Network
    A physical network that supports the attachment of multiple routers. Each pair of routers on such a network is assumed to be able to communicate directly.
    Hello Protocol
    The part of the OSPF protocol used to establish and maintain neighbor relationships. This is not the Hello protocol described in The Hello Protocol.
    Neighboring routers
    Two routers that have interfaces to a common network. On multiaccess networks, neighbors are dynamically discovered by the Hello protocol.

    Each neighbor is described by a state machine which describes the conversation between this router and its neighbor. A brief outline of the meaning of the states follows. See the section immediately following for a definition of the terms adjacency and designated router.

    Down
    Initial state of a neighbor conversation. It indicates that there has been no recent information received from the neighbor.
    Attempt
    A neighbor on a non-broadcast network appears down and an attempt should be made to contact it by sending regular Hello packets.
    Init
    A Hello packet has recently been received from the neighbor. However, bidirectional communication has not yet been established with the neighbor (that is, the router itself did not appear in the neighbor's Hello packet).
    2-way
    In this state, communication between the two routers is bidirectional. Adjacencies can be established, and neighbors in this state or higher are eligible to be elected as (backup) designated routers.
    ExStart
    The two neighbors are about to create an adjacency.
    Exchange
    The two neighbors are telling each other what they have in their topological databases.
    Loading
    The two neighbors are synchronizing their topological databases.
    Full
    The two neighbors are now fully adjacent; their databases are synchronized.
    Various events cause a change of state. For example, if a router receives a Hello packet from a neighbor that is down, the neighbor's state changes to init, and an inactivity timer is started. If the timer fires (that is, no further OSPF packets are received before it expires) the neighbor will return to the down state. Refer to RFC 1583 for a complete description of the states and information on the events which cause state changes.
    Adjacency
    A relationship formed between selected neighboring routers for the purpose of exchanging routing information. Not every pair of neighboring routers become adjacent. In particular, not every pair of routers will stay synchronized. If all neighbors were to be synchronized, the number of synchronized pairs on a multiaccess network such as a LAN would be n(n-1)/2 where n is the number of routers on the LAN. In large networks, the synchronization traffic would swamp the network, rendering it unusable. The concept of adjacencies is used to limit the number of synchronized pairs to 2n-1, ensuring that the amount of synchronization traffic is manageable.
    Link State Advertisement
    Refers to the local state of a router or network. This includes the state of the router's interfaces and adjacencies. Each link state advertisement is flooded throughout the routing domain. The collected link state advertisements of all routers and networks form the area's topological database.
    Flooding
    The process of ensuring that each link state advertisement is passed between adjacent routers to reach every router in the area. The flooding procedure is reliable.
    Designated Router
    Each multiaccess network that has at least two attached routers, has a Designated Router. The Designated Router generates a link state advertisement for the multiaccess network. It is elected by the Hello protocol. It becomes adjacent to all other routers on the network. Since the topological databases of all routers are synchronized through adjacencies, the Designated Router plays a central part in the synchronization process.
    Backup Designated Router
    In order to make the transition to a new Designated Router smoother, there is a Backup Designated Router for each multiaccess network. The Backup Designated Router is also adjacent to all routers on the network, and becomes Designated Router when the previous Designated Router fails. Because adjacencies already exist between the Backup Designated Router and all other routers attached to the network, new adjacencies do not have to be formed when the Backup Designated Router takes over from the Designated Router, shortening the time required for the takeover considerably. The Backup designated router is elected using the Hello protocol.
    Interface
    The connection between a router and one of its attached networks. Each interface has state information associated with it which is obtained from the underlying lower-level protocols and the OSPF protocol itself. A brief description of each state is given here. Please refer to RFC 1583 for more details, and for information on the events that will cause an interface to change its state.
    Down
    The interface is unavailable. This is the initial state of an interface.
    Loopback
    The interface is looped back to the router. It cannot be used for regular data traffic.
    Waiting
    The router is trying to determine the identity of the Designated Router or its backup.
    Point-to-Point
    The interface is to a point-to-point network or is a virtual link. The router forms an adjacency with the router at the other end.

    Note: The interfaces do not need IP addresses. Since the remainder of the internet has no practical need to see the routers' interfaces to the point-to-point link, just the interfaces to other networks, any IP addresses for the link would be needed only for communication between the two routers. To conserve the IP address space, the routers can dispense with IP addresses on the link. This has the effect of making the two routers appear to be one to IP but this has no ill effects. Such a link is called an unnumbered link.

    DR Other
    The interface is on a multiaccess network but this router is neither the Designated Router nor its backup. The router forms adjacencies with the Designated Router and its backup.
    Backup
    The router is the Backup Designated Router. It will be promoted to Designated Router if the present Designated Router fails. The router forms adjacencies with every other router on the network.
    DR
    The router itself is the Designated Router. The router forms adjacencies with every other router on the network. The router must also originate a network links advertisement for the network node.
    Type of Service (TOS) metrics
    In each type of link state advertisement, different metrics can be advertised for each IP Type of Service. A metric for TOS 0 (used for OSPF routing protocol packets) must always be specified. Metrics for other TOS values can be specified; if they are not, these metrics are assumed equal to the metric specified for TOS 0.
    Link State Database
    Also called the directed graph or the topological database. It is created from the link state advertisements generated by the routers in the area.

    Note: RFC 1583 uses the term Link State Database in preference to topological database. The former term has the advantage that it describes the contents of the database, the latter is more descriptive of the purpose of the database -- to describe the topology of the area. We have previously used the term topological database for this reason, but for the remainder of this section where we discuss the operation of OSPF in more detail, we will refer to it as the Link State Database.

    Shortest-Path Tree
    Each router runs the Shortest Path First (SPF) algorithm on the Link State Database to obtain its shortest-path tree. The tree gives the route to any destination network or host as far as the area boundary. It is used to build the routing table.

    Note: Because each router occupies a different place in the area's topology, application of the SPF algorithm gives a different tree for each router, even though the database is identical.

    Area border routers run multiple copies of the algorithm but build a single routing table.

    Routing table
    The routing table contains entries for each destination: network, subnet or host. For each destination, there is information for one or more types of service (TOS). For each combination of destination and type of service, there are entries for one or more optimum paths to be used.
    Area ID
    A 32-bit number identifying a particular area. The backbone has an Area ID of zero.
    Router ID
    A 32-bit number identifying a particular router. Each router within the AS has a single router ID. One possible implementation is to use the lowest numbered IP address belonging to a router as its router ID.
    Router Priority
    An 8-bit unsigned integer, configurable on a per-interface basis indicating this router's priority in the selection of the (backup) Designated Router. A Router Priority of zero indicates that this router is ineligible to be the Designated Router.

    3.3.4.1 Overview of OSPF Operation

    The basic sequence of operations performed by OSPF routers is:

    1. Discovering OSPF neighbors
    2. Electing the Designated Router
    3. Forming adjacencies
    4. Synchronizing databases
    5. Calculating the routing table
    6. Advertising Link States
    Routers will go through these steps when they first come up, and will repeat these steps in response to events which occur in the network. Each router must perform each of these steps for each network it is connected to, except for the calculation of the routing table. Each router generates and maintains a single routing table for all networks.

    Each of these steps is described in the following sections.

    Discovering OSPF Neighbors

    When OSPF routers start, they initiate and sustain relationships with their neighbors using the Hello protocol. The Hello protocol also ensures that communication between neighbors is bidirectional. Hello packets are sent periodically out to all router interfaces. Bidirectional communication is indicated when the router sees itself in the neighbor's Hello packet. On a broadcast network, Hello packets are sent using multicast; neighbors are then discovered dynamically. On non-broadcast networks each router that may potentially become a Designated Router has a list of all routers attached to the network and will send Hello packets to all other potential Designated Routers when its interface to the non-broadcast network first becomes operational.

    The OSPF header is described in Figure - OSPF Packet Header.


    Figure: OSPF Packet Header

    Version #
    The OSPF version number (2).
    Type
    Hello (1), database description (2), Link-State Request (3), Link-State Update (4), or Link-State Acknowledgment (5).
    Packet length
    Length of the protocol packet in bytes including the OSPF header.
    Router ID
    The ID of the router originating the packet.
    Area ID
    The area that the packet is being sent into.
    Checksum
    The standard IP checksum of the entire contents of the packet, excluding the 64-bit authentication field.
    AuType
    Identifies the authentication scheme to be used for the packet. The authentication type is configurable on a per-area basis. Additional authentication data is configurable on a per-interface basis. The authentication types currently defined are 0 (No authentication) and 1 (plain text 64-bit password).
    Authentication
    A 64-bit field for use by the authentication scheme.

    The format of the OSPF Hello packet is given in Figure - OSPF Hello Packet.


    Figure: OSPF Hello Packet

    Network Mask
    The network mask associated with this interface. This is the subnet mask if subnetting is implemented, or the equivalent mask for a non-subnetted network (for example, 255.255.255.0 for a non-subnetted Class C network).
    Dead Interval
    The number of seconds that must elapse before returning a silent neighbor to the down state.
    Hello Int (Hello Interval)
    The number of seconds between this router's Hello packets.
    Priority
    This router's Router Priority (for this interface).
    Designated Router
    The IP address of the Designated Router for this network, according to the sending router. This is set to 0 if no Designated Router is known.
    Backup Designated Router
    The IP address of the Backup Designated Router for this network, according to the sending router. This is set to 0 if no Backup Designated Router is known.
    Neighbor
    The Router IDs of each router from whom valid Hello packets have been received recently from the network. Recently means within the last Dead Interval.
    Determining the Designated Router

    This is done using the Hello protocol. A brief description of the process is given here. See RFC 1583 for a full description. The router examines the list of its neighbors, discards any with which it does not have bidirectional communication or which have a Router Priority of zero, and records the Designated Router, Backup Designated Router and Router Priority declared by each one. The router adds itself to the list, using the Router Priority configured for the interface and zero (unknown) for the Designated Router and Backup Designated Router values, if the calculating router has just come up.

    The following rules are used to determine the Backup Designated Router:

    Because the calculating router is in the list, it may determine that it is to become the Backup Designated Router. A similar process is followed for the Designated Router:

    The actual process is considerably more complex than this, because the Hello messages transmitted include changes to the fields recorded on other routers, and these changes cause events in those routers which in turn will trigger state changes or other actions. The intent behind the mechanism is twofold:

    The algorithm does not always result in the router with the highest priority being the Designated Router, nor in the one with the second highest priority being the Backup.

    The Designated Router has the following responsibilities:

    The Backup Designated Router has the following responsibility

    Forming adjacencies

    After a neighbor has been discovered, bidirectional communication ensured, and (on a multiaccess network) a Designated Router elected, a decision is made regarding whether or not an adjacency should be formed with the neighbor:

    If the decision is made to not form an adjacency, the state of the neighbor communication remains in the 2-way state.

    Adjacencies are established using Database Description packets. These contain a summary of the sender's link state database. Multiple packets may be used to describe the database: for this purpose a poll-response procedure is used. The router with the higher router ID will become the master, the other will become the slave. Database Description packets sent by the master (polls) are acknowledged by Database Description packets sent by the slave (responses). The packets contain sequence numbers to ensure a match between polls and responses. This is called the Database Exchange Process.

    The format of the OSPF Database Description packet is shown in Figure - OSPF Database Description Packet.


    Figure: OSPF Database Description Packet

    0
    Reserved, must be 0.
    I bit
    Init bit. Set to 1 when the packet is the first in the sequence of database description.
    M bit
    More bit. Indicates that more data descriptions are to follow.
    MS bit
    Master/slave bit. Set to 1 when the router is the master, 0 when it is the slave.
    DD sequence number
    Used to sequence the collection of database description packets.

    The rest of the packet contains a list of some or all of the contents of the topological database. Each item in the database is a link state advertisement. The database description packets contain the headers from these advertisements. The headers are sufficient to uniquely identify each advertisement. This information is used in the subsequent database synchronization. The format of a Link State Header is shown in Figure - OSPF Link State Advertisement Header.


    Figure: OSPF Link State Advertisement Header

    The fields in the link state advertisement header are:

    LS age
    A 16-bit number indicating the time in seconds since the origin of the advertisement. It is increased as the link state advertisement resides in a router's database and with each hop it travels as part of the flooding procedure. When it reaches a maximum value, it ceases to be used for determining routing tables and is discarded unless it is still needed for database synchronization. The age is also to determine which of two otherwise identical copies of an advertisement a router should use.
    Options
    Two bits which describe optional OSPF capabilities. The E-bit indicates an external routing capability: it is set unless the advertisement is for a router, network links or summary link in a stub area. The E-bit is used for information only and does not affect the routing table. The T-bit indicates that the advertisement describes paths for types of service in addition to TOS 0.
    LS type
    The types of the link state advertisement are:
    1
    Router links. These describe the states of a router's interfaces.
    2
    Network links. These describe the routers attached to a network.
    3
    Summary links. These describe inter-area, intra-AS routes. They are created by area border routers and allow routes to networks within the AS but outside the area to be described concisely.
    4
    Summary links. These describe routes to the boundary of the AS (that is, to AS boundary routers). They are created by area border routers. They are very similar to type 3.
    5
    AS external links. These describe routes to networks outside the AS. They are created by AS boundary routers. A default route for the AS can be described this way.
    Link State ID
    A unique ID for the advertisement which is dependent on the Link State Type. For types 1 and 4 it is the Router ID, for types 3 and 5 it is an IP network number, and for type 2 it is the IP address of the Designated Router.
    Advertising Router
    The Router ID of the router that originated the link state advertisement. For type 1 advertisements, this field is identical to the Link State ID. For type 2, it is the Router ID of the network's Designated Router. For types 3 and 4, it is the Router ID of an area border router. For type 5, it is the Router ID of an AS boundary router.
    LS sequence number
    Used to allow detection of old or duplicate link state advertisements.
    LS checksum
    Checksum of the complete link state advertisement excluding the LS age field.
    Synchronization of databases

    After the Database Exchange Process is over, each router has a list of those link advertisements for which the neighbor has more up-to-date instances. These are then requested in Link State Request packets. The response to a Link State Request packet is a Link State Update packet which contains some or all of the link state advertisements requested. At most one Link State Request can be outstanding: if no response is received, the requester must retry the request.

    Link state advertisements come in five formats. The format of a Router Links Advertisement (Type 1) is shown in Figure - OSPF Router Links Advertisement.


    Figure: OSPF Router Links Advertisement - This advertisement is itself encapsulated in an OSPF packet.

    V Bit
    When set, this router is the endpoint of a virtual link which is using this area as a transit area.
    E Bit
    When set, the router is an AS boundary router.
    B Bit
    When set, the router is an area border router.
    # links
    The number of links described by this advertisement.
    Link ID
    Identifies the object that this link connects to. The value depends upon the type field (see below).
    1
    Neighboring router's Router ID
    2
    IP Address of the Designated Router
    3
    This value depends on what the inter area route is to:
  • For a stub network it is the IP network/subnet number
  • For a host it is X'FFFFFFFF'
  • For the AS-external default route it is X'00000000'
  • 4
    Neighboring router's Router ID
    Link Data
    This value also depends upon the type field (see RFC 1583 for details).
    Type
    What this link connects to.
    1
    Point-to-point connection to another router
    2
    Connection to a transit network
    3
    Connection to a stub network or to a host
    4
    Virtual link
    # metric
    The number of different TOS metrics given for this link in addition to the metric for TOS 0.
    TOS 0 metric
    The cost of using this outbound link for TOS 0. All OSPF routing protocol packets are sent with the IP TOS field set to 0.
    TOS
    IP type of service that this metric refers to. RFC 1349 defines the possible TOS values in an IP header (see also Internet Protocol (IP)) using a 4-bit sequence. OSPF encodes these by treating the sequence as a number and doubling it (there is a reserved bit of 0 immediately following the TOS value field in the IP datagram, so OSPF allows for its future inclusion in the TOS value). There are five defined values:


    Table: Type of Service Values

    metric
    The cost of using this outbound router link for traffic of the specified Type of Service.

    As an example, suppose the point-to-point link between routers RT1 (IP address: 192.1.2.3) and RT6 (IP address: 6.5.4.3) is a satellite link. To encourage the use of this line for high bandwidth traffic, the AS administrator may set an artificially low metric for that TOS. Router RT1 would then originate the following router links advertisement (assuming RT1 is an area border router and is not an AS boundary router):

    ; RT1's router links advertisement
    
    LS age = 0                      ; always true on origination
    LS type = 1                     ; indicates router links
    Link State ID = 192.1.2.3       ; RT1's Router ID
    Advertising Router = 192.1.2.3  ; RT1
    bit E = 0                       ; not an AS boundary router
    bit B = 1                       ; area border router
    #links = 1
           Link ID = 6.5.4.3        ; neighbor router's Router ID
           Link Data = 0.0.0.0      ; interface to unnumbered SL
           Type = 1                 ; connects to router
           # other metrics = 1
           TOS 0 metric = 8
               TOS = 2              ; high bandwidth
               metric = 1           ; traffic preferred
    
    The format of a Network Links Advertisement (Type 2) is shown in Figure - OSPF Network Links Advertisement.


    Figure: OSPF Network Links Advertisement - This advertisement is itself encapsulated in an OSPF packet.

    Network Mask
    The IP address mask for the network. For example a Class A network would have the mask 255.0.0.0.
    Router ID 1-n
    The IP addresses of all routers on the network which are adjacent to the Designated Router (including the sending router). The number of routers in the list is deduced from the length field in the header.
    The format of a Summary Links Advertisement (Type 3 or 4) is shown in Figure - OSPF Summary Links Advertisement.


    Figure: OSPF Summary Links Advertisement - This advertisement is itself encapsulated in an OSPF packet.

    Network Mask
    For a Type 3 link state advertisement, this is the IP address mask for the network. For a Type 4 link state advertisement, this is not meaningful and must be zero.
    TOS 0
    zero
    metric
    The cost of this route for this type of service in the same units used for TOS metrics in Type 1 advertisements.
    TOS x
    Zero or more entries for additional types of service. The number of entries can be determined from the length field in the header.
    The format of an External Links Advertisement is shown in Figure - OSPF External Links Advertisement.


    Figure: OSPF External Links Advertisement - This advertisement is itself encapsulated in an OSPF packet.

    Network Mask
    The IP address mask for the network.
    Bit E
    The type of external metric. If set, the type is 2, otherwise it is 1.
    1
    The metric is directly comparable to OSPF Link State metrics
    2
    The metric is considered larger than all OSPF Link State metrics
    TOS 0
    zero
    metric
    The cost of this route. Interpretation depends on the E-bit.
    Forwarding Address
    The IP address that data traffic for this type of service intended for the advertised destination is to be forwarded to. The value zero indicates that traffic should be forwarded to the AS boundary router that originated the advertisement.
    External Route Tag
    A 32-bit value attached to the external route by an AS boundary router. OSPF does not define or use this value, but see BGP OSPF Interaction for its use when BGP is being used as an external routing protocol.
    TOS x
    Zero or more entries for additional types of service. The number of entries can be determined from the length field in the header.

    When all Link State Request packets have been answered, the databases are synchronized and the routers are described as fully adjacent. This adjacency is now added to the two routers' link state advertisements.

    Calculating the routing table

    Using its attached areas' link state databases as input, a router runs the SPF algorithm to build its routing table. The routing table is always built from scratch: updates are never made to an existing routing table. An old routing table is not discarded until changes between the two tables have been identified. Briefly, the calculation consists of the steps listed below. See RFC 1583 for more details about how the algorithm is implemented.

    1. The intra-area routes are calculated building the shortest path tree for each attached area using the router itself as the root of the tree. The router also calculates whether the area can act as a transit area for virtual links.
    2. The inter-area routes are calculated through examination of summary link advertisements. For area border routers (which are part of the backbone) only link advertisements corresponding to the backbone are used (that is, an area border router will always route inter-area traffic through the backbone).
    3. If the router is connected to one or more transit areas, the router replaces any routes it has calculated by routes through transit areas if these are superior.
    4. AS external routes are calculated through examination of AS external link advertisements. The locations of the AS boundary routers are already known because these are determined like any other intra-area or inter-area routes.

    When the algorithm produces multiple equal cost routes, OSPF can distribute the load across them evenly. The maximum supported number of equal cost routes is implementation dependent.

    Advertising Link States

    A router periodically advertises its link state, so the absence of a recent advertisement indicates to a router's neighbors that the router is down. All routers which have established bidirectional communication with a neighbor run an inactivity timer to detect such an occurrence. If the timer is not reset, it will eventually pop, and the associated event places the state machine corresponding to that neighbor in the down state. This means that communication must be re-established from the beginning, including re-synchronization of databases. A router also re-issues its advertisements when its state changes.

    A router can issue several link state advertisements into each area. These are propagated throughout the area by the flooding procedure. Each router issues a Router Links Advertisement. If the router is also the Designated Router for one or more of the networks in the area, then it will originate Network Links Advertisements for those networks. Area border routers issue one Summary Link Advertisement for each known inter-area destination. AS boundary routers originate one AS External Link Advertisement for each known external destination. Destinations are advertised one at a time so that the change in any single route can be flooded without reflooding the entire collection of routes. During the flooding procedure, many link state advertisements can be carried by a single Link State Update packet.

    3.3.4.2 Summary of Features for OSPF

    OSPF is a complex routing protocol, as will be clear from the preceding sections. The benefits of this complexity (over RIP) are as follows:

    References

    A detailed description of OSPF can be found in the following RFCs:

    3.3.4.3 OSI Intermediate System to Intermediate System (IS-IS)

    Intermediate System to Intermediate System (IS-IS) is a similar protocol to OSPF: it also uses a Link State, Shortest Path First algorithm (see Link-State, Shortest Path First for more details). However, IS-IS is an OSI protocol used for routing Connectionless Network Protocol (CLNP) packets within a routing domain. CLNP is the OSI protocol most comparable to IP.

    Integrated IS-IS extends IS-IS to encompass TCP/IP. Integrated IS-IS is described in RFC 1195. Its goal is to provide a single (and efficient) routing protocol for TCP/IP and for OSI. Its design makes use of the OSI IS-IS routing protocol, augmented with IP-specific information, and provides explicit support for IP subnetting, variable subnet masks, TOS-based (type of service) routing, and external routing. It provides a provision for authentication information. Integrated IS-IS is based on the same SPF routing algorithm as OSPF.

    Integrated IS-IS does not employ mutual encapsulation of IP and CLNP packets: both types are forwarded ``as-is'', nor does it change the behavior of the router as expected by either protocol suite. Integrated IS-IS behaves like an IGP in a TCP/IP network and in an OSI network. The only change is the addition of additional IP-related information.

    IS-IS uses the term Intermediate System (IS) to refer to an IS-IS router, but we shall use the term router, since this is freely used in the Integrated IS-IS standard.

    IS-IS groups networks into domains in a fashion that is analogous to OSPF. A routing domain is analogous to an Autonomous system, and it is subdivided into areas just like OSPF. Here is an overview of some of the more important aspects of IS-IS routing. Where possible, comparisons are made with equivalent concepts used in OSPF but it is dangerous to draw too close a parallel, since there are fundamental differences between the two protocols.

    Integrated IS-IS permits considerable mixing of the two protocol suites, subject to certain restrictions on the topology. Three types of router are defined:

    IP-only
    A router which uses IS-IS as the routing protocol for IP and does not otherwise support OSI protocols (for example, such routers would not be able to forward OSI CLNP packets).
    OSI-only
    A router which uses IS-IS as the routing protocol for OSI but not IP.
    Dual
    A router which uses IS-IS as a single integrated routing protocol for both IP and OSI.
    It is possible to have a mixed domain containing IS-IS routers, some of which are IP-only, some OSI-only and some are dual. Each area within a mixed domain is configured to be OSI, IP or dual. Areas which are to carry mixed traffic must have dual routers for all of the Level 1 routers. Similarly, the Level 2 routers in a mixed domain must all be dual routers if mixed traffic is to be routed between areas.

    3.3.4.4 Co-existence of TCP/IP and OSI Routing Protocols without IS-IS

    As its name suggests, Integrated IS-IS offers an integrated routing solution for multi-protocol networks. OSPF, like other TCP/IP routing protocols, uses an approach termed Ships In the Night (SIN) to handle coexistence issues. In the SIN approach, each multiprotocol router runs a separate process for each network layer (IP and OSI). A SIN router allows network managers to insert new SIN-based routing protocols, such as OSPF, one by one in the network, but the protocols exist independently of one another, and their frames pass each other like ships in the night.

    Since the customer base of independent router vendors remains largely TCP/IP-focused, most of these vendors are choosing, for now, to stick with SIN even if it means their routers will not be able to work in OSI networks. A few of them have announced that they will support Integrated IS-IS in the future.

    Table of Contents Exterior Routing Protocols