TCP/IP Tutorial and Technical Overview

Table of Contents IGMP Operation
TCP/IP Tutorial and Technical Overview

2.8 Address Resolution Protocol (ARP)

Figure: Address Resolution Protocol (ARP)

The ARP protocol is a network-specific standard protocol. Its status is elective.

The address resolution protocol is responsible for converting the higher-level protocol addresses (IP addresses) to physical network addresses. First, let's consider some general topics on Ethernet.

2.8.1 Ethernet versus IEEE 802.3

Two frame formats can be used on the Ethernet coaxial cable:

The standard issued in 1978 by Xerox Corporation, Intel Corporation and Digital Equipment Corporation, usually called Ethernet (or DIX Ethernet).
The international IEEE 802.3 standard, a more recently defined standard.

The difference between the two standards is in the use of one of the header fields, which contains a protocol-type number for Ethernet and the length of the data in the frame for IEEE 802.3.

Figure: Frame Formats for Ethernet and IEEE 802.3

The type field in Ethernet is used to distinguish between different protocols running on the coaxial cable, and allows their coexistence on the same physical cable.
The maximum length of an Ethernet frame is 1526 bytes. This means a data field length of up to 1500 bytes. The length of the 802.3 data field is also limited to 1500 bytes for 10 Mbps networks, but is different for other transmission speeds.
In the 802.3 MAC frame, the length of the data field is indicated in the 802.3 header. The type of protocol it carries is then indicated in the 802.2 header (higher protocol level)/, see Figure - Frame Formats for Ethernet and IEEE 802.3
In practice however, both frame formats can coexist on the same physical coax. This is done by using protocol type numbers (type field) greater than 1500 in the Ethernet frame. However, different device drivers are needed to handle each of these formats.

Thus, for all practical purposes, the Ethernet physical layer and the IEEE 802.3 physical layer are compatible. However, the Ethernet data link layer and the IEEE 802.3/802.2 data link layer are incompatible.

The 802.2 Logical Link Control (LLC) layer

above IEEE 802.3 uses a concept known as Link Service Access Point (LSAP) which uses a 3-byte header:

Figure: IEEE 802.2 LSAP Header

where DSAP and SSAP stand for Destination and Source Service Access Point respectively. Numbers for these fields are assigned by an IEEE committee.

Due to a growing number of applications using IEEE 802 as lower protocol layers, an extension was made to the IEEE 802.2 protocol in the form of the Sub-Network Access Protocol (SNAP). It is an extension to the LSAP header above, and its use is indicated by the value 170 in both the SSAP and DSAP fields of the LSAP frame above.

Figure: IEEE 802.2 SNAP Header

In the evolution of TCP/IP,

three standards were established, which describe the encapsulation of IP and ARP frames on these networks:

1984: RFC 894 - Standard for the Transmission of IP Datagrams over Ethernet Networks specifies only the use of Ethernet type of networks. The values assigned to the type field are:

2048 (hex 0800)
for IP datagrams
2054 (hex 0806)
for ARP datagrams
1985: RFC 948 - Two Methods for the Transmission of IP Datagrams over IEEE 802.3 Networks specifies two possibilities:
1. The Ethernet compatible method: the frames are sent on a real IEEE 802.3 network in the same fashion as on an Ethernet network, that is, using the IEEE 802.3 data-length field as the Ethernet type field, thereby violating the IEEE 802.3 rules, but compatible with an Ethernet network.
2. IEEE 802.2/802.3 LLC type 1 format: using 802.2 LSAP header with IP using the value 6 for the SSAP and DSAP fields.
The RFC indicates clearly that the IEEE 802.2/802.3 method is the preferred method, that is, that all future IP implementations on IEEE 802.3 networks are supposed to use the second method.
1987: RFC 1010 - Assigned Numbers (now obsoleted by RFC 1700 dated 1994) notes that as a result of IEEE 802.2 evolution and the need for more internet protocol numbers, a new approach was developed based on practical experiences exchanged during the August 1986 TCP Vendors Workshop. It states in an almost completely overlooked part of this RFC that from now on all IEEE 802.3, 802.4 and 802.5 implementations should use the Sub-Network Access Protocol (SNAP) form of the IEEE 802.2 LLC: DSAP and SSAP fields set to 170 (indicating the use of SNAP) and then SNAP assigned as follows:
- 0 (zero) as organization code.
- EtherType field:
  
  2048 (hex 0800)
  for IP datagrams
  2054 (hex 0806)
  for ARP datagrams
  32821 (hex 8035)
  for RARP datagrams
  These are the same values as used in the Ethernet type field.
1988: RFC 1042 - Standard for the Transmission of IP Datagrams over IEEE 802 Networks.
As this new approach (very important for implementations) passed almost unnoticed in a little note of an unrelated RFC, it became quite confusing, and finally, in February 1988, it was repeated in an RFC on its own: RFC 1042, which obsoletes RFC 948.

The relevant IBM TCP/IP products implement RFC 894 for DIX Ethernet and RFC1010 (now RFC 1700) for IEEE 802.3 networks.

However, in practical situations, there are still TCP/IP implementations that use the older LSAP method (RFC 948 or 1042). Such implementations will not communicate with the more recent implementations (such as IBM's).

Note also that the last method covers not only the IEEE 802.3 networks, but also the IEEE 802.4 and 802.5 networks such as the IBM Token-Ring LAN.

2.8.2 ARP Overview

On a single physical network, individual hosts are known on the network by their physical hardware address. Higher-level protocols address destination hosts in the form of a symbolic address (IP address in this case). When such a protocol wants to send a datagram to destination IP address w.x.y.z, the device driver does not understand this address.

Therefore, a module (ARP) is provided that will translate the IP address to the physical address of the destination host. It uses a lookup table (sometimes referred to as the ARP cache) to perform this translation.

When the address is not found in the ARP cache, a broadcast is sent out on the network, with a special format called the ARP request. If one of the machines on the network recognizes its own IP address in the request, it will send an ARP reply back to the requesting host. The reply will contain the physical hardware address of the host and source route information (if the packet has crossed bridges on its path). Both this address and the source route information are stored in the ARP cache of the requesting host. All subsequent datagrams to this destination IP address can now be translated to a physical address, which is used by the device driver to send out the datagram on the network.

ARP was designed to be used on networks that support hardware broadcast. This means, for example, that ARP will not work on an X.25 network.

2.8.3 ARP Detailed Concept

ARP is used on IEEE 802 networks as well as on the older DIX Ethernet networks to map IP addresses to physical hardware addresses. To do this, it is closely related to the device driver for that network. In fact, the ARP specifications in RFC 826 only describe its functionality, not its implementation. The implementation depends to a large extent on the device driver for a network type and they are usually coded together in the adapter microcode.

2.8.3.1 ARP Packet Generation

If an application wishes to send data to a certain IP destination address, the IP routing mechanism first determines the IP address of the ``next hop'' of the packet (it can be the destination host itself, or a router) and the hardware device on which it should be sent. If it is an IEEE 802.3/4/5 network, the ARP module must be consulted to map the <protocol type, target protocol address> to a physical address.

The ARP module tries to find the address in this ARP cache. If it finds the matching pair, it gives the corresponding 48-bit physical address back to the caller (the device driver) which then transmits the packet. If it doesn't find the pair in its table, it discards the packet (assumption is that a higher-level protocol will retransmit) and generates a network broadcast of an ARP request.

Figure: ARP Request/Reply Packet

Where:

Hardware address space: Specifies the type of hardware; examples are Ethernet or Packet Radio Net.
Protocol address space: Specifies the type of protocol, same as EtherType field in the IEEE 802 header (IP or ARP).
Hardware address length: Specifies the length (in bytes) of the hardware addresses in this packet. For IEEE 802.3 and IEEE 802.5 this will be 6.
Protocol address length: Specifies the length (in bytes) of the protocol addresses in this packet. For IP this will be 4.
Operation code: Specifies whether this is an ARP request (1) or reply (2).
Source/target hardware address: Contains the physical network hardware addresses. For IEEE 802.3 these are 48-bit addresses.
Source/target protocol address: Contains the protocol addresses. For TCP/IP these are the 32-bit IP addresses.

For the ARP request packet, the target hardware address is the only undefined field in the packet.

2.8.3.2 ARP Packet Reception

When a host receives an ARP packet (either a broadcast request or a point-to-point reply), the receiving device driver passes the packet to the ARP module which treats it as shown in Figure - ARP Packet Reception.

Figure: ARP Packet Reception

The requesting host will receive this ARP reply, and will follow the same algorithm to treat it. As a result of this, the triplet <protocol type, protocol address, hardware address> for the desired host will be added to its lookup table (ARP cache). The next time a higher-level protocol wants to send a packet to that host, the ARP module will find the target hardware address and the packet will be sent to that host.

Note that because the original ARP request was a broadcast on the network, all hosts on that network will have updated the sender's hardware address in their table (only if it was already in the table).

2.8.4 ARP and Subnets

The ARP protocol remains unchanged in the presence of subnets. Remember that each IP datagram first goes through the IP routing algorithm. This algorithm selects the hardware device driver which should send out the packet. Only then, the ARP module associated with that device driver is consulted.

2.8.5 Proxy-ARP or Transparent Subnetting

Proxy-ARP is described in RFC 1027 - Using ARP to Implement Transparent Subnet Gateways, which is in fact a subset of the method proposed in RFC 925 - Multi-LAN Address Resolution. It is another method to construct local subnets, without the need for a modification to the IP routing algorithm, but with modifications to the routers, which interconnect the subnets.

2.8.5.1 Proxy-ARP Concept

Consider one IP network, which is divided into subnets, interconnected by routers. We use the ``old'' IP routing algorithm, which means that no host knows about the existence of multiple physical networks. Consider hosts A and B which are on different physical networks within the same IP network, and a router R between the two subnetworks:

Figure: Hosts Interconnected by a Router

When host A wants to send an IP datagram to host B, it first has to determine the physical network address of host B through the use of the ARP protocol.

As host A cannot differentiate between the physical networks, his IP routing algorithm thinks that host B is on the local physical network and sends out a broadcast ARP request. Host B doesn't receive this broadcast, but router R does. Router R understands subnets, that is, it runs the ``subnet'' version of the IP routing algorithm and it will be able to see that the destination of the ARP request (from the target protocol address field) is on another physical network. If router R's routing tables specify that the next hop to that other network is through a different physical device, it will reply to the ARP as if it were host B, saying that the network address of host B is that of the router R itself.

Host A receives this ARP reply, puts it in his cache and will send future IP packets for host B to the router R. The router will forward such packets to the correct subnet.

The result is transparent subnetting:

Normal hosts (such as A and B) don't know about subnetting, so they use the ``old'' IP routing algorithm.
The routers between subnets have to:
1. Use the ``subnet'' IP algorithm.
2. Use a modified ARP module, which can reply on behalf of other hosts.

Figure: Proxy-ARP Router

The IBM TCP/IP products do not implement proxy-ARP routing.

Table of Contents Reverse Address Resolution Protocol (RARP)