next up previous contents
Next: The MulTreeLDP protocol Up: Implementation Previous: Implementation   Contents

Subsections


Multicast MPLS-Linux

MPLS-Linux is a recent implementation of MPLS for PCs running the Linux operating system [51]. MPLS-Linux is freely modifiable under the GNU license [31] and conforms to the MPLS specifications [61] [62]. Other MPLS implementations for PCs have been proposed in the past [54] [55], but are not maintained by their authors. Thus, we chose to implement our multicast rerouting mechanism on PCs running MPLS-Linux. MPLS-Linux does not support multicast forwarding, therefore we augmented MPLS-Linux with multicast capabilities. Before we explain how we extended MPLS-Linux, we provide a background on the existing MPLS-Linux implementation for unicast.


Unicast MPLS-Linux implementation

MPLS-Linux is implemented as a layer between Ethernet and IP. Ethernet is a MAC layer protocol which encapsulates IP packets in frames. In Section 1.2.2, we gave an overview of the three operations that MPLS routers can perform on packets (push, swap and pop) and we described the Forwarding Information Base (FIB) which contains the rules according to which MPLS routers forward packets. We now describe how the MPLS operations and the FIB are implemented in MPLS-Linux.

Table 5.1: MPLS-Linux unicast instructions overview. MPLS-Linux unicast implements the three MPLS operations (push, swap, pop) with five different instructions.
Instruction Input layer Output layer Description
PUSH IP MPLS Adds a shim header to an IP packet.
SET MPLS Ethernet Passes an MPLS unicast packet to an Ethernet interface.
POP Ethernet MPLS Removes a shim header from an Ethernet frame.
FWD MPLS MPLS Calls PUSH for a packet coming from POP.
DLV MPLS IP Passes an MPLS packet to the IP layer.



Table 5.2: Implementation of the three MPLS operations with the five MPLS-Linux instructions.
MPLS Operation Corresponding sequence of instructions in MPLS-Linux
push PUSH, SET
swap POP, FWD, PUSH, SET
pop POP, DLV


MPLS-Linux defines five instructions to implement shim header pushing, swapping and popping. Each of these instructions can be applied to IP packets or Ethernet frames in the MPLS layer as they are being processed by the Linux kernel. We give an overview of these five instructions in Table 5.1 and we describe how they implement the three MPLS operations in Table 5.2. The PUSH instruction adds an MPLS shim header to a packet which comes from the IP layer. The SET instruction passes an IP packet with a shim header from the MPLS layer to the Ethernet layer and tells the Ethernet layer on which Ethernet interface the MPLS packet should be forwarded. Together, the PUSH and SET instructions implement the MPLS ``push'' operation. The POP instruction removes the shim header of a packet that comes from the Ethernet layer. Packets processed by POP must be subsequently processed by either FWD or SET. The FWD instruction takes as an input a packet processed by POP and calls the PUSH instruction. Together, the POP, FWD, PUSH and SET instructions implement the MPLS ``swap'' operation. We will see in the remainder of this section why this FWD instruction is made necessary to swap labels. Last, the DLV instruction takes as an input a packet processed by POP and passes it to the IP layer. The POP and DLV instructions implement the MPLS ``pop'' operation.

We now describe the implementation of the FIB in MPLS-Linux. In MPLS-Linux, the FIB is split into three tables: the MPLS input and output tables, and the IP routing table. MPLS-Linux defines a Forwarding Equivalence Class (FEC) with a prefix and a prefix length. A prefix is a 32-bit IP address and a prefix length is a number comprised between 1 and 32. A packet with destination IP $IP_d$ matches the FEC $P/P_{len}$ constituted by the prefix $P$ and the prefix length $P_{len}$ if and only if the first $P_{len}$ bits of $IP_d$ and $P$ are the same. A requirement of MPLS-Linux is the presence in the IP routing table of a specific entry for each FEC that is defined at an MPLS ingress LER. It is not possible to define a FEC if no matching entry exists in the routing table. Indeed, MPLS-Linux relies on the IP routing table to determine the FEC of an IP packet. In MPLS-Linux, IP routing table entries are extended and contain FEC to Next Hop Label Forwarding Entry (FTN) mappings in addition to the IP routing information. Both the MPLS input and output table contain Next Hop Label Forwarding Entries (NHLFEs), while the MPLS input table implements the Incoming Label Map (ILM).

Figure 5.1: Processing of a packet in the MPLS layer with MPLS-Linux unicast.
\includegraphics[height=1.75in]{figures/MPLS_linux_ingress_LER} \includegraphics[height=1.75in]{figures/MPLS_linux_LSR} \includegraphics[height=1.95in]{figures/MPLS_linux_egress_LER}
a) At an ingress LER b) At a LSR c) At an egress LER
Figure 5.2: Processing of a packet in the MPLS layer with MPLS-Linux multicast.
\includegraphics[height=1.75in]{figures/MPLS_linux_ingress_mLER} \includegraphics[height=1.75in]{figures/MPLS_linux_mLSR} \includegraphics[height=1.95in]{figures/MPLS_linux_egress_mLER}
a) At an ingress LER b) At a LSR c) At an egress LER

Consider Figure 5.1(a) which shows how a shim header is pushed on an incoming Ethernet frame by an ingress LER. The Ethernet layer of the LER receives a frame with a protocol field in the Ethernet header set to 0x0800, which is the protocol code for IPv4. The Ethernet layer passes the incoming frame to the IP layer. The MPLS router searches for an entry in the IP routing table to make the routing decision, but since this entry matches a FEC it has been modified so that the packet is passed to the MPLS layer instead of being routed by the IP layer. The additional information contained in the IP routing table is a FTN, that is, a pointer to an MPLS output table entry. This output table entry is a NHLFE that contains two instructions. A PUSH instruction defines the label number of the packet, and a SET instruction defines on which interface the packet should be sent on. The MPLS layer adds at the beginning of the packet an MPLS header which contains the label found in the NHLFE, and passes the packet to the Ethernet layer. The Ethernet layer generates a frame with the protocol field set to the code assigned to MPLS unicast packets (0x8847) and sends the frame over the wire.

Consider now Figure 5.1(b) which shows how a label is swapped by a LSR. The Ethernet layer of the LSR receives a frame with a protocol field in the Ethernet header set to 0x8847. Since 0x8847 is the code assigned to MPLS unicast packets encapsulated in Ethernet frames, the Ethernet layer passes the frame to the MPLS layer of the LSR. The MPLS layer searches in the MPLS input table for the entry that matches the label embedded in the shim header of the packet. The input table implements the ILM and tells the MPLS layer what to do with the packet. The input table entry contains two instructions. The POP instruction tells the LSR to remove the MPLS header, and the FWD instruction points to an entry of the MPLS output table. This entry in turn contains two instructions: the PUSH instruction contains the new label for the packet and tells the LSR to add a shim header on the packet with this new label, while the SET instruction tells the LSR on which Ethernet interface the packet should be sent. The Ethernet layer then builds a frame with a protocol field of 0x8847 and sends it over the wire. By definition, the NHLFE tells an MPLS router whether a header must be popped or swapped. In MPLS-Linux the SWAP operation is implemented by successively popping and pushing a shim header, and the instructions required to pop an push a label are located in each of the MPLS tables. In this case, the NHLFE is contained at the same time in the input table and the output table.

Last, consider Figure 5.1(c) which shows how a label is popped by an egress LSR. The Ethernet layer of the LSR receives a frame with a protocol field in the Ethernet header set to 0x8847 and therefore passes the frame to the MPLS layer. The MPLS input table entry that matches the label of the packet contains two instructions. The POP instruction tells the LER to remove the shim header from the packet, and the DLV instruction tells the LER to pass the packet to the IP layer where it will be processed like any other IP packet. In this case, the NHLFE is fully contained in the input table entry and tells the packet to pop the shim header.

Labelspaces define the scope of forwarding rules. If two interfaces of the same MPLS router belong to the same labelspace, then they apply the same set of forwarding rules to MPLS packets. For example, if interfaces ``2'' and ``4'' are part of the same labelspace, then two packets with the same label arriving one on interface ``2'' and the other on interface ``4'' will follow the same forwarding rule. On the other hand, if multiple interfaces do not belong to the same labelspace then the incoming MPLS packets follow different forwarding rules. In our implementation, we do not use labelspaces and for each Ethernet interface we set the labelspace to be equal to the interface index assigned by the kernel.

Multicast MPLS-Linux implementation

Unicast MPLS-Linux provides five instructions to implement MPLS headers header operations. In order to support multicasting, we added two new instructions MSET and MFWD inside the kernel implementation of MPLS-Linux. Table 5.3 gives an overview of these two new instructions.

Table 5.3: MPLS-Linux multicast instructions overview. MPLS-Linux multicast extensions require two additional instructions to forward multicast packets.
Instruction Input layer Output layer Description
MSET MPLS Ethernet Passes an MPLS multicast packet to an Ethernet interface.
MFWD MPLS MPLS Calls PUSH for a multicast packet coming from POP.


A first difference between MPLS unicast and MPLS multicast lies in the protocol number in the Ethernet frames. The value of the protocol number is 0x8847 for MPLS unicast and 0x8848 for MPLS multicast. When a frame that contains an MPLS multicast packet is transmitted by the Ethernet layer, the protocol number should be set to the correct value in the Ethernet header. This is done with the new MSET instruction which replaces the unicast SET instruction when an MPLS router forwards multicast packets. On the other hand, frames received by the Ethernet layer with a multicast MPLS protocol number should be processed by the MPLS layer rather than the IP layer. The Linux kernel API defines the dev_add_pack() instruction to associate Ethernet protocol numbers with upper layer handlers. For instance, the protocol number 0x0800 is associated with the IP layer handler so that the Ethernet layer passes to the IP layer the frames that contain IP packets. We wrote the handler that redirects MPLS multicast packets to the MPLS layer.

Second, different from MPLS unicast, in MPLS multicast the MPLS layer must be able to duplicate packets. MPLS multicast forwards the same incoming packet to several interfaces. We define the new MPLS operation mswap (multicast swap) on MPLS headers. When an MPLS packet shim header is mswapped, the packet is duplicated and the shim header of each copy of the packet is swapped against a new one. Then, each copy of the packet is sent on a different interface. We implement the mswap operation with the POP and PUSH instructions, and the use of the new MFWD and MSET instructions are described in Table 5.4.

Table 5.4: Implementation of the multicast MPLS operations. The new instructions MFWD and MSET replace FWD and SET.
MPLS Operation Corresponding sequence of instructions in MPLS-Linux
push PUSH,MSET
mswap POP MFWD, PUSH, MSET
    MFWD, PUSH, MSET
    ... 
    MFWD, PUSH, MSET
pop POP, DLV


We have has designed the new MFWD (Multicast FWD) instruction as a replacement for FWD. While an input table entry can contain only one FWD instruction in unicast MPLS-Linux, several MFWD instructions can be placed in a single MPLS input table entry in our MPLS-Linux extensions. The MFWD instruction supports packet duplication. If an MPLS input table entry contains $n$ MFWD instructions, then incoming packets are duplicated $n-1$ times using a software mechanism provided by the kernel API. Each MFWD instruction points to a different MPLS output table entry. Each of the $n$ copies of the packet is processed according to the contents of the MPLS output table entries pointed by one of the MFWD instructions. Therefore, the MPLS layer can push a different shim header on each copy of the packet and forward it on a different interface.

Figures 5.2(a), 5.2(b) and 5.2(c) respectively show how shim headers are pushed, mswapped and popped for multicast MPLS packets. In Figure 5.2(a), the only difference between pushing a shim header on an MPLS unicast packet and pushing a shim header on an MPLS multicast packet lies in the protocol number in the Ethernet frame. The MPLS layer uses the MSET instruction instead of the SET instruction in the MPLS output table when pushing shim headers on MPLS multicast packets. In Figure 5.2(b), we illustrate how multicast packets are forwarded on several interfaces at the same time. The MPLS input table contains a POP instruction and two MFWD instructions for incoming packets. The MPLS layer first removes the MPLS shim header of each incoming packet. Then, it duplicates the packet in order to get two copies of the packet. The first MFWD instruction points to an entry in the MPLS output table which contains a PUSH and a MSET instruction. A shim header is pushed on the first copy of the packet and the Ethernet layer sends the corresponding frame over the wire via the interface specified by the MSET instruction. The second MFWD instruction points to a different entry in the MPLS output table which contains another PUSH and another MSET instruction. A shim header containing a different label is pushed on the second copy, and the corresponding Ethernet frame is sent using another interface. In Figure 5.2(c), we show how MPLS routers pop shim headers from MPLS multicast packets. There is no difference with popping the shim header of an MPLS unicast packet, except for the protocol number in the Ethernet header of incoming frames.

Last, our implementation supports mixed L2/L3 forwarding. The concept of mixed L2/L3 forwarding has been introduced in Section 1.3.4 and refers to the ability of a router to forward a multicast packet both with an IP and an MPLS mechanism. We perform mixed L2/L3 forwarding by using in the same MPLS input table entry one or several MFWD instructions to forward the packet with an MPLS mechanism, and a DLV instruction to forward the packet with an IP forwarding mechanism. Mixed L2/L3 forwarding support is illustrated in Figure 5.3. Incoming MPLS packets are duplicated by the MPLS layer. One copy remains in the MPLS layer and the other copy is passed to the IP layer.

Figure 5.3: Mixed L2/L3 forwarding implementation. The same incoming packet is passed to both the IP layer and the Ethernet layer by the MPLS layer. The shim header of the copy of the packet that remains in the MPLS layer is mswapped and the packet is passed to the Ethernet layer, while the copy of the packet that is sent to the IP layer is routed by the IP layer and can either be sent to the Ethernet layer, or be delivered to the transport layer of the MPLS router.
\includegraphics[width=3in]{figures/MPLS_linux_mL2L3}

FIB management API

Our implementation provides an API to let user processes modify the FIB of MPLS routers. In MPLS-Linux, the FIB is located inside the Linux kernel. Therefore the implementation of MPLS-Linux requires that a user program communicates with the Linux kernel.

Table 5.5: The /proc files related to the MPLS FIB. All files are in text format.
File Contents
/proc/net/mpls_labelspace Mapping between physical interfaces and labelspaces.
/proc/net/mpls_fec FEC mappings.
/proc/net/mpls_in Input table.
/proc/net/mpls_out Output table.


The three communication channels between user programs and the kernel provided by Linux are ioctl system calls, netlink sockets, and the /proc file system. MPLS-Linux uses netlink sockets and the /proc file system to access the FIB. Netlink is a datagram oriented socket based interface between the kernel and user programs. The assigned domain for netlink sockets is PF_NETLINK. The netlink API provides functions to encapsulate and decapsulate information in netlink datagrams which have a specific format. Users can send and receive information encapsulated in netlink datagrams to the kernel via the classic socket calls send and recv. The /proc file system is a virtual filesystem where files contain information used by the kernel. MPLS-Linux uses certain files of the /proc filesystem (see Table 5.5) to represent the FIB in human-readable text format. Our implementation uses solely the netlink communication interface to communicate with the kernel [53].


MPLS-Linux provides four functions to manipulate the FIB. These functions can either create or remove entries in these tables. The function send_nhlfe() creates an entry in the MPLS output table if called with parameter RTM_NEWNHLFE, and deletes an entry in the MPLS output table when called with RTM_DELNHLFE. The function send_xc() binds or unbinds an entry in the input table to an entry in the output table. Therefore, since NHLFE are implemented in both the input and output tables, NHLFE are created or deleted using combinations of send_nhlfe() and send_xc(). A mcast field in the message sent by send_nhlfe() specifies whether the NHLFE refers to unicast or multicast packets. The function send_ilm() manages the ILM by creating or deleting entries in the MPLS input table. The function send_ftn() creates or deletes FTN mappings in the IP routing table. We added the function send_mc() to create and delete mappings between an entry in the MPLS input table and several entries in the MPLS output table. MPLS forwarding rules are created with the netlink functions described above. However, a single forwarding rule such as ``push unicast label 10'' requires to send two netlink messages: a NHLFE and a FTN must be created via send_nhlfe and send_ftn. To assist users in creating forwarding rules, we provide a C API, which is described in Tables 5.6 and 5.7. This simple API simplifies the creation of MPLS forwarding rules by hiding to the user the crafting of complex netlink messages and the call of the netlink functions.

Table 5.6: Netlink functions and the corresponding C API used to set the MPLS forwarding rules. The C API simplifies the creation and removal of MPLS forwarding rules.
MPLS Operation Instructions Netlink functions called C API

push (unicast)
$\left.\begin{array}{l} \mbox{{\tt PUSH},}  \mbox{{\tt SET}} \end{array} \right.$ $\begin{array}{l} \mbox{{\tt send\_ftn}}  \mbox{{\tt send\_nhlfe} with {\em mcast=0}} \end{array} $
push_label(), remove_push_label()

push (multicast)
$\left.\begin{array}{l} \mbox{{\tt PUSH},}  \mbox{{\tt MSET}} \end{array} \right.$ $\begin{array}{l} \mbox{{\tt send\_ftn}}  \mbox{{\tt send\_nhlfe} with {\em mcast=1}} \end{array} $
swap $\left.\begin{array}{l}\mbox{{\tt POP},}\end{array}\right.$ $\begin{array}{l}\mbox{{\tt send\_ilm}}\end{array}$ swap_label(), remove_swap_label() 
  $\left.\begin{array}{l} \mbox{{\tt PUSH,}}  \mbox{{\tt SET,}} \end{array} \right\} $ $\begin{array}{l}\mbox{{\tt send\_nhlfe}}\end{array}$
  $\left.\begin{array}{l}\mbox{{\tt FWD}}\end{array}\right.$ $\begin{array}{l}\mbox{{\tt send\_xc}}\end{array}$
mswap $\left.\begin{array}{l}\mbox{{\tt POP},}\end{array}\right.$ $\begin{array}{l}\mbox{{\tt send\_ilm}}\end{array}$ mswap_label(), remove_mswap_label()
  $\left.\begin{array}{l} \mbox{{\tt PUSH,}}  \mbox{{\tt MSET,}} \end{array} \right\} $ $\begin{array}{l}\mbox{{\tt send\_nhlfe}}\end{array}$
  $\left.\begin{array}{l}\mbox{{\tt MFWD}}\end{array}\right.$ $\begin{array}{l}\mbox{{\tt send\_mc}}\end{array}$

pop
$\left.\begin{array}{l} \mbox{{\tt POP,}}  \mbox{{\tt DLV}} \end{array} \right\} $ $\begin{array}{l}\mbox{{\tt send\_ilm}}\end{array}$ pop_label(), remove_pop_label()



Table 5.7: Details on the FIB manipulation API. The C API hides to the user the crafting of complex netlink messages and the calls of the netlink functions.
MPLS Operation C API function Arguments Description
push push_label( int ifindex, Push a header with label label_id on packets of the FEC fec_prefix/fec_len that are to be transmitted on interfaceifindex towards next_hop.  
    struct in_addr next_hop,
    u_int label_id,
    struct in_addr fec_prefix,
    u_char fec_len)
  remove_push_label( int ifindex, Remove a rule created with push_label()
    u_int label_id)
swap  swap_label( u_int in_label_id, Swap a header with label in_label_id of packets that are to be transmitted on interface in_labelspace against a header containing label out_label_id and forward the packets on the interface indexed by out_if_index towards out_next_hop.
    u_int out_label_id,
    int out_if_index,
    struct in_addr out_next_hop)
  remove_swap_label( u_int in_label_id, Remove a rule created with swap_label().
    int in_labelspace,
    u_int out_label_id,
    int out_if_index)
pop pop_label( u_int label_id, Pop headers containing label label_id of packets arriving via an interface that belongs to labelspace labelspace.  
    int labelspace)
  remove_pop_label( u_int label_id, Remove a rule created with pop label().  
    int labelspace)



next up previous contents
Next: The MulTreeLDP protocol Up: Implementation Previous: Implementation   Contents
Yvan Pointurier 2002-08-11