next up previous contents
Next: Switchover and switchback Up: MPLS Multicast Fast Reroute Previous: Link failure and recovery   Contents


Failure and recovery notification


When a link failure or recovery is detected, the PSLs must be notified of the failure or recovery so that they can perform switchover or switchback. The nodes that detect the failure or recovery are responsible for notifying the PSLs. We successively consider link failure and recovery in a multicast routing tree.

As explained in Chapter 3, when a link fails the multicast routing tree is split in two trees. For instance, if nodes $C$ and $D$ detect the failure of link $CD$ as illustrated in Figure 4.2(b), the original multicast routing tree is split into one tree rooted at $C$ and one tree rooted at node $D$. These two trees are represented in Figure 4.7. Each of the two nodes that detect the failure sends out a signaling protocol link failure notification message to each of their children. This notification message contains in its payload the IP address of the interface that ends the failed link. Nodes that receive a link failure notification send in turn link failure notification to their children until all leaves of the two trees previously defined are reached. When a node sends a link failure notification message, it does not change the IP address contained in the payload of the message such that all nodes which receive the message know which link has failed. For instance, in Figure 4.7, $D$ sends a link failure notification to $F$, $G$ and $A$. The notification message sent by $D$ contains the IP address of the interface of node $D$ that ends link $CD$. Node $C$ sends a link failure notification to nodes $E$ and $S'$. The notification message sent by $C$ contains the IP address of the interface of node $C$ that ends link $CD$. Upon reception of the notification, $S'$ sends a link failure notification to $H$ which in turn sends link failure notifications to $J$ and $B$.

Figure 4.7: Failure notification mechanism. After failure of link $CD$, the original multicast routing tree is split into two trees, one rooted at $C$ and one rooted at $D$. Nodes $C$ (resp. $D$) sends a link failure notification message on the tree for which it is the root. The failure notification messages are propagated on both trees until all leaves are reached.
\includegraphics[width=0.9\textwidth]{figures/mc_fast_reroute_notif}

All nodes of the original multicast routing tree are notified of the failure. In particular, if the failed link is on the protected path protected by a backup path, then the PSLs for the backup path are both notified of the link failure. Each PSL contains a list of the IP addresses of the interfaces that end links of the protected path. When a PSL receives a link failure notification message, it checks that the IP address contained in the payload of the notification message matches an IP address of its internal list. If such is the case, then the PSL must perform switchover. Otherwise, the failed link is not on the protected path protected by the PSL and the PSL does not perform switchover.

When a link repair is detected, we use the exact same mechanism to propagate the repair information. Only the type of message used changes, i.e. signaling protocol link recovery notification messages are used. Messages contain the IP address of the interface of the repaired link. In our example, when link $CD$ is repaired, then node $D$ sends a link recovery notification message to $F$, $G$ and $A$ which contain the IP address of the interface of $D$ that ends the recovered link. Node $C$ sends a link recovery notification message to $S'$ which contain the IP address of the interface of $C$ that ends the recovered link. Node $S'$ sends a link recovery notification message to $H$, which sends link recovery notification messages to $J$ and $B$. If a backup path has been established and if the link that has been repaired is part of the protected path, then the end nodes of the backup path perform switchback.

The failure notification time is the time between the instant at which a failure is detected and the instant at which both PSLs are notified of the failure. Likewise, the recovery notification time is the time between the instant at which a link repair is detected and the instant at which both PSLs are notified of the recovery. Since the notification mechanism is the same for link failure and link recovery, failure notification time and recovery notification time are the same. We call notification time and we note $T_{notif}$ the common value for failure notification time and recovery notification time. If we denote by $T_{nnotif}$ the time taken by a node to send and process a notification message (node notification delay), and if the protected path of the multicast routing tree consists of $l$ links, then the recovery time is bounded by:

\begin{displaymath}T_{notif} \leq l T_{nnotif}.\end{displaymath}

Using 100 Mbits/s Ethernet hardware, the node notification delay is in the order of 1 millisecond. Therefore, it takes a few milliseconds to notify the PSLs after a link failure has been detected.


next up previous contents
Next: Switchover and switchback Up: MPLS Multicast Fast Reroute Previous: Link failure and recovery   Contents
Yvan Pointurier 2002-08-11