AbstractThe Xpress Transport Protocol (XTP) has been designed to support a variety of applications ranging from real-time embedded systems to multimedia distribution to applications distributed over a wide area network. In a single protocol it provides all the classic functionality of TCP, UDP, and TP4, plus new services such as transport multicast, multicast group management, transport layer priorities, traffic descriptions for quality-of service negotiation, rate and burst control, and selectable error and flow control mechanisms. XTP has the same interconnectivity as TCP/UDP/TP4 because it operates over any network layer (IP, CLNP), any datalink layer (LLC, MAC), or directly on top of the AAL of ATM. In general, XTP avoids coupling policy with mechanism; XTP offers services but the user's application defines what communications paradigm is most appropriate for its particular environment. XTP is a high performance protocol, and can sustain high throughput (90 Mbits/s over FDDI between a pair of 133 MHz Pentiums) and low latency (220 ,us to move 16 bytes from user memory to user memory on two 133 MHz PCs connected by FDDI). Since XTP can run in parallel with all other transport protocols, and can run over whatever network layer (if any) is provided, it represents a low-risk way to exploit the increased functionality required for distributed applications without sacrificing connectedness or interoperability.
In the OSI Reference Model [OSI7498], each layer of the model is responsible for providing a specific set of services to the layers above and below it. For the transport layer, those responsibilities include:
client identification multiplexing multiple users of the transport layer and keeping their data streams separate
segmentation dividing an arbitrarily long message into packets which are of a convenient size for the underlying network, then reassembling them at the receiver into a duplicate of the original message
correctness implementing a transport checksum that is used to verify the correctness of data received, and rejecting data received with data errors
reliability assuring that the delivered data stream is sequential, with no lost data and no duplicate data
In the Internet world, the Transmission Control Protocol (TCP) [DARPA81a] provides these functions, and is typically used for data exchange that requires full reliability (like file transfer). In addition, the User Datagram Protocol (UDP) [DARPA80] is available for less demanding applications; it handles a single message of up to 64 KB on a "best-effort" (unacknowledged) basis and is used for data that does not require guaranteed delivery. Both TCP and UDP require the services of the Internet Protocol (IP) [DARPA81b] in the network layer below them to route messages to their destination. In the ISO world, the Transport Protocol class 4 (TP4) [IS08073] and the Connectionless Network Protocol (CLNP) [IS08473] provide similar services.
TCP, UDP, and IP are very important and very valuable protocols in the networking world; even though they are neither national nor international standards, they are defacto standards in the Internet environment. Having been designed many years ago, however, TCP/UDP and TP4/CLNP did not anticipate the many new types of applications that now require interconnection across LANs and MANs and WANs. Thus, when TCP/TP4 are applied to more demanding applications than file transfer or electronic mail, for example the interconnection of distributed applications, or support for real-time embedded systems, or transport of synchronized multimedia data streams, they show some deficiencies. For example:
A New ApproachIn search of a better match between application requirements and transport protocol functionalities, an international group of protocol designers, composed of representatives from industry, academia, military, and government, has defined the Xpress Transport Protocol [Strayer92, XTP95] which provides, all in one transport protocol, the traditional functionalities found in TCP, UDP, and TP4. XTP has been adopted as part of MIL-STD-2204, the U.S. military standard for the Survivable Adaptable Fiber Optic Embedded Network (Safenet) [MIL2204]. SAFENET is the Department of Defense's specification for mission-critical networked communications systems.
It is important to note that XTP is not perceived as a replacement for TCP or UDP or TP4; rather it is simply another transport protocol with its own unique functionality which is available as an option for the system designer. Just as TCP and UDP operate side-by-side above IP, XTP operates simultaneously with any number of other transport protocols, including TCP and UDP and TP4, and utilizes the services of the network layer below it, if any. XTP operates over IP (which makes it usable in any Internet environment), over CLNP (which covers the ISO networks), directly over the LLC or MAC of any LAN (which covers Ethernet, token ring, and FDDI), and directly over the adaptation layer of an Asynchronous Transfer Mode (ATM) network. The possible interplay of transport protocols, network protocols, and networks is shown in Figure 1:
XTP FunctionalityIf XTP has value, it is only because it satisfies some need of the user' s application. We examine the major XTP functionalities and how they might be gainfully employed.
MulticastTCP and TP4 support only a unicast paradigm, that is, one transmitter talking to one receiver. As applications become more distributed, there is a need for network nodes to form groups that share certain types of data. For example, a ship' s current location represents data that is needed by the navigation system, the autopilot, the collision-avoidance system, and the weapons systems. Rather than broadcasting this information to all nodes (which requires each node to inspect and interpret it in software, and then discard it if it is not needed), the modern approach is to form a multicast group consisting of one transmitter and arbitrarily many receivers. Then every transmission to the multicast group goes only to those applications that have registered themselves as receivers of the data. (Security algorithms can be employed here to monitor group membership.) Receivers can join, leave, and rejoin an on-going multicast group as permitted by the multicast group manager (see the next section on Multicast Group Management). The reliability of data delivery is independent of whether the transmission is unicast or multicast and is controlled by the user's selection of the error control mechanism (see section Selectable Error Control).
Figure 2(a) represents a traditional unicast, whereas Figure 2(b) shows the basic one-to-many approach of multicast. Figure 2(c) shows an N-to-M multicast. Groups that require more than a single transmitter can be synthesized by using N l-to-N multicast groups to form a N-to-N multicast capability as shown in Figure2(d):
Multicast Group Management (MGM)An application may form, or may join, any number of multicast groups. Membership in the group is controlled by the multicast group manager, and it in turn is controlled by the transmitting application. Using MGM, the user can selectively allow or deny the admission of any node to the multicast group; therefore, the resulting dynamic group membership can be known at all times (if desired). The user controls the reaction of the protocol when group members fail or voluntarily leave the group. If the members of a multicast group are required to be fully reliable, then the failure of any group member destroys the integrity of the multicast group, resulting in a failure report to the transmitting application. The application determines the consequences of group membership failure; it may choose to delete the failed member from the group and continue, or abandon transmission entirely, or take any other action that the application deems appropriate. See [Dempsey90] and [Dempsey91].
Note how inappropriate it would be to have the protocol manage group membership via some predefined policy. Only the application can appreciate the side-effects of a change in group membership, so XTP reports group membership changes to the transmitting application and lets it decide what should be done.
Since group membership is under program control, one could define a multicast group in which, say, three particular members must receive all data reliably and all other members may eavesdrop as they wish. Given that definition, XTP MGM would report failures of any of the three essential nodes, but would ignore the joining and leaving of other non-essential listeners.
Also, we note that the reliability of group membership is orthogonal to the reliability of data transfer within a group. As discussed in the section on selectable error control, XTP supports multiple paradigms for error control, but these are independent of the group membership issues.
PriorityA rich priority subsystem allows applications to define the importance of their data according to a system-wide scheme; XTP then operates on its most important data first. When XTP is coupled with a real-time operating system and a network that supports priorities, the system designer can then bound end-to-end delivery latencies for high-priority messages (see Figure 3).
XTP puts a bit in the packet header that indicates whether or not each packet is participating in the overall priority scheme. If so, then an adjacent field indicates the relative priority of this packet compared to all other packets. XTP examines the priority bit and its associated numeric field and then processes that packet in accordance with its defined importance. One can say that, to a granularity of one packet's transmission time on the network, XTP is always operating on its most important packet. This is true not only in the end-systems, but also in all interior routers if they support the necessary mechanisms to detect and respond to XTP priority (see the section on the XTP-aware IP router for a discussion).
The XTP priority scheme is based on a static priority encoded in a 16-bit numeric field; the lower the numerical value, the higher the importance. By defining the priority system this way, putting a delivery timestamp in the priority field automatically implements an earliest-deadline-first delivery policy with no additional work on the part of the user.
Users generate These messages are XTP emits its messages of varying sorted using priority most important importance. queues. Message first.
Rate and Burst ControlIn a modern fiber optic network, the usual source of errors is not bit errors on the medium but rather buffer overruns in the receiver or congestion in the routers. Recognizing this fundamental shift, XTP adds a mechanism for error prevention in addition to the classic mechanisms for error detection (e.g., CRCs) and correction (e.g., retransmission). When using XTP, a receiver can dynamically throttle a transmitter by using rate and burst control. The rate control parameter limits the amount of data that can be transmitted per unit time, while the burst parameter limits the size of data that can be sent (i.e., rate and burst control together force inter-packet gaps). Figure 4 shows the effect of rate and burst control on the total amount of data transmitted over time.
When using TCP, a slow receiver's protection from a fast transmitter has to be synthesized by flow control, that is, the opening and closing of credit windows. This causes dynamic throughput fluctuation as transmission starts and stops in accordance with the demands of the flow control window. In contrast, rate control allows a steady stream of data to be emitted at a rate known to be acceptable to the receiver. Additionally, conventional flow control is only active end-to-end; that is, routers do not participate. Thus a receiver with spare capacity can open a large flow control window to invite rapid transmission, but this might only result in further congestion of some intermediate router. XTP's rate and burst control algorithms allow the routers to participate; thus a congested router could temporarily reduce the rate control parameter of its incoming connections until the period of congestion has passed.
Connection ManagementTCP and TP4 require the exchange of six data packets to transmit one data element reliably (two to set up and acknowledge the connection, two to send and acknowledge the data, and two to close the connection). XTP achieves the equivalent bidirectional reliability with only three packets because of XTP's powerful connection establishment mechanisms. In the XTP paradigm, host A sends its first packet which requests the connection with host B and sends data, all in the same packet. Host B may then respond with packets flowing in the reverse direction. When the data transfer ends, the last packet from A to B is marked as such, which closes one side of the connection. Host B acknowledges having received the last data element from A and then closes the connection in the A-to-B direction. Traffic now flows in the reverse B-to-A direction until it ends, at which point the last packet is so marked. Host A acknowledges the receipt of the last packet from B and closes the connection in the B-to-A direction.
In the special case of a transaction, XTP requires only two packets. Host A sends its first packet to B that opens the connection and transfers the data (the transaction request); host B replies by sending its data (the transaction response) and closing the connection.
Selectable Error Control XTP supports not one but three types of error control:
(a) a fully reliable mode like TCP and TP4, which would normally be used for applications such as file transfer;
(b) a UDP-like service, in which the receiver does not acknowledge transmission and thus the transmitter never knows whether the data was successfully delivered (i.e., a datagram); and (c) a special mode called fast negative acknowledgment that can improve error repair in certain situations. If the networking environment is a LAN, then out-of-sequence data delivery is much more likely to mean that the missing data is truly lost, as opposed to delayed (as it might be by taking an alternate path in a WAN). Using the fastnak option, a receiver that identifies outof-sequence delivery immediately (without waiting for any time-outs) sends a control packet that informs the transmitter about the missing data. The transmitter then immediately resends the missing data.
Another independent option is noerror mode which suspends the normal retransmission scheme. Correctly received data is properly sequenced, but gaps, if any, are not retransmitted. This mode is expected to find utility when carrying digital voice and video.
Not only does XTP provide a range of data reliability options, it does so in a single protocol. Each connection can support an independent choice of error control strategy, and the choice is entirely under the control of the application.
Selectable Flow ControlAs with error control, three orthogonal options are provided. Traditional flow control based on credit windows is available for normal data. Reservation mode practices a conservative flow control policy whereby the receiver may only issue a credit for buffers dedicated to a particular connection; this assures that data will not be lost due to buffer starvation at the destination. The third mechanism is to disable flow control entirely by using the noflow option. Note that such a "free flow" or "streaming" mode of operation is not available in other protocols; this mode may prove useful for multimedia applications.
Selective RetransmissionTCP and TP4 respond to errors by using a go-back-n algorithm in which the transmission window is reset and begins again with the first byte or packet which was lost or received in error. While this works well for local area networks with short delivery latencies, it is less efficient with networks that have either high capacity (high data rates) or long delivery latencies (satellite networks) or both. On these networks, go-back-n may retransmit data which has already been received correctly. XTP can use either go-back-n or selective retransmission; in the latter, the receiver acknowledges spans of correctly received data and the sender retransmits only the gaps. The user selects which scheme to use, if any (recall that retransmission does not occur when using noerror mode). Selective retransmission will be useful for WANs involving satellite links.
Historically, retransmission has been regarded as either inappropriate or ineffective for jitter control in multimedia systems, and this may well be true if TCP-style retransmission is used. However, we have shown in [Dempsey94a,b,c,d] that retransmission, layered on top of an efficient retransmission mechanism like that in XTP, can be very effective for controlling packet loss in delay-sensitive streams.
Selective AcknowledgmentAcknowledging every packet assumes that the network often loses packets, assumes that the transmitter wants acknowledgments, and embeds policy with mechanism. XTP allows the user to decide if and when acknowledgments are desirable. Acknowledgments are provided whenever the transmitter requests them, thereby allowing the user to select an acknowledgment frequency ranging from always to sometimes to never. Philosophically, acknowledgment generation from the receiver is decoupled from data arrivals or window sizes. Whenever the transmitter wants to know the status of the connection, it asks; when the receiver responds, it tells everything it knows about its current status.
Maximum Transmission Unit (MTU) DetectionA common problem with the independence of layers introduced by the OSI model is that the transport layer may segment a message into packets for the network layer, only to have the network layer fragment each of those packets into still smaller packets that are appropriate for a particular data link. XTP avoids this double effort by negotiating the proper MTU for the route in use. Both endsystems and routers can declare their MTU for a particular link, and the transmitter will select the minimum of all MTUs encountered on any given path.
Out-of-band DataIt is sometimes useful to send information about the data stream without embedding it within the data stream. As an option, XTP can carry with each packet up to eight bytes of tagged data. Tagged data is passed from the transmitter to the receiver, and its presence is indicated to the receiver, but it is never interpreted by XTP. Tagged data is expected to be useful to the application for providing semantic information about the data. Tagged data is an elegant way to associate a 64-bit timestamp with each packet transmitted.
AlignmentAlthough it is a simple notion, data alignment pays big dividends by avoiding excess data copies. The major fields of the XTP header are aligned on 8-byte boundaries; this minimizes the number of memory accesses needed to retrieve a field. The data segment begins and ends on 8-byte boundaries, and a length field identifies the last byte of the user data.
Traffic DescriptorsWe anticipate that packet-switched networks like the Internet will eventually support quality-of-service requests from transmitters; the ATM Forum is already grappling with the problem of how to implement QoS in ATM networks. Since XTP is designed to work in both environments, it defines a traffic descriptor field that can be used to communicate QoS parameters among routers and end-systems. While XTP itself does not perform explicit resource reservation, it can supply the necessary information to other resource reservation protocols (e.g., RSVP, Q93.B). Since resource reservation work is in its infancy, we expect the form and content of the traffic descriptor field to evolve over time.
PerformanceProtocol performance measurement is a tricky subject; it is difficult to achieve a true "apples-to-apples" measurement unless the same person or group designed and implemented both protocols in the same environment. Since XTP has more functionality than TCP/UDP/TP4, one might predict that it would run slower, but that has not been our experience.
Running under pSOS+ on a pair of 133 MHz Pentium-based PCs connected by Rockwell FDDI interfaces, we see the following results:
Throughput: using reliable transport unicast and large (64KB) messages, these machines sustain a throughput in excess of 90 Mbits/sec.
Throughput: using unacknowledged transport multicast (for which there is no counterpart in TCP), the machines can sustain a throughput of 96 Mbits/sec.
Latency: the end-to-end delay for transmitting a message is very short; the elapsed time for moving a 16-byte message from user memory in the transmitter to user memory in the receiver is 220 Us.
In general, our experience has been that a skillful XTP implementation delivers to the application approximately 80-96% of the raw network bandwidth available at the level of the MAC device driver. Nevertheless, because it is difficult to separate the skill of the implementer from the inherent power of the protocol, XTP does not make the claim that it is faster than any particular protocol; the measured performance simply speaks for itself. XTP is a tool, just like TCP and UDP are tools, and it should be used wherever appropriate. In my opinion, XTP's major contribution is not so much its performance but its functionality. The ability to utilize many different communications paradigms, all in the same protocol, and all under the user's control, is a major advantage for the system designer.
ApplicationsXTP has been utilized in a number of commercial and military applications. Its use is growing as application designers see its functional advantages (e.g., multicast, priority management) in distributed systems. Here are some examples of how commercial companies and government agencies have used XTP's capabilities to achieve specific goals.
Multicast. A government contractor needed to interconnect 22 CRAY supercomputers, all riding the belly of a C-130 aircraft, to form a flying signal intelligence center. Since this environment required sending identical data streams to multiple processors, they chose XTP so that they could utilize its multicast capabilities. Each multicast transmission consumes only a fraction of the network bandwidth that would otherwise have been required by N serial unicasts.
Gigabyte files. A commercial company needed to transfer gigabyte files routinely across a satellite-based network. They chose XTP to utilize its selective retransmission capability. When errors corrupt the satellite channel, XTP retransmits only the data that was lost; it does not restart the data stream from the point of loss.
High performance. A government agency needed a communications protocol that would support the real-time environment of weapons control. They chose XTP based on its low end-to-end latency characteristics. Being able to reliably send modest size messages with sub-millisecond latencies allowed them to write effective feedback control loops even in a packet-switched networking environment.
Image distribution. One company faced the problem of having to send very high resolution medical images over an FDDI-based hospital network. They needed to move multiple mammography images (40 MB each) to multiple work stations for simultaneous review by a staff of radiologists. They chose XTP for its combination of high throughput and multicast distribution.
Digital telephony. A government contractor wanted to operate a digital telephone/ intercom system over an existing FDDI network. They had to support 120 simultaneous bidirectional conversations (64 Kbits/s each), and each conversation had to have the capability of being sent to any combination of destinations. They chose XTP because its multicast feature would allow any number of receivers to join a multicast group (under the supervision and control of the transmitter), thus making it simple to implement the cross-communications requirement.
Video file server. A company needed the capability to send and receive "video mail" (synchronized audio/video) throughout their corporate network, in addition to their traditional text-based electronic mail. They chose XTP because it handled the multimedia requirements with ease. Personal computers were able to mount a remote disk drive (the file server) and retrieve a compressed multimedia stream in real time.
Priority support. A government research lab required that its packet-switched network be able to support digital multimedia streams in addition to file transfer and electronic mail. They used XTP in combination with our XTP-aware IP router (see following section) to achieve this goal. The IP router identifies XTP packets and handles them in accordance with the XTP priority option.
Real-time systems. A government contractor needed to design a ship-board network, based on Safenet, that would handle time-critical command and control messages, multimedia streams, background file transfer, and data distribution from one source to multiple destinations. They chose XTP because it provided latency control, message-level priorities, multimedia support, high overall performance, and multicast capabilities to handle the distribution of identical data streams.
Interoperability. Desiring to preserve the connectedness and interoperability associated with the classic protocols, one government contractor build a SAFENET network incorporating TCP and UDP and XTP, all operating over IP. Another vendor provided TP4 and XTP, both over CLNP. Yet another vendor, working with real-time data, put XTP directly on top of the MAC of FDDI.
Protocol EngineFor high-performance applications that already use a significant fraction of the CPU, forcing the CPU to run the network protocols at the transport layer and below may be a bad choice of partitioning. For those intelligent network interfaces that offer an on-board microprocessor, another option is to move protocol processing off the host entirely. We have conducted some initial experiments [Michel93a, Michel93b] that place the transport protocol and lower layer processing directly on the network interface card to create a protocol engine; the user then sees the transport service through an API. These experiments suggest that off-host protocol processing can be an effective way to increase the fraction of host CPU available to the application. At the same time, there is at least the possibility of having to field fewer versions of the protocol; using an off-host processor one would need versions for each backplane bus and an API for each operating system, rather than the traditional approach of having as many versions as the cross-product of CPU, operating system, and network type.
XTP's FutureXTP has been standardized as part for MIL-STD-2204 (Safenet). However, XTP's future does not depend upon standardization; we observe that TCP/UDP/IP have fared rather well without being international standards. XTP's acceptance depends upon having system designers recognize that:
References[DARPA80] Postel, J., ed., "User Datagram Protocol," RFC 768, USC/ Information Sciences Institute, August 1980.
[DARPA81a] Postel, J., ed., "Transmission Control Protocol—DARPA Internet Program Protocol Specification," RFC 793, USC/Information Sciences Institute, September 1981.
[DARPA81b] Postel, J., ed., "Internet Protocol-DARPA Intemet Program Protocol Specification," RFC 791, USC/Information Sciences Institute, September 1981.
[Dempsey90] Dempsey, B.J., Fenton, J.C., and Weaver, A.C., "The Multidriver: A Reliable Multicast Service for the Xpress Transfer Protocol," 15th Local Computer Networks Conference, Minneapolis, Minnesota, October 1-3,1990.
[Dempsey91] Dempsey, B.J., An Analysis of Multicast and Multicast Group Management, M.S. thesis, Department of Computer Science, University of Virginia, January 1991.
[Dempsey94a] Dempsey, B.J., Lucas, M.T., and Weaver, A.C., "An Empirical Study of Packet Voice Distribution over a Campus-Wide Network," l9th IEEE Local Computer Networks Conference, Minneapolis, Minnesota, October 1994.
[Dempsey94b] Dempsey, B.J., Lucas, M.T., and Weaver, A.C., "Design and Implementation of a High Quality Video Distribution System using XTP Reliable Multicast," Second International Workshop on Advanced Communications and Applications for High-Speed Networks, Heidelberg, Germany, September 1994.
[Dempsey94c] Dempsey, B.J., Retransmission-Based Error Control for Continuous Media Traffic in Packet-Switched Networks, Ph.D. dissertation, Computer Networks Laboratory, Department of Computer Science, University of Virginia, May 1994.
[Dempsey94d] Dempsey, B.J., Liebeherr, J., and Weaver, A.C., "A New Error Control Scheme for Packetized Voice over High-Speed Local Area Networks," 18th IEEE Local Computer Networks Conference, Minneapolis, Minnesota, September 1993.
[IS07498] International Organization for Standardization, "Information Processing Systems-Open Systems Interconnection-Basic Reference Model," International Standard 7498, October 1984.
[IS08073] International Organization for Standardization, "Information Processing Systems-Open Systems Interconnection-Transport Protocol Specifications," International Standard 8073, July 1986.
[IS08473] International Organization for Standardization, "Information Processing Systems-Open Systems Interconnection-Data Communications Protocol for Providing the ConnectionlessMode Network Service, International Standard 8473, March 1986.
[Mentat94] "Mentat XTP for Streams," Mentat, 1145 Galey Avenue, Suite 315, Los Angeles, California 90024 USA.
[Michel93a] Michel, J., Waterman, A., and Weaver, A., "Performance Evaluation of an Off-Host Communications Architecture," High Performance Communication Subsystems, Williamsburg, VA, September 1-3,1993.
[Michel93b] Michel, J., Waterman, A., and Weaver, A.C., "Performance Evaluation of an Off-Host Communications System," 18th Local Computer Networks Conference, Minneapolis, MN, September 19-22, 1993.
[MIL2204] Survivable Adaptable Fiber Optic Network, U.S. Department of Defense Military Standard MIL-STD-2204, October 1992.
[NetX95]"Xpress Transport Protocol version 4.0 Multiuser Application Programming Interface," Network Xpress Inc., 1111 Rose Hill Drive, Suite 9, Charlottesville, Virginia 22903 USA.
[Strayer92]Strayer, W.T., Dempsey, B.J., and Weaver, A.C., XTP: The Xpress Transfer Protocol, Addison-Wesley, 1992.
[Strayer94]Strayer, W.T., Gray, S., and Cline, R.E., Jr., "An Object-Oriented Implementation of the Xpress Transfer Protocol," Proceedings of the Second International Workshop on AdvancedCommunications and Applications for High-Speed Networks,
(IWACA'94), Heidelberg, Germany, September 26-28, 1994.
[XTP95] Strayer, W.T., ed., "Xpress Transport Protocol 4.0 Specification," XTP Forum Inc., 1394 Greenworth Place, Santa Barbara, CA 93108 USA.
|Network Xpress||Telephone: (804) 293-8066|
|firstname.lastname@example.org||FAX: (804) 293-8414|