Hypercast2.0 Design Document
HyperCast Team Email: hypercast@cs.virginia.edu January 2002 Acknowledgements Several undergraduate and graduate students, and Postdocs have contributed to the development of the HyperCast protocol. In alphabetical order, these are Jean Ablutz, Tyler Beam, Burton Filstrup, Konrad Lorincz, Huafeng Lu, Michael Nahas, Neelima Putrevu, Bhupinder Sethi, Weisheng Si, Scott Talbert, Dongwen Wang, Haiyong Wang, and Guimin Zhang. Principal investigator of the project is Jörg Liebeherr. The first version of HyperCast was designed by Tyler Beam, Jorg Liebeherr and Bhupinder Sethi. HyperCast 1.0 was implemented by Tyler Beam. The design of the HyperCast 2.0 overlay socket was done by Jorg Liebeherr, Konrad Lorincz, Mike Nahas, and Dongwen Wang. Most of the implementation and testing of HyperCast 2.0 was done by Dongwen Wang, Mike Nahas, and Guimin Zhang. The demo applications provided in the HyperCast distribution were written by Dongwen Wang (media streamer), Guimin Zhang (mftp), and Huafeng Lu (whiteboard). The design document was written by Jorg Liebeherr, Dongwen Wang, and Guimin Zhang, with material supplied by all members. The HyperCast project has been supported in part by the National Science Foundation under grants NCR-9624106, ANI-9870336, and ANI-0085955. The HyperCast project is part of the Denali project (http://denali.berkeley.edu) on scalable services for the Global Internet. Any opinions, findings, and
conclusions or recommendations expressed in this material are those of the
author(s) and do not necessarily reflect the views of the National Science
Foundation or any other sponsor.
HyperCast 2.0 Overview 1.1 Overlay Topologies in HyperCastThe HyperCast software builds and maintains logical
overlay networks between applications, and supports data transmission
between applications in the overlay.
Applications self-organize into a logical overlay network, and
transfer data along the edges of the overlay network using unicast
transport services. Each
application communicates only with its neighbors in the overlay network. Using the overlay, services for one-to-many transmissions
("multicast") and many-to-one transmissions ("incast")
can be easily implemented.
HyperCast builds topologies graphs with well-known properties. Currently, HyperCast can build two types of overlay network topologies, logical hypercubes and logical Delaunay triangulations. An important common property to both topologies is that once the topology is established, packet forwarding in these overlay network can be performed without the need for a routing protocol. 1. 1.1.1 Hypercubes
The logical hypercube overlay network topology organizes the applications
into a logical n-dimensional hypercube. Each node is identified by a label
(e.g., "010"), which indicates the position of the node in the
logical hypercube. In an overlay network with N nodes, the lowest N
position of a hypercube are occupied (according to a Gray ordering).
One advantage of using a hypercube is that each node has only
We refer to [LIEBE99] for a detailed description of the hypercube topology. The protocol has been tested with up to 10,000 nodes running on 100 computers. See [BEAM99] and [LORIN01] for the test results. 1. 1.1.2 Delaunay TriangulationA Delaunay triangulation is a special type triangulation. Its main characteristic is that for each circumscribing circle of a triangle formed by three nodes, no other node of the graph is in the interior of the circle. Each node in a Delaunay triangulation has (x,y) coordinates which depict a point in the plane. In Figure 1.3, we show a Delaunay triangulation of five nodes and the circumscribing circles of some of its triangles.
If the coordinates of a node in the Delaunay triangulation reflect the node’s geographical location, then nodes in the overlay network are likely to be neighbors if their geographical location is close. However, Delaunay triangulations are not aware of the underlying network infrastructure. The Delaunay Triangulation topology has been tested with up to 10,000 running on 100 computers. See [LIEBE01b] for the results. 1.2 Evaluation Criteria for Overlay TopologiesOverlay network topologies can be evaluated based on a number of criteria. We use several of them to evaluate logical hypercubes and logical Delaunay triangulations. · Need for global information: The scalability of an overlay network (in terms of the total number of nodes) is limited if the state information grows fast with the number of nodes in the overlay network. In both HyperCast topologies, the state information at each node is very small (except for the worst-case scenarios in Delaunay triangulations). · Effort to build and maintain the overlay topology: Ideally, the number of computations performed after a node joins or leaves the overlay network should be constant. For the hypercube, adding and removing a node generally requires a constant number of steps. However, the process of adding multiple nodes must be executed sequentially, and cannot be executed in parallel. The effort to add and delete a node in the Delaunay triangulation is dependent on the number of neighbors of a node. This number is constant in the average case, but can be considerable higher. On the other hand, since the Delaunay triangulation can add and remove nodes in parallel, changes to the overlay topology are completed quickly. · Need for a routing protocol in the overlay: For data forwarding in a network, nodes generally maintain a routing table, which determines the next-hop of the path towards the destination or destinations. Many existing overlay network approaches require the nodes to run a distance routing protocol vector or other shortest path algorithm to establish the routing tables. With hypercubes and Delaunay triangulations, nodes can use the logical addresses of other nodes to determine the next hop of data. This eliminates the need for a routing protocol. · Match of the overlay network to the Internet topology: Data forwarding in overlay networks is done at the application level. Therefore, data may traverse the same part of the IP network several times before it reaches its destination or destinations. This may result in inefficient use of network capacity and increased delays compared to transmission at the IP layer. These disadvantages, which are shared to some degree by all overlay network topologies, are least pronounced if the overlay network is constructed with consideration of the underlying Internet topology. Since the overlay topologies of HyperCast are constructed without awareness of the Internet topology, the overlay networks constructed by Hypercast may not be good matches for the underlying Internet topology. 1.3 Data transportAll data is transmitted along trees that are embedded in the overlay network topology. For each member of an overlay network, there is a spanning tree in the overlay network with the member as its root. Given the root of an embedded tree, each member of an overlay network locally determines its children and parent with respect to that tree. Each member forwards data to its children or parent in an embedded tree with respect to a specific node. Data is exchanged in the overlay network as formatted messages. Data transmission between two members of an overlay network can be done with either UDP or TCP. An application selects a specific overlay topology and a specific protocol for data transport when it instantiates an overlay socket. Multicast transmission in an overlay network uses the embedded tree that has the multicast sender as root. If an overlay node receives a multicast message, it passes the message to the application and forwards the message to all of its child nodes in the embedded tree (see Figure1. 4a). Unicast transmission is performed "upstream" an embedded tree. The sender of a unicast message forwards the message to its parent in the embedded tree that has the destination as the root (see Figure 1.4b).
(a) Multicast (b)Unicast Figure 1.4. Data forwarding in overlay networks. Figure 1.4 shows that data is forwarded along trees that are embedded in the overlay network. For each member of the overlay, there is an embedded tree with this member at the root. Hypercast presumes that given the root of an embedded tree, each member of an overlay network can locally determine its children and parent with respect to that tree. The node forwards data either to its children or to its parent.
1.4 Overlay SocketsThe HyperCast software is built around the notion of an overlay socket (Overlay Socket). An Overlay Socket is an endpoint for communication in an overlay network. An overlay socket provides application programs an interface for communications over a logical overlay topology. The application programming interface (API) of an Overlay Socket offers applications the ability to:
The HyperCast software is structured around components which are either part of an Overlay Socket or that interact with an Overlay Socket.
1.5 References[BEAM99] T. K. Beam. HyperCast:A Protocol for Maintaining a Logical Hypercube-Based Network Topolog. M.S. Thesis, University of Virginia, May 1999. [LIEBE99] J. Liebeherr and T. K. Beam. HyperCast: A protocol for maintaining multicast group members in a logical hypercube topology. In Proceedings First International Workshop on Networked Group Communication (NGC '99), Lecture Notes in Computer Science, Volume 1736, pages 72-89, 1999. [LIEBE01a] ] J. Liebeherr and M. Nahas. Application-layer Multicast with Delaunay Triangulations. Proceedings of IEEE Globecom 2001, Global Internet Symposium. [LIEBE01b] J. Liebeherr, M. Nahas, and W. Si. Large-scale Application-Layer Multicast with Delaunay Triangulations. University of Virginia, Computer Science Department, Technical Report, CS-01-26, November 2001. [LORIN01] K. Lorintz, HyperCast: A Super-Scalable Many-to-Many Multicast Protocol for Distributed Internet Applications, Undergraduate Thesis, School of Engineering and Applied Science, University of Virginia. May 2001. [1] If x is an integer then
|