Hypercast2.0

 

Design Document

 

 

HyperCast Team
Department of Computer Science
University of Virginia

Email: hypercast@cs.virginia.edu
Web: http://www.cs.virginia.edu/~hypercast

 

January 2002


Acknowledgements

Several undergraduate and graduate students, and Postdocs have contributed to the development of the HyperCast protocol. In alphabetical order, these are Jean Ablutz, Tyler Beam, Burton Filstrup, Konrad Lorincz, Huafeng Lu, Michael Nahas,  Neelima Putrevu, Bhupinder Sethi, Weisheng Si, Scott Talbert, Dongwen Wang, Haiyong Wang, and Guimin Zhang.

Principal investigator of the project is Jörg Liebeherr.

The first version of  HyperCast was designed  by Tyler Beam, Jorg Liebeherr and Bhupinder Sethi. HyperCast 1.0 was implemented by Tyler Beam.

The design of the HyperCast 2.0 overlay socket was done by Jorg Liebeherr, Konrad Lorincz, Mike Nahas, and Dongwen Wang. Most of the implementation and testing of HyperCast 2.0 was done by Dongwen Wang, Mike Nahas, and Guimin Zhang.

The demo applications provided in the HyperCast distribution were written by Dongwen Wang (media streamer), Guimin Zhang (mftp), and Huafeng Lu (whiteboard).

The design document was written by Jorg Liebeherr, Dongwen Wang, and Guimin Zhang, with material supplied by all members.

The HyperCast project has been supported in part by the National Science Foundation under grants NCR-9624106, ANI-9870336, and ANI-0085955.  The HyperCast project is part of the Denali project (http://denali.berkeley.edu) on scalable services for the Global Internet. 

 

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation or any other sponsor.



CHAPTER 1

HyperCast 2.0 Overview

 

1.1   Overlay Topologies in HyperCast

The HyperCast software builds and maintains logical overlay networks between applications, and supports data transmission between applications in the overlay.  Applications self-organize into a logical overlay network, and transfer data along the edges of the overlay network using unicast transport services.  Each application communicates only with its neighbors in the overlay network.  Using the overlay, services for one-to-many transmissions ("multicast") and many-to-one transmissions ("incast") can be easily implemented.

HyperCast builds topologies graphs with well-known properties. Currently, HyperCast can build two types of overlay network topologies, logical hypercubes and logical Delaunay triangulations. An important common property to both topologies is that once the topology is established, packet forwarding in these overlay network can be performed without the need for a routing protocol.  

1.  1.1.1 Hypercubes  

The logical hypercube overlay network topology organizes the applications into a logical n-dimensional hypercube. Each node is identified by a label (e.g., "010"), which indicates the position of the node in the logical hypercube.  In an overlay network with N nodes, the lowest N position of a hypercube are occupied (according to a Gray ordering).

One advantage of using a hypercube is that each node has only   neighbors[1], where N is the total number of nodes.  Also, the longest route in the hypercube is .  A disadvantage of hypercubes is that the physical network infrastructure is completely ignored. Another disadvantage is that the hypercube construction must be done sequentially, i.e. one node at a time. Therefore, for large groups it can take a long time before the overlay network is built. Also, the departure of a single node may require substantial changes to the overlay topology.

We refer to [LIEBE99] for a detailed description of the hypercube topology. The protocol has been tested with up to 10,000 nodes running on 100 computers. See [BEAM99] and [LORIN01] for the test results.

1.  1.1.2  Delaunay Triangulation

A Delaunay triangulation is a special type triangulation. Its main characteristic is that for each circumscribing circle of a triangle formed by three nodes, no other node of the graph is in the interior of the circle. Each node in a Delaunay triangulation has (x,y) coordinates which depict a point in the plane. In Figure 1.3, we show a Delaunay triangulation of five nodes and the circumscribing circles of some of its triangles. 

An advantage of the Delaunay triangulation overlay network topology is that it can be constructed in a distributed fashion. Therefore, Delaunay triangulations can be built very quickly.  In a Delaunay triangulation, each node has an average of six neighbors. In the worst-case, however, the number of neighbors is N-1, where N is the total number of nodes.  

If the coordinates of a node in the Delaunay triangulation reflect the node’s geographical location, then nodes in the overlay network are likely to be neighbors if their geographical location is close. However, Delaunay triangulations are not aware of the underlying network infrastructure. 

The Delaunay Triangulation topology has been tested with up to 10,000 running on 100 computers. See [LIEBE01b] for the results.

1.2   Evaluation Criteria for Overlay Topologies

Overlay network topologies can be evaluated based on a number of criteria.  We use several of them to evaluate logical hypercubes and logical Delaunay triangulations.

·       Need for global information: The scalability of an overlay network (in terms of the total number of nodes) is limited if the state information grows fast with the number of nodes in the overlay network. In both HyperCast topologies, the state information at each node is very small (except for the worst-case scenarios in Delaunay triangulations).  

·       Effort to build and maintain the overlay topology: Ideally, the number of computations performed after a node joins or leaves the overlay network should be constant. For the hypercube, adding and removing a node generally requires a constant number of  steps. However, the process of adding multiple nodes must be executed sequentially, and cannot be executed  in parallel. The effort to add and delete a node in the Delaunay triangulation is dependent on the number of neighbors of a node. This number is constant in the average case, but can be considerable higher. On the other hand, since the Delaunay triangulation can add and remove nodes in parallel, changes to the overlay topology are completed quickly.

·       Need for a routing protocol in the overlay: For data forwarding in a network, nodes generally maintain a routing table, which determines the next-hop of the path towards the destination or destinations. Many existing overlay network approaches require the nodes to run a distance routing protocol vector or other shortest path algorithm to establish the routing tables. With hypercubes and Delaunay triangulations, nodes can use the logical addresses of other nodes to determine the next hop of data.  This eliminates the need for a routing protocol.

·       Match of the overlay network to the Internet topology: Data forwarding in overlay networks is done at the application level. Therefore, data may traverse the same part of the IP network several times before it reaches its destination or destinations.  This may result in inefficient use of network capacity and increased delays compared to transmission at the IP layer. These disadvantages, which are shared to some degree by all overlay network topologies, are least pronounced if the overlay network is constructed with consideration of the underlying Internet topology. Since the overlay topologies of HyperCast are constructed without awareness of the Internet topology, the overlay networks constructed by Hypercast may not be good matches for the underlying Internet topology.

1.3    Data transport  

All data is transmitted along trees that are embedded in the overlay network topology. For each member of an overlay network, there is a spanning tree in the overlay network with the member as its root.   Given the root of an embedded tree, each member of an overlay network locally determines its children and parent with respect to that tree. Each member forwards data to its children or parent in an embedded tree with respect to a specific node.

Data is exchanged in the overlay network as formatted messages. Data transmission between two members of an overlay network can be done with either UDP or TCP.  An application selects a specific overlay topology and a specific protocol for data transport when it instantiates an overlay socket.

Multicast transmission in an overlay network uses the embedded tree that has the multicast sender as root. If an overlay node receives a multicast message, it passes the message to the application and forwards the message to all of its child nodes in the embedded tree (see Figure1. 4a).

Unicast transmission is performed "upstream" an embedded tree. The sender of a unicast message forwards the message to its parent in the embedded tree that has the destination as the root (see Figure 1.4b).

                                (a) Multicast                          (b)Unicast

Figure 1.4. Data forwarding in overlay networks. 

Figure 1.4 shows that data is forwarded along trees that are embedded in the overlay network. For each member of the overlay, there is an embedded tree with this member at the root. Hypercast presumes that given the root of an embedded tree, each member of an overlay network can locally determine its children and parent with respect to that tree. The node forwards data either to its children or to its parent.

 

1.4  Overlay Sockets

The HyperCast software is built around the notion of an overlay socket (Overlay Socket). An Overlay Socket is an endpoint for communication in an overlay network. An overlay socket provides application programs an interface for communications over a logical overlay topology.  The application programming interface (API) of an Overlay Socket offers applications the ability to:

  • Join and leave existing overlays,
  • Send data to all or a subset of the members of the overlay network, and
  • Receive data from the overlay.

The HyperCast software is structured around components which are either part of an Overlay Socket or that interact with an Overlay Socket. 

 

1.5   References

[BEAM99] T. K. Beam. HyperCast:A Protocol for Maintaining a Logical Hypercube-Based Network Topolog. M.S. Thesis, University of Virginia, May 1999.

[LIEBE99] J. Liebeherr and T. K. Beam. HyperCast: A protocol for maintaining multicast group members in a logical hypercube topology. In Proceedings First International Workshop on Networked Group   Communication (NGC '99), Lecture Notes in Computer Science, Volume 1736, pages 72-89, 1999.

[LIEBE01a] ] J. Liebeherr and M. Nahas.  Application-layer Multicast with Delaunay Triangulations.  Proceedings of IEEE Globecom 2001, Global Internet Symposium.

[LIEBE01b] J. Liebeherr, M. Nahas, and W. Si.  Large-scale Application-Layer Multicast with Delaunay Triangulations.  University of Virginia, Computer Science Department, Technical Report, CS-01-26, November 2001.

[LORIN01] K. Lorintz, HyperCast: A Super-Scalable Many-to-Many Multicast Protocol for Distributed Internet Applications, Undergraduate Thesis, School of Engineering and Applied Science, University of Virginia. May 2001.


[1] If x is an integer then =x, otherwise it is the smallest integer larger than x.