NSF #1254006 A New Efficient and Cooperative Large-Scale Distributed Data Sharing System

This project investigates how information from social networks may be used to create efficient and cooperative large-scale distributed data sharing systems. Two of the challenges facing such systems are locating data quickly and cost-efficiently, and enforcing cooperative node behaviors. Social network-based approaches seek to build social networks into data sharing systems in order to leverage the real-world friend properties of mutual trust and common interest. Although these approaches can significantly reduce cost and complexity compared to purely technical approaches, current methods only make use of a very superficial level of social network properties, and cannot support the broad goal of enabling all nodes to share data freely and efficiently. This project addresses these shortcomings through more fully leveraging social network properties, and coordinating social network-based approaches with technical approaches. Areas of investigation include infrastructures for data searching, cooperation enhancement mechanisms, and algorithms for data server selection.

Today, a multitude of large-scale distributed systems use the Internet to deliver a variety of data -- such as software and audio/video content -- to end users. These systems support many social, commercial, and cultural activities. Millions of dollars are spent on commercial servers to deliver this data. Peer-to-peer technology, in which user computers cooperate to share content between themselves, has potential for doing this cheaper and faster. However, wider use of peer-to-peer technology is held back by several limitations. One is that nodes may misbehave, selfishly receiving content but not contributing in return, or distributing corrupted or malicious content into the system, to be further spread by unsuspecting users. Another limitation is that locating content is difficult, due to the wide distribution of nodes and the lack of a central index. This project seeks to address these limitations by making use of information obtained from social networks. Potential benefits of the work include savings in the costs of infrastructure and energy consumption for data distribution. The project also provides educational opportunities for graduate and undergraduate students, and collaborates with other educational programs for under-represented student recruitment, and outreach to K-12 students.