NSF #1064230 Collaborative Research: A Peer-to-Peer based Storage System for High-End Computing (PI)

Clemson University and the University of Alabama at Birmingham (UAB) jointly propose transformative computer science in the area of file systems in High-End Computing (HEC). The project is unique to the mission of NSF's EAGER program because, in a novel way, the outcomes of this work will replace current parallel file systems with a substantially more scalable Peer-to-Peer (P2P)-based storage system that will enable forthcoming exascale computing systems.

The combination of P2P-based file sharing concepts and HEC requirements and concerns is a novel approach to large-scale computing with performance requirements. The study of exascale storage from first principles is an important class of research to be undertaken, as opposed to refactoring existing approaches to file systems that are deployed on Terascale and Petascale systems.

Because the investigators seek to validate alternative designs of high scalability, availability, integrity and robustness than those offered by the logical evolution of existing HEC file systems and their instantiations on Petascale architectures, this EAGER project will seek to produce a preliminary/research prototype for a radically different file system. This project will jointly study, design, and create a preliminary/research prototype for a distributed software infrastructure and related techniques that support scalable and reliable file storage and retrieval for HEC relying on a structured P2P network.

If this project is successful, the design/architecture/strategy for exascale-based storage systems will change greatly over the logical evolutionary extensions of existing file systems and key opportunities and barriers will be more clearly understood in regards to the creation of practical exascale storage systems. Exascale co-design approaches will be considered and compared to the outcomes of this work, thereby informing other researchers of the relative merits of co-design approaches for exascale when file systems are studied based on a first-principles approach.