History
From VCGR Wiki
VCGR History
The Virginia Center for Grid Research was founded in early 2005 to promote the application of grids to real world problems and to continue the leading-edge grid research done at the University for more than a decade. The VCGR has its roots in the distributed and grid computing research efforts begun in 1993 which spawned the Legion research group and grid computing platform. Legion in turn had its roots in earlier object-oriented parallel computing projects, most notably the Mentat research project.
The VCGR is funded by the state of Virginia under a grant to encourage the advancement of key technologies in academia.
Mentat
Mentat was a research project dedicated to creating an easy-to-use and high performance object oriented language for parallel computing. The Mentat language was designed as extensions to standard C++ - in particular adding keywords to the language to allow programmers to mark which classes could be run in parallel. The research group developed a compiler and runtime system to analyize the data flow requirements of the program and to manage the instantiation and scheduling of each instance as well as inter-object communication.
The Mentat group deployed the resulting system both locally at UVa and remotely at other institutions to help outside researchers, primarily in the hard sciences, improve their efficiency and developing, testing and running their applications. As Mentat began to grow beyond groups of local homogeneous workstations, so did the interesting problems encountered. In particular, it became apparent that much more was needed to make such a system work on a large scale, especially across different organizational boundaries. What was needed included data management facilities, security and user management, meta scheduling facilities, cross firewall and VPN communication, binary management, state management, resource management, richer control of resources by their owners, and much more. From this realization came the Legion research project.
Legion
Legion was born in late 1993 with the observation that dramatic changes in wide-area network bandwidth were on the horizon. In addition to the expected vast increases in bandwidth, other changes such as faster processors, more available memory, more disk space, etc. were expected to follow in the usual way as predicted by Moore’s Law. Given the dramatic changes in bandwidth expected, the natural question was, how will this bandwidth be used? Since not just bandwidth will change, we generalized the question to, “Given the expected changes in the physical infrastructure – what sorts of applications will people want, and given that, what is the system software infrastructure that will be needed to support those applications?” The Legion project was born with the determination to build, test, deploy and ultimately transfer to industry, a robust, scalable, Grid computing software infrastructure. We followed the classic design paradigm of first determining requirements, then completely designing the system architecture on paper after numerous design meetings, and finally, after a year of design work, coding. We made a decision to write from scratch rather than extend and modify an existing system, Mentat, that we had been using as a prototype. We felt that only by starting from scratch could we ensure adherence to our architectural principles. First funding was obtained in early 1996, and the first line of Legion code was written in June of 1996.
By November, 1997 we were ready for our first deployment. We deployed Legion at UVa, SDSC, NCSA and UC Berkeley for our first large scale test and demonstration at Supercomputing 1997. In the early months keeping the mean time between failures (MTBF) over twenty hours under continuous use was a challenge. This is when we learned several valuable lessons. For example, we learned that the world is not “fail-stop”. While we intellectually knew this – it was really brought home by the unusual failure modes of the various hosts in the system.
By November 1998, we had solved the failure problems and our MTBF was in excess of one month, and heading towards three months. We again demonstrated Legion – now on what we called NPACI-Net. NPACI-Net consisted of hosts at UVa, SDSC, Caltech, UC Berkeley, IU, NCSA, the University of Michigan, Georgia Tech, Tokyo Institute of Technology and Vrije Universiteit, Amsterdam. By that time dozens of applications had been ported to Legion from areas as diverse as materials science, ocean modeling, sequence comparison, molecular modeling and astronomy. NPACI-Net grew through 2003 with additional sites such as the University of Minnesota, the University of Texas at Austin, SUNY Binghamton and PSC. Supported platforms included Windows 2000, the Compaq iPaq, the T3E and T90, IBM SP-3, Solaris, Irix, HPUX, Linux, True 64 Unix and others.
From the beginning of the project a “technology transfer” phase had been envisioned in which the technology would be moved from academia to industry. We felt strongly that Grid software would move into mainstream business computing only with commercially-supported software, help lines, customer support, services and deployment teams. In 1999, Applied MetaComputing was founded to carry out the technology transition. In 2001, Applied MetaComputing raised $16M in venture capital and changed its name to AVAKI Corporation. The company acquired legal rights to Legion from the University of Virginia and renamed Legion to “Avaki”. Avaki was released commercially in September, 2001.
