University of Virginia Department of
    Computer Science

Thursday, May 01, 2008
Tibor Horvath

Chair: Marty Humphrey, Sudhanva Gurumurthi, John Lach, Tarek Abdelzaher
Advisor: Kevin Skadron
OLSSON 236D, 1:00 PM

A Ph.D. Defence

Energy Management in Real-Time Multi-Tier Internet Services

ABSTRACT

Energy management only recently emerged as a major consideration in the design of high-performance Internet services. As a result of requirements for continual performance scaling, the energy required to provide these services also massively increasedA~Wan unfortunate corollary of MooreA~Rs Law. At the same time, the high scalability of Internet services is commonly achieved by employing multi-tier (functionally distributed) clustered architectures. At high-demand sites, the number of server  computers in such clusters can be very large (on the order of thousands or above). Due to the  high overall power consumption and related heat dissipation that these server farms exhibit severe operational challenges arise such as high energy costs, high cooling costs (operation and maintenance), costly infrastructure requirements (such as sophisticated power-delivery and cooling systems), increased space demands (server unit density is limited by heat dissipation), and decreased system reliability (heat-related failures). These increased operational costs constitute a significan part of total upkeep and maintenance expenses of large sites. Advances in networking technology have and will continue to cause demand and reliance on Internet-based services to accelerate, exacerbating the above-mentioned issues.  

By 2002, researchers already identified that there is significant potential for reducing energy use in Web servers. Because of observed daily and weekly demand fluctuations due to the natural cycle of human activity levels, Internet server workloads tend to show extended periods of low-load or even near-idle operation. Furthermore, newer server-class hardware started to adopt power saving modes previously only supported on mobile systems, which allows judicious reduction of server capacity (and the corresponding energy use) during those off-peak load conditions. Since then, several techniques appeared to manage power dissipation of individual machines. However, since most techniques incur some performance penalty, minimizing the global energy expenditure of a cluster with minimal effect on its performance  remains a challenge.                                                                                                                                                                                                                                         This dissertation presents the theoretical analysis of several aspects of this energy minimization problem, starting out with a basic multi-tier server model, which is then extended to deal with multiple priorities and reconfigurable clusters; it discusses the optimization of spare capacity in such clusters when multiple machine sleep states are available; and finally, it analyzes how external disturbances, such as those triggering dynamic thermal management actions, affect the performance of cluster power management algorithms.                                                                                                  



Other Recent and Upcoming Colloquia