UVa CS Technical Reports

Some of the reports listed below are not available on-line. To obtain them, send mail to techrep-librarian@cs.virginia.edu.

[Computer Science Technical Reports] [Institute for Parallel Computation Technical Reports] [Research Memos]

Computer Science

CS-2007-07
Huang, Wei; Sankaranarayanan, Karthik; Ribando, Robert J.; Stan, Mircea R.; Skadron, Kevin. An Improved Block-Based Thermal Model in HotSpot 4.0 with Granularity Considerations, April 23, 2007.

Abstract: This technical report describes our most recent improvements to the HotSpot thermal model. First, in response to a previous paper in WDDD 2006 that cites some accuracy shortcomings, we improve treatment in HotSpot block model of high aspect-ratio blocks, high power densities, and convection boundary conditions. These improvements are included in the newest release of the HotSpot software, version 4.0. Second, to help guide users modeling efforts and choice of modeling resolution, we propose an analytical approach to find the relationship between the power modeling granularity and the associated peak temperature rise at that granularity, i.e. the relationship between the size and peak temperature of a uniform heat source. We theoretically confirm the existence of a strong spatial temperature low-pass filtering effect: a tiny heat source is much cooler than a large heat source with the same power density.

CS-2007-05
Walcott, Kristen R.; Humphreys, Greg; Gurumurthi, Sudhanva. Dynamic Prediction of Architectural Vulnerability from Microarchitectural State, November 17, 2006.

Abstract: Transient faults due to particle strikes are a key challenge in microprocessor design. Driven by exponentially increasing transis tor counts, per-chip faults are a growing burden. To protect against soft errors, redundancy techniques such as redundant multithrea ding (RMT) are often used. However, these techniques assume that the probability that a structural fault will result in a soft error (i.e., the Architectural Vulnerability Factor (AVF)) is 100 percent, unnecessarily draining processor resources. Due to the high co st of redundancy, there have been efforts to throttle RMT at runtime. To date, these methods have not incorporated an AVF model and therefore tend to be ad hoc. Unfortunately, computing the AVF of complex microprocessor structures (e.g., the ISQ) can be quite invo lved. To provide probabilistic guarantees about fault tolerance, we have created a rigorous characterization of AVF behavior that can be e asily implemented in hardware. We experimentally demonstrate AVF variability within and across the SPEC2000 benchmarks and identify strong correlations between structural AVF values and a small set of processor metrics. Using these simple indicators as predictors, we create a proof-of-concept RMT implementation that demonstrates that AVF prediction can be used to maintain a low fault tolerance level without significant performance impact.

CS-2005-14
Rajan, Hridesh; Sullivan, Kevin J.. On the Expressive Power of Classpects, July 18, 2005.

Abstract: The most prominent feature of AspectJ and derivative languages is the aspect. The aspect is a module-like construct that supports data abstraction but that is distinct from the class in several ways. First, aspects support new mechanisms: notably join points, pointcut descriptors and advice. At the same time, they lack some key features of classes, namely the ability to manage aspect instantiation using new. In earlier work, we showed that it was possible and valuable to simplify the design of such languages without loss of expressiveness by unifying aspects and classes in a new construct that we called the classpect. The contribution of this paper is the formal demonstration, using Felleisen's notion of macro-eliminable programming language extensions, that the classpect language model, realized in the Eos language and compiler, is strictly more expressive than AspectJ in the important case in which aspect-like modules must advise other aspect modules.

CS-2005-12
Jin, X.; French, J. C.; Michel, J.. Query Formulation for Information Retrieval by Intelligence Analysts, July 2005.

CS-2005-11
K. Dale, J. Sheaffer, D. Luebke, G. Humphreys, and K. Skadron.. Applications of Small Scale Reconfigurability to Graphics Processors, June 2005 (revised 8 July 2005).

Abstract: We explore the application of Small-Scale Reconfigurability (SSR) to graphics hardware. SSR is a relatively new architectural technique wherein functionality common to multiple subunits is reused rather than replicated, yielding high-performance reconfigurable hardware with reduced area requirements. We show that SSR can be used effectively in programmable graphics architectures to allow double-precision computation without affecting the performance of single-precision calculations and to increase fragment shader performance with a minimal impact on chip area.

CS-2005-10
Lu, Zhijian; Lach, John; Stan, Mircea; Skadron Kevin. Temperature-Aware Modeling and Banking of IC Lifetime Reliability, July 1, 2005.

Abstract: Most existing integrated circuit (IC) reliability models assume a uniform, typically worst-case, operating temperature, but temporal and spatial temperature variations affect expected device lifetime. As a result, design decisions and dynamic thermal management (DTM) techniques using worst-case models are pessimistic and result in excessive design margins and unnecessary runtime engagement of cooling mechanisms (and associated performance penalties). By leveraging a reliability model that accounts for temperature gradients (dramatically improving interconnect lifetime prediction accuracy) and modeling expected lifetime as a resource that is consumed over time at a temperature- and voltage-dependent rate, substantial design margin can be reclaimed and runtime penalties avoided while meeting expected lifetime requirements. In this paper, we evaluate the potential benefits and implementations of this technique by tracking the expected lifetime of a system under different workloads while accounting for the impact of dynamic voltage and temperature variations. Simulation results show that our dynamic reliability management (DRM) techniques provide a 40% performance penalty reduction over that incurred by pessimistic DTM in general-purpose computing and a 10% increase in quality of service (QoS) for servers, all while preserving the expected IC lifetime reliability.

CS-2005-08
Sankaranarayanan, Karthik; Velusamy, Sivakumar; Skadron, Kevin; Stan, Mircea. Microarchitectural Floorplanning for Thermal Management: A Technical Report, May 20, 2005.

Abstract: In current day microprocessors, exponentially increasing power densities, leakage, cooling costs,and reliability concerns have resulted in temperature becoming a first class design constraint like performance and power. Hence,virtually every high performance microprocessor uses a combination of an elaborate thermal package and some form of Dynamic Thermal Management (DTM) scheme that adaptively controls its temperature. While DTM schemes exploit the important variable of power density to control temperature,this paper attempts to show that there is a significant peak temperature reduction potential in managing lateral heat spreading through floorplanning. It argues that this potential warrants consideration of the temperature-performance trade-off early in the design stage at the microarchitectural level using floorplanning. As a demonstration,it uses previously proposed wire delay model and floorplanning algorithm based on simulated annealing to present a profile-driven, thermal-aware floorplanning scheme that significantly reduces peak processor temperature with minimal performance impact that is quite competitive with DTM.

CS-2005-02
Greenwell, William S.; Holloway, C. Michael; Knight, John C.. A Taxonomy of Fallacies in System Safety Arguments, December 10, 2004 (posted 19 July 2005).

Abstract: A system's safety argument is intended to show that the system is acceptably safe to operate in a given environment. If that argument is fallacious, the system may be prone to hazardous modes of operation that could contribute to accidents. We conducted a case study of three industrial safety cases to determine the frequency and nature of fallacious reasoning in system safety arguments. Our results suggest that the frequency of logical fallacies in these arguments is significant and that they follow common themes. To avoid these fallacies, developers must be aware of them when they create safety arguments, and regulators and investigators must know how to discover them when they review those arguments. We present a taxonomy of logical fallacies tailored to system safety cases to assist developers and regulators in these tasks and then demonstrate the taxonomy by applying it to the three safety cases from our case study.

CS-2005-01
Li, Y.; Skadron, K.; Brooks, D.; Hempstead, M.; Mauro, P.; Hu, Z.. Power and Thermal Effects of SRAM vs. Latch-Mux Design Styles and Clock Gating Choices, February 1, 2005.

Abstract: This paper studies the impact on energy efficiency and thermal behavior of design style and clock-gating style in queue and array structures. These structures are major sources of power dissipation, and both design styles and various clock gating schemes can be found in modern, high-performance processors. Although some work in the circuits domain has explored these issues from a power perspective, thermal treatments are less common, and we are not aware of any work in the architecture domain.

CS-2004-39
Velusamy, Siva; Huang, Wei; Lach, John; Skadron, Kevin.. Monitoring Temperature in FPGA based SoCs, December 20, 2004 (posted 4 March 2005).

Abstract: FPGA logic densities continue to increase at a tremendous rate. This has had the undesired consequence of increased power density, which manifests itself as higher on-die temperatures and local hotspots. Sophisticated packaging techniques have become essential to maintain the health of the chip. In addition to static techniques to reduce the temperature, dynamic thermal management techniques are essential. In this paper, we present the design of a system that monitors the temperatures at various locations on the FPGA. This system is composed of a controller interfacing to an array of temperature sensors that are implemented on the FPGA fabric. We discuss the issues involved in designing and interfacing to the sensors. We compare the sensor readings with values obtained from HotSpot, an architectural level thermal modeling tool. Finally we show how such a system can be used in two micro-architectural studies - determining the effect the splitting one high power density unit into components that can be spread over the chip, and analyzing the response of a system to thermal emergencies.

CS-2004-38
Tarjan, David; Skadron, Kevin. Merging Path and Gshare Indexing in Perceptron Branch Prediction, December 1, 2004.

Abstract: We introduce the hashed perceptron predictor, which merges the concepts behind the gshare, path-based and perceptron branch predictors. This predictor can achieve superior accuracy to a path-based and a global perceptron predictor, previously the most accurate dynamic branch predictors known in the literature. We also show how such a predictor can be ahead pipelined to yield one cycle effective latency. On 11 programs from the SPECint2000 set of benchmarks, the hashed perceptron predictor improves accuracy by up to 22 percent over a path-based perceptron and improves IPC by up to 6.5 percent.

CS-2004-36
Nguyen-Tuong, Anh ; Guarnieri, Salvatore ; Greene, Doug; Evans, David . Automatically Hardening Web Applications Using Precise Tainting, December 14, 2004.

Abstract: Most web applications contain security vulnerabilities. The simple and natural ways of creating a web application are prone to SQL injection attacks and cross-site scripting attacks (among other less common vulnerabilities). In response, many tools have been developed for detecting or mitigating common web application vulnerabilities. Existing techniques either require effort from the site developer or are prone to false positives. This paper presents a fully automated approach to securely hardening web applications. It is based on precisely tracking taintedness of data and checking specifically for dangerous content in only in parts of commands and output that came from untrustworthy sources. Unlike previous work in which everything that is derived from tainted input is tainted, our approach precisely tracks taintedness within data values. We describe our results and prototype implementation on the predominant LAMP (Linux, Apache, MySQL, PHP) platform.

CS-2004-35
Vicaire, Pascal A.; Stankovic, John A.. Elastic Localization, November 1, 2004.

Abstract: Numerous wireless sensor network algorithms assume that individual sensors possess location information. However, state of the art localization algorithms often achieve acceptable performance only under restrictive assumptions. For instance, some algorithms necessitate regular sensor deployment or centralized computations. Other algorithms require a high proportion of position aware nodes or the ability to accurately infer emission distance or emission direction of received radio signals. We propose the Elastic Localization Algorithm (ELA), a distributed, scalable, robust and efficient localization algorithm. ELA only presumes that a few percent of the sensors know their location and that an estimation of the maximum communication range is available. We provide extensive simulation data describing the precision, the convergence speed, and the communication load of ELA, using networks composed of thousands of sensors. In addition, we submit ELA to testing considering the influence of maximum range and beacon position misestimation, irregular radio patterns, asynchronous nodes, packet losses, particular topologies, and sensor mobility.

CS-2004-34
Horvath, Tibor; Abdelzaher, Tarek; Skadron, Kevin. Dynamic Voltage Scaling in Multi-tier Web Servers with End-to-end Delay Control, November 1, 2004.

Abstract: Energy and cooling costs of web server farms are among their main financial expenditures. This paper explores the benefits of dynamic voltage scaling (DVS) for power management in server farms. Unlike previous work, which addressed DVS on individual servers and on load-balanced server replicas, this paper addresses DVS in multi-stage service pipelines. Contemporary Web server installations typically adopt a three-tier architecture in which the first tier presents a Web interface, the second executes scripts that implement business logic, and the third serves database accesses. From a user's perspective, only the end-to-end response across the entire pipeline is relevant. This paper presents an algorithm for minimizing the total energy expenditure of the multi-stage pipeline subject to soft end-to-end response-time constraints. A distributed power management service is designed and evaluated on a real 3-tier server prototype for coordinating DVS settings in a way that minimizes global energy consumption while meeting end-to-end delay constraints. The service is shown to consume as much as 30 percent less energy compared to the default (Linux) energy saving policy.

CS-2004-33
Natrajan, Anand; Grimshaw, Andrew S.; Humphrey, Marty A.; Nguyen-Tuong, Anh. Dispelling Seven Myths about Grid Resource, August 30, 2004.

Abstract: Grid resource management is often viewed as scheduling for large, long-running, compute-intensive, parallel applications. This view is narrow, as we will argue in this article. Grids today encompass diverse resources including machines, users and applications. Moreover, grid users make different demands from different resources. Therefore, grid infrastructures must adopt a greater responsibility than before for managing resources. Grid resource management cannot mean just scheduling jobs on the fastest machines, but must also include scheduling special jobs on matched machines, preserving site autonomy, determining usage policies, respecting permissions for use and so on. In this article, we will present seven "myths" or common beliefs about grid resource management, and dispel each myth by presenting counter-examples or "observations". These observations have been culled from our experience as well as work done by several experts in major grid-related projects. In order to relate these observations to concrete implementation, we will also present grid resource management in Legion, a grid infrastructure project initiated at the University of Virginia.

CS-2004-31
Co, Michele; Skadron, Kevin. Evaluating Trace Cache Energy-Efficiency, October 10, 2004.

Abstract: Future fetch engines need to be energy-efficient. Therefore, a thorough evaluation and comparison of fetch engine design is necessary for futuristic processors. Our work compares the energy-efficiency of concurrent trace caches, sequential trace caches (STCs), block-based trace caches (BBTCs), and instruction caches (ICs). We compare: CTCs and STCs with path-based next trace predictor (NTP), ICs with branch predictor, and BBTCs with trace table. To separate out predictor organization and prediction effects we also evaluate ICs with NTP and BBTCs with NTP. In our experiments, we first evaluate the fetch engines with no area budget restrictions. Then, to consider higher clock rates we evaluate the fetch engines when restricting the area budget for each component. To consider future process technologies, we also evaluate the effect of increased leakage. We find that branch prediction (whether explicit or implicit) is a key component in the energy-efficiency of the fetch engine designs evaluated. Branch prediction effects are eliminated by artificially equalizing the effective branch prediction accuracy for the fetch engine designs and the results are evaluated. We find that access delay limits the theoretical performance of the fetch engines evaluated. We propose a novel ahead pipelined NTP that performs nearly as well as the single-cycle access NTP.

CS-2004-30
Rajan, Hridesh; Sullivan, Kevin. Generalizing AOP for Aspect-Oriented Testing, September 30, 2004 .

Abstract: In profiling program execution to assess the adequacy of a test set, a challenge is to select the code to be included in the coverage assessment. Current mechanisms for doing this are coarse-grained, assume traditional concepts of modularity, require tedious and error-prone manual selection, and leave the tester's intent implicit in the input provided to the testing tools. The aspect-oriented constructs of languages such as AspectJ promise to help ease and extend our ability to select the code to be included in a test adequacy criterion and to document the tester's intent within the source code itself. Our contribution is a language-centric approach to automated test adequacy analysis that we call concern coverage. We claim that our approach enables explicit, precise, abstract, and machine-readable representation of the tester's intent and that it can ease testing by eliminating the need for manual selection and explicit maintenance of test adequacy criteria.

CS-2004-29
Paul, Nathanael; Evans, David. .NET Security: Lessons Learned and Missed from Java, October 9, 2004.

Abstract: Many systems execute untrusted programs in virtual machines (VMs) to limit their access to system resources. Sun introduced the Java VM in 1995, primarily intended as a lightweight platform for execution of untrusted code inside web pages. More recently, Microsoft developed the .NET platform with similar goals. Both platforms share many design and implementation properties, but there are key differences between Java and .NET that have an impact on their security. This paper examines how .NET's design avoids vulnerabilities and limitations discovered in Java and discusses lessons learned (and missed) from Java's experience with security.

CS-2004-28
Tarjan, David; Skadron, Kevin. Revisiting the Perceptron Predictor Again, September 23, 2004.

Abstract: We introduce a new kind of branch predictor, the hashed perceptron predictor, which merges the concepts behind the gshare and perceptron branch predictors. This is done by fetching the perceptron weights using the exclusive-or of branch addresses and branch history. This predictor can achieve superior accuracy to a path-based and a global perceptron predictor, previously the most accurate fully dynamic branch predictors known in the literature, at the same storage budgets. Additionally, it reduces the number of adders by a factor of four compared to a path-based perceptron. We also show how such a predictor can be ahead pipelined to yield one cycle effective latency, making it the first standalone perceptron predictor. On the SPEC integer set of benchmarks, the hashed ahead-pipelined path-based perceptron predictor (hashed perceptron for short) improves accuracy by 20% over a path-based perceptron and improves IPC by 5.8%. We believe these improvements make the perceptron predictor a promising choice as a branch predictor for a future high-performance microprocessor.

CS-2004-21
Rajan, Hridesh; Sullivan, Kevin. Classpects: Unifying Aspect- and Object-Oriented Language Design, Sept 1, 2004.

Abstract: The contribution of this work is a compositional, modular building block for program design that unifies classes and aspects as they appear in today's most successful aspect-oriented languages. We call it the classpect. We make three basic claims for this idea: first, it can be realized cleanly in an industrial-strength language; second, it improves the conceptual integrity of the programming model; third, its compositionality significantly expands the useful program design space. To test these claims we designed and implemented a language supporting this model. Eos-U extends C#, has all of the aspect-oriented capabilities of AspectJ, but also unifies classes and aspects. We then exhibit a layered architectural style for modular design of integrated systems, and we show that, unlike earlier aspects, classpects cleanly support this design style.

CS-2004-19
Cai,Yuanfang; Sullivan, Kevin J.. Software Design Spaces: Logical Modeling and Formal Dependence Analysis, September 1, 2004 (Revised 24 September 2004).

Abstract: We lack a useful, formal theory of modularity in abstract software design. A missing key is a framework for the abstract representation of software design spaces that supports analysis of design decision coupling structures. We contribute such a framework. We represent design spaces as constraint networks and develop a concept of design decision coupling based on the minimal change sets of a variable. This work supports derivation, from logical models, of design structure matrices (DSM's), for which we have a promising but inadequate theory of modularity. We present complexity results and a brute force algorithm. To test for potential software engineering utility, we analyzed the design spaces of Parnas's 1972 information hiding paper, with positive results that were surprising in several ways.

CS-2004-18
John C. Giordano, Paul F. Reynolds Jr., David C. Brogan. Exploring the Constraints of Human Behavior Representation, March 12, 2004.

Abstract: Human behavior representation (HBR) is an elusive, yet critical goal for many in the simulation community. Requirement specifications related to HBR often exceed current capabilities. There exist a number of tools, techniques and frameworks to model and simulate HBR, but they are constrained and do not generalize well. Even with a vibrant research community, certain HBR characteristics remain beyond our grasp, unless some unforeseen disruptive technologies emerge. We survey the state of the practice for HBR, discuss ongoing research, and identify what appear to be insurmountable challenges. Along with exposing the essential characteristics of HBR and their current level of maturity, we propose a generational framework for considering HBR capabilities. While a number of HBR issues have been addressed in the literature, there is no published discussion explicitly detailing its constraints and limitations. (NOTE: This work will appear in the proceedings of the 2004 ACM/IEEE Winter Simulation Conference.)

CS-2004-17
Vieira, Sibelius Lellis; Liebeherr, Jorg. An Algorithmic Approach to Topological Design of Service Overlay Networks, May 2004.

Abstract: The Internet still lacks adequate support for QoS applications with real-time requirements. In great part, this is due to the fact that provisioning of end-to-end QoS to traffic that traverses multiple autonomous systems (ASs) requires a level of cooperation between ASs that is difficult to achieve in the current architecture. Recently, service overlay networks have been considered as an approach to QoS deployment that avoids these difficulties. In this study, we address the problem of the topological synthesis of a service overlay network, where endsystems and nodes of the overlay network (provider nodes) are connected through ISPs that supports bandwidth reservations. We express the topology design problem as an optimization problem. Even though the design problem is related to the (in general NP-hard) quadratic assignment problem, we are able to show that relatively simple heuristic algorithms can deliver results that are sometimes close to the optimal solution.

CS-2004-15
Huang, Chengdu; Abdelzaher, Tarek F.. Towards Content Distribution Networks with Latency Guarantees, April 15, 2004.

Abstract: This paper investigates the performance of a content distribution network designed to provide bounded content access latency. Content can be divided into multiple classes with different configurable per-class delay bounds. The network uses a simple distributed algorithm to dynamically select a subset of its proxy servers for different classes such that a global per-class delay bound is achieved on content access. The content distribution algorithm is implemented and tested on PlanetLab, a world-wide distributed Internet testbed. Evaluation results demonstrate that despite Internet delay variability, subsecond delay bounds (of 200-500ms) can be guaranteed with a very high probability at only a moderate content replication cost. The distribution algorithm achieves a 4 to 5 fold reduction in the number of response-time violations compared to prior content distribution approaches that attempt to minimize average latency. To the authors' knowledge, this paper presents the first wide-area performance evaluation of an algorithm designed to bound maximum content access latency, as opposed to optimizing an average performance metric.

CS-2004-14
Sankaranarayanan, Karthik; Skadron, Kevin. Profile-Based Adaptation for Cache Decay: A Technical Report, April 8, 2004.

Abstract: Cache decay is a set of leakage-reduction mechanisms that put cache lines that have not been accessed for a specific duration into a low-leakage standby mode. This duration is called the decay interval, and its optimal value varies across applications. This paper provides an extended discussion of the results previously presented in our journal paper. It describes an adaptation technique that analytically finds the optimal decay interval through profiling, and shows that the most important variables required for finding the optimal decay interval can be estimated with a reasonable degree of accuracy using profiling. This work explicitly trades off the leakage power saved in putting both the live and dead lines into standby mode, against its performance and energy costs. It achieves energy savings close to what can be obtained with an omniscient choice of per-benchmark optimal decay interval.

CS-2004-13
Huang, Wei; Stan, Mercia R.; Skadron, Kevin; Sankaranarayanan, Karthik; Ghosh, Shougata; Velusamy, Sivakumar. Compact Thermal Modeling for Temperature-Aware Design, April 2004.

Abstract: Thermal design in sub-100nm technologies has become one of the major challenges to the CAD community. Thermal effects on design aspects such as performance, power and reliability have to be thoroughly and seriously considered during the entire design flow in order to achieve faster design convergence and optimal design. This paper expands the discussion in our conference paper (DAC 2004). We first introduce the idea of temperature-aware design. We then propose a compact thermal model which can be integrated with modern CAD tools to achieve a temperature-aware design methodology. Finally, we use the compact thermal model in a case study of microprocessor design to show the importance of using temperature as a guideline for the design. The results from our thermal model show that a temperature-aware design approach can provide more accurate estimations, and therefore better decisions and faster design convergence.

CS-2004-11
Nguyen-Tuong, A.; Grimshaw, A. S.; Wasson, G.; Humphrey, M.; Knight, J. C.. Towards Dependable Grids, March 26, 2004.

Abstract: To date, grids (a form of distributed system) have been used to aggregate resources for performance-starved applications typically resulting from scientific enquiry. Grids should not just be facilitating advances in science and engineering; rather they should also be making an impact on our daily lives by enabling sophisticated applications such as new consumer services and support for homeland defense. For example, grids providing financial services should be seamlessly interconnected with grids that support telecommunications, and grids providing health care services should be seamlessly interconnected with both financial services and telecommunications. This is not possible today because the poor grid dependability — which is tolerated by scientific users — would be unacceptable in critical infrastructure applications. This project aims at correcting this problem by developing technology that will allow grids to be used to provide services upon which society can depend. Grids must be engineered both to achieve high dependability and to permit assurance that high dependability has been achieved.

CS-2004-10
Nagaraddi, Prashant; Yu, Z.; Stankovic, J.; He, Z.. VEST User's Manual, June 17, 2004.

Abstract: VEST (Virginia Embedded Systems Toolkit) provides an environment for constructing and analyzing component-based distributed real-time embedded systems. In this manual, we describe the process of installing and running VEST. It includes descriptions of all the features of the tool including the various analyses that can be performed on VEST designs. A comprehensive tutorial on learning the basics of using VEST is also provided.

CS-2004-09
Christin, Nicolas; Liebeherr, Jorg; Abdelzaher, Tarek F.. Enhancing Class-Based Service Architectures with Adaptive Rate Allocation and Dropping Mechanisms, March 25, 2004.

Abstract: Class-based service differentiation can be realized without resource reservation, admission control and traffic policing. However, the resulting service guarantees are only relative, in the sense that guarantees given to a flow class at any time are expressed with reference to the service given to other flow classes. While it is, in principle, not feasible to provision for absolute guarantees (i.e., to assure lower bounds on service metrics at all times) without admission control and/or traffic policing, we will show in this paper that such a service can be reasonably well emulated using adaptive rate allocation and dropping mechanisms at the link schedulers of routers. We name the resulting type of guarantees best-effort bounds. We propose mechanisms for link schedulers of routers that achieve these and other guarantees by adjusting the drop rates and the service rate allocations of traffic classes to current load conditions. The mechanisms are rooted in control theory and employ adaptive feedback loops. We demonstrate that these mechanisms can realize many recently proposed approaches to class-based service differentiation. The effectiveness of the proposed mechanisms are evaluated in simulation experiments as well as in measurement experiments of a kernel-level implementation on FreeBSD PC-routers with multiple 100Mbps Ethernet interfaces.

CS-2004-08
Lu, Zhijian; Huang, Wei; Ghosh, Shougata ; Lach, John; Stan, Mircea ; Skadron, Kevin. Analysis of Temporal and Spatial Temperature Gradients for IC Reliability, March 22, 2004 (withheld until March 4, 2005).

Abstract: One of the most common causes of IC failure is interconnect electromigration (EM), which exhibits a rate that is exponentially dependent on temperature. As a result, EM rate is one of the major determinants of the maximum tolerable operating temperature for an IC and of resulting cooling costs. Previous EM models have assumed a uniform, typically worst-case, temperature. This paper presents a model that accounts for temporal and spatial variations in temperature, and shows that accounting for these variations can dramatically improve interconnect lifetime prediction accuracy. We also show that the same modeling approach applies to temperature-related gate-oxide breakdown, another common cause of IC failure. We then propose that by modeling expected lifetime as a resource that is consumed over time at a rate dependent on temperature, substantial design margin can be reclaimed. For example, for a fixed target lifetime, intermittent higher operating temperatures and performance can be tolerated if compensated by lower temperatures at other times during the product's lifetime. This approach offers higher overall performance and/or lower cooling costs than a standard design methodology that uses a worst-case temperature criterion for reliability analysis. This report supersedes TR CS-2003-21 and TR CS-2004-07.

CS-2004-07
Huang, Wei; Lu, Zhijian; Ghosh, Shougata; Lach, John; Stan, Mircea ; Skadron, Kevin. The Importance of Temporal and Spatial Temperature Gradients in IC Reliability Analysis, January 6, 2004 (withheld until March 23, 2004).

Abstract: Existing IC reliability models assume a uniform, typically worst-case,operating temperature, but temporal and spatial temperature variations affect expected device lifetime. This paper presents a model that accounts for temperature gradients, dramatically improving interconnect and gate-oxide lifetime prediction accuracy. By modeling expected lifetime as a resource that is consumed over time at a temperature-dependent rate, substantial design margin can be reclaimed and/or less expensive cooling systems may be used.

CS-2004-04
Frost, Christopher; Peck, Michael; Evans, David. Pancakes, Puzzles, and Polynomials: Cracking the Cracker Barrel, March 2004.

Abstract: The Cracker Barrel peg game is a simple, one-player game commonly found on tables at pancake restaurants. In this paper, we consider the computational complexity of the Cracker Barrel problem. We show that a variation of a generalization of the problem is NP-complete.

CS-2004-03
Evans, David; Peck, Michael. Simulating Critical Software Engineering, February 2004.

Abstract: One goal of many introductory software engineering courses is to simulate realistic software engineering. Unfortunately, many of the practical constraints of typical courses are antithetical to that goal: instead of working in large teams on large projects, dealing with changing requirements and maintaining programs over many years, courses generally involve students working alone or in small teams with short projects than end the first time the program works correctly on some selected input. Of course, it is impossible (and undesirable) to carry out full industrial software development within the context of a typical university course. However, it is possible to simulate some aspects of safety critical software engineering in an introductory software engineering course. This paper describes an approach to teaching introductory software engineering that focuses on using lightweight analysis tools to aid in producing secure, robust and maintainable programs. We describe how assignments were designed that integrate analysis tools with typical software engineering material and reports on results from an experiment measuring students understanding of program invariants.

CS-2003-21
Lu, Zhijian; Stan, Mircea ; Lach, John; Skadron, Kevin. Interconnect Lifetime Prediction for Temperature-Aware Design, November 26, 2003.

Abstract: Thermal effects are becoming a limiting factor in high-performance circuit design due to the strong temperature-dependence of leakage power, circuit performance, IC package cost and reliability. Temperature-aware design tries to maximize performance under a given thermal envelope through various static and dynamic approaches. While existing interconnect reliability models assume a constant temperature, this paper presents a technique for probabilistically estimating interconnect lifetime for any time-varying temperature profile. With this formulation, interconnect lifetime can be modeled as a resource that is consumed over time, with the rate of consumption being a function of temperature. As a result, designers may be more aggressive in the temperature profiles that are allowed on a chip instead of using static worst-case assumptions. For example, performance (hence power and temperature) may be increased beyond what is allowed by worst-case restrictions for short periods as long as the increase is compensated for later by lower activity. With this model, temperature-aware designs will achieve higher overall performance while satisfying lifetime requirements.

CS-2003-20
Li, Chengzhi; Burchard, Almut; Liebeherr, Jorg. A Network Calculus with Effective Bandwidth, November 2003.

Abstract: This paper establishes a link between two principal tools for the analysis of network traffic, namely, effective bandwidth and network calculus. It is shown that a general formulation of effective bandwidth can be expressed within the framework of a probabilistic version of the network calculus, where both arrivals and service are specified in terms of probabilistic bounds. By formulating well-known effective bandwidth expressions in terms of probabilistic envelope functions, the developed network calculus can be applied to a wide range of traffic types, including traffic that has self-similar characteristics. As applications, probabilistic lower bounds are presented on the service given by three different scheduling algorithms: Static Priority (SP), Earliest Deadline First (EDF), and Generalized Processor Sharing (GPS). Numerical examples show the impact of the traffic models and the scheduling algorithm on the multiplexing gain in a network.

CS-2003-19 (Not available on-line)
Co, Michele; Skadron, K.. Evaluating the Energy Efficiency of Trace Caches, October 28, 2003.

Abstract: This paper presents the preliminary results of a sequential trace cache design space study as well as the results of a path-based next trace predictor design space study. Findings show that sequential trace caches are highly energy and power-efficient. Fetch engines which include a sequential trace cache provide higher performance for approximately equal area at a significant energy and power savings. We find that when examining performance and average fetch power, fetch engines with trace caches may not seem appealing, but when examining energy-delay and energy-delay-squared, the benefits of a trace cache become clear. Even if average fetch power is increased due to the increased fetch engine area, the energy-efficiency is still improves with a trace cache due to faster execution and more opportunities for clock gating, making the trace cache superior in terms of energy-delay and energy-delay-squared products. Our exploration of the Jacobson et al. path-based next trace predictor design space indicates that a hybrid next trace predictor consisting of a 1KB correlating table, 8KB secondary table, and 4-entry return history stack would be the best choice in terms of power, performance, and energy. [This report is temporarily unavilable. Please contact micheleco@cs.virginia.edu for further details.]

CS-2003-18
Hirst, Kevin R.; Haskins, John W. Jr.; Skadron, Kevin. dMT: Inexpensive Throughput Enhancement in Small-Scale Embedded Microprocessors with Differential Multithreading: Extended Results, Oct. 2003.

Abstract: This paper examines differential multithreading (dMT) as an attractive organization for increasing throughput in simple, small-scale, pipelined processors like those used in embedded environments. dMT copes with pipeline stalls due to hazards and data- and instruction-cache misses by using duplicated pipeline registers to run instructions from an alternate thread. Results show that dMT boosts throughput substantially and can in fact replace dynamic branch prediction or can be used to reduce the sizes of the instruction and data caches. This report expands upon [1] by presenting extended results for additional dMT configurations.

CS-2003-15
Li, Yingmin; Skadron, Kevin; Stan, Mircea, R.. New Findings on Using Queue Occupancy to Integrate Runtime Power-Saving Techniques Across the Pipeline, July 20, 2003.

Abstract: This paper provides new insights on how to integrate power-saving techniques by using queue occupancies to dynamically match the power-saving modes of various pipeline stages with the current instruction throughput. (This paper focuses on fetch, decode, integer execution, and data cache.) Architects have proposed many runtime power-saving techniques, most of which reduce power dissipation in a single microarchitectural unit. But very little work has been done to integrate these disparate techniques to ensure that they cooperate rather than interfering with each other. We use both queuing theory and experimental results to justify the use of superscalar decoupling queues to guide dynamic control of power settings. This permits integrated power control for multiple units across the pipeline, with minimal negative interaction, by matching the throughput of each stage and the application's current instruction-level parallelism. Our findings verify but also improve upon those in previous work by Semerraro et al. In particular, our approach is robust in jumping out of the bad power modes configuration incurring radical performance degradation, and our approach allows the fetch stage (a significant source of power dissipation) to realize power savings, something that prior integrated, queue-based techniques have not been able to accomplish.

CS-2003-14
Hill, Jonathan; Knight, John C.. Selective Notification: Combining Forms of Decoupled Addressing for Internet-Scale Command and Alert Dissemination, April 11, 2003.

Abstract: By using an information survivability control system, the survivability of critical networked information systems can be enhanced using a variety of fault-tolerance mechanisms. Essential to the effective implementation of such mechanisms is communication from the error detection component to the various application nodes in the network. In this paper, we introduce a technique called Selective Notification for the communication of commands and alerts in very large distributed systems. The technique combines intentional addressing, content addressing and sender qualification in a single decoupled event-delivery mechanism. We show that effective targeted command and alert dissemination is achievable, and that Selective Notification allows systems to apply a wide range of event connectivity policies. We present details of an implementation of Selective Notification and the results of performance assessment experiments. Based on our preliminary performance data, we conclude that Selective Notification can be used to support survivability architectures in Internet-sized systems.

CS-2003-13
Larochelle, David; Rosasco, Nicholas. Towards a Model of the Costs of Security, June 2003.

Abstract: We present a simple information security model to determine why, historically, the level of security has not increased despite numerous technical advances. In our model, the software design process involves trade-offs between security and functionality. Developers choose points in the design space corresponding to certain levels of security and functionality. If development resources, such as number of developers, time for completion, etc., are fixed, there is an implicit trade-off between security and functionality. We refer to the set of points that represent the maximum possible security given a certain level of functionality as the protection possibilities frontier (PPF). Technical advances push back the PPF expanding the set of accessible points in the design space potentially allowing both increased security and increased functionality. But historically this has not been sufficient to result in increased security. Instead almost all of the technical advancement is used to increase functionality. We examine how technical advances affect the marginal cost of security in terms of sacrificed functionality and classify technical advances into 3 categories: security neutral, security hostile, and security enhancing. In order for the level of security to increase, security enhancing technical advances must offset security hostile technical advances. We also observe that producing security enhancing technologies is surprisingly difficult. Even advances in information security technologies often result in the ability to take additional risks rather than increased security. Additionally we briefly examine user preferences, which to a large extant drive the actions of developers. We suggest that the lack of security cannot be explained purely by consumer apathy and that limited product availability and network externalities also contribute.

CS-2003-12
Greenwell, William S.; Knight, John C.. What Should Aviation Safety Incidents Teach Us?, March 20, 2003.

Abstract: Accidents and incidents involving safety-critical software systems often provide lessons to the systems users and designers, to industry, and to the software engineering community at large. Proper identification and documentation of these lessons is critical in order to prevent the recurrence of an untoward event. In this paper we examine two commercial aviation incidents involving failures of safety-critical software systems. Based on our analysis of the incidents and the official investigations that followed, we conclude that the aviation community is missing important lessons regarding safety-critical software systems, especially concerning the broad role these systems play in preserving the safety of commercial air travel. This is primarily because incidents involving such systems are not being investigated and documented with sufficient rigor to identify these lessons and disseminate them throughout the aviation community effectively.

CS-2003-08
Skadron, K.; Stan, M. R.; Huang, W.; Velusamy, S.; Sankaranarayanan, K.; Tarjan, D.. Temperature-Aware Microarchitecture: Extended Discussion and Results, April 2003.

Abstract: With power density and hence cooling costs rising exponentially, processor packaging can no longer be designed for the worst case, and there is an urgent need for runtime processor-level techniques that can regulate operating temperature when the package's capacity is exceeded. Evaluating such techniques, however, requires a thermal model that is practical for architectural studies. This paper expands upon the discussion and results that were presented in our conference paper. It describes "HotSpot", an accurate yet fast model based on an equivalent circuit of thermal resistances and capacitances that correspond to microarchitecture blocks and essential aspects of the thermal package. Validation was performed using finite-element simulation. The paper also introduces several effective methods for dynamic thermal management (DTM): "temperature-tracking" frequency scaling, localized toggling, and migrating computation to spare hardware units. Modeling temperature at the microarchitecture level also shows that power metrics are poor predictors of temperature, that sensor imprecision has a substantial impact on the performance of DTM, and that the inclusion of lateral resistances for thermal diffusion is important for accuracy.

CS-2003-05
Zhang, Y.; Parikh, D.; Sankaranarayanan, K.; Skadron, K.; Stan, M. R.. HotLeakage: A Temperature-Aware Model of Subthreshold and Gate Leakage for Architects, March 2003.

Abstract: This report introduces HotLeakage, an architectural model for subthreshold and gate leakage that we have developed here at the University of Virginia. The most important features of HotLeakage are the explicit inclusion of temperature, voltage, gate leakage, and parameter variations, and the ability to recalculate leakage currents dynamically as temperature and voltage change due to operating conditions, DVS techniques, etc. HotLeakage provides default settings for 180nm through 70nm technologies for modeling cache and register files, and provides a simple interface for selecting alternate parameter values and for modeling alternative microarchitecture structures. It also provides models for several extant cache leakage control techniques, with an interface for adding further techniques. HotLeakage is currently a semi-independent module for use with SimpleScalar, but is sufficiently modular that it should be fairly easy to port to other simulators. Because sub-threshold leakage currents are exponentially dependent on temperature and voltage, because gate leakage is growing so rapidly, and because parameter variations can have a profound effect on simulation accuracy, we hope that HotLeakage will serve as a useful tool for microarchitects to more accurately evaluate issues related leakage power. HotLeakage is available for download at http://lava.cs.virginia.edu/HotLeakage.

CS-2003-11
Blum, Brian; He, Tian; Son, Sang; Stankovic, John. IGF: A State-Free Robust Communication Protocol for Wireless Sensor Networks, April 21, 2003.

Abstract: Wireless Sensor Networks (WSNs) are being designed to solve a gamut of interesting real-world problems. Limitations on available energy and bandwidth, message loss, high rates of node failure, and communication restrictions pose challenging requirements for these systems. Beyond these inherent limitations, both the possibility of node mobility and energy conserving sleep protocols that power down nodes introduce additional complexity to routing protocols that depend on up to date routing or neighborhood tables. Such state-based protocols suffer excessive delay or message loss, as system dynamics require expensive upkeep of these tables. Utilizing characteristics of high node density and location awareness, we introduce IGF, a location-aware routing protocol that is robust and works without knowledge of the existence of neighboring nodes (state-free). We compare our work against established routing protocols to demonstrate the efficacy of our solution when nodes are mobile or periodically sleep to conserve energy. We show that IGF far outperforms these protocols, in some cases delivering close to 100% of the packets transmitted while alternate solutions fail to even find a path between a source and destination. Specifically, we show that our protocol demonstrates a vast improvement over prior work using metrics of delivery ratio, control overhead, and end-to-end delay.

CS-2003-10
French, James C.; Watson, J. V. S.; Jin, X.; Martin, W. N.. Using Multiple Image Representations to Improve the Quality of Content-Based Image Retrieval, March 2003.

Abstract: Content-based image retrieval (CBIR) has been the object of considerable study since the early 90s. Much effort has gone into characterizing the "content" of an image for the purpose of subsequent retrieval. The present study seek to capitalize on this work and to extend it by employing content-analysis of multiple representations of an image which we term multiple viewpoints or channels. The idea is to place each image in multiple feature spaces and then effect retrieval by querying each of these spaces and merging the several responses. We show that a simple realization of this strategy can be used to boost the retrieval effectiveness of conventional CBIR.

CS-2003-06
He, Tian; Huang, Chengdu; Blum, Brian; Stankovic, John A.; Abdelzaher, Tarek. Range-Free Localization Schemes for Large Scale Sensor Networks, April 21, 2003.

Abstract: Wireless Sensor Networks have been proposed for a multitude of location-dependent applications. For such systems, the cost and limitations of the hardware on sensing nodes prevent the use of range-based localization schemes that depend on absolute point-to-point distance estimates. Because coarse accuracy is sufficient for most sensor network applications, solutions in range-free localization are being pursued as a cost-effective alternative to more expensive range-based approaches. In this paper, we present APIT, a novel localization algorithm that is range-free. We show that our APIT scheme performs best when an irregular radio pattern and random node placement are considered, and low communication overhead is desired. We compare our work via extensive simulation, with three state-of-the-art range-free localization schemes to identify the preferable system configurations of each. In addition, we study the effect of location error on routing and tracking performance. We show that routing performance and tracking accuracy are not significantly affected by localization error when the error is less than 0.4 times the communication radio radius.

CS-2003-04
Christin, N.; Liebeherr, J.. Marking Algorithms for Service Differentiation of TCP Traffic, February 21, 2003.

Abstract: Class-based service architectures for quality-of-service (QoS) differentiation typically provide loss, throughput, and delay differentiation. However, proposals for class-based QoS differentiation generally do not account for the needs of TCP traffic, which are characterized by a coupling of packet losses and achievable throughput.Ignoring this coupling may result in poor service differentiation at the microflow level. This paper shows how Explicit Congestion Notification (ECN) can be used to achieve service differentiation for TCP traffic classes at the microflow level. We present a traffic marking algorithm for routers, which, if used in conjunction with ECN, regulates thet ransmission rate of TCP sources in such a way that packet dropsd ue to buffer overflows are avoided. We demonstrate how the algorithm can be integrated in a service architecture with absolute and proportional QoS guarantees. Simulation results illustrate the effectiveness of the presented algorithms at avoiding packet losses and regulating traffic for meeting service guarantees, and provide a comparison with other algorithms proposed in the literature.

CS-2003-02
He, Tian ; Blum, Brian M. ; Stankovic, John A ; Abdelzaher, Tarek. AIDA: Application Independent Data Aggregation in Wireless Sensor Networks, Jan 12, 20003.

Abstract: Sensor networks, a novel paradigm in distributed wireless communication technology, have been proposed for use in various applications including military surveillance and environmental monitoring. These systems could deploy heterogeneous collections of sensors capable of observing and reporting on various dynamic properties of their surroundings in a time sensitive manner. Such systems suffer bandwidth, energy, and throughput constraints that limit the quantity of information transferred from end to end. These factors cou-pled with unpredictable traffic patterns and dynamic network topologies make the task of designing optimal protocols for such networks difficult. Mechanisms to perform data centric aggregation utilizing application specific knowledge provide a means to augmenting throughput, but have limitations due to their lack of adap-tation and reliance on application specific decisions. We therefore propose a novel aggregation scheme that adaptively performs application independent data aggregation in a time sensitive manner. Our work isolates aggregation decisions into a module that resides between the network and the data link layer and does not require any modifications to the currently existing MAC and network layer protocols. We take advantage of queuing delay and the broadcast nature of wireless communication to concatenate network units into an ag-gregate using a novel adaptive feedback scheme to schedule the delivery of this aggregate to the MAC layer for transmission. In our evaluation we show that end-to-end transmission delay is reduced by as much as 80% under heavy traffic loads. Additionally, we show as much as a 50% reduction in transmission energy con-sumption with the addition of only 2 bytes of header overhead per network unit. Theoretical analysis, simula-tion, and a test-bed implementation on Berkeley¡¯s MICA motes are provided to validate our claims.

CS-2003-01
Highley, Timothy; Reynolds, Paul. Marginal Cost-Benefit Analysis for Predictive File Prefetching, January, 2003.

Abstract: File prefetching can reduce file access latencies and improve overall performance. Prefetching can be especially important in on-line software on demand, where network latencies create unacceptable delays. Prefetching involves predicting future accesses and establishing when/whether to prefetch, based on future access predictions. Cost-benefit analysis (CBA) addresses when/whether to prefetch and it addresses the interaction between prefetching and caching. CBA weighs the expected benefits of file prefetching and the cost of expected buffer usage. We describe 1-Marginal CBA, an approach that employs probabilistic predictions, as opposed to deterministic hints. We present a probabilistically optimal, though intractable, algorithm, Opt, for a representative prediction model, and demonstrate that any other optimal algorithm under that model will also be intractable. We argue that in many circumstances 1-Marginal and Opt will make the same decisions. Finally, we present simulation results in which 1-Marginal reduced I/O time by an average of 19% and a maximum of 49% over other prefetching schemes in the literature, even when using the same predictor. Since the cost of 1-Marginal is comparable to that of other published algorithms the improvement is real.

CS-2002-28
Kang, K.D.; Son, S.H.; Stankovic, J.A.. Specifying and Managing Quality of Real-Time Data Services, September 18, 2002.

Abstract: The demand for real-time data services is increasing in many applications including e-commerce, agile manufacturing, and telecommunication network management. In these applications, it is desirable to execute transactions within their deadlines, i.e., before the real-world status changes, using fresh (temporally consistent) data. However, it is challenging to meet these fundamental requirements due to dynamic workloads and data access patterns in these applications and potential conflicts between transaction timeliness and data freshness requirements. In this paper, we define average/transient deadline miss ratio and new data freshness metrics to specify the required quality of real-time data services. We also present a novel QoS management architecture to support the required QoS even in the presence of unpredictable workloads and access patterns. To prevent overload and support the required QoS, the presented architecture applies feedback control, admission control, and flexible freshness management schemes. A simulation study shows that our QoS-aware approach can support the required QoS, whereas baseline approaches fail to support the required miss ratio and/or freshness. Further, our approach shows a comparable performance to the theoretical oracle that is privileged by a complete future knowledge of data accesses.

CS-2002-27
Liebeherr, Jorg; Patek, Stephen D.; Burchard, Almut. Statistical Per-Flow Service Bounds in a Network with Aggregate Provisioning, July 2002.

Abstract: Scalability concerns of QoS implementations have stipulated service architectures where QoS is not provisioned separately to each flow, but instead to aggregates of flows. This paper determines stochastic bounds for the service experienced by a single flow when resources are managed for aggregates of flows and when the scheduling algorithms used in the network are not known. Using a recently developed statistical network calculus, per-flow bounds can be calculated for backlog, delay, and the burstiness of output traffic.

CS-2002-23
Zhong, Weilin; Evans, David. When Ants Attack: Security Issues for Stigmergic Systems, April, 2002.

Abstract: Stigmergic systems solve global problems by using indirect communication mediated by an environment. Because they are localized and dynamic, stigmergic systems are self-organizing, robust and adaptive. These properties are useful for creating survivable systems, but stigmergic systems also raise new security concerns. Indirect communication makes systems more vulnerable in an open and hostile environment, and feedback mechanisms common to stigmergic algorithms can be exploited by attackers. In this paper we use AntNet, an adaptive routing algorithm inspired by biological ant foraging, to explore some of the security issues for stigmergic systems. We identify possible attacks and analyze their potency. We propose and evaluate mechanisms for defending against these attacks.

CS-2002-35
Larochelle, David; Scheidt, Karl; Sullivan, Kevin. Join Point Encapsulation, January 2003 (posted 16 June 2004).

Abstract: At the heart of aspect-oriented programming is the exposure of certain phenomena in the execution of one set of program elements to behavioral modifications specified by other elements. The phenomena are join points. The modifying elements are aspects. The problem that we address is that current aspect-oriented languages do not provide adequate means to control the exposure of join points for behavioral modification by aspects. Rather, these languages define certain classes of join points (e.g., method calls) and expose all instances thereof. In a nutshell, then, current languages lack join point encapsulation mechanisms. We map a solution space and describe one proof-of-concept design that we have implemented for AspectJ. A key feature of out design is that it is, itself, aspect-oriented: We use AspectJ's pointcut language to identify cross-curring sets of join points to be encapsulated. The potential benefits of such a modularity-supporting mechanism include improved ease of reasoning about program behavior and the ability to assure the absence of side-effects through enforced non-interference by aspect code, and improved ease of design evolution.

CS-2002-32
Hill, Jonathan C.; Knight, John C.; Crickenberger, Aaron M.; Honhard, Richard. Publish and Subscribe with Reply, October 23, 2002.

Abstract: Reply for Publish and Subscribe allows receivers of a publication to reply to the publisher. We demonstrate that Reply is a natural, efficient, and useful component of Publish/Subscribe. It is natural because it maintains the weakest possible coupling between senders and receivers. It is efficient because it stores computations discarded in publication forwarding, later applying them to channel replies. Most importantly, it is useful because it increases the domain of applications for which Publish/Subscribe is suited. This paper includes discussion of Reply’s utility, introduction of two algorithms with differing state storage and capabilities, their analysis for worst-case conditions, modeling of required resources, and presentation of a modular implementation of Reply for distributed Publish/Subscribe systems.

CS-2002-22
French, James C.; Martin, Worthy M.; Watson, James V. S.. A Qualitative Examination of Content-Based Image Retrieval Behavior using Systematically Modified Test Images, August, 2002.

Abstract: We describe the outcome of an effort to understand the behavior of content-based image retrieval (CBIR) technology by examining the behavior of a CBIR system in response to carefully constructed input query images. This work is preliminary and only considers a single CBIR system, SIMPLIcity. We chose this particular system for expediency. It is our intention to develop a methodology suitable for examining any CBIR system.

CS-2002-25
Kang, K. D.; Son, S. H.; Stankovic, J. A.. STAR: Secure Real-Time Transaction Processing with Timeliness Guarantees, August 19, 2002.

Abstract: Real-time databases are needed in security-critical applications, e.g., e-commerce, agile manufacturing, and military applications. In these applications, transactions and data items can be classified into several security levels according to their clearance and sensitivity levels. It is essential for real-time databases to prevent illegal direct/indirect transfer of sensitive data, e.g., secret trade, manufacturing, or operational data, between transactions belonging to different security levels. Further, transactions should be committed within their deadlines, i.e., before the market, manufacturing, or battle field status changes. In this paper, we present a novel real-time database architecture, in which illegal direct/indirect inter-level information flows are prevented while controlling the deadline miss ratio for admitted transactions to remain below a certain threshold. In our approach, mandatory access control mechanisms are applied for security purposes. QoS management, admission control, and feedback control schemes are applied to support certain guarantees on miss ratio against potential overload and data conflicts. A detailed simulation study shows that our approach can support the specified miss ratio preventing illegal information flows even in the presence of unpredictable workloads and varying degrees of data contention, whereas baseline approaches fail.

CS-2002-20
Barcella, M.; Huang, W.; Skadron, K.; Stan, M. R.. Architecture-Level Compact Thermal R-C Modeling, July 2002.

Abstract: Architectural thermal modeling of high performance VLSI systems is of special importance in order to achieve reliable and power-saving system designs. In this paper, we present a novel, computation-efficient methodology for architectural level compact thermal modeling. The RC thermal models derived from this methodology are based on the layout of micro-architectural level functional blocks such as caches, execution units, and register files, etc. We utilize the duality between thermal and electrical circuit to perform electro-thermal simulations of both steady state and transient thermal responses of different blocks that dissipate different amounts of power. Temperature profiles derived from this modeling technique are compared with those derived from computation-intensive FEM/FDM tools. The model is both extremely fast and accurate, less than 4% error. This modeling methodology can then be adopted for architectural level VLSI design tools to perform fast electro-thermal simulations.

CS-2002-21
Lu, Zhijian; Lach, John; Stan, Mircea R.; Skadron, Kevin. Alloyed Branch History: Combining Global and Local Branch History for Robust Performance, July 2002.

Abstract: This paper introduces "alloyed" prediction, a new two-level predictor organization that combines global and local history in the same structure, combining the advantages of two-level predictors and hybrid predictors. The alloyed organization is motivated by measurements showing that "wrong-history mispredictions" are even more important than conflict-induced mispredictions. Wrong-history mispredictions arise because current two-level, history-based predictors provide only global or only local history. The contribution of wrong-history to the overall misprediction rate is substantial because most programs have some branches that require global history and others that require local history. This paper explores several ways to implement alloyed prediction, including the previously proposed bi-mode organization. Simulations show that "mshare" is the best alloyed organization among those we examine, and that mshare gives reliably good prediction compared to bimodal ("two-bit"), two-level, and hybrid predictors. The robust performance of alloying across a range of predictor sizes stems from its ability to attack wrong-history mispredictions at even very small sizes without subdividing the branch predictor into smaller and less effective components.

CS-2002-19
Haskins, Jr., J. W.; Skadron, K.. Memory Reference Reuse Latency: Accelerated Sampled Microarchitecture Simulation, July 12, 2002.

Abstract: This paper explores techniques for speeding up sampled microprocessor simulations by exploiting the observation that of the memory references that precede a sample, references that occur nearest to the sample are more likely to be germane during the sample itself. This means that accurately warming up simulated cache and branch predictor state only requires that a subset of the memory references and control-flow instructions immediately preceding a simulation sample need to be modeled. Our technique takes measurements of the memory reference reuse latencies (MRRLs) and uses these data to choose a point prior to each sample to engage cache hierarchy and branch predictor modeling. By starting cache and branch predictor modeling late in the pre-sample instruction stream, rather than modeling cache and branch predictor interactions for all pre-sample instructions we are able to save the time cost of modeling them. This savings reduces overall simulation running times by an average of 25%, while generating an average error in IPC of less than 0.7%.

CS-2002-15
Drewry, Darren T.; Gu, Lin; Hocking, A. Benjamin; Kang, Kyoung-Don; Schutt, Robert C.; Taylor, Christopher M.; Pfaltz, John L.. Current State of Data Mining, May 9, 2002.

Abstract: This report is a compendium of results uncovered in CS 851, spring semester 2002. Our goal is to provide direction to other graduate students who want to explore the fascinating field of "data mining." Be aware that this represents the topics and the papers that we found most interesting. We made no attempt to be either `fair' or to be `comprehensive' as in a Computing Reviews article. Instead we tried to pick just one or two references in each of our interest areas that were both relatively recent and, in our opinion, worth reading. Their bibliographies can direct a reader to other sources.

CS-2002-17
Kang, K. D.; Son, S. H.; Stankovic, J. A.; Abdelzaher, T. F.. FLUTE: A Flexible Real-Time Data Management Architecture for Performance Guarantees, June 14, 2002.

Abstract: Efficient real-time data management has become increasingly important as real-time applications become more sophisticated and data-intensive. In data-intensive real-time applications, e.g., online stock trading, agile manufacturing, sensor data fusion, and telecommunication network management, it is essential to execute transactions within their deadlines using fresh (temporally consistent) sensor data, which reflect the current real-world states. However, it is very challenging to meet this fundamental requirement due to potentially time-varying workloads and data access patterns in these applications. Also, user service requests and sensor updates can compete for system resources, thereby making it difficult to achieve guaranteed real-time database services in terms of both deadline miss ratio and data freshness. In this paper, we present a novel real-time data management architecture, which includes feedback control, admission control, and flexible update schemes, to enforce the deadline miss ratio of admitted user transactions to be below a certain threshold, e.g., 1%. At the same time, we maintain data freshness even in the presence of unpredictable workloads and access patterns. In a simulation study, we illustrate the applicability of our approach by showing that stringent performance specifications can be satisfied over a wide range of workloads and access patterns. We also show that our approach can significantly improve performance compared to baselines.

CS-2002-16
Drewry, Darren T.; Reynolds, Paul F.; Emanuel, William R.. Optimization as a Tool for Consistency Maintenance in Multi-resolution Simulation, May 16, 2002.

Abstract: The need for new approaches to the consistent simulation of related phenomena at multiple levels of resolution is great. While many fields of application would benefit from a complete and approachable solution to this problem, such solutions have proven extremely difficult. We present a multi-resolution simulation methodology that uses numerical optimization as a tool for maintaining external consistency between models of the same phenomena operating at different levels of temporal and/or spatial resolution. Our approach follows from previous work in the disparate fields of inverse modeling and spacetime constraint-based animation. As a case study, our methodology is applied to two environmental models of forest canopy processes that make overlapping predictions under unique sets of operating assumptions, and which execute at different temporal resolutions. Experimental results are presented and future directions are addressed.

CS-2002-11
Highley, Timothy; Reynolds, Paul F.; Vellanki, Vivekanand. Absolute Cost-Benefit Analysis for Predictive File Prefetching, April 18, 2002.

Abstract: Cost-benefit analysis attempts to balance the benefits of file prefetching against the costs of giving up a buffer from the cache. This is a comprehensive approach to prefetching. It simultaneously answers the questions of what and when/whether to prefetch, and also addresses the interaction between prefetching and caching. Original research on cost-benefit analysis relied on hints about prefetching that were supplied by the program, though other research has considered the more general case of programs without hints. In this paper, we revisit previous research on cost-benefit analysis without hints, present new estimators for the costs and benefits of prefetching under a system model without hints, and generalize the system model to consider probability trees that may have the same block lie along multiple paths of the tree. We present a distinction between absolute and marginal cost-benefit analysis, two different approaches to file prefetching that previous researchers have used. The difference between the two is a subtlety that has not been previously noted or investigated. Our new estimators are based on absolute cost-benefit analysis, with modifications to address observed weaknesses in the absolute approach.

CS-2002-10
Lu, Chenyang; Blum, Brian M.; Abdelzaher, Tarek F.; Stankovic, John A.; He, Tian. RAP: A Real-Time Communication Architecture for Large-Scale Wireless Sensor Networks, March 25, 2002.

Abstract: Large-scale wireless sensor networks represent a new generation of real-time embedded systems with significantly different communication constraints from traditional networked systems. This paper presents RAP, a new real-time communication architecture for large-scale sensor networks. RAP provides convenient, high-level query and event services for distributed micro-sensing applications. Novel location-addressed communication models are supported by a scalable and light-weight network stack. We present and evaluate a new packet scheduling policy called velocity monotonic scheduling that inherently accounts for both time and distance constraints. We show that this policy is particularly suitable for communication scheduling in sensor networks in which a large number of wireless devices are seamlessly integrated into a physical space to perform real-time monitoring and control. Detailed simulations of representative sensor network environments demonstrate that RAP significantly reduces the end-to-end deadline miss ratio in the sensor network.

CS-2002-09
He, Tian; Stankovic, John A.; Lu, Chenyang; Abdelzaher, Tarek. SPEED:A Real-Time Protocol for Sensor Networks, March 1,2002.

Abstract: In this paper, we present a real-time communication protocol, called SPEED, for sensor networks. The protocol provides three types of real-time communication services, namely, real-time unicast, real-time area-multicast and real-time area-anycast. SPEED is specifically tailored to be a stateless, localized algorithm with minimal control overhead. End-to-end real-time communication guarantees are achieved using a novel combination of feedback control and non-deterministic QoS-aware geographic forwarding with a bounded hop count. SPEED is a highly efficient and scalable protocol for the sensor networks where node density is high while the resources of each node are scarce. Theoretical analysis and simulation experiments are provided to validate our claims.

CS-2002-08
Sang, S. H.; Kang, K. D.. QoS Management in Web-based Real-Time Data Services, April 4, 2002.

Abstract: The demand for real-time data services has been increasing recently. Many e-commerce applications and web-based information services are becoming very sophisticated in their data needs. They span the spectrum from low level status data, e.g., stock prices, to high level aggregated data, e.g., recommended selling/buying point. In these applications, it is desired to process user requests within their deadlines using fresh data, which reflect the current market status. Current web-based data services are poor at processing user requests in a timely manner using fresh data. To address this problem, we present a framework for guaranteed real-time data services in unpredictable environments. We also introduce a possible application of our approach in the distributed environment.

CS-2002-06
Hill, Jonathan C.. Site-Select Messaging for Distributed Systems, April 2, 2002.

Abstract: A technical description of the expressiveness and cost of Site-Select Messaging implemented on content-driven Publish/Subscribe.

CS-2002-04
French, J. C.; Martin, W. N.; Olsen, L. M.. Extending the Vocabulary Available for Cross-Disciplinary Searching of Earth Science Data, March, 2002.

Abstract: This paper discusses the development of a prototype search assistant designed to aid cross-disciplinary searching of Earth science data. The goal of the project is to provide aids to help searchers overcome the vocabulary problem: the vocabulary used by the searchers is not the same as that used by the indexers. The work was motivated by vocabulary issues existing in the EOS Data Gateway (EDG) at the time this work began. (The EDG is the interface for searching and ordering Earth science data products from NASA and affiliated centers.) We describe the status of the project and give examples of the techniques that we are using. We also discuss the way in which we are implementing a new search paradigm, multiple viewpoints, within our prototype.

CS-2002-02
Kang, K. D.; Son, Sang H.; Stankovic, John A.. Service Differentiation in Real-Time Main Memory Databases, January 29, 2002.

Abstract: The demand for real-time database services has been increasing recently. Examples include sensor data fusion, decision support applications, web information services, e-commerce, and data-intensive smart spaces. In these systems, it is essential to execute transactions in time using fresh (temporally consistent) data. Due to the high service demand and stringent timing/data temporal consistency constraints, real-time databases can be overloaded. As a result, users may suffer poor services. Many transaction deadlines can be missed or transactions may have to use stale data. To address these problems, we present a service differentiation architecture. Transactions are classified into several service classes based on their importance. Under overload, different degrees of deadline miss ratio guarantees are provided among the service classes according to their importance. A certain data freshness guarantee is also provided for the data accessed by timely transactions which finish within their deadlines. Feedback control is applied to support the miss ratio and freshness guarantees. In a simulation study, our service differentiation approach shows a significant performance improvement compared to the baseline approaches. The specified miss ratio and freshness are supported even in the presence of unpredictable workloads and data access patterns. Our approach also achieves a relatively low miss ratio for the less privileged service classes, thereby reducing potential starvation.

CS-2002-01
Haskins, Jr., John W.; Skadron, K.; KleinOsowski, AJ; Lilja, D. J.. Techniques for Accurate, Accelerated Processor Simulation: Analysis of Reduced Inputs and Sampling, January 22, 2002.

Abstract: Detailed execution-driven simulation is an important tool for computer architecture research. It is desirable to drive these simulations with standard benchmark programs that are commonly used to evaluate existing computer systems, such as the SPEC2000 suite. Unfortunately, simulating these benchmark programs to completion using full-detail, cycle-accurate simulation on the designated reference input sets results in intractably long simulation durations. This study evaluates and compares two techniques for combating long simulation times: reduced inputs and sampling. Our objective is to assess the ability of each to reduce simulation running times, while simultaneously minimizing the difference in the results generated by using these techniques relative to the results generated by simulating the benchmark programs to completion using the reference inputs. With the reduced input technique, new input sets are carefully generated by hand to produce run-time characteristics of the benchmark programs that are comparable to the overall characteristics produced when the programs are run with the standard inputs. Sampling, on the other hand, simulates only a small fraction of the program's overall execution in full, cycle-accurate detail using the reference inputs.

CS-2001-29
Scott, Kevin; Davidson, Jack. Software Security Using Software Dynamic Translation, November 2001.

Abstract: Software dynamic translation (SDT) is a technology that allows programs to be modified as they are running. Researchers have used SDT with good success to build a variety of useful software tools (e.g., binary translators, operating system simulators, low-overhead profilers, and dynamic optimizers). In this paper, we describe how SDT can be used to address the critical problem of providing software security. The paper shows how SDT can simply and effectively implement arbitrary user-specified software safety policies. Unlike static analysis techniques which typically process source code, SDT is applied to binary code. Consequently, SDT can handle untrusted binaries and unsecured libraries from any source. To demonstrate and validate that SDT provides additional security, we have implemented a software security API for Strata, our software dynamic translation infrastructure. The API, while simple, allows clients to implement powerful policies to prevent potential security violations. To illustrate the use of Strata and the security API, the paper provides implementations of several interesting and useful security policies.

CS-2001-18
Scott, Kevin; Davidson, Jack; Skadron, Kevin. Low-overhead Software Dynamic Translation, July 2001.

Abstract: Software dynamic translation (SDT) is a technology that allows programs to be modified as they are running. The overhead of monitoring and modifying a running program s instructions is often substantial in SDT. As a result SDT can be impractically slow, especially in SDT systems that do not or can not employ dynamic optimization to offset overhead. This is unfortunate since SDT has obvious advantages in modern computing environments and interesting applications of SDT con-tinue to emerge. In this paper we introduce two novel overhead reduction techniques that can improve SDT performance by a factor of three even when no dynamic optimization is performed. To demonstrate the effectiveness of our overhead reduction techniques, and to show the type of useful tasks to which low-overhead, non-optimizing SDT systems might be put, we implemented two dynamic safety checkers with SDT. These dynamic safety checkers perform useful tasks pre-venting buffer-overrun exploits and restricting system call usage in untrusted binaries. Further their performance is similar to, and in some cases better than, state-of-the-art tools that perform the same functions without SDT.

CS-2001-17
Scott, Kevin; Davidson, Jack. Strata: A Software Dynamic Translation Infrastructure, July 2001.

Abstract: Software dynamic translation is the alteration of a running program to achieve a specific objective. For example, a dynamic optimizer uses software dynamic translation to modify a running program with the objective of making the program run faster. In addition to its demonstrated utility in dynamic optimizers, software dynamic translation also shows promise for producing applications that are adaptable, secure, and robust. In this paper, we describe the design and implementation of an extensible, retargetable infrastructure to facilitate research in applications of software dynamic translation technology. The infrastructure, called Strata, provides the software dynamic translator implementor with a virtual machine model that can be extended to implement specific software dynamic translation functionality. To illustrate the use of Strata to build client applications, the paper describes the Strata implementation of a two dynamic safety checkers and a dynamic instruction scheduler.

CS-2001-27
Skadron, Kevin; Abdelzaher, Tarek; Stan, Mircea R.. Control-Theoretic Techniques and Thermal-RC Modeling for Accurate and Localized Dynamic Thermal Management, Nov. 2001.

Abstract: This paper proposes the use of formal feedback control theory as a way to implement adaptive techniques in the processor architecture. Dynamic thermal management (DTM) is used as a test vehicle, and variations of a PID controller (Proportional-Integral-Differential) are developed and tested for adaptive control of fetch "toggling." To accurately test the DTM mechanism being proposed, this paper also develops a thermal model based on lumped thermal resistances and thermal capacitances. This model is computationally efficient and tracks temperature at the granularity of individual functional blocks within the processor. Because localized heating occurs much faster than chip-wide heating, some parts of the processor are more likely to be "hot spots" than others. Experiments using Wattch and the SPEC2000 benchmarks show that the thermal trigger threshold can be set within 0.2 degrees of the maximum temperature and yet never enter thermal emergency. This cuts the performance loss of DTM by 65% compared to the previously described fetch toggling technique that uses a response of fixed magnitude.

CS-2001-26
Liebeherr, J.; Nahas, M.; Si, W.. Application-Layer Multicasting with Delaunay Triangulation Overlays, December 5, 2001.

Abstract: Application-layer multicast supports group applications without the need for a network-layer multicast protocol. Here, applications arrange themselves in a logical overlay network and transfer data within the overlay. In this paper, we present an application-layer multicast solutionthat uses a Delaunay triangulation as an overlay network topology. An advantage of using a Delaunay triangulation is that it allows each application to locally derive next-hop routing information without requiring a routing protocol in the overlay. A disadvantage of using a Delaunay triangulation is that the mapping of the overlay to the network topology at the network and data link layer may be suboptimal. We present a protocol, called "Delaunay Triangulation (DT)" protocol, which constructs Delaunay triangulation overlay networks. We present measurement experiments of the DT protocol for overlay networks with up to 10,000 members, that are running on a local PC cluster with 100 Linux PCs. The results show that the protocol stabilizes quickly, e.g., an overlay network with 10,000 nodes can be built in just over 30 seconds. The traffic measurements indicate that the average overhead of a node is only a few kilobits per second if the overlay network is in a steady state. Results of throughput experiments of multicast transmissions (using TCP unicast connections between neighbors in the overlay network) show an achievable throughput of approximately 15 Mbps in an overlay with 100 nodes and 2 Mbps in an overlay with 1,000 nodes.

CS-2000-13
Scott, Kevin; Ramsey, Norman. When Do Match-compilation Heuristics Matter?, February, 2000.

Abstract: Modern, statically typed, functional languages define functions by pattern matching. Although pattern matching is defined in terms of sequential checking of a value against one pattern after another, real implementations translate patterns into automata that can test a value against many patterns at once. Decision trees are popular automata. The cost of using a decision tree is related to its size and shape. The only method guaranteed to produce decision trees of minimum cost requires exponential match-compilation time, so a number of heuristics have been proposed in the literature or used in actual compilers. This paper presents an experimental evaluation of such heuristics, using the Standard ML of New Jersey compiler. The principal finding is that for most benchmark programs, all heuristics produce trees with identical sizes. For a few programs, choosing one heuristic over another may change the size of a decision tree, but seldom by more than a few percent. There are, however, machine-generated programs for which the right or wrong heuristic can make enormous differences: factors of 2-20.

CS-2001-12
Scott, Kevin. On Proebsting's Law, March 1, 2001.

Abstract: In 1965 Gordon Moore observed that the capacity of semiconductor ICs doubled every 18 to 24 months. This trend, now known as Moore's Law, has held for over 25 years and is responsible for the exponential increase in microprocessor performance over this period. In 1998 Todd Proebsting made a similar-in-spirit, but altogether less optimistic observation about optimizing compilers. His observation, henceforth known as Proebsting's Law, is that improvements to compiler technology double the performance of typical programs every 18 years. Proebsting has suggested an experiment to evaluate the veracity of his observation. This paper presents the results of this experiment and some comments on what Proebsting's Law portends for compiler research.

CS-2001-28
Christin, Nicolas; Liebeherr, Jorg. The QoSbox: A PC-Router for Quantitative Service Differentiation in IP Networks, November 20, 2001.

Abstract: We describe the design and implementation in UNIX-based PCs of the QoSbox, a configurable IP router that provides per-hop service guarantees on loss, delays and throughput to classes of traffic. There is no restriction on the number of classes or the specific service guarantees each class obtains. The novel aspects of the QoSbox are that (1) the QoSbox does not rely on any external component (e.g., no traffic shaping and no admission control) to enforce the desired service guarantees, but instead, (2) dynamically adapts packet forwarding and dropping decisions as a function of the instantaneous traffic arrivals; also, (3) the QoSbox can enforce both absolute bounds and proportional service guarantees on queueing delays, loss rates, and throughput at the same time. We evaluate the QoSbox in a testbed of PC-routers over a FastEthernet network, and show that the QoSbox is a possible solution allowing for incremental deployment to the problem of providing service differentiation in a scalable manner.

CS-2001-25
Parikh, Dharmesh; Skadron, Kevin; Zhang, Yan; Barcella, Marco; Stan, Mircea R.. Power Issues Related To Branch Prediction, November 2, 2001.

Abstract: This paper explores the role of branch predictor organization in power/energy/performance tradeoffs for processor design. We find that as a general rule, to reduce overall energy consumption in the processor it is worthwhile to spend more power in the branch predictor if this results in more accurate predictions that improve running time. Two techniques, however, provide substantial reductions in power dissipation without harming accuracy. Banking reduces the portion of the branch predictor that is active at any one time. And a new on-chip structure, the prediction probe detector (PPD), can use pre-decode bits to entirely eliminate unnecessary predictor and BTB accesses. Despite the extra power that must be spent accessing the PPD, it reduces local predictor power and energy dissipation by about 45% and overall processor power and energy dissipation by 5--6%.

CS-2001-24
Hu, Zhigang; Juang, Philo; Skadron, Kevin; Clark, Douglas W.; Martonosi, Margaret. Applying Decay Strategies to Branch Predictors for Leakage Energy Savings, Oct. 2001.

Abstract: This paper shows that substantial reductions in leakage energy can be obtained by deactivating groups of branch-predictor entries if they lie idle for a sufficiently long time. Decay techniques, first introduced by Kaxiras et al. for caches, work by tracking accesses to cache lines and turning off power to those that lie idle for a sufficiently long period of time (the decay interval). Once deactivated, these lines essentially draw no leakage current. The key trick is in identifying opportunities where an item can be turned off without incurring significant performance or power cost. Branch predictors are, like caches, large array structures with significant leakage; as such, it is natural to consider applying decay techniques to them as well. Applying decay techniques to branch predictors is, however, not straightforward. The overhead for applying decay to individual counters in the predictor is prohibitive, so decay must be applied to groups of predictor entries. The most natural grouping is a row in the square data array used to implement the branch predictor, but then decay will only be successful if entire rows lie idle for sufficiently long periods of time. This paper shows that branch predictors do exhibit sufficient spatial and temporal locality to make decay effective for bimodal, gshare, and hybrid predictors, as well as the branch target buffer. In particular, decay is quite effective when applied intelligently to hybrid predictors, which use two predictors in parallel and are among the most accurate predictor organizations. Hybrid predictors are also especially amenable to decay, because inactive entries in one component can be left inactive if the other component is able to provide a prediction. Overall, this paper demonstrates that decay techniques apply more broadly than just to caches, but that careful policy and implementation make the difference between success and failure in building decay-based branch predictors.

CS-2001-23
Parikh, Stavan M.; Elder, Matthew C.. Air Traffic Management Overview, October 12, 2001.

Abstract: The Air Traffic Control (ATC) system manages air traffic from various sources throughout the United States National Airspace System (NAS). The mission of air traffic control is to promote the safe, orderly, and expeditious movement of aircraft. The Federal Aviation Administration (FAA) is responsible for providing the NAS infrastructure to enable air operations in the United States, 24 hours a day, 365 days a year. Air traffic control, the primary component of the NAS architecture, is a complex system comprising of many types of facilities, a plethora of hardware and software systems, and complicated interactions between humans and machines. There are two goals that must be met by the Air Traffic Control system: safety and efficiency. To ensure safety, minimum separation must be maintained between aircraft as they pass through the national airspace. Efficient air traffic control, however, requires that separation not be so great as to adversely affect the flow of traffic through the airspace. This document provides a summary of the air traffic management system together with pointers to other sources of more detailed information.

CS-2001-22
Kang, K.-D.; Son, S. H.; Stankovic, J. A.; Abdelzaher, T. F.. A QoS-Sensitive Approach for Timeliness and Freshness Guarantees in Real-Time Databases, October 9, 2001.

Abstract: The demand for real-time database services has been increasing recently. Examples include sensor data fusion, decision support, web information services, e-commerce, online trading, and data-intensive smart spaces. A real-time database, a core component of many information systems for real-time applications, can be a main service bottleneck. To address this problem, we present a real-time database QoS management scheme that provides guarantees on deadline miss ratio and data freshness, which are considered two fundamental performance metrics for real-time database services. Using our approach, admitted user transactions can be processed in a timely manner and data freshness can be guaranteed even in the presence of unpredictable workloads and data access patterns. A simulation study shows that our QoS-sensitive approach can achieve a significant performance improvement, in terms of deadline miss ratio and data freshness, compared to several baseline approaches. Furthermore, our approach shows a comparable performance to the theoretical oracle that is privileged by the complete future knowledge of data accesses.

CS-2001-19
Liebeherr, Jorg; Patek, Stephen; Burchard, Almut. A Calculus for End-to-end Statistical Service Guarantees, August 6, 2001 [Revised December 2003].

Abstract: The deterministic network calculus offers an elegant framework for determining delays and backlog in a network with deterministic service guarantees to individual traffic flows. This paper addresses the problem of extending the network calculus to a probabilistic framework with statistical service guarantees. Here, the key difficulty relates to expressing, in a statistical setting, an end-to-end (network) service curve as a concatenation of per-node service curves. The notion of an effective service curve is developed as a probabilistic bound on the service received by an individual flow. It is shown that per-node effective service curves can be concatenated to yield a network effective service curve.

CS-2001-21
Christin, Nicolas; Liebeherr, Jorg; Abdelzaher, Tarek F.. A Quantitative Assured Forwarding Service, August 8, 2001.

Abstract: The Assured Forwarding (AF) service of the IETF DiffServ architecture provides a qualitative service differentiation between classes of traffic, in the sense that a low-priority class experiences higher loss rates and higher delays than a high-priority class. However, the AF service does not quantify the difference in the service given to classes. In an effort to strengthen the service guarantees of the AF service, we propose a Quantitative Assured Forwarding service with absolute and proportional differentiation of loss, service rates, and packet delays. We present a feedback-based algorithm which enforces the desired class-level differentiation on a per-hop basis, without the need for admission control or signaling. Measurement results from a testbed of FreeBSD PC-routers on a 100 Mbps Ethernet network show the effectiveness of the proposed service, and indicate that our implementation is suitable for networks with high data rates.

CS-2001-14
Sankaranarayanan, Karthik; Skadron, Kevin. A Scheme for Selective Squash and Re-issue for Single-Sided Branch Hammocks, July 10, 2001.

Abstract: This abstract describes work to minimize re-execution of control independent instructions. This technique differs from prior work in its emphasis on compiler scheduling in order to minimize changes to the hardware of an out-of-order processor. Work so far has focused on single-sided branch hammocks.

CS-2000-32
Srinivasa, Rashmi; Phan, Tram; Mohanraj,Nisanti; Powell, Allison; French, Jim. Database Selection Using Document and Collection Term Frequencies, May 16, 2000.

Abstract: We examine the impact of two types of information - document frequency (df) and collection term frequency (ctf) - on the effectiveness of database selection. We introduce a family of database selection algorithms based on this information, and compare their effectiveness to two existing database selection approaches, CORI and gGlOSS. We demonstrate that a simple selection algorithm that uses only document frequency information is more effective than gGlOSS, and achieves effectiveness that is very close to that of CORI.

CS-2001-13
Sullivan, Kevin; Griswold, William G.; Cai, Yuanfang; Hallen, Ben. The Structure and Value of Modularity in Software Design, March 18, 2001.

Abstract: The concept of information hiding modularity is a cornerstone of modern software design thought, but its formulation remains casual and its emphasis on changeability is imperfectly related to the goal of creating value in a given context. We need better models of the structure and value of information hiding, for both their explanatory power and prescriptive utility. We evaluate the potential of a new theory-developed to account for the influence of modularity on the evolution of the computer industry-to inform software design. The theory uses design structure matrices to model designs and real options techniques to value them. To test the potential utility of the theory for software we represent a model software system in its terms-Parnas's KWIC-and evaluate the results. We contribute an extension to design structure matrices and show that the options results are consistent with Parnas's conclusions. Our results suggest that such a theory does have potential to help inform software design.

CS-2001-08
Srinivasa, Rashmi; Williams, Craig; Reynolds, Paul F.. Distributed Transaction Processing on an Ordering Network, February 28, 2001.

Abstract: The increasing demand for high throughputs in transaction processing systems leads to high degrees of transaction concurrency and hence high data contention. The conventional dynamic two-phase locking (2PL) concurrency control (CC) technique causes system thrashing at high data contention levels, restricting transaction throughput. Optimistic concurrency control (OCC) is an alternative strategy, but OCC techniques suffer from wasted resources caused by repeated transaction restarts. We propose a new technique, ORDER, that enlists the aid of the interconnection network in a distributed database system in order to coordinate transactions. The network in an ORDER system provides total ordering of messages at a low cost, enabling efficient CC. We compare the performance of dynamic 2PL and ORDER, using both an analytical model and a simulation. Unlike previously-proposed models for 2PL, our analytical model predicts performance accurately even under high data contention. We study the effects of various parameters on performance, and demonstrate that ORDER outperforms dynamic 2PL for a wide range of workloads.

CS-2001-07
Prey, Kevin; French, James C.; Powell, Allison L.; Viles, Charles L.. Inverse Document Frequency and Web Search Engines, February 5, 2001.

Abstract: Full text searching over a database of moderate size often uses the inverse document frequency, idf = log(N/df), as a component in term weighting functions used for document indexing and retrieval. However, in very large databases (e.g. internet search engines), there is the potential that the collection size (N) could dominate the idf value, decreasing the usefulness of idf as a term weighting component. In this short paper we examine the properties of idf in the context of internet search engines. The observed idf values may also shed light upon the indexed content of the WWW. For example, if the internet search engines we survey index random samples of the WWW, we would expect similar idf values for the same term across the different search engines.

CS-2000-31
O'Neil, Edward K.; French, James C.. A Description of the LAMB Web-Derived Language Model Builder, May 16, 2000.

Abstract: This paper describes the language modeling script constructed for the Spring 2000 Information Retrieval seminar. The script builds on Callan's work of automatically generating language models for text databases by creating language models for internet search engines. An overview of the work, principles and observed properties of language models, the language modeling script (LAMB) and its shortcomings are described.

CS-2000-30
Monroe, Gary A.; Mikesell, David R.; French, James C.. Determining Stopping Criteria in the Generation of Web-Derived Langua ge Models, May 10, 2000.

Abstract: In this work, we present a small-scale evaluation of two query-based sampling techniques for building language models, using a database comprised of world-wide web documents. We propose a metric by which it is possible to determine when to cease sampling a given web database, and we compare this new metric to other metrics that have been used in previous work to determine the fidelity of sampled language models.

CS-2001-06
Lu, Chenyang; Abdelzaher, Tarek F.; Stankovic, John A.; Son, Sang H.. A Feedback Control Architecture and Design Methodology for Service Delay Guarantees in Web Servers, January 25, 2001.

Abstract: This paper presents the design and implementation of an adaptive architecture to provide relative, absolute and hybrid service delay guarantees for different service classes on web servers under HTTP 1.1. The first contribution of this paper is the architecture based on feedback control loops that enforce delay guarantees for classes via dynamic connection scheduling and process reallocation. The second contribution is our use of feedback control theory to design the feedback loop with proven performance guarantees. In contrast with ad hoc approaches that often rely on laborious tuning and design iterations, our control theory approach enables us to systematically design an adaptive web server with established analytical methods. The design methodology includes using system identification to establish dynamic models for a web server, and using the Root Locus method to design feedback controllers to satisfy performance specifications. The adaptive architecture has been implemented by modifying an Apache web server. Experimental results demonstrate that our adaptive server provides robust delay guarantees even when workload varies significantly. Properties of our adaptive web server also include guaranteed stability, and satisfactory efficiency and accuracy in achieving desired delay or delay differentiation.

CS-2001-05
Lu, Chenyang; Stankovic, John A.; Tao, Gang; Son, Sang H.. Feedback Control Real-Time Scheduling: Framework, Modeling, and Algorithms, January 15, 2001.

Abstract: This paper presents a feedback control real-time scheduling (FCS) framework for adaptive real-time systems. An advantage of the FCS framework is its use of feedback control theory (rather than ad hoc solutions) as a scientific underpinning. We apply a control theory based design methodology to systematically design FCS algorithms to satisfy their transient and steady state performance specifications. In particular, we establish a dynamic model and performance analysis of several feedback control scheduling algorithms, which is a major challenge and key step for the control design of adaptive real-time systems. We also generalize a FCS architecture that allows plug-ins of different real-time scheduling policies and QoS optimization algorithms. Based on our model, we identify different types of real-time applications where each FCS algorithm can be applied. Performance evaluation results demonstrate that our analytically tuned FCS algorithms provide robust steady and transient state performance guarantees for periodic and aperiodic tasks even when the task execution time varied considerably from the estimation.

CS-2000-25
Patek, Stephen D.; Venkateswaran, Raja; Liebeherr, Jorg. Simple Alternate Routing for Differentiated Services Networks, September 29, 2000.

Abstract: Recent work on differentiated services in the Internet has defined new notions of Quality of Service (QoS) that apply to aggregates of traffic in networks with coarse spatial granularity. Most proposals for differentiated services involve traffic control algorithms for aggregate service levels, packet marking and policing, and preferential treatment of marked packets in the network core. The issue of routing for enhancing aggregate QoS has not received a lot of attention. This study investigates the potential benefit of using alternate routing strategies in support of differentiated services. We propose a traffic control scheme, called Simple Alternate Routing, wherein portions of unmarked packet flows can be assigned to alternate paths through a Service Provider Network (SPN) in response to congestion feedback information. The scheme is simple, requiring only minor changes to the SPN border routers so that alternately routed packets can be tunneled via conventional paths to an intermediate border node and then tunneled from there to the original egress border node. We present distributed algorithms for (1) discovering congestion within the SPN, and (2) allocating traffic to alternate paths that are uncongested. We have implemented the scheme in a packet-level simulation, and we have examined the transient response of the algorithm to perturbations in the nominal traffic levels experienced by the SPN. The experimental study of this paper provides some understanding of the scheme's ability to adapt in routing packets around congestion. Our results indicate that the alternate routing framework shows promise and warrants further consideration.

CS-2000-24
Liebeherr, Jorg; Christin, Nicolas. Buffer Management and Scheduling for Enhanced Differentiated Services, August 22, 2000.

Abstract: A novel framework, called JoBS (Joint Buffer Management and Scheduling), is presented for reasoning about relative and absolute per-class service differentiation in a packet network without information on traffic arrivals. JoBS has two unique capabilities: (1) JoBS makes scheduling and buffer management decisions in a single step, and (2) JoBS supports both relative and absolute QoS requirements of classes. JoBS is presented in terms of the solution to an optimization problem. Numerical simulation examples, including results for a heuristic approximation of JoBS, are presented to illustrate the effectiveness of the approach and to compare JoBS to existing methods for loss and delay differentiation.

CS-2000-23
Weikle, D. A. B.; Skadron, Kevin; McKee, Sally; Wulf, William A.. TSpec: A Notation for Describing Memory Reference Traces, August 14, 2000.

Abstract: Interpreting reference patterns in the output of a processor is complicated by the lack of a succinct notation for humans to use when communicating about them. Since an actual trace is simply an incredibly long list of numbers, it is difficult to see the underlying patterns inherent in it. The source code, while simpler to look at, does not include the effects of compiler optimizations such as loop unrolling, and so can be misleading as to the actual references and order seen by the memory system. To simplify communication of traces between researchers and to understand them more completely, we have developed a notation for representing them that is easy for humans to read, write, and analyze. We call this notation TSpec, for trace specification notation. It has been designed for use in cache design with four goals in mind. First, it is intended to assist in communication between people, especially with respect to understanding the patterns inherent in memory reference traces. Second, it is the object on which the cache filter model operates. Specifically, the trace and state of the cache are represented in TSpec, these are then the inputs for a function that models the cache, and the result of that function is a modified trace and state that are also represented in TSpec. Third, it supports the future creation of a machine readable version that could be used to generate traces to drive simulators, or for use in tools (such as translators from assembly language to TSpec). Finally, it can be used to represent different levels of abstraction in benchmark analysis.

CS-2000-22
Highley, Timothy; Reynolds, Paul; Williams, Craig. Testing the Functional Requirements of the Switch Interface Unit, August 4, 2000.

Abstract: Isotach networks provide special guarantees about message ordering and delivery times in a parallel or distributed computing environment. Although isotach networks can be implemented entirely in software running on commercial off-the-shelf hardware, an isotach network that uses custom hardware to maintain logical time may be particularly efficient. A hardware-based implementation of an isotach network has been designed and built using custom components, one of which is the switch interface unit (SIU). An SIU handles isotach functions for a host. These functions include the maintenance of logical time, the timestamping of outgoing isotach mesages, and the exchange of tokens with another custom hardware device,the token manager. We have designed and performed tests to analyze the switch interface units and increase confidence that the SIUs are functioning correctly. The analysis has indicated that the new units are working properly.

CS-2000-21
Abdelzaher, Tarek F.. A Schedulable Utilization Bound for Aperiodic Tasks, August 2, 2000.

Abstract: In this report, we derive a utilization bound on schedulability of apriodic tasks with arbitrary arrival times, execution times, and deadlines. To the author's knowledge, this is the first time a utilization bound is derived for the aperiodic task model. It allows constructing an O(1) admission test for aperiodic tasks. Earlier admission tests are at best O(n). We show that deadline-monotonic scheduling is the optimal fixed-priority scheduling policy for aperiodic tasks in the sense of maximizing the schedulable utilization bound. We prove that the optimal bound is 5/8. Our result is an extension of the well-known Liu and Layland's bound of ln 2 (derived for periodic tasks). The new bound is shown to be tight. We briefly generalize our results to tasks with multiple resource requirements and multiple processors. Dynamic priority scheduling (EDF) of aperiodic tasks is shown to have the same schedulability bound as for periodic tasks.

Our findings are especially useful for an emerging category of soft real-time applications, such as online trading and e-commerce, where task (request) arrival times are arbitrary, task service times are unknown, and service has to be performed within a given deadline. Our result provides theoretical grounds for guaranteeing deadlines of individual aperiodic requests by observing only the aggregate utilization conditions which simplifies achieving real-time assurances in such applications.

CS-2000-08
Williams, C.; Reynolds, P.F., Jr.; de Supinski, B.R.. Data coherence protocols: the home update protocol, March, 2000.

Abstract: We describe a class of directory coherence protocols called delta coherence protocols that use network guarantees to support a new and highly concurrent approach to maintain a consistent shared memory. Delta coherence protocols are more concurrent than other coherence protocols in that they allow processes to pipeline memory accesses without violating sequential consistency; support multiple concurrent readers and writers to the same cache block; and allow processes to access multiple shared variables atomically without invalidating copies held by other processes or otherwise obtaining exclusive access to the referenced variables. Delta protocols include both update and invalidate protocols. In this paper we describe the simplest, most basic delta protocol, an update protocol called the home update protocol. Delta protocols are based on isotach network guarantees. An isotach network maintains a logical time system that allows each process to predict and control the logical time at which its messages are received. We prove the home update protocol is correct using logical time to reason about the order in which requests are executed.