- Dissertation: Persistent Storage for Program Metadata (May 2012).
- Abstract
Building a modern software system has increasingly become a complex and difficult to manage task. Typical software development systems involve dozens of tools to assist the programmer in building secure and robust programs. Many of these tools perform detailed analysis, collecting large amounts of program metadata to better understand the program and improve it. Unfortunately this metadata is often discarded immediately after the tool is finished running. By saving, organizing and making such metadata available across the software development toolchain, software developers can build new tools and improve existing tools with ease. To achieve this goal, this work presents Metaman, a system for metadata storage and retrieval. Metaman allows any tool in thetoolchain to submit and query metadata about the program, avoiding duplication of analysis and saving data that was previously discarded.
An important class of tools with specific metadata needs is the set of run-time tools. Many high-level programing environments offer robust built-in run-time systems to assist the programmer with features such as security models and run-time introspection. For applications created using languages without such features, Software Dynamic Translation (SDT) systems have been introduced to allow these features to be applied to arbitrary programs. However, because these SDT-based tools operate at runtime, lengthy analysis phases can negatively affect application performance. To validate the value and utility ofmaintaining program metadata across the software development toolchain, the research demonstrates how ubiquitous program metadata can be used to provide the ability for SDT systems to improve the performance, security, and program understanding without the need to do costly analysis at runtime. These improvements show the benefits of ubiquitous availability of metadata for SDT systems as well as other tools in the development toolchain - Dan is working for NVIDIA.
|
|
- Dissertation: Data Placement Optimizations for Multilevel Cache Hierarchies. (May 2004).
- Abstract
As compiler optimizations have increasingly focused on the memory hierarchy, a variety of efforts have attempted to reduce cache misses in first level instruction and data caches. Placement of code to reduce instruction cache misses, and placement of data to reduce data cache misses, have been demonstrated to be beneficial for a variety of application programs. However, most of this work has been limited to reduction of first-level cache misses. Careful examination of various characteristics of modern computer architectures reveals opportunities for a data placement optimization framework that targets several means of performance improvement at once. Cache hierarchies have recently extended as deep as three levels, each with different cache miss penalties. Cache misses need to be reduced at all cache levels to maximize performance. Reducing TLB (translation lookaside buffer) misses and virtual memory page use is also desirable. Addressing of global and local variables can use addressing modes of differing costs, and the less expensive addressing modes can be used more frequently if the data placement optimization considers this goal.Building a modern software system has increasingly become a complex and difficult to manage task.
A multi-goal data placement framework has been developed to enable all of these optimizations. Through a novel method of static data affinity profiling, followed by a data placement optimization that uses hierarchical graph partitioning and local refinement, it is possible to achieve reductions in cache misses throughout the cache hierarchy, while also increasing page and TLB locality and enabling the address mode and bus cycle optimizations. An original method of characterizing the parameters of the cache and TLB hierarchy that are needed for the profiling and optimizations, using hardware performance counters, helps make the entire data placement framework practical and portable. The static data affinity profiling avoids the practical difficulties inherent in past research that relied on expensive dynamic profiling runs. The hierarchical graph partitioning approach to data placement is able to make use of Chaco, a well tested, off the shelf graph partitioning code library. Extensive measurements using timings and cache simulations for Sun UltraSparc-II machines demonstrate the effectiveness of the data placement optimizations. - Clark is a Research Scientist at Zephyr Software LLC.
|
- Dissertation: Effective Algorithms for Partitioned Memory Hierarchies in Embedded Systems. (May 2004).
- Abstract
As compiler optimizations have increasingly focused on the memory hierarchy, a variety of efforts have attempted to reduce cache misses in first level instruction and data caches. Placement of code to reduce instruction cache misses, and placement of data to reduce data cache misses, have been demonstrated to be beneficial for a variety of application programs. However, most of this work has been limited to reduction of first-level cache misses. Careful examination of various characteristics of modern computer architectures reveals opportunities for a data placement optimization framework that targets several means of performance improvement at once. Cache hierarchies have recently extended as deep as three levels, each with different cache miss penalties. Cache misses need to be reduced at all cache levels to maximize performance. Reducing TLB (translation lookaside buffer) misses and virtual memory page use is also desirable. Addressing of global and local variables can use addressing modes of differing costs, and the less expensive addressing modes can be used more frequently if the data placement optimization considers this goal.Building a modern software system has increasingly become a complex and difficult to manage task.
A multi-goal data placement framework has been developed to enable all of these optimizations. Through a novel method of static data affinity profiling, followed by a data placement optimization that uses hierarchical graph partitioning and local refinement, it is possible to achieve reductions in cache misses throughout the cache hierarchy, while also increasing page and TLB locality and enabling the address mode and bus cycle optimizations. An original method of characterizing the parameters of the cache and TLB hierarchy that are needed for the profiling and optimizations, using hardware performance counters, helps make the entire data placement framework practical and portable. The static data affinity profiling avoids the practical difficulties inherent in past research that relied on expensive dynamic profiling runs. The hierarchical graph partitioning approach to data placement is able to make use of Chaco, a well tested, off the shelf graph partitioning code library. Extensive measurements using timings and cache simulations for Sun UltraSparc-II machines demonstrate the effectiveness of the data placement optimizations. - Jason Hiser is a Research Scientist in the Department of Computer Science at the University of Virginia.
- Chris Milner
- Mark Bailey
- Dissertation: Reusable Computing System Descriptions for Retargetable Systems Software.
- Mark is an Associate Professor in the Department of Computer Science at Hamilton College
- Ricky Benitez.
- Dissertation: Register Allocation and Phase Interactions in Retargetable Optimizing Compiler.
- Ricky is now working at Google.
- Bruce Childers
- Dissertation: Custom Embedded Counterflow Pipelines.
- Bruce is an Associate Professor in the Department of Computer Science at the University of Pittsburgh.
- Anne Holler was at Transmeta, but the last time I heard she is working with Ricky Benitez on a start up.
- Sanjay Jinturkar is at Lucent working on DSP compilers.
- David Whalley is a Professor in the Department of Computer Science at Florida State University.
|
|
|
|
|
|