Research
|
I currently direct the
LAVA lab (Laboratory for
Computer Architecture at Virginia).
My research currently focuses on how to design multicore architectures in
the presence of severe physical constraints, especially thermal, power
delivery, process variations, and wear-out. We are chiefly focusing on
these issues in the context of asymmetric and heterogeneous designs,
which provide the best balance between high single-thread performance
and high throughput for parallel tasks. Support for asymmetry is
also becoming essential as process variations (chiefly "process tilt")
create performance and power asymmetry even in organizations that were
originally designed to be symmetric (see our
DATE'07 paper). To address these challenges, we are taking a
variety of approaches:
- Novel temperature-aware design techniques (e.g. our
HPCA'06 and
DAC'08 papers)
- New temperature modeling capabilities in
HotSpot (e.g.
our
IEEE. Trans. Computers'08 and
ISPASS'09 papers)
- New temperature sensing capabilities (e.g. our
ITEHRM'06 and upcoming IEEE Trans. Computers papers)
- New reliability management techniques to balance performance and wear-out (e.g. our
IEEE Micro'05 paper) and cope with transient faults (e.g. our
GH'06
and
GH'07 papers for GPUs) and take advantage of reconfigurable
resources
- New reliability modeling capabilities (e.g. our
IEEE TVLSI'07 paper)
- Dynamic combination or "federation" of scalar cores
to support runtime
variations in ILP and DLP (e.g. our
DAC'08 paper)
- New, lightweight out-of-order execution techniques with much better
performance/mm2 and performance/watt (see our
DAC'08 paper - this was an enabling technique for federation)
- New cache organizations for many cores (see our
ICCD'09 paper).
- New design-space exploration capabilities that reduce simulation
requirements, such as genetically programmed response surfaces (e.g. our
DAC'08 paper)
- New power management techniques, especially in the context of
real-time constraints, spanning a variety of application types from
multimedia (e.g. our
Asilomar'06 paper) to multi-tier e-commerce workloads (e.g. our
PACT'08 paper)
We are also one of the first groups to explore the use of graphics
processors (GPUs) for general-purpose computing (are
GPUs for you?) and the first to develop
an architectural simulation infrastructure-Qsilver-for
performance, power, and thermal studies. In addition to exploring
the implications of heterogeneous organizations combining CPUs, GPUs,
and other processor types, we are exploring how the massive parallelism
of the GPU and its novel SIMD and memory organization can most
effectively be used. To address these questions, we are pursuing a
variety of investigations, such as:
- Developing the
Rodinia
benchmark suite of applications with both optimized GPU and
multicore-CPU implementations of a diverse set of applications
(see our
IISWC'09 and
JPDC'08 papers and upcoming
ASPLOS 2010 tutorial)
- Exploring how to most effectively use texture, constant,
per-block shared memory, and other features that GPUs and GPU
languages such as CUDA provide (e.g., see our
ACM Queue'08,
JPDC'08,
IPDPS'09, and
ICS'09 papers)
- Developing new techniques to make SIMD architectures more
effective in the presence of irregular data structures or irregular
parallelism (see our upcoming
SC'09 paper)
- Comparing GPU and FPGA efficiency for diverse application
characteristics (e.g. our
SASP'08 paper)
- Developing new programming abstractions to simplify GPU
programming (e.g. our
ICS'09 paper)
- Developing new dynamic analysis techniques for GPU programs
(e.g. our
STMCS'08 and
PMEA'09 papers)
- Understanding how to best interface MATLAB to the GPU (e.g. our
BiC'09 paper)
Our work uses a variety of tools, from native GPU implementations
using CUDA, Brook, OpenCL and MATLAB to simulation tools, chiefly
M5 and our VF2 extensions for
multicore, multi-threading, SIMD, and asymmetric organizations; our
Genetically
Programmed Response Surfaces Toolkit; and of course
HotLeakage and HotSpot.
In
prior work, my group has:
These research projects have stimulated several innovations in our computer architecture courses, including the development of a Microprocessor Survey Course (also described in a paper at SIGCSE) and the use of
CUDA to teach
both concurrency and parallel architecture.
This work is currently supported by the National Science Foundation under grant nos.
CCF-0903471, CNS-0916908 (ARRA), CNS-0551630 (CRI), IIS-0612049, and CNS-0615277 and the Semiconductor Research
Corporation under task nos. 1607, 1972, and 2042; research grants from Intel MTL,
NVIDIA Research,
and NEC Labs; equipment donations and
extended loans from
NVIDIA and
Hewlett Packard; a GRC/AMD Ph.D. fellowship for Michael Boyer; and an NVIDIA Ph.D. fellowship for Jiayuan Meng.
Prior support has come from
the National Science Foundation under grant
nos. ITR-0082671, CCR-0133634 (CAREER), CCR-0105626, EIA-0224434, DOS-0306404,
CCF-0429765, and CNS-0509245; the Army Research Office
under grant no. W911NF-04-1-0288; IBM
Research; and an Excellence Award from the University of Virginia Fund for Excellence in Science and Technology
(FEST).
Additional support has been provided by
William A. Ballard Fellowships for John W. Haskins and David Tarjan, a
University of Virginia Award for Excellence in Scholarship in the Sciences &
Engineering for David Tarjan, and an ATI graduate fellowship for Jeremy Sheaffer.
Please note that any opinions, findings,
and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the
funding agencies.
Current Postdoctoral Research Associates:
- Wei Huang (Ph.D.
UVA, 2007, joining IBM Research-Austin in Jan. 2010)
- Brett Meyer (Ph.D. CMU, 2009)
- Jeremy Sheaffer (Ph.D. UVA, 2007)
Current Graduate Students:
- Michael Boyer
- Shuai Che
-
Marisabel Guevera
-
Paul Lee (co-advised with Mircea Stan; MS expected Dec. 2009!)
- Mario Donato Marino
- Jiayuan Meng
(Ph.D. expected May 2010!)
- Zhenyu Qi (co-advised with Mircea Stan)
- Prateeksha Satyamoorthy
- Lukasz Szafaryn
- Liang Wang
- Runjie Zhang (co-advised with Mircea Stan)
LavaLab Graduate, Postdoctoral, and Visiting Scientist Alumni
- Marco
Barcella (MSEE, 2002), U.Va.
ECE (advisor:
Mircea
Stan) - now with IBM
- Sung Woo Chung
(Ph.D. SNU 2003, now Asst. Professor at Korea U.)
- Michele
Co (Ph.D., 2006) - now a Research Associate with U.Va. CS
- John
Haskins (Ph.D., 2003) - now with Qualcomm
- Kevin
Hirst (MCS, 2002) - now with Inova
- Tibor Horvath (Ph.D., 2008) - now with Google
- Wei
Huang (Ph.D., 2006), U.Va. ECE
(advisor:
Mircea
Stan) - now a Research Associate with U.Va. CS)
- Eric Humenay
(ME, 2007) - now with RLW, Inc.
- Yingmin
Li (Ph.D., Aug. 2006) - now with NVIDIA
- Zhijian
Lu (Ph.D., Jan. 2007), U.Va. ECE
(advisor:
John
Lach) - now with Marvell Technology
- Dharmesh
Parikh (MCS, 2003) - now with AMD
- Karthik
Sankaranarayanan (Ph.D. May 2009) - now with Intel Research
- Jeremy Sheaffer
(Ph.D., 2007) - now a Research Associate with U.Va. CS
- David Tarjan
(Ph.D. June 2009) - now with NVIDIA Research
- Sivakumar
Velusamy (MCS, 2005) - now with Xilinx
- Dee
A.B. Weikle (Ph.D., 2001)
- Chris White (MCS, 2007) - now with Mythic Entertainment/Electronic Arts
Undergraduate Researchers:
- Jean Ablutz '01
- Sean Arietta '08, now a Ph.D. student at UVA (CS)
- Peter Brownlee Bakkum '10
- Jeff Barbieri '10
- Adam Banda '07
- Clay Carter '07
- Sui Chan '01
- David Chang '10
- David C. Chu '04, now a Ph.D. student at UC-Berkeley
- Henry Cook '07, now a Ph.D. student at UC-Berkeley
- Steve Cook '07, now at Lockheed Martin
- Puyan Dadvar '05, now with the Washington Metro Transit
Authority
- Jonathan Erdman '02
- David Faulkner '06, now at Amentra
- Jesse Foster '05, now working for Verizon Business
- Shougata Ghosh '05, MS *08 Princeton EE
- Matt Goodrum '10
- Douglas Grosvenor '09, now at High Performance Technologies
- Jovian Ho'10
- Philo
Juang '00, Ph.D. *05 Princeton EE, now with Google
- Steve Kelley '01
- Sue Kim '04
- Michael King '02, now at Halfaker and Associates
- Paul Lammana '02, now with Solers
- Adrian Lanning '00, now at NTELOS Wireless
- Kyeong-Jae Lee '05, now a Ph.D. student at MIT
- Sang-Ha "Shawn" Lee '10
- Drew Maier '07, now at Electronic Arts
- Ami Malaviya '05, now with McKinsey
- Daniel Marcus '07
- David McWhorter
'05, now with Commonwealth Computer Research
- John Miranda '01, MS*04 George Washington University
- Anindo Mukherjee '06, now at SAIC
- Eugene Otto '06, founded FooMojo
- Chris Palmer '07, now with Bloomberg Financial
-
Pitchaya Sitthi-Amorn '07, now a Ph.D. student at UVA (CS)
- Adam Spanberger '02,
now at ITT NexGen
- Kevin Stammetti '07,
now with Booz Allen Hamilton
- Arun Thomas '03,
MS*08 UVA, now
with Vrije U.
- Michael Trotter '10
- Lora Vaughn '04, now with US-DoD
- Eric Wirth '04
- Yuriy Zilbergleyt '03
Other links:
-
Tutorial on NVIDIA GPU programming (CUDA), with David Luebke (NVIDIA Research), Michael Garland (NVIDIA Research), and John Owens (UC Davis), at ASPLOS-XIII, Seattle, WA, Mar. 2008.
-
Tutorial
on Thermal Issues for Temperature-Aware Computer Systems, with David
Brooks (Harvard), Antonio Gonzalez (UPC Barcelona and Intel Barcelona),
Lev Finkelstein (Intel Haifa), and Mircea R. Stan (Univ. of Virginia),
at ISCA-31, Munich Germany, June 2004.
-
Tutorial on Power-Aware Design for High-Performance Processors, with José González
(Intel Barcelona), at HPCA-10, Madrid Spain, Feb. 2004.
-
Notes
for how to be a successful conference publications chair;
also a template
(courtesy of Martin Schulz) for a receipt to send for extra-page charges
-
A brief summary of
my advising philosophy (Also see
Good advice
on how to succeed as a graduate student from Christos Kozyrakis's
website)
-
Position
papers & final report, 2001 NSF Workshop on Computer Performance Evaluation, co-organized with Margaret Martonosi. The panel's recommendations also appear in an article in the Aug. 2003 issue of IEEE Computer.
|
Selected Publications
Please note that papers linked here represent author preprints.
The official, published version must be obtained from the publisher's
website or the published print copy. Nevertheless, all publications listed and/or posted
here are
still copyrighted by the publisher or author. Permission is given to make digital or hard copies
of all or part of this material without fee for personal or classroom
use, provided that the copies are not made or distributed for profit or
commercial advantage, and that copies bear the appropriate copyright
notice and the full bibliographic citation. To copy otherwise, to
republish, etc. requires specific permission and/or a fee.
Please note further that any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the sponsoring agencies or employers.
|
Recent Highlights
-
(SIMD, cache)
D. Tarjan, J. Meng, and K. Skadron. "Increasing Memory
Miss Tolerance for SIMD Cores." In
Proceedings of the
ACM/IEEE International Conference for High Performance Computing, Networking,
Storage and Analysis (SC), Nov. 2009, to appear. (pdf)
-
(gpgpu, accelerators,
manycore, heterogeneous architecture, benchmarks)
S. Che, M. Boyer, J. Meng, D. Tarjan, J. W.
Sheaffer, S.-H. Lee, and K. Skadron. “Rodinia: A Benchmark Suite for
Heterogeneous Computing.” In Proceedings of the IEEE International Symposium
on Workload Characterization (IISWC), pp. 44-54, Oct. 2009. (pdf)

-
(manycore, cache, coherence)
J. Meng and K.
Skadron “Avoiding Cache Thrashing due to Private Data Placement in Last-Level
Cache for Manycore Scaling.” In Proceedings of the IEEE International
Conference on Computer Design (ICCD), pp. 282-88, Oct. 2009. (pdf)
-
(gpgpu, accelerators,
manycore, heterogeneous architecture, MATLAB)
L. G. Szafaryn, K. Skadron, and J. J. Saucerman.
"Experiences Accelerating MATLAB Systems Biology Applications." In
Proceedings of the Workshop on Biomedicine in Computing: Systems, Architectures,
and Circuits (BiC) 2009, in conjunction with the 36th IEEE/ACM International
Symposium on Computer Architecture (ISCA), June 2009, to appear. (pdf)
-
(gpgpu, accelerators,
manycore, heterogeneous architecture, stencil operations) J. Meng and K.
Skadron. "Performance Modeling and Automatic Ghost Zone Optimization for
Iterative Stencil Loops on GPUs." In Proceedings of the 23rd Annual ACM
International Conference on Supercomputing (ICS),
pp. 256-65, June 2009.
(pdf)
-
(gpgpu, accelerators,
manycore, heterogeneous architecture) M. Boyer, D. Tarjan, S. T. Acton,
and K. Skadron. "Accelerating Leukocyte Tracking using CUDA: A Case Study
in Leveraging Manycore Coprocessors." In Proceedings of the 23rd IEEE
International Parallel and Distributed Processing Symposium (IPDPS), May
2009. (pdf)
-
(thermal)
W. Huang, K.
Skadron, S. Gurumurthi, R. J. Ribando, and M. R. Stan. "
Differentiating the Roles of IR Measurement and Simulation for Power and
Temperature-Aware Design." In
Proceedings of the 2009 IEEE International Symposium on Performance Analysis of
Systems and Software (ISPASS), pp. 1-10, Apr. 2009. (pdf)
-
(thermal)
W. Huang, K. Sankaranarayanan, K. Skadron, R. J. Ribando, and M. R. Stan.
"Accurate, Pre-RTL Temperature-Aware Processor Design Using a Parameterized,
Geometric Thermal Model." IEEE Transactions on Computers,
57(9):1277-88, Sept. 2008, DOI 10.1109/TC.2008.64. (pdf)
-
(power, real-time, control theory)
T. Horvath and K. Skadron. "Multi-mode Energy Management for Multi-tier
Server Clusters." In Proceedings of the ACM/IEEE/IFIP International
Conference on Parallel Architectures and Compilation Techniques (PACT), pp.
270-79, Oct.
2008. (preprint
pdf)
-
(gpgpu, fpga, accelerators,
heterogeneous architecture)
S. Che, J. Li, J. W. Sheaffer, K. Skadron, and J. Lach. “Accelerating Compute
Intensive Applications with GPUs and
FPGAs.” In Proceedings of the IEEE Symposium on Application Specific
Processors (SASP),
pp. 101-07,
June 2008. (pdf)
-
(manycore, thermal)
W. Huang, M. R. Stan, K. Sankaranarayanan, Robert J. Ribando, and K. Skadron.
“Many-Core Design from a Thermal Perspective.” In Proceedings of the
45th ACM/IEEE Conference on Design Automation (DAC), June 2008.
(pdf)
-
(manycore)
D. Tarjan, M. Boyer, and K. Skadron. “Federation: Repurposing Scalar Cores for
Out-of-Order Instruction Issue.” In Proceedings of the 45th ACM/IEEE
Conference on Design Automation (DAC), June 2008.
(pdf)
-
(simulation methodology)
H. Cook and K. Skadron. “Predictive Design Space Exploration Using Genetically
Programmed Response Surfaces.” In Proceedings of the 45th ACM/IEEE Conference
on Design Automation (DAC), June 2008.
(pdf)
-
(gpgpu)
J. Nickolls, I. Buck, M. Garland, K.
Skadron. “Scalable Parallel Programming with CUDA.” ACM Queue,
6(2):40-53, Mar.-Apr. 2008.
DOI 10.1145/1365490.1365500
(pdf)
Highlights from Prior Work
-
(power, branch prediction)
S. W. Chung and K.
Skadron. “On-Demand Solution to Minimize I-Cache Leakage Energy with
Maintaining Performance.” IEEE Transactions on Computers,
57(1):7-24, Jan. 2008, DOI 10.1109/TC.2007.70770.
(pdf)
-
(graphics architecture, reliability)
J. Sheaffer, D. Luebke, and K. Skadron. “A Hardware Redundancy and
Recovery Mechanism for Reliable Scientific Computation on Graphics Processors.”
In Proceedings of Eurographics/ACM Graphics Hardware 2007 (GH), pp.
55-64, Aug.
2007.
(pdf)
-
(parameter variations, multicore,
thermal, power, leakage) E. Humenay, D. Tarjan,
and K. Skadron. "Impact of Process Variations on Multicore Performance
Symmetry." In Proceedings of
the 2007 Conference on Design, Automation and Test in Europe (DATE), pp.
1653-58, Apr. 2007. (pdf)
-
(reliability, thermal) Z. Lu, W. Huang, M. Stan, K.
Skadron, and J. Lach. “Interconnect
Lifetime Prediction for Reliability-Aware Systems.” IEEE Transactions on
VLSI Systems, 15(2):159-72, Feb. 2007. (pdf)
-
(branch prediction, trace cache,
power)
M. Co, D. A.B. Weikle, and K. Skadron. "Evaluating Trace Cache Energy
Efficiency." ACM Transactions on Architecture and Code Optimization (TACO),
3(4):450-76,
Dec. 2006. (Abstract
| pdf)
-
(power, multimedia, real-time)
Z. Lu, J. Lach, K. Skadron, and M. R. Stan. “Design and Implementation of
an Energy Efficient Multimedia Playback System.” In Proceedings of the 40th
Asilomar Conference on Signals, Systems and Computers, Oct. 2006. (pdf)
-
(graphics architecture, reliability)
J. W. Sheaffer, D. P. Luebke, and K. Skadron. “The Visual Vulnerability Spectrum: Characterizing Architectural Vulnerability for Graphics Hardware.” In
Proceedings of
Eurographics/ACM Graphics Hardware 2006 (GH),
pp. 9-16, Sept. 2006. (pdf)
-
(thermal) S. W. Chung and K. Skadron. “Using on-Chip Event Counters for High-Resolution,
Real-Time Temperature Measurements.” In Proceedings of the IEEE/ASME Tenth
Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic
Systems (ITHERM), June 2006. (pdf)
-
(power) Z. Lu, Y. Zhang,
M. R. Stan, J. Lach, and K. Skadron. “Procrastinating Voltage Scheduling with
Discrete Frequency Sets.” In Proceedings of the 2004
Design, Automation and Test in Europe Conference (DATE), pp. 456-61, Mar.
2006. (pdf)
-
(multi-core architecture, power,
thermal) Y. Li, B. C. Lee, D. Brooks, Z. Hu, and K. Skadron. "CMP Design Space
Exploration Subject to Physical Constraints." In Proceedings of the
Twelfth IEEE International Symposium on High Performance Computer Architecture (HPCA),
pp. 15-26, Feb. 2006. (pdf)
-
(power) V. Narayanan and K.
Skadron. "Architectural/System Design and Optimization," in "CAD
Algorithms, Methods and Tools For Low-Power Circuits and Systems,"
E. Macii ed. IEEE Council on Electronic Design Automation (C-EDA) Technology
Survey, Jan. 2006. (IEEE
Xplore link)
-
(thermal) K. Sankaranarayanan, S.
Velusamy, M.R. Stan, and K. Skadron. "A Case for Thermal-Aware Floorplanning at
the Microarchitectural Level." The Journal of Instruction-Level Parallelism, vol. 7, Oct. 2005, http://www.jilp.org/vol7/. (pdf)
-
(branch prediction) D. Tarjan and K. Skadron.
“Merging Path and Gshare Indexing in Perceptron Branch Prediction.”
ACM
Transactions on Architecture and Code Optimization, Sept. 2005, 2(3):280-300.
(pdf)
-
(power, thermal) Y. Li, M. Hempstead, P. Mauro,
D. Brooks, Z. Hu, and K. Skadron. “Power and Thermal Effects of SRAM vs.
LatchMux Design.” In Proceedings of the ACM/IEEE 2005 International
Symposium on Low-Power Electronics Design (ISLPED), pp. 173-178, Aug. 2005.
(pdf)
-
(thermal, security)
P.
Dadvar and K. Skadron. “Potential Thermal Security Risks.” In
Proceedings of the IEEE Semiconductor Thermal Measurement, Modeling, and
Management Symposium (Semi-Therm 21), pp. 229-34, Mar. 2005. (pdf) -
(thermal, graphics architecture)
J. W. Sheaffer, K. Skadron, and D. P. Luebke. “Studying Thermal
Management for Graphics-Processor Architectures.” In Proceedings of the
2005 IEEE International Symposium on Performance Analysis of Systems and
Software (ISPASS), Mar. 2005. (pdf
|
Qsilver software home page) -
(thermal)
K. Skadron, K. Sankaranarayanan,
S. Velusamy, D. Tarjan, M.R. Stan, and W. Huang. “Temperature-Aware
Microarchitecture: Modeling and Implementation.” ACM Transactions on
Architecture and Code Optimization, 1(1):94-125, Mar. 2004.
(pdf)
-
(leakage power)
Y. Li, D. Parikh, Y. Zhang, K. Sankaranarayanan, M. R. Stan, and K. Skadron.
“State-Preserving vs. Non-State-Preserving Leakage Control in Caches.”
In Proceedings of the 2004 Design, Automation and Test in Europe (DATE)
Conference, pp. 22-27, Feb. 2004. (pdf)
[HotLeakage software home page]
- (power, real-time)
V. Sharma, A. Thomas, T. Abdelzaher, Z. Lu, and K. Skadron. “Power-Aware
QoS Management on Web Servers.” In Proceedings of the 24th International
Real-Time Systems Symposium, pp. 63-72, Dec. 2003. (pdf)
(Best
student paper!)
-
(branch prediction)
Z. Lu, J. Lach, M. Stan, and K. Skadron. “Alloyed Branch History:
Combining Global and Local Branch History for Robust Performance,” International
Journal of Parallel Programming, Kluwer, 31(2):137-77, Apr. 2003. (pdf
|
Abstract)
-
(write buffers)
K. Skadron and D.W. Clark. "Design Issues and Tradeoffs for Write Buffers." In
Proceedings of the Third International Symposium on High-Performance Computer Architecture, pp. 144-55, February 1997. (postscript |
pdf |
abstract)
Complete list of Skadron's publications
|