CS 8501: Special Topics in Computer Science - Scalable Processor
Architectures
Instructor: Kevin Skadron
Class meetings: MW 3:30-4:45 in MSB 125
Instructor office hours: Mon-Thurs 1-2 and by appt.
Prerequisite: CS 6354 or equivalent
This course will count toward breadth under the "Computer Systems" area for
CS students and under Area I, "Hardware" for CPE students.
This course will survey classic and recent research in parallel computer
architecture with a focus on the design of future, scalable, parallel
microprocessor architectures and limiting physical constraints such as power,
thermal, and variability phenomena. The course will consist of paper readings and
presentations, surveys of classic and contemporary systems, and a term project.
Sample topics:
- Classic parallel architecture research (scalable coherence, NUMA vs.
COMA, etc.)
- Interconnect architecture
- Memory architecture
- Technology limits
- Existing and proposed scalable multicore/manycore architectures (GPUs, Cell, TRIPS, RAW,
Cray, Blue Gene, etc.)
The project will be a research project. Group projects are encouraged.
The topic and plan for the project should offer the potential for publication.
The course grade will be based approximately on:
- Presentation or lecture (one per student): 20%
- In-class participation: 20%
(note that participation will include short comments about the paper of the
day from each student, to be posted on the class wiki)
- Scribe duties: 10% (summary of discussion, to be posted on class wiki
within 24 hrs)
- Final project: 50% (this in turn will be divided between a proposal,
status report, and final report and presentation)
Honor code terms: All work must correctly attribute sources. All group
work must represent equal effort from all partners--deviations from equal effort
must be documented and should be discussed with me first. Projects must
represent original work.
Readings/topics (pointers to these papers are available from the
Collab under "Resources"
and additional relevant readings are sometimes listed on the course wiki):
- Mon, 1/25: CMP design space: "Maximizing CMP Throughput with Mediocre
Cores," Davis, Laudon, and Olukotun. PACT'05. Presented by
Marisabel Guevara (slides).
- Wed, 1/27: CMP design space: "CMP Design Space Exploration Subject to
Physical Constraints." Li, Lee, Brooks, Hu, and Skadron. HPCA'06.
Presented by Shuai Che (slides).
- Mon, 2/1: GPU architecture: "GPU Computing." Owens et al. Proceedings
of the IEEE. 96(5):879-99, May 2008. DOI
10.1109/JPROC.2008.917757.
Presented by Michael Boyer (slides on
Collab).
- Wed, 2/3: Multicore and power simulation: "McPAT: An Integrated Power,
Area, and Timing Modeling Framework for Multicore and Manycore
Architectures." Li et al. MICRO 2009. Presented by Runjie Zhang
(slides).
- Mon, 2/8: Memory architecture: "Fully-Buffered DIMM Memory
Architectures: Understanding Mechanisms, Overheads and Scaling." Ganesh,
Jaleel, Wang, and Jacob. HPCA 2007. Presented by Mario Marino (slides).
- Wed, 2/10: Memory architecture: "Scaling the Bandwidth Wall: Challenges
in and Avenues for CMP Scaling." Rogers et al. ISCA 2009.
Presented by Jason Mars (slides on
Collab).
- You may also wish to refer to "Memory Bandwidth Limitations of Future
Microprocessors." Burger, Goodman, and Kagi. ISCA 1996.
- and "Understanding How Off-Chip Memory Bandwidth Partitioning in Chip
Multiprocessors Affects System Performance." Liu, Jiang, and Solihin.
HPCA 2010.
- Mon, 2/15: NoC architecture: "Performance Evaluation and Design
Trade-Offs for Network-on-Chip Interconnect Architectures." Pande et
al. IEEE Transactions on Computers, 54(8):1025-40, Aug.
2005, DOI
10.1109/TC.2005.134. Presented by Lukasz Szafaryn (slides).
- You may also wish to refer to "Route Packets, Not Wires: On-Chip
Interconnection Networks." Dally and Towles. DAC 2001.
- Wed, 2/17: NoC architecture: "Synthesis of Networks on Chips for 3D
Systems on Chips." Murali, Seiculescu, Benini, and De Micheli.
ASP-DAC'09. Presented by Puqing Wu (slides).
- Mon, 2/22: Lifetime reliability: "Exploiting Structural Duplication for
Lifetime Reliability Enhancement." Srinivasan et al. ISCA'05.
Presented by Brett Meyer.
- Wed, 2/24: Soft errors: "A Systematic Methodology to Compute the
Architectural Vulnerability Factors for a High-Performance Microprocessor."
Mukherjee et al. MICRO'03. Presented by Saad Arrabi (slides).
- Mon, 3/1: Process variations: "VARIUS: A Model of Process Variation and
Resulting Timing Errors for Microarchitects." Sarangi et al. IEEE Trans.
Semiconductor Manufacturing, 21(1), Feb. 2008. Presented by
Prateeksha Satyamoorthy (slides).
- Wed, 3/3: Process variations: "Scheduling Algorithms for Unpredictably
Heterogeneous CMP Architectures." Winter and Albonesi. DSN'08.
Presented by Chris Gregg (slides).
- Week of 3/8: spring break
- Mon, 3/15: Distributed caches: "Reactive NUCA: near-optimal block
placement and replication in distributed caches." Hardavellas et al.
ISCA'09. Presented by Abhishek Rawat (slides on
Collab).
- Wed, 3/17: project discussions
- Mon, 3/22: Coherence: "A Framework for Coarse-Grain Optimizations in the
On-Chip Memory Hierarchy." Zebchuk et al. MICRO'07.
Presented by Marko Miklo (slides).
- Wed, 3/24: Consistency: "BulkSC: Bulk Enforcement of Sequential
Consistency." Ceze et al. ISCA'07. Presented by
Jing Yang (slides on
Collab).
- Mon, 3/29: Vector: "Overcoming the Limitations of Conventional Vector
Processors." Kozyrakis and Patterson. ISCA'03. Presented by
Liang Wang (slides on
Collab).
- Wed, 3/31: SIMD divergence: "Dynamic
Warp Subdivision for Integrated Branch and Memory Divergence Tolerance."
Meng et al. ISCA'10. Presented by Greg Faust (slides on
Collab).
- Mon, 4/5: Novel/data-parallel architectures: "The Vector-Thread
Architecture." Krashinsky et al. ISCA'04.
Presented by Dan Upton (authors'
slides).
- Wed, 4/7: Novel/data-parallel architectures: TRIPS. Presented
by Chris Gregg (authors'
slides).
1. Primary paper: "Exploiting ILP, TLP, and DLP Using Polymorphism in
the TRIPS Architecture." Sankaralingam et al. ISCA'03.
2. Supplementary paper: "An Evaluation of the TRIPS Computer System."
Gebhart et al. ASPLOS'09.
3. Another paper of interest: "Universal Mechanisms for Data-Parallel
Architectures." Sankaralingam et al. MICRO'03.
- Mon, 4/12: Novel/data-parallel architectures: Imagine.
Presented by Lukasz Szafaryn (slides).
1. Primary paper: "Evaluating the Imagine Stream Architecture."
Ahn et al. ISCA'04.
2. Helpful background reading: "Compiling for Stream Processing."
Das et al. PACT'06.
- Wed, 4/14: Novel architectures: RAW. Presented by Abhishek
Rawat (slides on
Collab).
1. Primary paper: "Evaluation of the Raw Microprocessor: An
Exposed-Wire-Delay Architecture for ILP and Streams." Taylor et al.
ISCA'04.
2. Helpful background reading: "Exploiting coarse-grained task, data,
and pipeline parallelism in stream programs." Gordon et al. ASPLOS'06.
- Mon, 4/19: Novel/dataflow architectures: "Wavescalar."
Swanson et al. MICRO'03. Presented by Marisabel Guevara (slides on
Collab).
- Wed, 4/21: Historic: "The M-Machine Multicomputer." Fillo et al.
MICRO'95. Presented by Marko Miko; scribe Abhishek Rawat.
- Mon, 4/26: Dynamic cores: "Forwardflow: A Scalable Core for
Power-Constrained CMPs." Gibson and Wood. ISCA'10. Presented
by Prateeksha Satyamoorthy; scribe Liang Wang./
- Weds, 4/28: Application-specific processors: "Understanding Sources of
Inefficiency in General-Purpose Chips." Hammed et al. ISCA'10.
Presented by Saad Arrabi; scribe Puqing Wu.
- Mon, 5/3 (last class!): Application-specific processing: "Anton, a Special-Purpose
Machine for Molecular Dynamics Simulation." Shaw et al. ISCA'07.
Presented by Jason Mars; scribe Greg Faust.
- Finals period: Project presentations
Last updated: 26 Apr. 2010