CS 6501: Special Topics in Computer
Architecture: Heterogeneous and Scalable Computing
Instructor: Kevin Skadron
Class meetings: T/Th 11-12:15, OLS 228E
Office hours: M/F 9-11
Prerequisite: undergraduate-level computer architecture
This course will count toward breadth under the "Computer Systems" area for
CS students and under Area I, "Hardware" for CPE students.
This year's computer architecture special topics
course will explore how physical constraints--specifically power, thermal, and bandwidth limits--are
changing processor design, and how heterogeneous multiprocessor
organizations can address these limits. Heterogeneity may include
mixtures of CPUs, GPUs, DSPs, FPGAs, and other specialized
processors, as well as heterogeneous combinations of memory elements. Already GPUs are becoming mainstream, and even integrated on the same chip as the CPU.
We will survey some of these processors, driving workloads, and then consider how
heterogeneous architectures affect software. Finally, we will
study how to optimize the architecture of a heterogeneous processor subject to
physical consraints.
The course will primarily consist of lectures and student-led
discussions. Assignments will include paper and processor
presentations; some brief pencil-and-paper exercises; programming assignments; and a small research project.
Brief Outline
Some topics may span multiple days
- Overview of design challenges that motivate heterogeneity
- Power-aware design [with brief exercise]
- Temperature-aware design [with brief exercise]
- I/O constraints, DRAM overview
- "Dark silicon" projections [with brief exercise]
- GPU tutorial [with brief programming exercise]
- FPGA tutorial [with brief programming exercise]
- Other reconfigurable architectures (eg, Tartan)
- [if time permits] Student-led presentations describing accelerator cores
- [if time permits] Student-led presentations describing MP-SoCs
- [if time permits] Application case studies
Final projects will be selected later in the semester, with the goal of exploring one of the above topic areas in greater depth. It is hoped that most projects can serve as the foundation for a subsequent publication.
Detailed Schedule
- 8/23: Overview/problem statement (Skadron)
- 8/25 - 9/8: Power-aware design tutorial (Skadron)
- 9/13 - 9/15: Thermal tutorial (Skadron)
- 9/20: Discussion re scaling-projection homework
- 9/22: Reading: Sampson, Complex Operators, HPCA'11 (Wadden)
- 9/27: Reading: Govindaraju, Dynamically Specialized Datapaths, HPCA'11 (Dorn)
- 9/29: Small Scale Reconfigurability (Lach); Reading: Lach, Application-Specific
Product Generics, IEEE Computer'09
- 10/4: Reading: Esmaeilzadeh, Dark Silicon, ISCA'11 (Hall)
- 10/6: GPU overview (Boyer)
- 10/13: Reading: Lee, Debunking the 100X GPU vs. GPU myth, ISCA'10 (Zhang)
- 10/18: Reading: Gebhart, An Evaluation of the TRIPS Computer System, ASPLOS'09 (Klinefelter)
- 10/20: Reading: Gordon, Exploiting coarse-grained task, data, and pipeline parallelism in stream programs, ASPLOS'06 (Gregg)
- 10/25: Reading: Lee, "Exploring the Tradeoffs between Programmability and Efficiency in Data-Parallel Accelerators," ISCA'11 (Craig)
- 10/27: Reading: Choudhary, "FabScalar: composing synthesizable RTL designs of arbitrary cores within a canonical superscalar template", ISCA'11 (Shu)
- 11/1: Reading: Stitt, Thread Warping: Dynamic and Transparent Synthesis of Thread Accelerators," TODAES'11 (Arrabi)
- 11/3: Reading: Che, "Accelerating Compute Intensive Applications with GPUs and FPGAs," SASP'08 (Wadden)
- 11/8: Reading: Solomatnikov, "Using a configurable processor generator for computer architecture prototyping", MICRO'09 (Boley)
- 11/10: Reading: Njoroge, "ATLAS: a chip-multiprocessor with transactional memory support," DATE'07 (Newell)
- 11/15: FPGA overview (Lach)
- 11/17: Reading: Wee, "A practical FPGA-based framework for novel CMP research," FPGA'07 (Arrabi)
- 11/22: Reading: Mishra, "Tartan: evaluating spatial computation for whole program execution," ASPLOS'06 ()
- 11/29: Reading: Ahn, "Future scaling of processor-memory interfaces," SC'09 (Gregg)
- 12/1: Reading: Daga, "On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing," SAAHPC'11 (Newell, Hall, Craig)
- 12/6: Reading: Swanson, Wavescalar ISCA'06 and TOCS'07 papers (Boley, Dorn, Shu, Zhang)
Assignments
Last updated: 30 Nov. 2011