Reading List
CS 851: Aggressive Speculative Architectures

It is up to you to print out the paper (or not, if you prefer to read it
online)
Note: "background" papers provide necessary background for the in-class
discussion, but we will not discuss such papers directly.
1. Week of Jan. 19 (Intro)
(I'll be talking about these papers, so they're not required
reading)
-
(reference) T.-Y. Yeh and Y. N. Patt. "A comparison of dynamic
branch predictors that use two levels of branch history." In Proc. ISCA-20,
pages 25766, May 1993. [Abstract
| PDF]
-
(reference) S. McFarling. "Combining branch predictors." Tech. Note
TN-36, DEC WRL, June 1993. [Abstract
| PDF]
2. Week of Jan 24 (Aggressive branch prediction)
Session 1 (it's not as bad as it looks!):
-
(background) S. Sechrest, C.-C. Lee, and T. Mudge. "Correlation and
aliasing in dynamic branch predictors." In Proc. ISCA-23, pages 2232,
May 1995. [PDF]
-
K. Skadron, M. Martonosi, and D.W. Clark. "Alloying Global and Local
Branch History: A Robust Solution to Wrong-History Mispredictions."
Tech Report TR-606-99, Princeton Dept. of Computer Science, Oct. 1999.
Submitted for publication. [Abstract
| PDF]
-
M. Evers, S. J. Patel, R. S. Chappell, and Y. N. Patt. "An analysis
of correlation and predictability: What makes two-level branch predictors
work." In Proc. ISCA-25, pages 5261, June 1998. [PDF]
Session 2:
-
C.-C. Lee, I.-C. K. Chen, and T. N. Mudge. "The bi-mode branch predictor."
In Proc. Micro-30, pages 413, Dec. 1997. [PS]
-
A.N. Eden and T. Mudge. The YAGS branch prediction scheme.
In Proc. MICRO-31, pages 6977, Dec. 1998. [PDF]
3. Week of Jan. 31 (More branch prediction)
Session 1:
-
D. I. August, D. A. Connors, J. C. Gyllenhaal, and W. W. Hwu. "Architectural
Support for Compiler-Synthesized Dynamic Branch Prediction Strategies:
Rationale and Initial Results." In Proc. HPCA-3, Feb. 1997.
[Abstract
| PS]
-
S. Mahlke and B. Natarajan. "Compiler Synthesized Dynamic Branch
Prediction." In Proc. Micro-29, pages 153164, Dec. 1996. [PDF]
Session 2:
-
G. Reinman, T. Austin, and B. Calder. "A scalable front-end architecture
for fast instruction delivery." In Proc. ISCA-26, May 1999.
[PDF]
-
(catch-up)
4. Week of Feb. 7 (Trace cache, trace processors, and instruction
reuse)
Session 1:
-
(background) E. Rotenberg, S. Bennett, and J. Smith. "Trace
cache: A low latency approach to high bandwidth instruction fetching."
In Proc. Micro-29, Dec. 1996. [PDF]
-
E. Rotenberg, Q. Jacobsen, Y. Sazeides, and J. Smith. "Trace processors."
In Proc. Micro-30, Dec. 1997. [PS]
Session 2:
-
E. Rotenberg and J. Smith. "Control Independence in Trace Processors."
In Proc. MICRO-32, pages 415, Nov. 1999. [PDF]
-
(background) A. Sodani and G. Sohi. "Dynamic Instruction Reuse."
In Proc. ISCA-24, pages 194205, June 1997. [PDF]
-
J. Huang and D. Lilja. "Exploiting Basic Block Value Locality with
Block Reuse". In Proc. HPCA-5, pages 10614, Jan. 1999. [Gzip'd
PS]
5. Week of Feb. 14 (SMT, SSMT, and Multipath)
Session 1:
-
D.M. Tullsen, S.J. Eggers, J.S. Emer, H.M. Levy, J.L. Lo, and R.L. Stamm.
"Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous
Multithreading Processor." In Proc. ISCA-23, May, 1996. [PDF]
-
R. S. Chappell, J. Stark, S. P. Kim, S. K. Reinhardt and Y. N. Patt.
"Simultaneous subordinate microthreading (SSMT)." In Proc. ISCA-26,
pages 186195, May 1999. [PDF]
Session 2:
-
P.S. Ahuja, K. Skadron, M. Martonosi, and D.W. Clark. "Multipath Execution:
Opportunities and Limits." In Proc. ICS '98, pp. 10108, July 1998.
[Abstract
| PDF]
-
N. Mitchell, L. Carter, J. Ferrante, and D. Tullsen. "ILP versus
TLP on SMT." In Proc. of Supercomputing '99, Nov. 1999. [PS]
6. Week of Feb. 21 (Multiscalar)
Session 1:
-
G.S. Sohi, S. Breach, and T.N. Vijaykumar. "Multiscalar Processors."
In Proc. ISCA-22, 1995. [Gzip'd
PS]
-
M. Franklin and G.S. Sohi. "A Hardware Mechanism for Dynamic Reordering
of Memory References." IEEE Transactions on Computers, May 1996.
[Gzip'd
PS]
Session 2:
-
T.N. Vijaykumar and G.S. Sohi. "Task Selection for a Multiscalar
Processor." In Proc. Micro-31, pp. 8192, Nov-Dec 1998. [Gzip'd
PS]
-
A.I. Moshovos, S.E. Breach, T.N. Vijaykumar, G.S. Sohi. "Dynamic
Speculation and Synchronization of Data Dependences." In Proc. ISCA-24,
pp. 18193, June 1997. [Gzip'd
PS]
7. Week of Feb. 28 (Chip Multiprocessing and Thread-Level Data
Speculation)
Session 1:
-
K. Olukotun, B.A. Nayfeh , L. Hammond, K. Wilson, and K.-Y. Chang.
"The Case for a Single-Chip Multiprocessor." In Proc. ASPLOS-VII,
pp. 211 Oct. 1996. [PDF]
-
K. Olukotun, L. Hammond, and M. Willey. "Improving the Performance
of Speculatively Parallel Applications on the Hydra CMP." In Proc.
ICS '99, June 1999. [PDF]
Session 2:
-
J.G. Steffan and T.C. Mowry. "The Potential for Using Thread-Level
Data Speculation to Facilitate Automatic Parallelization." In Proc.
HPCA-4, pp. 213, 1998. [PS]
-
J.G. Steffan, C. B. Colohan, and T. C. Mowry. "Extending Cache Coherence
to Support Thread-Level Data Speculation on a Single Chip and Beyond."
Tech. Report CMU-CS-98-171, Carnegie-Mellon School of Computer Science,
Dec. 1998. [PS]
8. Week of Mar. 6 (Value Speculation and Caching)
Session 1:
-
M. H. Lipasti, C. B. Wilkerson, and J. P. Shen. "Value Locality and
Load Value Prediction." In Proc. ASPLOS-VII, pp. 13847, Oct. 1996.
[PDF]
- (we'll do this one quickly)
-
P. Marcuello, J. Tubella, and A. González. "Value Prediction
for Speculative Multithreaded Architectures." In Proc. MICRO-32,
pp. 23036, Nov. 1999. [PDF]
-
A. Sodani and G.S. Sohi. "Understanding the differences between value
prediction and instruction reuse." In Proc. MICRO-31, pp. 20515,
Nov. 1998. [PDF]
Session 2:
-
N. Jouppi. "Improving Direct-Mapped Cache Performance by the Addition
of a Small Fully-Associative Cache and Prefetch Buffers." In Proc.
ISCA-17, May 1990. [PDF]
-
K. Farkas and N. Jouppi. "Complexity/Performance Tradeoffs with Non-Blocking
Loads." In Proc. ISCA-21, pp. 21122, Apr. 1994. [PDF]
9. Week of Mar. 13: Spring Break
10. Week of Mar. 20 (More Caching)
Session 1:
-
N. Jouppi. "Cache Write Policies and Performance." In Proc.
ISCA-20, pp. 191201, May 1993. [PDF]
-
T. Juan, T. Lang, and J.J. Navarro. "The difference-bit cache."
In Proc. ISCA-23, pp. 11420, May 1996. [PDF]
-
D. Albonesi. "Selective cache ways: on-demand cache resource allocation."
In Proc. MICRO-32, pp. 248-59, Nov. 1999. [PDF]
Session 2:
-
S.G. Abraham, R.A. Sugumar, D. Windheiser, B.R. Rau, and R. Gupta.
"Predictability of Load/Store Instruction Latencies." In Proc. MICRO-26,
pp. 13952, Nov. 1993. [PDF]
-
S.T. Srinivasan and A.R. Lebeck. "Load latency tolerance in dynamically
scheduled processors." In Proc. MICRO-31, pp. 14859, Nov. 1998.
[PDF]
11. Week of Mar. 27 (Even More Caching, and IRAM)
Session 1:
-
A. Roth, A. Moshovos, and G.S. Sohi. "Dependence based prefetching
for linked data structures." In Proc. ASPLOS-VIII, pp. 11526 , Oct.
1998. [PDF]
-
B. Calder, C. Krintz, S. John and T. Austin. "Cache-conscious data placement."
In Proc. ASPLOS-VIII, pp. 13949 , Oct. 1998. [PDF]
Session 2:
-
D. Patterson et al. "A Case for Intelligent DRAM: IRAM."
IEEE Micro, Apr. 1997. [PDF
| PS]
-
C.E. Kozyrakis and D.A. Patterson. "A New Direction in Computer Architecture
Research." IEEE Computer, Nov. 1998. [PDF]
12. Week of Apr. 3 (Wacky Stuff: Diva and DataScalar)
Session 1:
-
T.M. Austin. "DIVA: a reliable substrate for deep submicron microarchitecture
design." In Proc. MICRO-32, pp. 196207, Nov. 1999. [PDF]
Session 2:
-
D. Burger, S. Kaxiras, and J.R. Goodman. "DataScalar architectures."
In Proc. ISCA-24, pp. 33849, June 1997. [PDF]
13. Week of Apr. 10 (VLIW)
Session 1:
-
R.P. Colwell et al. "A VLIW architecture for a trace scheduling
compiler." In Proc. ASPLOS-II, pp. 180-92. [Online text
not available]
-
B. R. Rau, D. W. L. Yen, W. Yen, R. A. Towle, "The Cydra 5 Departmental
Supercomputer - Design Philosophies, Decisions, and Trade-offs", IEEE
Computer, 22(1):12-35, January 1989. [Online text not available]
Session 2:
-
S. A. Mahlke et al. "Sentinel Scheduling for VLIW and Superscalar
Processors." In Proc ASPLOS-V, pp.238-247, Oct. 1992. [PDF]
-
W.W. Hwu et al. "The Superblock: An Effective Technique for
VLIW and Superscalar Compilation." In The Journal of Supercomputing,
Kluwer Academic Publishers, 1993, pp. 229-248. [PS]
14. Week of Apr. 17 (Predication)
Session 1:
-
G.S. Tyson. "The effects of predicated execution on branch prediction."
In Proc. MICRO-27, pp.196-206, Nov. 1994. [PDF]
-
Scott A. Mahlke et al. "A comparison of full and partial predicated
execution support for ILP processors." In Proc. ISCA-22, June 1995.
[PDF]
Session 2:
-
David I. August et al. "The program decision logic approach to predicated
execution." In Proc. ISCA-26, pp. 208-19, May 1999. [PDF]
15. Week of Apr. 24 (Potpourri)
Session 1:
-
S.C. Goldstein et al. "PipeRench: a reconfigurable architecture and compiler." In IEEE Computer, pp. 70-77, Apr. 2000.
[PDF]
-
R. Barua, W. Lee, S. Amarasinghe and A. Agarwal. "Maps: a compiler-managed
memory system for RAW machines." In Proc. ISCA-26, pp. 4-15, May
1999. [PDF]
Session 2:
-
David Brooks and Margaret Martonosi. Dynamically Exploiting Narrow Width
Operands to Improve Processor Power and Performance. HPCA-5. Jan, 1999.
[PDF]
And that's all, folks!
Last updated June 6, 2000
Back to CS
851 home page