Differential Multithreading: Recapturing Pipeline Stall Cycles and Enhancing Throughput in Small-Scale Embedded Microprocessors

J. W. Haskins, Jr. and K. Skadron.
In Proc. of the Workshop on Complexity-Effective Design, June 10, 2000. Held in conjunction with the 27th International Symposium on Computer Architecture, Vancouver, BC.

Abstract
This paper presents Differential Multithreading (dMT) as an inexpensive way to achieve high throughput from a single-issue architecture. dMT switches among multiple instruction streams in response to pipeline stall conditions but saves in-flight instructions, thus squashing pipeline bubbles and ensuring maximal utilization of a single pipeline. dMT uses auxiliary pipeline registers to save the state of in-flight but stalled instructions. This squashes bubbles that would otherwise arise from data hazards, branch delays, and cache misses.

This paper describes the pipeline organization necessary to support dMT, explains the advantage of shared-pipeline multithreading, and presents preliminary results which suggest that dMT can substantially increase processor utilization.


Available in postscript