D. Tarjan, K. Skadron and M. Stan.
In Workshop on Complexity Effective Design (WCED), June 2004
Abstract
The increasing pipeline depth, aggressive clock rates and
execution width of modern processors require ever more accurate dynamic branch
predictors to fully exploit their potential. Recent research on ahead pipelined
branch predictors [11, 19] and branch predictors based on perceptrons [10,11]
have offered either increased accuracy or effective single cycle access times,
at the cost of large hardware budgets and additional complexity in the branch
predictor recovery mechanism. Here we show that a pipelined perceptron predictor
can be constructed so that it has an effective latency of one cycle with a
minimal loss of accuracy. We then introduce the concept of a precomputed local
perceptron, which allows the use of both local and global history in an ahead
pipelined perceptron. Both of these two techniques together allow this new
perceptron predictor to match or exceed the accuracy of previous designs except
at very small hardware budgets, and allow the elimination of most of the
complexity in the rest of the pipeline associated with overriding predictors.