Control-Theoretic Techniques and Thermal-RC Modeling for Accurate and Localized Dynamic Thermal Management

K. Skadron, T. Abdelzaher, and M. R. Stan
In Proc. of the 2002 International Symposium on High-Performance Computer Architecture, February, 2002, Cambridge, MA.

This paper proposes the use of formal feedback control theory as a way to implement adaptive techniques in the processor architecture. Dynamic thermal management (DTM) is used as a test vehicle, and variations of a PID controller (Proportional-Integral-Differential) are developed and tested for adaptive control of fetch "toggling." To accurately test the DTM mechanism being proposed, this paper also develops a thermal model based on lumped thermal resistances and thermal capacitances. This model is computationally efficient and tracks temperature at the granularity of individual functional blocks within the processor. Because localized heating occurs much faster than chip-wide heating, some parts of the processor are more likely to be "hot spots" than others.

Experiments using Wattch and the SPEC2000 benchmarks show that the thermal trigger threshold can be set within 0.2 degrees of the maximum temperature and yet never enter thermal emergency. This cuts the performance loss of DTM by 65% compared to the previously described fetch toggling technique that uses a response of fixed magnitude.

This paper contains four errors that require clarification.

  1. Equation (4) has a typo: to be correct, there should be a minus sign between the two terms.  Otherwise temperature can only increase.
  2. The thermal resistance of the package is neglected, but susbsequent preliminary simulations show that this resistance is signficant.
  3. The areas in Table 3 are too small by a factor of 10.  Note that correcting these will lead to smaller thermal resistances and correspondingly smaller rises in temperature, on the order of only 1 degree.  But early measurements suggest that the effect of the package (item #2 above) raises this temperature by about a factor of 10.  This means that the temperature differentials reported in this paper are approximately correct, and certainly adequate to evaluate the PID-fetch-throttling scheme that we propose.
  4. We calculate Cblock and Rnorm values using 100 * area for Cblock and 1e-6/area for Rnorm. However, in Table 3, R values for the LSQ, instruction window, and branch predictor don't match the equation. At some point we revised our area estimates and only part of the table, not the area column itself.

Last updated 9 Aug. 2002