# Physically-Based Compact Thermal Modeling — Achieving Parametrization and Boundary Condition Independence

Wei Huang<sup>†</sup>, Mircea R. Stan<sup>†</sup>, and Kevin Skadron<sup>‡</sup> †Dept. of Electrical and Computer Engineering, ‡Dept. of Computer Science University of Virginia Charlottesville, VA 22904, USA

# Abstract

This paper presents an approach of compact thermal modeling — HotSpot, which is parameterized according to design geometrical dimensions and material physical properties. While most existing compact thermal modeling methods facilitate thermal analysis of existing package designs, the HotSpot modeling method is more suitable for the exploration of new designs at both the die level and the package level due to its physically-based parametrization characteristics. Although it may not be as "compact" as other modeling approaches, HotSpot provides much more thermal information of the design, especially at the die level, with negligible computational overhead. We also show that the HotSpot achieves reasonable boundary condition independence (BCI) by comparing it with a DELPHI compact thermal model for a benchmark BGA chip under the same set of boundary conditions.

## 1. Introduction

Along with the continued scaling of VLSI systems, the ever-increasing power density and the resultant difficulties in managing temperatures have become one of the major challenges for system designers of all design levels. It is well known that operating temperature affects performance, power consumption and reliability of a microelectronic system. Obviously, it is almost impossible to model temperature and analyze the thermal effects of a system together with the environment in their full details. Using numerical analysis methods, such as FEM, is also time-consuming and cost-inefficient, and hence is not a proper way to model temperature, either. The solution is to build compact thermal models (CTMs) with reasonably accurate temperature predictions at different levels — for example, transistor level, die level, package level and board level, etc [1].

A top-down hierarchy of compact thermal models would

be helpful for designers at different design levels [1]. There are several requirements for a CTM to be useful at a particular design level. The first requirement is to provide enough thermal information at that level. For example, for packagelevel CTM, previous studies [2] [3] have shown that a single junction-to-case thermal resistance is not adequate for package design, because the information of temperature distribution across the package is lost. This will lead to a package design that is not thermally optimized. Instead, multiple nodes are needed on the package surfaces. Similarly, a CTM at the die level should consist of more nodes than a single junction node in order to give temperature distribution information across the die. The second requirement for a CTM at a particular design level is to model just at the granularity that is needed and hide the details of the lower levels, so that the CTM itself is adequate for thermal analysis at that level. For example, package-level CTMs, such as the DELPHI models, hide the lower level details of the package, including the die, the thermal attach, the solder balls/lead frames, etc., because these details are intellectual properties that need to be protected by the vendors. Similarly, a CTM at the die level should also hide the lower level details of the die, such as circuit structures, if it needs to be protected. On the other hand, knowing these details may help to make the CTM physically-based and fully parameterizable, as will be seen in Section 3. The third requirement for a CTM is to be reasonably boundary condition independent (BCI). By achieving BCI, the variation of the environment does not affect the compact thermal model. the DEL-PHI package-level compact thermal models achieve BCI by finding a thermal resistance network with minimum overall error when applied to different boundary conditions. Again, we will show in later sections that if the structure details are known and the CTM is physically based, the CTM is intrinsically BCI with reasonable accuracy.

In this paper, we present a compact thermal modeling approach — *HotSpot*, which is physically based on the design geometry and material properties. Compared to our previous attempt of the HotSpot modeling approach that was pre-



Figure 1. The stacked layers of different materials in the HotSpot modeling approach. Heat generating surface and major heat transfer paths are also shown.

sented in THERMINIC'02 [4], the new HotSpot presented in this paper has the following improvements — First, the difference of heat conducting area among different layers is taken into account. Second, the fitting factors for lateral thermal resistance calculations and isothermal surface estimation are replaced by physics-based formulas and more detailed layer division. All these improvements, which are shown in Section 2, make the HotSpot models better resemble to real designs and fully physically-based and parameterized. The compact thermal model examples of HotSpot shown in this paper are mainly at the die and package level, but the modeling method can also be easily extended to other design levels.

The paper is organized as follows. Section 2 shows the modeling details of HotSpot at the die level and package level. Section 3 discusses the issues of compact thermal model parametrization and BCI by comparing both the modeling method differences and the results of HotSpot and DELPHI models. Section 4 discusses the limitations and advantages of HotSpot. Finally, Section 5 concludes the paper and points out possible future work.

## 2. HotSpot Modeling Details

When constructing a compact thermal model using the HotSpot approach, one needs to first identify different layers of the design that are made of different materials. This requires that the designer have some prior detailed knowledge of the designed structure. These layers are then stacked on top of each other as shown in Fig. 1. The layers can, for instance, represent heat sink, heat spreader, silicon substrate, on-chip interconnect layer, C4 pads, ceramic packaging substrate, solder balls, etc. Usually, the surface that generates heat is the surface of the silicon substrate layer.

Each layer is then divided into a number of blocks. For example, in Fig. 2(c), the silicon substrate layer can be divided according to architecture-level blocks (only three blocks are shown for simplicity) or finer granularity, de-



Figure 2. (a) Area division of larger layers (top view). (b) side view of one block with its lateral and vertical thermal resistances. (c) a layer, for example, the silicon die, can be divided into arbitrary number of blocks if detail thermal information is needed (top view).

pending on what the die-level design requires. For other layers that requires less detailed thermal information, one can simply divide that layer as illustrated in Fig. 2(a). The center shaded part in a layer shown by Fig. 2(a) is the area covered by another adjacent layer such as the one shown in Fig. 2(c). This center part can have the same number of nodes as its smaller neighbor layer or collapse those nodes into a single node, depending on the accuracy and computation overhead requirements. The remaining peripheral part in Fig. 2(a) is then divided into four trapezoidal blocks, each is assigned with one node. Each block in each layer has a vertical thermal resistance and several lateral resistances, which model vertical heat transfer to its neighbor layers and lateral heat spreading/constriction within the layer itself, respectively. Fig. 2(b) shows a side view of one block with both the lateral and the vertical resistances. Vertical resistance can be calculated by  $R_{vertical} = t/(k \cdot A)$ , where t is the thickness of that layer, k is the thermal conductivity of the material of that layer, and A is the cross-sectional area of the block. Calculating lateral thermal resistance is not as straightforward as the vertical resistance. This is because the complex nature of modeling heat spreading and constriction. One can consider the lateral thermal resistance of one block is the spreading/constriction thermal resistance of the other parts within a layer to that specific block. Details of lateral thermal resistance derivation and formulas can be found in [5] and [6].

For layers that have surfaces interfacing with the ambient, i.e. the boundaries, we assume that each surface has a constant heat transfer coefficient h. The corresponding thermal resistance can then be calculated as  $R_{convection} = 1/(h \cdot A)$ , where A is the surface area. Strictly speaking, these convection thermal resistances are not part of the com-



Figure 3. Steady-state validation of the HotSpot compact thermal model: (a) Testing chip measurements (b) Results from the HotSpot model with errors less than 5%.

pact thermal model, because they include the information of the environment. If the environment changes, i.e. the boundary condition changes, the value of these convection resistances also change. On the other hand, for a particular design, the values of all the other thermal resistances shown in Fig. 2(a)–(c) should not change if the compact thermal model is BCI.

We have validated the HotSpot modeling method by comparing with a commercial thermal testing chip [7]. The thermal testing chip has a 9x9 grid of heat dissipators, which can be turned on or off individually, with an embedded thermal sensor for each grid cell. The testing chip can measure both steady-state and transient temperatures for each of the grid cells. We built the same 9x9 gridlike chip structure in our thermal model. In validation, we neglected the secondary heat flow path from the die to the PCB, because the testing chip is wire bonded and plugged in a plastic socket that has very low thermal conductivity. We then turned on sets of power dissipators in the testing chip and assigned the same power values at the same locations in our thermal model. Fig. 3 shows one example of the validation experiments we have performed. In the figure are the steady-state thermal plots using measurements from the testing chip and results from our thermal model. The percentage error values are calculated by  $(T_{model} - T_{chip})/(T_{chip} - T_{ambient})$ . The power density in this experiment is  $50W/cm^2$  in the heat dissipating area (the 3x3 lower-right corner). As can be seen, the HotSpot thermal model is reasonably accurate, with the worst case error values for steady-state temperatures less than 5%.

The HotSpot modeling approach has been successfully used to build compact thermal models in research areas such as dynamic thermal management (DTM) techniques for microprocessors [8] and die-level thermal-aware computeraided designs [9]. Examples of HotSpot compact thermal models can be found in [8] and [9].

From the above, we can see that the HotSpot model-

ing approach is different to existing compact thermal model methods, such as DELPHI [2] [10]. The differences are as follows. First, DELPHI models are at the package level, hence with only one node for the die. This is adequate for the package vendors and board-level designers. On the other hand, for the die-level designers, HotSpot models has more nodes for the die structures. Second, DELPHI models hide the packaging details due to the requirements of package vendors, while HotSpot models have detailed packagelevel and die-level information. Third, the thermal resistances in the DELPHI models are extracted from detailed simulations, while the thermal resistances of HotSpot models are calculated based on dimensions and physical properties of materials. All these differences result in different applications for DELPHI and HotSpot models. DELPHI models are ideal for thermal analysis of existing package designs without revealing details of the package, while HotSpot models are more suitable for explorations of new die-level and package-level designs. Also, the HotSpot models are intrinsically parameterizable and BCI, as will be discussed in Section 3.

## 3. Parametrization and BCI in HotSpot

#### 3.1. Parametrization of Compact Thermal Models

Parametrization of compact thermal models is desirable and has drawn attention from researchers. In [11], the author points out that achieving a sensible parametrization of compact thermal models is next to impossible for the chosen very simple structure of the DELPHI models. This is true for a modeling approach such as DELPHI, because the model structure consisting of only a few thermal resistances makes it impossible to take into account the actual very complex package structure, together with the variations of thermal conductivities and the heat spreading/constriction effects within the die and the package. On the other hand, the HotSpot modeling approach can be better parameterized due to its physically-based nature. The cost for parametrization is that HotSpot models are usually more complex than the DELPHI models.

In HotSpot models, the variations of thermal conductivities over temperature still cannot be fully parameterized. But one can work around this problem by performing several rounds of thermal analysis and updating thermal conductivities based on temperature readings from the compact thermal model. Eventually, the temperature and thermal conductivity will converge to fixed values.

#### **3.2. Boundary Condition Independence (BCI)**

Achieving BCI is essential to compact thermal models. If the model changes whenever the boundary condi-



Figure 4. Thermal resistances network for (a) the DELPHI model and (b) the HotSpot model of a DELPHI BGA benchmark chip, extracted from [10].

tions change, the model would be useless. Traditionally, researchers in the package compact thermal modeling community usually adopt the DELPHI approach to achieve BCI, that is, finding a thermal resistance network with minimum overall error when applied to different boundary conditions. The resistance values are extracted from detailed thermal simulations with the same package structure. Such simulations can be performed in some numerical analysis tools.

When using the HotSpot modeling approach, because there is no data extraction procedures and all the resistance values are calculated from the physical dimensions and properties of the materials, the model itself is intrinsically BCI. In order to validate that the HotSpot models can achieve reasonable BCI, we compare a HotSpot model with a DELPHI model for the BGA benchmark chip in [10]. The dimensions of the BGA chip and the set of boundary conditions are both taken from the specifications in [10]. The model structures of both DELPHI and HotSpot are shown in Fig. 4. In this comparison, the notion of quarter symmetry can be applied because there is only one node for the die. Therefore, only a quarter of the package is sketched for the HotSpot and the DELPHI model in Fig. 4.

The temperature readings from both models are listed in Table 1. The heat generated at the die surface is 2.5W used as the input to both models. As can be seen from Table 1, the HotSpot model achieves reasonable BCI. For the listed five boundary conditions, it yields almost the same temperature readings as the DELPHI model. The worst case percentage error is 5.8%. One possible reason of the error is the surface division ratio is fixed according to the area of the smaller neighbor layer, in this case, it is the die area. This division ratio might not be exactly the optimal ratio, but it is near the optimal ratio. In a previous work [12], the author argues that the surface division ratio should be determined by the heat flux distribution on a particular surface.

| # | b.c.     | HotSpot | DELPHI  | error  |
|---|----------|---------|---------|--------|
| 1 | DCP-1    | 16.79   | 16.68   | 0.66%  |
| 2 | DCP-2    | 19.94   | 20.00   | -0.30% |
| 3 | DCP-3    | 66.42   | 62.78   | 5.80%  |
| 4 | DCP-4    | 2960.00 | 3070.00 | -3.58% |
| 5 | infinite | 10.20   | 10.56   | -3.41% |

Table 1. Comparison of HotSpot model and DELPHI model for the DELPHI BGA benchmark chip under the same set of boundary conditions. Temperatures are in Celcius and with respect to ambient temperature.

He also shows that the heat flux distribution function f(-) of the top surface develops a peak just above the die area. Therefore, it is evident that using the die area to divide the top surface is feasible.

## 4. Limitations and Advantages

So far, the modeling details and major characteristics of the HotSpot compact thermal modeling approach have been presented. As we can see, HotSpot modeling approach has its limitations. First, it is not as "compact" as other existing compact thermal models such as the DELPHI models. But the number of nodes are still within a manageable amount, and the computational overhead is also negligible compared to detailed numerical models. Second, when it comes to analyze or release a fixed compact thermal model for an existing design or a final product, HotSpot models are not as friendly as the DELPHI models to the users. This is due to the complexity of the HotSpot model and the revealing of the design details. Third, at the same level of complexity, HotSpot model is not as accurate as the DELPHI model. This is because the DELPHI model is extracted from detailed simulations, which is still the most accurate way to model thermal effects, while the HotSpot model is essentially a simplified version of the detailed model, therefore can not achieve the same level of accuracy as the detailed simulations and hence the DELPHI models. Fourth, the HotSpot model is not as BCI as the DELPHI model. This is due to the surface area division method used by HotSpot model is not exactly the optimal one, although it is proved to be a feasible one as shown in Section 3 and [12].

However, the HotSpot modeling approach also has its significant advantages. The advantages of HotSpot are mainly due to the fact that it is parameterizable and BCI. Parametrization is useful because a variety of design explorations can be carried out by only changing the dimension and material parameters without reconstructing the whole compact thermal model through detailed simulations. For example, using HotSpot models, one can easily find the optimum die thickness by simply sweeping the die thickness parameter and keeping all the other parameters constant, given the package and maximum power density are known. Another example would be investigating the effect of different types of heat spreaders or heat sinks. One can easily add/change the layers of heat spreader or heat sink by following the HotSpot modeling method in Section 2. Also, it is important to notice that HotSpot can be used to study hypothetical systems for which physical implementations and thermal measurements cannot yet be obtained. From these examples, we can see that the HotSpot modeling approach is very suitable for die level and package level design explorations.

## 5. Conclusions and Future Work

In this paper, we have presented a physically-based compact thermal modeling approach — HotSpot, which is also parameterized and BCI. The HotSpot modeling approach is more suitable for exploring new designs, while existing modeling approaches, such as DELPHI, are more suitable for accurately analyze existing designs. In addition, we believe that achieving the parametrization of compact thermal models is a significant contribution of HotSpot to the research area of microelectronic thermal design and analysis. Future work consists of developing dynamic version of the HotSpot modeling approach. Modeling active cooling effects to different types of interfacing surfaces is also an interesting topic.

## References

[1] M-N. Sabry. Compact thermal models for electronic systems. *Components and Packaging Technologies,* 

IEEE Transactions on, 26(1):179–185, March 2003.

- [2] H. Rosten and C. Lasance. Delphi: The development of libraries of physical models of electronic components for an integrated design environment. In *Proc. Conf. Int. Elec. Pack. Soc.*, 1994.
- [3] A. Bar-Cohen, T. Elperin, and R. Eliasi. θ<sub>jc</sub> charaterization of chip packages – justification, limitations and future. *Components, Hybrids, Manufacturing Technology, IEEE Transactions on*, 12:724–731, December 1989.
- [4] K. Skadron, M. R. Stan, and *et al.* Hotspot: Techniques for modeling thermal effects at the processorarchitecture level. In *Proc. 8th THERMINIC*, pages 169–172, October 2002.
- [5] S. Lee, S. Song, V. Au, and K. Moran. Constricting/spreading resistance model for electronics packaging. In *Proc. AJTEC*, pages 199–206, March 1995.
- [6] K. Skadron, M. R. Stan, W. Huang, S. Velusamy, K. Sankaranarayanan, and D. Tarjan. Temperatureaware microarchitecture: Extended discussion and results. Technical Report CS-2003-08, University of Virginia, Computer Science Department, 2003.
- [7] V. Székely, C. Márta, M. Renze, G. Végh, Z. Benedek, and S. Török. A thermal benchmark chip: Design and applications. *Components, Packaging, and Manufacturing Technology–Part A, IEEE Transactions on*, 21(3):399–405, September 1998.
- [8] K. Skadron, M. R. Stan, W. Huang, S. Velusamy, K. Sankaranarayanan, and D. Tarjan. Temperatureaware microarchitecture. In *Proc. ISCA-30*, pages 2– 13, June 2003.
- [9] W. Huang, M. R. Stan, K. Skadron, K. Sankaranarayanan, S. Ghosh, and S.Velusamy. Compact thermal modeling for temperature-aware design. In *Proc. 41st DAC*, pages –, June 2004.
- [10] C. J. M. Lasance. Two benchmarks to facilitate the study of compact thermal modeling phenomena. *Components and Packaging Technologies, IEEE Transactions on*, 24(4):559–565, December 2001.
- [11] C. J. M. Lasance. Recent progress in compact thermal models. In *Proc. 19th IEEE SEMI-THERM Symp.*, pages 290–299, March 2003.
- [12] E. G. T. Bosch. Thermal compact models: An alternative approach. *Components and Packaging Technologies, IEEE Transactions on*, 26(1):173–178, March 2003.