New Performance-Driven FPGA Routing Algorithms

6. Experimental Results

We have implemented the IGSMT, PFA and IDOM algorithms using C++ in the SUN Unix environment. Our code is available upon request, and all of our benchmarks and routing solutions are available on the World Wide Web at URL http://www.cs.virginia.edu/~robins/. We have also implemented KMB and ZEL, and used each of these as H inside the inner loop of IGSMT, yielding the IKMB and IZEL constructions, respectively. For comparison, we have implemented DOM, as well as the following adaptation of Dijkstra's shortest-paths tree algorithm [6] to the GSA problem:

DJKA -- compute Dijkstra's shortest-paths tree rooted at the source, and then delete edges that are not contained in any source-sink path.

We compared all of these methods (KMB, ZEL, IKMB, IZEL, DJKA, DOM, PFA, IDOM) on the same inputs, both in terms of total wirelength as well as maximum source-sink pathlength. The inputs consisted of uniformly distributed random nets in 20 by 20 weighted grid graphs, where the edge weights modeled congestion induced by previously-routed nets. Congestion was created as follows: starting with a grid graph having unit weights (w=1.00) on all edges, k uniformly-distributed nets (2 - 5 pins each) were routed using KMB. As each net was routed, the weights of the corresponding graph edges were incremented, resulting in a higher average routing-graph edge weight w_avg > 1.00. Three different levels of congestion were thus modeled: (a) none (k=0, w_avg = 1.00), (b) low (k=10, w_avg = 1.28), and (c) medium (k=20, w_avg = 1.55).

For each of these three congestion levels and net size (5 and 8 pins), 50 uniformly-distributed nets were routed on a congested graph (newly-generated for each net), using all eight algorithms. For each net, we normalized the wirelength produced by each heuristic with respect to the wirelength used by KMB; similarly, the maximum source-sink pathlength of each heuristic was normalized to optimal. Table 1 gives the average percent improvement for each congestion level, where a positive value represents an increase (i.e., disimprovement) in the total wirelength (resp. maximum pathlength) with respect to KMB (resp. optimal), while a negative number represents a decrease (i.e., improvement).

Among the four Steiner heuristics (KMB, ZEL, IKMB, IZEL), IZEL has superior performance. The ranking IZEL < IKMB < ZEL < KMB is highly consistent across all net sizes in terms of both wirelength and maximum pathlength, indicating that our iterated constructions outperform the stand-alone, non-iterated versions. Among the four arborescence constructions (DJKA, DOM, PFA, IDOM), PFA and IDOM consistently use the least wirelength (these all yield optimal maximum pathlength). Here too, the ranking is quite consistent in terms of wirelength across all net sizes, namely IDOM < PFA < DOM < DJKA.

On uncongested graphs, both PFA and IDOM outperform KMB in term of wirelength by up to 5.6%. This is interesting since KMB minimizes wirelength only, yet it uses more wirelength than either PFA and IDOM, which only optimize wirelength as a secondary criterion. For uncongested graphs, both PFA and IDOM yield optimal maximum pathlength at almost no wirelength penalty over IZEL; thus, these seem to afford favorable tradeoffs between wirelength and maximum pathlength. Note that IKMB and Iterated 1-Steiner [10] yield identical solutions for geometric instances (when using the Hanan grid as the underlying graph).

We built an actual FPGA router based on these algorithms, and completely routed 14 pre-placed industrial benchmark circuits, containing up to 608 nets each. Our constructions easily adapt to a variety of architectures; in particular, we modeled two distinct FPGA architectures, the first corresponding to Xilinx 3000-series parts [19] (Table 2) and the second corresponding to 4000-series parts [19] (Table 3) - these architectures are identical to those used by the CGE router [3], and the SEGA [13] and GPB [18] routers, respectively. The 3000-series FPGAs used to route the circuits in Table 2 have switch-block flexibility of 6 and 60% channel-edge connectivity, while the 4000-series FPGAs in Table 3 have switch-block flexibility of 3 and 100% channel-edge connectivity. We did not alter the fixed benchmark placements. CPU times to completely route the circuits on a Sun SparcServer 10/514 workstation varied from several minutes for the smallest circuit to several hours for the largest.

We route the nets one at a time, updating the routing-graph edge weights as we proceed to reflect congestion. We employ a net-ordering scheme with a move-to-front heuristic: when infeasibility is encountered in routing a particular net, that net will be routed earlier in subsequent phases, thereby increasing the probability of a successful routing of all nets. Only a few such passes are required to completely route each benchmark.

For each of the circuits, we compared the maximum channel width required by our router using the IKMB algorithm to the best reported results from CGE [3] using the 3000-series architecture (Table 2), as well as to the best reported values for SEGA [13] and GPB [18] using the 4000-series architecture (Table 3). For both types of architectures we are able to route all of the benchmark circuits using significantly smaller channel width than CGE, SEGA and GPB (with these other routers requiring an average of 22%, 26%, and 17% more channel width, respectively, than our router).

To illustrate how minimizing maximum pathlength affects wirelength (and thus channel width), Table 4 shows the maximum channel width required for a successful routing using the IKMB, PFA and IDOM algorithms for each of the circuits. As expected, both PFA and IDOM require larger channel width than IKMB. However, neither PFA nor IDOM require larger channel width than SEGA or GPB (which do not directly minimize maximum source-sink pathlength). Thus, PFA and IDOM simultaneously minimize wirelength and maximum pathlength quite effectively.

Table 5 shows the average increase in wirelength vs. the decrease in maximum pathlength for IKMB, PFA and IDOM on the benchmark circuits. Here the algorithms operate on FPGAs with the same channel width (i.e., the smallest channel width that results in a successful routing for all algorithms). The increase in wirelength for PFA and IDOM (18.2% and 12.8%, respectively) corresponds to the increase in channel width observed in Table 4. Both PFA and IDOM effectively reduce the maximum pathlength (by 9.5% and 10.2% on average, respectively). Figure 7 illustrates our router's solution for the smallest 4000-series benchmark circuit.

Footnote 1: Incomplete routes or global routing only are not useful in practice, since there can be an arbitrarily large increase in channel width in obtaining complete detailed routes from these [17].