Re: Attainable memory bandwidth

From: Peter Klausler (pmk@craycos.com)
Date: Fri Oct 04 1991 - 14:22:14 CDT


> Thanks for the clarification. i was deliberately ignoring the I/O port
> since it is not used in the floating-point kernels that I was discussing
> and since I don't understand the details well enough to talk about it!

Has to be there to sell the machine, though. I brought up the point because
your analysis was not showing the full memory bandwidth potential of CRI's
machine.

A further point to consider is that the "memory bandwidth" of a Cray-class
machine must really be specified as three distinct values:
        * Maximum processor bandwidth (CPUS * PORTS/CPU * BANDWIDTH/PORT)
        * Maximum bank bandwidth (BANKS * BANDWIDTH/BANK)
        * Maximum arbitration mechanism bandwidth (SECTIONS * BANDWIDTH/SECTION)

In practice, the minimum of these three values constrains the performance
you'll see on a real code. As a rule of thumb, total system bandwidth in
references/second should not be less than the processing rate of
floating-point results/second; this was the embarrassing imbalance of the
early CRAY-2 machines.

This view of memory system bandwidth shows why codes can get into trouble
with memory strides divisible by powers of two as small as 4 and 8. Such
strides cut down the number of sections/quadrants/octants being used, limiting
the arbitration mechanism bandwidth.



This archive was generated by hypermail 2b29 : Tue Apr 18 2000 - 05:23:01 CDT