In looking throught your latest posting of the memory bandwidth table,
I was struck by the numbers for the CM2. In particular, the numbers
you report for the SAXPY and SUM show speeds greater than 80 Gbytes/sec.
The numbers I have indicate that the limiting memory bandwidth on
a fully configure CM2 will be
4 bytes/sec/processor * 2000 processors * 10 Mhz = 80 Gbytes/sec
This number should be a "guaranteed not to exceed" performance indicator.
How, then, did TMC produce a larger number? Did they use arrays small
enough to fit inside the registers on the Wyteks? Did they "tweek" the
machine to improve the clock speed? Is there a cache on the Wyteks?
Do you have any other thoughts about how TMC exceeded their speed-of-light
This archive was generated by hypermail 2b29 : Tue Apr 18 2000 - 05:23:02 CDT