Re: CM2 bandwidth

From: Daan Sandee (sandee@Think.COM)
Date: Tue Dec 24 1991 - 13:10:39 CST


Your program was using cm_timer_cm_read_busy, presumably inserted by Alex.
Unfortunately, this function and other functions that return a CM time to
the caller are BROKE on CMSS 6.1. So I had to revert to the usual
CM_timer_print which prints to standard error.
Also, the optimizer did optimize away all of your code.
So I returned to my own program, and did basically the same thing, using
primitives that were *not* optimized away.
Now I can't give you a spread with RMS error on the times, but just by
looking at consecutive runs I can tell you that the number of digits I
supply indicates the accuracy.
The following are for N=1,000,000 repeat count NITER=125, making for
exactly 1 GB (decimal, that is) per stream (double precision).
Times in seconds.

                  CM-2 8K 8MHz CM-2 4K 8MHz CM-2 4K 7MHz bytes/tick
 *optimized* (256 PEs) (128 PEs) (128 PEs) (approx)
  a = a + 1.d0 0.523 1.045 1.201 2
  a = a + b 0.713 1.425 1.633 2
  a = a + 2.d0*b 0.717 1.434 1.642 2
 *unoptimized*
  a = 2.d0 0.364 0.727 0.833 1.333
  a = b 0.543 1.085 1.242 2
  a = 2.d0*a 0.523 1.046 1.201 2
  a = b + c 0.716 1.432 1.642 2
  a = c + 2.d0*b 0.750 1.500 1.719 2

This looks like a speed of 2 bytes per PE per clock tick.
Which is less than the nominal rate of 1 slice per PE per tick.

Daan Sandee sandee@think.com
Thinking Machines Corporation
Cambridge, Mass 02142



This archive was generated by hypermail 2b29 : Tue Apr 18 2000 - 05:23:02 CDT