**Previous message:**Daan Sandee: "Re: Fortran 90 in the US DoD"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ]

Your program was using cm_timer_cm_read_busy, presumably inserted by Alex.

Unfortunately, this function and other functions that return a CM time to

the caller are BROKE on CMSS 6.1. So I had to revert to the usual

CM_timer_print which prints to standard error.

Also, the optimizer did optimize away all of your code.

So I returned to my own program, and did basically the same thing, using

primitives that were *not* optimized away.

Now I can't give you a spread with RMS error on the times, but just by

looking at consecutive runs I can tell you that the number of digits I

supply indicates the accuracy.

The following are for N=1,000,000 repeat count NITER=125, making for

exactly 1 GB (decimal, that is) per stream (double precision).

Times in seconds.

CM-2 8K 8MHz CM-2 4K 8MHz CM-2 4K 7MHz bytes/tick

*optimized* (256 PEs) (128 PEs) (128 PEs) (approx)

a = a + 1.d0 0.523 1.045 1.201 2

a = a + b 0.713 1.425 1.633 2

a = a + 2.d0*b 0.717 1.434 1.642 2

*unoptimized*

a = 2.d0 0.364 0.727 0.833 1.333

a = b 0.543 1.085 1.242 2

a = 2.d0*a 0.523 1.046 1.201 2

a = b + c 0.716 1.432 1.642 2

a = c + 2.d0*b 0.750 1.500 1.719 2

This looks like a speed of 2 bytes per PE per clock tick.

Which is less than the nominal rate of 1 slice per PE per tick.

Daan Sandee sandee@think.com

Thinking Machines Corporation

Cambridge, Mass 02142

**Previous message:**Daan Sandee: "Re: Fortran 90 in the US DoD"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ]

*
This archive was generated by hypermail 2b29
: Tue Apr 18 2000 - 05:23:02 CDT
*