Stream results

From: Steve Turner (turner@csrd.uiuc.edu)
Date: Sat Jan 23 1993 - 15:13:35 CST


Your posting on comp.sys.super piqued my curiosity, so I snarfed the
benchmark and ported it to our machines. I work for CSRD, and we have
a bunch of Alliant machines here, as well as our own home-brewed
agglomeration of 4 of FX/80s called Cedar. You probably have heard of
us, so I'll just tell you what I did to get the results and then give
results.

I made two changes to the source code. First, I replaced the calls to
"second" with calls to the High Resolution Clock timer facility
(hrcget and hrcdelta) This is a microsecond resolution timer used for
performance evaluation, so I think the results should be accurate.
Second, I made slight changes to the result FORMAT statements, since
Alliant's fortran compiler assume carriage control info is used.
I will send you a copy of the altered source, if you want, but since
the changes were so trivial it doesn't seem necessary.

The results for an FX/80 with 8 processors (~11.75 MHz clock rate)
compiled with Alliant's fortran compiler using only the "-O" option:

--------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLEPRECISION word
--------------------------------------
 Timing calibration ; time = 630.571000000000 hundredths of a second
 Increase the size of the arrays if this is <30 and your clock precision is =<1/100 second
 ---------------------------------------------------
 Function Rate (MB/s) RMS time Min time Max time
 Assignment: 72.8155 0.0692 0.0659 0.0739
 Scaling : 71.5990 0.0700 0.0670 0.0746
 Summing : 76.2793 0.0971 0.0944 0.1028
 SAXPYing : 76.5143 0.0998 0.0941 0.1119

----------------

The results for an FX/2800 using a 14 processor "cluster", compiled
with Alliant's fortran compiler using just the "-O" option are:
(sorry, I don't know the clock rate of the i860's)
--------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLEPRECISION word
--------------------------------------
 Timing calibration ; time = 15.7310000000000 hundredths of a second
 Increase the size of the arrays if this is <30 and your clock precision is =<1/100 second
 ---------------------------------------------------
 Function Rate (MB/s) RMS time Min time Max time
 Assignment: 144.6655 0.0342 0.0332 0.0362
 Scaling : 150.1877 0.0328 0.0320 0.0347
 Summing : 135.1859 0.0549 0.0533 0.0584
 SAXPYing : 125.3264 0.0590 0.0575 0.0618
----------------

Since this seemed to run too fast, I bumped up the array size by one
order of magnitude and ran it again. The "long stream" results are:
--------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLEPRECISION word
--------------------------------------
 Timing calibration ; time = 185.139000000000 hundredths of a second
 Increase the size of the arrays if this is <30 and your clock precision is =<1/100 second
 ---------------------------------------------------
 Function Rate (MB/s) RMS time Min time Max time
 Assignment: 309.0394 0.1650 0.1553 0.1882
 Scaling : 305.9273 0.1658 0.1569 0.1834
 Summing : 297.8160 0.2528 0.2418 0.2889
 SAXPYing : 291.9708 0.2620 0.2466 0.3141
----------------

I plan on porting it to Cedar, too, but this will require modification
of the array declarations in order to distribute the arrays to the
global memory. I'll send details along with the results once I get them.

st



This archive was generated by hypermail 2b29 : Tue Apr 18 2000 - 05:23:02 CDT