STREAM result

From: Bob Shair (bshair@csci.csc.com)
Date: Fri Apr 04 1997 - 16:21:25 CST


Hi, John,

I hope you remember me from my days at IBM, supporting RS/6000s
(when the comments you made on memory were very useful).

I've downloaded a copy of STREAM, and will try running it on various
systems here. The systems I have access to are mostly designed for
commercial transaction processing, so the results may look somewhat
different from number crunchers, but I trust they'll be of interest.

The first result which I have is from a RS/6000 7012-G30 (2x604 @ 112.5MHz).
I don't know how to use two processors, though, so this is only using one.

The system was basically idle, though in full operational state, with 38
users logged in, CA-Unicenter and all daemons running.

Compiled with xlc -O3 -qhsflt -qtune=604 -qarch=ppc and xlC.C 3.1.1.0
   ./stream_604
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 1000000, Offset = 0
Total memory required = 22.9 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 9999 microseconds.
Each test below will take on the order of 269999 microseconds.
   (= 27 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 48.4849 0.3350 0.3300 0.3400
Scale: 45.7144 0.3530 0.3500 0.3600
Add: 53.3334 0.4602 0.4500 0.5000
Triad: 53.3335 0.4601 0.4500 0.4800

Running the same binary on an IBM PowerSeries 850 (1 604@133MHz)
(this machine should be the same as a 43P).
   ./stream_604
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 1000000, Offset = 0
Total memory required = 22.9 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 9999 microseconds.
Each test below will take on the order of 300000 microseconds.
   (= 30 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 45.7142 0.3590 0.3500 0.3600
Scale: 44.4445 0.3640 0.3600 0.3700
Add: 53.3334 0.4570 0.4500 0.4600
Triad: 53.3335 0.4530 0.4500 0.4600

--
Bob Shair                                         Voice.(217)351-8250 Ext:2421
Systems Consultant                                Fax.(217) 351-7346
CSC-CIS TRIS Division                             E-mail. bshair@csci.csc.com
At the source of the Embarras                     P.O. Box 770
2109 Fox Drive                                    Champaign, IL 61824-0770



This archive was generated by hypermail 2b29 : Tue Apr 18 2000 - 05:23:06 CDT