John,
[For what its worth...]
Here are some results of your streams program on a multi-processor
Challange @ 150 Mhz running IRIX 5.2.
Included are three cases:
- sequential code
- parallel code -- compiled with -pfa, the POWER FORTRAN accelerator.
-- on a 2 CPU Challange
-- on a 6 CPU Challange
The -pfa does all sorts of things like loop unrolling, some data
dependency analysis, etc. (you probably know more abt this than i).
I have the impression that (some) improvement can be made by
finding better compiler options.
Cheers,
-- Robert van Liere
Included for each of three cases is:
- compiler options
- uname -a
- hardware inventory | grep 150
- a.out output
1. --------------------------------------------------------------------
; f77 -mips2 -O3 -non_shared stream_d.f
; uname -a
IRIX artemis 5.2 02282015 IP19 mips
; a.out
--------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLEPRECISION word
--------------------------------------
Timing calibration ; time = 159.0000024065375 hundredths of a second
Increase the size of the arrays if this is <30
and your clock precision is =<1/100 second
---------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Assignment: 62.7451 0.5231 0.5100 0.5400
Scaling : 58.1819 0.5520 0.5500 0.5600
Summing : 63.1580 0.7700 0.7600 0.7800
SAXPYing : 70.5882 0.6920 0.6800 0.7000
2. --------------------------------------------------------------------
; f77 -pfa -mips2 -O3 -non_shared stream_d.f
; hinv | grep 150
2 150 MHZ IP19 Processors
; uname -a
IRIX artemis 5.2 02282015 IP19 mips
; a.out
--------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLEPRECISION word
--------------------------------------
Timing calibration ; time = 178.0000081285834 hundredths of a second
Increase the size of the arrays if this is <30
and your clock precision is =<1/100 second
---------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Assignment: 133.3337 0.2461 0.2400 0.2600
Scaling : 139.1304 0.2441 0.2300 0.2500
Summing : 133.3338 0.3721 0.3600 0.3800
SAXPYing : 141.1764 0.3551 0.3400 0.3600
3. --------------------------------------------------------------------
; f77 -pfa -mips2 -O3 -non_shared stream_d.f
; hinv | grep 150
6 150 MHZ IP19 Processors
; uname -a
IRIX zeus 5.2 02282015 IP19 mips
; a.out
--------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLEPRECISION word
--------------------------------------
Timing calibration ; time = 80.99999930709600 hundredths of a second
Increase the size of the arrays if this is <30
and your clock precision is =<1/100 second
---------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Assignment: 355.5568 0.1018 0.0900 0.1300
Scaling : 355.5559 0.1015 0.0900 0.1200
Summing : 369.2308 0.1444 0.1300 0.1600
SAXPYing : 369.2304 0.1467 0.1300 0.1700
--------------------------------------------------
Robert van Liere
robertl@cwi.nl
<http://www.cwi.nl/cwi/people/Robert.van.Liere.html>
This archive was generated by hypermail 2b29 : Tue Apr 18 2000 - 05:23:04 CDT