Stream for S3600

From: M.J. Rutter (mjr19@cus.cam.ac.uk)
Date: Mon May 18 1998 - 04:30:26 CDT


Dear John,

          Here are some stream results for a Hitachi S3600. It may be
(almost) obsolete, but it is still faster than the average desktop... The
results are one run with array size of 4,000,000, then array size
20,000,000 offset 0 (twice) offset 8 and offset 10. The triad is the only
test which is markedly changed by the offset, and I have not
investigated other offsets.

          If nothing else, this should double the number of Hitachi
machines for which you have data!

          Michael

Compiled with "f77 -W0,'OPT(O(S)),HAP' stream_d.f t_second.f"

Array sizes increased as I don't trust the timing routine very far.

Run on an empty machine as an unprivileged user, except offset != 0 runs,
which were on a busy machine.

----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 4000000
 Offset = 0
 The total memory requirement is 91 MB
 You are running each test 5 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity/precision appears to be 28 microseconds
 The tests below will each take a time on the order
 of 7461 microseconds
    (= 266 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 12407.9100 0.0052 0.0052 0.0052
Scale: 6500.1016 0.0100 0.0098 0.0102
Add: 11320.7547 0.0087 0.0085 0.0091
Triad: 7548.3567 0.0137 0.0127 0.0142
 Sum of a is = 6075000000000.00000
 Sum of b is = 1215000000000.00000
 Sum of c is = 1620000000000.00000
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 20000000
 Offset = 0
 The total memory requirement is 457 MB
 You are running each test 5 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity/precision appears to be 28 microseconds
 The tests below will each take a time on the order
 of 37276 microseconds
    (= 1331 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 12782.6156 0.0258 0.0250 0.0261
Scale: 6346.3102 0.0507 0.0504 0.0514
Add: 11202.1284 0.0433 0.0428 0.0435
Triad: 7031.2157 0.0697 0.0683 0.0706
 Sum of a is = 30375000000000.0000
 Sum of b is = 6075000000000.00000
 Sum of c is = 8100000000000.00000
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 20000000
 Offset = 0
 The total memory requirement is 457 MB
 You are running each test 5 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity/precision appears to be 27 microseconds
 The tests below will each take a time on the order
 of 37321 microseconds
    (= 1382 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 12700.4286 0.0258 0.0252 0.0262
Scale: 6346.0585 0.0507 0.0504 0.0513
Add: 11183.3368 0.0434 0.0429 0.0440
Triad: 6980.5997 0.0696 0.0688 0.0705
 Sum of a is = 30375000000000.0000
 Sum of b is = 6075000000000.00000
 Sum of c is = 8100000000000.00000
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 20000000
 Offset = 8
 The total memory requirement is 457 MB
 You are running each test 5 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity/precision appears to be 28 microseconds
 The tests below will each take a time on the order
 of 37263 microseconds
    (= 1331 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 12235.6900 0.0269 0.0262 0.0274
Scale: 6338.0142 0.0507 0.0505 0.0508
Add: 11437.5581 0.0422 0.0420 0.0424
Triad: 10168.6298 0.0504 0.0472 0.0599
 Sum of a is = 30375000000000.0000
 Sum of b is = 6075000000000.00000
 Sum of c is = 8100000000000.00000
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 20000000
 Offset = 10
 The total memory requirement is 457 MB
 You are running each test 5 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity/precision appears to be 28 microseconds
 The tests below will each take a time on the order
 of 37404 microseconds
    (= 1336 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 12062.7262 0.0273 0.0265 0.0288
Scale: 6351.4747 0.0509 0.0504 0.0512
Add: 11291.9921 0.0426 0.0425 0.0428
Triad: 10819.8273 0.0445 0.0444 0.0449
 Sum of a is = 30375000000000.0000
 Sum of b is = 6075000000000.00000
 Sum of c is = 8100000000000.00000



This archive was generated by hypermail 2b29 : Tue Apr 18 2000 - 05:23:07 CDT