Results for IBM SP2/P2SC-thin node (120MHz)

From: ce107@cfm.brown.edu
Date: Sat May 03 1997 - 14:50:41 CDT


Fresh out of the Mauiu P2SC system. Unfortunately the memory is only
128MB per node and thus I cannot use a large enough problem size to
get 20 clock ticks with the cpu timer. :-(
                                                Constantinos
C version - cpu timer - xlc -O3 -qarch=pwr2 -qtune=pwr2 -Q
offset 0 - repeated 4 times
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 4400000, Offset = 0
Total memory required = 100.7 MB.
Each test is run 200 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 9999 microseconds.
Each test below will take on the order of 80000 microseconds.
   (= 8 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 782.2222 0.1023 0.0900 0.1200
Scale: 782.2222 0.1049 0.0900 0.1500
Add: 880.0000 0.1351 0.1200 0.1500
Triad: 812.3077 0.1363 0.1300 0.1500
96.5u 0.1s 2:08 75% 8+102663k 0+0io 693pf+0w
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 4400000, Offset = 0
Total memory required = 100.7 MB.
Each test is run 200 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 9999 microseconds.
Each test below will take on the order of 69999 microseconds.
   (= 7 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 782.2222 0.1025 0.0900 0.1200
Scale: 782.2222 0.1052 0.0900 0.1200
Add: 960.0000 0.1355 0.1100 0.1400
Triad: 880.0000 0.1356 0.1200 0.1600
96.3u 0.1s 1:37 98% 7+102709k 0+0io 7pf+0w
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 4400000, Offset = 0
Total memory required = 100.7 MB.
Each test is run 200 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 9999 microseconds.
Each test below will take on the order of 69999 microseconds.
   (= 7 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 704.0000 0.1027 0.1000 0.1300
Scale: 782.2222 0.1059 0.0900 0.1100
Add: 880.0000 0.1358 0.1200 0.1500
Triad: 880.0000 0.1350 0.1200 0.1400
96.5u 0.1s 1:37 99% 7+102719k 0+0io 0pf+0w
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 4400000, Offset = 0
Total memory required = 100.7 MB.
Each test is run 200 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 9999 microseconds.
Each test below will take on the order of 69999 microseconds.
   (= 7 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 880.0000 0.1032 0.0800 0.1100
Scale: 782.2222 0.1048 0.0900 0.1100
Add: 880.0000 0.1360 0.1200 0.1400
Triad: 880.0000 0.1347 0.1200 0.1400
96.3u 0.1s 1:37 99% 15+102762k 0+0io 3pf+0w

F77 version - cpu timer - xlf -O3 -qarch=pwr2 -qtune=pwr2 -Q
offset 0 - repeated 4 times
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 4400000
 Offset = 0
 The total memory requirement is 100 MB
 You are running each test 200 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity/precision appears to be 10000 microseconds
 The tests below will each take a time on the order
 of 70000 microseconds
    (= 7 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 782.2222 .1022 .0900 .1100
Scale: 782.2222 .1045 .0900 .1100
Add: 880.0000 .1347 .1200 .1500
Triad: 960.0000 .1349 .1100 .1400
 Sum of a is = 0.145456952139311120E+243
 Sum of b is = 0.290913904286721377E+242
 Sum of c is = 0.387885205746326088E+242
96.7u 0.2s 1:38 98% 19+102759k 0+0io 53pf+0w
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 4400000
 Offset = 0
 The total memory requirement is 100 MB
 You are running each test 200 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity/precision appears to be 10000 microseconds
 The tests below will each take a time on the order
 of 70000 microseconds
    (= 7 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 782.2222 .1030 .0900 .1100
Scale: 782.2222 .1049 .0900 .1200
Add: 880.0000 .1349 .1200 .1500
Triad: 880.0000 .1348 .1200 .1500
 Sum of a is = 0.145456952139311120E+243
 Sum of b is = 0.290913904286721377E+242
 Sum of c is = 0.387885205746326088E+242
96.9u 0.2s 1:38 98% 19+102718k 0+0io 1pf+0w
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 4400000
 Offset = 0
 The total memory requirement is 100 MB
 You are running each test 200 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity/precision appears to be 10000 microseconds
 The tests below will each take a time on the order
 of 70000 microseconds
    (= 7 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 782.2222 .1021 .0900 .1100
Scale: 782.2222 .1051 .0900 .1100
Add: 880.0000 .1337 .1200 .1400
Triad: 880.0000 .1354 .1200 .1400
 Sum of a is = 0.145456952139311120E+243
 Sum of b is = 0.290913904286721377E+242
 Sum of c is = 0.387885205746326088E+242
96.7u 0.0s 1:37 98% 19+102875k 0+0io 7pf+0w
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 4400000
 Offset = 0
 The total memory requirement is 100 MB
 You are running each test 200 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity/precision appears to be 10000 microseconds
 The tests below will each take a time on the order
 of 70000 microseconds
    (= 7 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 782.2222 .1027 .0900 .1100
Scale: 782.2222 .1042 .0900 .1100
Add: 880.0000 .1353 .1200 .1400
Triad: 880.0000 .1347 .1200 .1400
 Sum of a is = 0.145456952139311120E+243
 Sum of b is = 0.290913904286721377E+242
 Sum of c is = 0.387885205746326088E+242
96.8u 0.2s 1:37 99% 19+102725k 0+0io 1pf+0w

C version - hires real timer - xlc -O3 -qarch=pwr2 -qtune=pwr2 -Q
offset 0 - repeated 4 times
offset 1-16
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 1000000, Offset = 0
Total memory required = 22.9 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 16131 microseconds.
   (= 16131 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 688.1302 0.0241 0.0233 0.0292
Scale: 670.7533 0.0250 0.0239 0.0330
Add: 780.1330 0.0325 0.0308 0.0438
Triad: 783.6525 0.0310 0.0306 0.0335
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 1000000, Offset = 0
Total memory required = 22.9 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 16138 microseconds.
   (= 16138 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 690.0832 0.0235 0.0232 0.0257
Scale: 672.0261 0.0240 0.0238 0.0243
Add: 782.7780 0.0309 0.0307 0.0313
Triad: 786.8039 0.0306 0.0305 0.0311
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 1000000, Offset = 0
Total memory required = 22.9 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 16102 microseconds.
   (= 16102 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 689.9590 0.0234 0.0232 0.0239
Scale: 671.9050 0.0238 0.0238 0.0239
Add: 782.2488 0.0314 0.0307 0.0347
Triad: 785.9469 0.0308 0.0305 0.0314
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 1000000, Offset = 0
Total memory required = 22.9 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 16132 microseconds.
   (= 16132 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 688.1161 0.0234 0.0233 0.0238
Scale: 670.6661 0.0240 0.0239 0.0245
Add: 780.3145 0.0308 0.0308 0.0310
Triad: 783.6677 0.0312 0.0306 0.0346
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 2000000, Offset = 1
Total memory required = 45.8 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 32740 microseconds.
   (= 32740 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 688.9214 0.0469 0.0464 0.0476
Scale: 670.9645 0.0479 0.0477 0.0485
Add: 780.5353 0.0622 0.0615 0.0655
Triad: 775.2992 0.0624 0.0619 0.0642
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 2000000, Offset = 2
Total memory required = 45.8 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 32449 microseconds.
   (= 32449 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 690.3103 0.0465 0.0464 0.0466
Scale: 673.4406 0.0479 0.0475 0.0485
Add: 787.2485 0.0613 0.0610 0.0618
Triad: 779.3886 0.0624 0.0616 0.0658
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 2000000, Offset = 3
Total memory required = 45.8 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 32280 microseconds.
   (= 32280 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 674.5270 0.0488 0.0474 0.0521
Scale: 668.4882 0.0482 0.0479 0.0500
Add: 768.3165 0.0628 0.0625 0.0639
Triad: 774.2720 0.0629 0.0620 0.0663
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 2000000, Offset = 4
Total memory required = 45.8 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 32671 microseconds.
   (= 32671 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 675.4606 0.0481 0.0474 0.0513
Scale: 672.4554 0.0478 0.0476 0.0481
Add: 772.9818 0.0626 0.0621 0.0632
Triad: 777.6695 0.0619 0.0617 0.0625
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 2000000, Offset = 5
Total memory required = 45.8 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 32250 microseconds.
   (= 32250 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 648.7552 0.0496 0.0493 0.0501
Scale: 684.4962 0.0475 0.0467 0.0515
Add: 749.0442 0.0647 0.0641 0.0663
Triad: 764.8315 0.0631 0.0628 0.0640
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 2000000, Offset = 6
Total memory required = 45.8 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 32305 microseconds.
   (= 32305 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 647.9644 0.0498 0.0494 0.0513
Scale: 682.2763 0.0475 0.0469 0.0513
Add: 740.8318 0.0652 0.0648 0.0657
Triad: 767.2507 0.0629 0.0626 0.0634
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 2000000, Offset = 7
Total memory required = 45.8 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 32466 microseconds.
   (= 32466 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 660.4391 0.0492 0.0485 0.0530
Scale: 681.8898 0.0473 0.0469 0.0480
Add: 728.4755 0.0663 0.0659 0.0669
Triad: 748.0311 0.0646 0.0642 0.0665
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 2000000, Offset = 8
Total memory required = 45.8 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 32281 microseconds.
   (= 32281 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 658.0962 0.0494 0.0486 0.0527
Scale: 682.1688 0.0472 0.0469 0.0479
Add: 721.5580 0.0668 0.0665 0.0672
Triad: 752.2923 0.0642 0.0638 0.0649
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 2000000, Offset = 9
Total memory required = 45.8 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 32281 microseconds.
   (= 32281 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 646.8652 0.0508 0.0495 0.0540
Scale: 678.8204 0.0476 0.0471 0.0486
Add: 718.5475 0.0681 0.0668 0.0749
Triad: 728.6878 0.0669 0.0659 0.0705
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 2000000, Offset = 10
Total memory required = 45.8 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 33030 microseconds.
   (= 33030 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 653.6668 0.0493 0.0490 0.0500
Scale: 682.6163 0.0472 0.0469 0.0479
Add: 731.7787 0.0666 0.0656 0.0706
Triad: 720.4116 0.0669 0.0666 0.0675
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 2000000, Offset = 11
Total memory required = 45.8 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 32217 microseconds.
   (= 32217 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 675.2023 0.0485 0.0474 0.0522
Scale: 680.7277 0.0478 0.0470 0.0511
Add: 728.2990 0.0662 0.0659 0.0670
Triad: 741.4948 0.0657 0.0647 0.0711
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 2000000, Offset = 12
Total memory required = 45.8 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 32305 microseconds.
   (= 32305 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 674.6694 0.0481 0.0474 0.0498
Scale: 680.4931 0.0473 0.0470 0.0478
Add: 752.6031 0.0640 0.0638 0.0646
Triad: 729.6041 0.0662 0.0658 0.0671
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 2000000, Offset = 13
Total memory required = 45.8 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 32740 microseconds.
   (= 32740 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 687.8745 0.0467 0.0465 0.0473
Scale: 682.1965 0.0485 0.0469 0.0597
Add: 752.5581 0.0654 0.0638 0.0718
Triad: 740.2148 0.0732 0.0648 0.1239
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 2000000, Offset = 14
Total memory required = 45.8 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 32240 microseconds.
   (= 32240 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 688.6510 0.0469 0.0465 0.0488
Scale: 684.5573 0.0471 0.0467 0.0478
Add: 754.0152 0.0643 0.0637 0.0677
Triad: 751.6632 0.0642 0.0639 0.0651
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 2000000, Offset = 15
Total memory required = 45.8 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 32222 microseconds.
   (= 32222 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 690.6194 0.0465 0.0463 0.0471
Scale: 668.8463 0.0481 0.0478 0.0486
Add: 729.8779 0.0662 0.0658 0.0679
Triad: 764.2508 0.0634 0.0628 0.0646
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 2000000, Offset = 16
Total memory required = 45.8 MB.
Each test is run 10 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 32242 microseconds.
   (= 32242 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 689.3194 0.0467 0.0464 0.0472
Scale: 670.3261 0.0480 0.0477 0.0488
Add: 752.5651 0.0647 0.0638 0.0683
Triad: 753.1634 0.0641 0.0637 0.0655

F77 version - hires real timer - xlf -O3 -qarch=pwr2 -qtune=pwr2 -Q
offset 0 - repeated 4 times
offset 1-16
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 2000000
 Offset = 0
 The total memory requirement is 45 MB
 You are running each test 10 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be 1 microseconds
 The tests below will each take a time on the order
 of 32223 microseconds
    (= 32223 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 687.9697 .0444 .0465 .0473
Scale: 672.6290 .0456 .0476 .0495
Add: 785.4287 .0583 .0611 .0621
Triad: 784.7292 .0583 .0612 .0620
 Sum of a is = 0.230660156259187354E+19
 Sum of b is = 0.461320312485643840E+18
 Sum of c is = 0.615093750014125568E+18
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 2000000
 Offset = 0
 The total memory requirement is 45 MB
 You are running each test 10 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be 1 microseconds
 The tests below will each take a time on the order
 of 32263 microseconds
    (= 32263 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 688.0050 .0453 .0465 .0507
Scale: 672.1187 .0454 .0476 .0483
Add: 785.2449 .0657 .0611 .1097
Triad: 784.7950 .0584 .0612 .0620
 Sum of a is = 0.230660156259187354E+19
 Sum of b is = 0.461320312485643840E+18
 Sum of c is = 0.615093750014125568E+18
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 2000000
 Offset = 0
 The total memory requirement is 45 MB
 You are running each test 10 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be 1 microseconds
 The tests below will each take a time on the order
 of 32459 microseconds
    (= 32459 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 688.5997 .0443 .0465 .0473
Scale: 672.6795 .0458 .0476 .0516
Add: 785.1791 .0584 .0611 .0621
Triad: 784.8439 .0582 .0612 .0619
 Sum of a is = 0.230660156259187354E+19
 Sum of b is = 0.461320312485643840E+18
 Sum of c is = 0.615093750014125568E+18
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 2000000
 Offset = 0
 The total memory requirement is 45 MB
 You are running each test 10 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be 1 microseconds
 The tests below will each take a time on the order
 of 32357 microseconds
    (= 32357 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 688.8047 .0444 .0465 .0478
Scale: 672.6475 .0453 .0476 .0482
Add: 785.4226 .0587 .0611 .0652
Triad: 784.7231 .0584 .0612 .0624
 Sum of a is = 0.230660156259187354E+19
 Sum of b is = 0.461320312485643840E+18
 Sum of c is = 0.615093750014125568E+18
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 2000000
 Offset = 1
 The total memory requirement is 45 MB
 You are running each test 10 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be 1 microseconds
 The tests below will each take a time on the order
 of 32224 microseconds
    (= 32224 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 688.2925 .0445 .0465 .0492
Scale: 673.7821 .0454 .0475 .0483
Add: 779.9728 .0592 .0615 .0655
Triad: 783.5320 .0582 .0613 .0621
 Sum of a is = 0.230660156259187354E+19
 Sum of b is = 0.461320312485643840E+18
 Sum of c is = 0.615093750014125568E+18
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 2000000
 Offset = 2
 The total memory requirement is 45 MB
 You are running each test 10 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be 1 microseconds
 The tests below will each take a time on the order
 of 32224 microseconds
    (= 32224 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 688.7782 .0443 .0465 .0472
Scale: 673.8836 .0453 .0475 .0485
Add: 779.8913 .0589 .0615 .0635
Triad: 783.4604 .0585 .0613 .0623
 Sum of a is = 0.230660156259187354E+19
 Sum of b is = 0.461320312485643840E+18
 Sum of c is = 0.615093750014125568E+18
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 2000000
 Offset = 3
 The total memory requirement is 45 MB
 You are running each test 10 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be 1 microseconds
 The tests below will each take a time on the order
 of 32256 microseconds
    (= 32256 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 674.9714 .0453 .0474 .0497
Scale: 645.9887 .0476 .0495 .0533
Add: 778.6651 .0588 .0616 .0624
Triad: 772.5013 .0592 .0621 .0629
 Sum of a is = 0.230660156259187354E+19
 Sum of b is = 0.461320312485643840E+18
 Sum of c is = 0.615093750014125568E+18
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 2000000
 Offset = 4
 The total memory requirement is 45 MB
 You are running each test 10 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be 1 microseconds
 The tests below will each take a time on the order
 of 32223 microseconds
    (= 32223 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 675.2448 .0455 .0474 .0495
Scale: 648.3087 .0471 .0494 .0501
Add: 778.8880 .0587 .0616 .0625
Triad: 772.6066 .0594 .0621 .0640
 Sum of a is = 0.230660156259187354E+19
 Sum of b is = 0.461320312485643840E+18
 Sum of c is = 0.615093750014125568E+18
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 2000000
 Offset = 5
 The total memory requirement is 45 MB
 You are running each test 10 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be 1 microseconds
 The tests below will each take a time on the order
 of 32223 microseconds
    (= 32223 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 675.2923 .0452 .0474 .0482
Scale: 648.3478 .0479 .0494 .0538
Add: 780.4839 .0588 .0615 .0629
Triad: 741.1045 .0621 .0648 .0688
 Sum of a is = 0.230660156259187354E+19
 Sum of b is = 0.461320312485643840E+18
 Sum of c is = 0.615093750014125568E+18
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 2000000
 Offset = 6
 The total memory requirement is 45 MB
 You are running each test 10 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be 1 microseconds
 The tests below will each take a time on the order
 of 32223 microseconds
    (= 32223 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 675.4062 .0451 .0474 .0482
Scale: 648.2257 .0473 .0494 .0515
Add: 780.6064 .0587 .0615 .0627
Triad: 740.8781 .0620 .0648 .0671
 Sum of a is = 0.230660156259187354E+19
 Sum of b is = 0.461320312485643840E+18
 Sum of c is = 0.615093750014125568E+18
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 2000000
 Offset = 7
 The total memory requirement is 45 MB
 You are running each test 10 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be 1 microseconds
 The tests below will each take a time on the order
 of 32441 microseconds
    (= 32441 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 659.9860 .0462 .0485 .0491
Scale: 647.2894 .0471 .0494 .0502
Add: 761.4758 .0601 .0630 .0639
Triad: 740.4885 .0622 .0648 .0691
 Sum of a is = 0.230660156259187354E+19
 Sum of b is = 0.461320312485643840E+18
 Sum of c is = 0.615093750014125568E+18
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 2000000
 Offset = 8
 The total memory requirement is 45 MB
 You are running each test 10 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be 1 microseconds
 The tests below will each take a time on the order
 of 32224 microseconds
    (= 32224 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 659.8254 .0462 .0485 .0492
Scale: 647.2332 .0474 .0494 .0516
Add: 761.4902 .0603 .0630 .0653
Triad: 740.4803 .0620 .0648 .0661
 Sum of a is = 0.230660156259187354E+19
 Sum of b is = 0.461320312485643840E+18
 Sum of c is = 0.615093750014125568E+18
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 2000000
 Offset = 9
 The total memory requirement is 45 MB
 You are running each test 10 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be 1 microseconds
 The tests below will each take a time on the order
 of 32254 microseconds
    (= 32254 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 648.9560 .0469 .0493 .0496
Scale: 649.8798 .0471 .0492 .0506
Add: 742.4094 .0617 .0647 .0659
Triad: 730.2141 .0631 .0657 .0699
 Sum of a is = 0.230660156259187354E+19
 Sum of b is = 0.461320312485643840E+18
 Sum of c is = 0.615093750014125568E+18
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 2000000
 Offset = 10
 The total memory requirement is 45 MB
 You are running each test 10 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be 1 microseconds
 The tests below will each take a time on the order
 of 32670 microseconds
    (= 32670 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 648.6392 .0477 .0493 .0564
Scale: 649.9113 .0473 .0492 .0518
Add: 741.9361 .0621 .0647 .0671
Triad: 730.2313 .0633 .0657 .0714
 Sum of a is = 0.230660156259187354E+19
 Sum of b is = 0.461320312485643840E+18
 Sum of c is = 0.615093750014125568E+18
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 2000000
 Offset = 11
 The total memory requirement is 45 MB
 You are running each test 10 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be 1 microseconds
 The tests below will each take a time on the order
 of 32225 microseconds
    (= 32225 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 675.7598 .0453 .0474 .0493
Scale: 648.0176 .0471 .0494 .0502
Add: 740.9367 .0628 .0648 .0689
Triad: 742.2712 .0617 .0647 .0654
 Sum of a is = 0.230660156259187354E+19
 Sum of b is = 0.461320312485643840E+18
 Sum of c is = 0.615093750014125568E+18
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 2000000
 Offset = 12
 The total memory requirement is 45 MB
 You are running each test 10 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be 1 microseconds
 The tests below will each take a time on the order
 of 32432 microseconds
    (= 32432 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 675.4368 .0454 .0474 .0495
Scale: 648.4449 .0470 .0493 .0503
Add: 740.8849 .0618 .0648 .0656
Triad: 742.6902 .0616 .0646 .0657
 Sum of a is = 0.230660156259187354E+19
 Sum of b is = 0.461320312485643840E+18
 Sum of c is = 0.615093750014125568E+18
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 2000000
 Offset = 13
 The total memory requirement is 45 MB
 You are running each test 10 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be 1 microseconds
 The tests below will each take a time on the order
 of 32226 microseconds
    (= 32226 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 690.0246 .0443 .0464 .0472
Scale: 672.9089 .0453 .0476 .0483
Add: 751.8625 .0611 .0638 .0662
Triad: 752.7916 .0612 .0638 .0678
 Sum of a is = 0.230660156259187354E+19
 Sum of b is = 0.461320312485643840E+18
 Sum of c is = 0.615093750014125568E+18
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 2000000
 Offset = 14
 The total memory requirement is 45 MB
 You are running each test 10 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be 1 microseconds
 The tests below will each take a time on the order
 of 32224 microseconds
    (= 32224 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 690.1488 .0442 .0464 .0469
Scale: 673.2835 .0454 .0475 .0484
Add: 751.9889 .0610 .0638 .0657
Triad: 752.7790 .0608 .0638 .0647
 Sum of a is = 0.230660156259187354E+19
 Sum of b is = 0.461320312485643840E+18
 Sum of c is = 0.615093750014125568E+18
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 2000000
 Offset = 15
 The total memory requirement is 45 MB
 You are running each test 10 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be 1 microseconds
 The tests below will each take a time on the order
 of 32266 microseconds
    (= 32266 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 688.8648 .0448 .0465 .0507
Scale: 671.0299 .0454 .0477 .0485
Add: 752.4554 .0609 .0638 .0651
Triad: 752.3176 .0609 .0638 .0648
 Sum of a is = 0.230660156259187354E+19
 Sum of b is = 0.461320312485643840E+18
 Sum of c is = 0.615093750014125568E+18
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size = 2000000
 Offset = 16
 The total memory requirement is 45 MB
 You are running each test 10 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be 1 microseconds
 The tests below will each take a time on the order
 of 32224 microseconds
    (= 32224 clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 688.4461 .0445 .0465 .0487
Scale: 671.4731 .0454 .0477 .0485
Add: 752.1279 .0611 .0638 .0657
Triad: 752.2164 .0609 .0638 .0649
 Sum of a is = 0.230660156259187354E+19
 Sum of b is = 0.461320312485643840E+18
 Sum of c is = 0.615093750014125568E+18



This archive was generated by hypermail 2b29 : Tue Apr 18 2000 - 05:23:07 CDT