Requesting Large Pages
Shared Segment Pointer = 504403158265495552
Segment Size (DW) = 268435456 (MB = 2048 )
Vector Size (DW) = 67108864 (MB = 512 )
Num_threads = 8
Num_threads = 8
Num_threads = 8
Num_threads = 8
Num_threads = 8
Num_threads = 8
Num_threads = 8
Num_threads = 8
rebind: num_parthds is 8
Starting Initialization
Done With Initialization
a(1) 1.00000000000000000
a(N) 0.000000000000000000E+00
Base Offset = 67108864
Incremental Offset = 2048
Number of Threads = 8
----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 67108864
Offset = 0
The total memory requirement is 1536 MB
You are running each test 5 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity appears to be less than one microsecond
Your clock granularity/precision appears to be 1 microseconds
The tests below will each take a time on the order
of 111654 microseconds
(= 111654 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 9230.0307 .1178 .1163 .1222
Scale: 9100.1022 .1186 .1180 .1198
Add: 8079.9400 .2005 .1993 .2016
Triad: 8239.3571 .1963 .1955 .1970
Sum of a is = 101921587200000.000
Sum of b is = 20384317440000.0000
Sum of c is = 27179089920000.0000
Base Offset = 67108864
Incremental Offset = 2304
Number of Threads = 8
----------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 9307.2943 .1928 .1154 .1222
Scale: 9100.1022 .1947 .1180 .1198
Add: 9615.0088 .2617 .1675 .2016
Triad: 9487.6141 .2617 .1698 .1970
Sum of a is = 101921587200000.000
Sum of b is = 20384317440000.0000
Sum of c is = 27179089920000.0000
Base Offset = 67108864
Incremental Offset = 2560
Number of Threads = 8
----------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 9307.2943 .2299 .1154 .1246
Scale: 9488.6735 .2280 .1132 .1198
Add: 10139.4896 .2792 .1588 .2016
Triad: 10372.7592 .2771 .1553 .1970
Sum of a is = 101921587200000.000
Sum of b is = 20384317440000.0000
Sum of c is = 27179089920000.0000
Base Offset = 67108864
Incremental Offset = 2816
Number of Threads = 8
----------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 9365.2418 .2436 .1147 .1246
Scale: 9488.6735 .2452 .1132 .1214
Add: 10139.4896 .2912 .1588 .2016
Triad: 10372.7592 .2894 .1553 .1970
Sum of a is = 101921587200000.000
Sum of b is = 20384317440000.0000
Sum of c is = 27179089920000.0000
Base Offset = 67108864
Incremental Offset = 3072
Number of Threads = 8
----------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Copy: 9365.2418 .2510 .1147 .1246
Scale: 9488.6735 .2513 .1132 .1214
Add: 10139.4896 .3101 .1588 .2016
Triad: 10372.7592 .3087 .1553 .1970
Sum of a is = 101921587200000.000
Sum of b is = 20384317440000.0000
Sum of c is = 27179089920000.0000
bindprocessor successful: thread_self() 581777 cpu_id 2
bindprocessor successful: thread_self() 405589 cpu_id 3
bindprocessor successful: thread_self() 618541 cpu_id 7
bindprocessor successful: thread_self() 381029 cpu_id 6
bindprocessor successful: thread_self() 622695 cpu_id 1
bindprocessor successful: thread_self() 651441 cpu_id 0
bindprocessor successful: thread_self() 589853 cpu_id 5
bindprocessor successful: thread_self() 630995 cpu_id 4