------------------------------------------------------------- STREAM version $Revision: 5.10 $ ------------------------------------------------------------- This system uses 8 bytes per array element. ------------------------------------------------------------- Array size = 80000000 (elements), Offset = 0 (elements) Memory per array = 610.4 MiB (= 0.596 GiB). (Array is sized for systems with cache sizes up to 152.59 MiB) (Results are acceptable for Cache sizes up to 160.00 MiB) Total memory required = 1831.1 MiB (= 1.788 GiB). ------------------------------------------------------------- Each kernel will be executed 10 times. The *best* time for each kernel (excluding the first iteration) will be used to compute the reported bandwidth. ------------------------------------------------------------- Number of Threads requested = 20 Number of Threads counted = 20 DEBUG: The system has 32 logical processors DEBUG: Manually setting cpu affinity for each thread DEBUG: thread 3 setting affinity for core 12 DEBUG: thread 17 setting affinity for core 7 DEBUG: thread 2 setting affinity for core 8 DEBUG: thread 1 setting affinity for core 4 DEBUG: thread 16 setting affinity for core 3 DEBUG: thread 15 setting affinity for core 30 DEBUG: thread 9 setting affinity for core 6 DEBUG: thread 13 setting affinity for core 22 DEBUG: thread 14 setting affinity for core 26 DEBUG: thread 4 setting affinity for core 16 DEBUG: thread 10 setting affinity for core 10 DEBUG: thread 5 setting affinity for core 20 DEBUG: thread 12 setting affinity for core 18 DEBUG: thread 6 setting affinity for core 24 DEBUG: thread 11 setting affinity for core 14 DEBUG: thread 19 setting affinity for core 15 DEBUG: thread 18 setting affinity for core 11 DEBUG: thread 8 setting affinity for core 2 DEBUG: thread 0 setting affinity for core 0 DEBUG: thread 7 setting affinity for core 28 DEBUG: The system has 32 logical processors Affinity Mask for Thread 1 includes Processor 4 Affinity Mask for Thread 2 includes Processor 8 Affinity Mask for Thread 3 includes Processor 12 Affinity Mask for Thread 4 includes Processor 16 Affinity Mask for Thread 8 includes Processor 2 Affinity Mask for Thread 7 includes Processor 28 Affinity Mask for Thread 5 includes Processor 20 Affinity Mask for Thread 9 includes Processor 6 Affinity Mask for Thread 6 includes Processor 24 Affinity Mask for Thread 11 includes Processor 14 Affinity Mask for Thread 10 includes Processor 10 Affinity Mask for Thread 12 includes Processor 18 Affinity Mask for Thread 17 includes Processor 7 Affinity Mask for Thread 16 includes Processor 3 Affinity Mask for Thread 13 includes Processor 22 Affinity Mask for Thread 15 includes Processor 30 Affinity Mask for Thread 19 includes Processor 15 Affinity Mask for Thread 0 includes Processor 0 Affinity Mask for Thread 14 includes Processor 26 Affinity Mask for Thread 18 includes Processor 11 ------------------------------------------------------------- Your timer granularity/precision appears to be 1 microseconds. Each test below will take on the order of 17422 microseconds. (= 17422 timer ticks) ------------------------------------------------------------- ----- Rates based on fastest execution of each kernel ----- (delta) is avg performance relative to best iteration for each kernel Function Best Rate MB/s Min time Avg time (delta) Max time Copy: 76573.327 0.016716 0.016837 (-0.72%) 0.016921 Scale: 77650.951 0.016484 0.016555 (-0.43%) 0.016687 Add: 73775.055 0.026025 0.026150 (-0.48%) 0.026333 Triad: 76232.641 0.025186 0.025320 (-0.53%) 0.025430 ------------------------------------------------------------- ----- Alternate Selection of Results ------------ ----- Rates based on fastest full iteration (sum of 4 kernel times) ----- Fastest (overall) iteration was iteration 4 of 10 (delta) is the performance in iteration 4 relative to the fastest individual kernel timing Function Best Rate MB/s time (delta) Copy: 76031.115 0.016835 (-0.71%) Scale: 77225.390 0.016575 (-0.55%) Add: 73775.055 0.026025 ( 0.00%) Triad: 75979.467 0.025270 (-0.33%) ------------------------------------------------------------- Solution Validates: avg error less than 1.000000e-13 on all three arrays -------------------------------------------------------------