Submission of Standard Stream Result AMD Istanbul 8435 ( 2.6GHz )

From: Mark_Digicor <mark@digicor.com.au>
Date: Tue Oct 06 2009 - 21:21:55 CDT


Dear Dr. McCalpin,

I would like to submit the standard STREAM result of a Supermicro AMD =
Istanbul 4-Way system.

Below is the system description and STREAM test Result.

Hope to see it being posted soon.

 

Best Regards.

 

 Mark Han



7th Oct 2009 DiGiCOR Pty Ltd.







System Description / Configuration:

Motherboard: Supermicro BHQME ( blade )

CPU: AMD Opteron Istanbul 8435

CPU Speed: 2.6GHz

CPU(s): 4 processors, 6 cores/processor

L3 Cache: 4 x 5MB ( per processor )

Memory: 64GB DDR2-800MHz ( 4GB per DIMM, 4 socket x 4 DIMMs x 4GB , =
2DPC, Dual Channels).

HT-Assist: enabled ( probe filter enabled )



Operation system: Windows 2008 64bit R2



STREAM pro/code:

http://www.cs.virginia.edu/stream/FTP/Contrib/StreamWin-32-64_distro.zip

 

Result:

 
C:\Stream_64bit>rem

C:\Stream_64bit>date /T
Wed 10/07/2009

C:\Stream_64bit>time /T
11:33 AM

C:\Stream_64bit>REM : 4-Ways Ps - 24 cores : =
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23

C:\Stream_64bit>REM 2.6 GHz x 4 Opteron 8435 / 4 x 4 x 4GB ( 64GB ) =
DDR2-800MHz

C:\Stream_64bit>set MP_BIND=yes

C:\Stream_64bit>rem Multithreaded version Stream5.8_omp-64.exe Windows =
2008 64bit R2

C:\Stream_64bit>set OMP_NUM_THREADS=24

C:\Stream_64bit>set =
MP_BLIST=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23 =


C:\Stream_64bit>sleep.exe 20

C:\Stream_64bit>start /b /WAIT /HIGH stream5.8_omp-64.exe
-------------------------------------------------------------
STREAM version $Revision: 5.8 $
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 40000000, Offset = 0
Total memory required = 915.5 MB.
Each test is run 40 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Number of Threads requested = 24
-------------------------------------------------------------
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 202721 microseconds.
   (= 202721 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 42146.0543 0.0155 0.0152 0.0249
Scale: 42180.8821 0.0152 0.0152 0.0156
Add: 41048.7656 0.0235 0.0234 0.0245
Triad: 41198.5458 0.0241 0.0233 0.0428
-------------------------------------------------------------
Solution Validates
-------------------------------------------------------------

C:\Stream_64bit>set OMP_NUM_THREADS=24

C:\Stream_64bit>set =
MP_BLIST=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23 =


C:\Stream_64bit>sleep.exe 20

C:\Stream_64bit>start /b /WAIT /HIGH stream5.8_omp-64.exe
-------------------------------------------------------------
STREAM version $Revision: 5.8 $
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 40000000, Offset = 0
Total memory required = 915.5 MB.
Each test is run 40 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Number of Threads requested = 24
-------------------------------------------------------------
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 202720 microseconds.
   (= 202720 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 42144.9668 0.0155 0.0152 0.0241
Scale: 42134.0955 0.0152 0.0152 0.0161
Add: 41005.4848 0.0235 0.0234 0.0250
Triad: 41134.9113 0.0238 0.0233 0.0401
-------------------------------------------------------------
Solution Validates
-------------------------------------------------------------

C:\Stream_64bit>set =
MP_BLIST=18,19,20,21,22,23,12,13,14,15,16,17,6,7,8,9,10,11,0,1,2,3,4,5 =


C:\Stream_64bit>sleep.exe 20

C:\Stream_64bit>start /b /WAIT /HIGH stream5.8_omp-64.exe
-------------------------------------------------------------
STREAM version $Revision: 5.8 $
-------------------------------------------------------------
This system uses 8 bytes per DOUBLE PRECISION word.
-------------------------------------------------------------
Array size = 40000000, Offset = 0
Total memory required = 915.5 MB.
Each test is run 40 times, but only
the *best* time for each is used.
-------------------------------------------------------------
Number of Threads requested = 24
-------------------------------------------------------------
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
Printing one line per active thread....
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 200188 microseconds.
   (= 200188 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Rate (MB/s) Avg time Min time Max time
Copy: 42144.9668 0.0152 0.0152 0.0155
Scale: 42141.7049 0.0152 0.0152 0.0164
Add: 40996.5652 0.0237 0.0234 0.0303
Triad: 41134.9113 0.0234 0.0233 0.0238
-------------------------------------------------------------
Solution Validates
-------------------------------------------------------------

C:\Stream_64bit>date /T
Wed 10/07/2009

C:\Stream_64bit>time /T
11:35 AM

C:\Stream_64bit>rem

C:\Stream_64bit>rem


Received on Wed Oct 07 06:59:34 2009

This archive was generated by hypermail 2.1.8 : Wed Oct 07 2009 - 07:15:24 CDT