Fw: standard STREAM on IBM eServer p5 575 (1900 MHz, 8cpu)

From: John D Mccalpin (mccalpin@us.ibm.com)
Date: Thu Feb 24 2005 - 10:13:25 CST

  • Next message: John D Mccalpin: "Fw: tuned STREAM on IBM eServer p5 575 (1900 MHz, 8cpu)"

    To: mccalpin@us.ibm.com
    cc:
    From: Ly Vu/Austin/IBM@IBMUS
    Subject: standard STREAM on IBM eServer p5 575 (1900 MHz, 8cpu)

    These are standard STREAM results on an IBM eServer p5 575
    with eight 1900 MHz cpus. This is a POWER5 SMP machine.
    Large pages were used in all cases.

    Function Rate (MB/s) Avg time Min time Max time
    Copy: 34966.6146 .1232 .1228 .1235
    Scale: 35035.0607 .1227 .1225 .1229
    Add: 41076.3886 .1569 .1568 .1570
    Triad: 41585.3956 .1553 .1548 .1556

    Here is the full output file:
    --------------------------------------------------

     Requesting Large Pages
     Setting up for 2 CPUs per module
     Number of segments per array = 8
     CPU binding list : 0 2 4 6 8 10 12 14
     Shared Segment Pointer = 504403158265495552
     Shared Segment Pointer = 504403160412979200
     Shared Segment Pointer = 504403162560462848
     Segment Size (B) = 268435456 (MB = 256 )
     Array Size (B) = 2147483648 (MB = 2048 )
     Array Size (DW) = 268435456
     Num_threads = 16
     Num_threads = 16
     Num_threads =16
     Num_threads = 16
     Num_threads = 16
     Num_threads = 16
     Num_threads = 16
     Num_threads = 16
     Num_threads = 16
     Num_threads = 16
     Num_threads = 16
     Num_threads = 16
     Num_threads = 16
     Num_threads = 16
     Num_threads = 16
     Num_threads = 16
     rebind: num_parthds is 16
     Starting Initialization
     Done With Initialization
     a(1) 1.00000000000000000
     b(M) 1.00000000000000000
     c(M) 1.00000000000000000
     Incremental Offset = 512
    ----------------------------------------------
     Double precision appears to have 16 digits of accuracy
     Assuming 8 bytes per DOUBLE PRECISION word
    ----------------------------------------------
     Array size = 268303360
     The total memory requirement is 6140 MB
     You are running each test 5 times
     --
     The *best* time for each test is used
     *EXCLUDING* the first and last iterations
     ----------------------------------------------------
     Your clock granularity appears to be less than one microsecond
     Your clock granularity/precision appears to be 1 microseconds=

     ----------------------------------------------------
    Function Rate (MB/s) Avg time Min time Max time
    Copy: 34966.6146 .1232 .1228 .1235
    Scale: 35035.0607 .1227 .1225 .1229
    Add: 41076.3886 .1569 .1568 .1570
    Triad: 41585.3956 .1553 .1548 .1556
     ----------------------------------------------------
     Solution Validates!
     ----------------------------------------------------

    ______________________________________________
    Ly Vu
    IBM Corp. - Austin, Texas.
    RS/6000 Performance Analysis.



    This archive was generated by hypermail 2.1.4 : Tue Mar 08 2005 - 07:42:51 CST