Tuned 64-bit STREAM results on an IBM eServer OpenPower 710

From: Duc Vianney (dvianney@us.ibm.com)
Date: Fri Feb 11 2005 - 19:26:20 UTC

  • Next message: Schmidt, David (Performance Eng.): "STREAM Results for the HP ProLiants DL385, DL585, BL25p, BL35p, ML370 G4"

    John:

    Please post the following results on late Monday 14 Feb 2005, but before 9
    A.M. Tuesday 15 Feb 2005.

    These are tuned 64-bit STREAM results on an IBM eServer OpenPower 710
    with two 1650 MHz CPUs running RedHat Enterprise Linux AS 4.

    IBM Corporation IBM eServer OpenPower 710 (1650MHz, 2 CPU, Linux)
    CPU(s) enabled: 2 cores, 1 chip, 2 cores/chip (SMT ON)
    CPU(s) orderable: 1,2
    Primary Cache: 64KBI+32KDB (on chip)/core
    Secondary Cache: 1920KB unified (on chip)/chip
    L3 Cache: 36MB unified (off chip)/DCM, 1 DCM/SUT

    SMT: Acronym for "Simultaneous Multi-Threading". A processor technology
    that allows
         the simultaneous execution of multiple thread contexts within a
    single processor
         core. (Enabled by default)
    DCM: Acronym for "Dual-Chip Module" (one dual-core processor chip + one
    L3-cache chip)
    SUT: Acronym for "System Under Test"

     Number of Threads = 4
    ----------------------------------------------
     Double precision appears to have 16 digits of accuracy
     Assuming 8 bytes per DOUBLE PRECISION word
    ----------------------------------------------
     Array size = 66060288
     Offset = 96
     The total memory requirement is 1512 MB
     You are running each test 100 times
     The *best* time for each test is used
     ----------------------------------------------------
     Your clock granularity/precision appears to be 1 microseconds
     The tests below will each take a time on the order
     of 373160 microseconds
        (= 373160 clock ticks)
     Increase the size of the arrays if this shows that
     you are not getting at least 20 clock ticks per test.
     ----------------------------------------------------
     WARNING -- The above is only a rough guideline.
     For best results, please be sure you know the
     precision of your system timer.
     ----------------------------------------------------
    Function Rate (MB/s) RMS time Min time Max time
    Copy: 3039.9033 .3507 .3477 .5598
    Scale: 3010.0351 .3518 .3511 .3522
    Add: 4378.1162 .3625 .3621 .3631
    Triad: 4427.0507 .3585 .3581 .3645
     Sum of a is = 0.537150969556339130E+126
     Sum of b is = 0.107430193907022657E+126
     Sum of c is = 0.143240258538696412E+126
    locking to cpu 0
    locking to cpu 1
    locking to cpu 2
    locking to cpu 3

    Duc J Vianney, Ph. D., IBM Linux Technology Center Performance Team
    dvianney@us.ibm.com, Phone: (512) 838-9919 Fax: (512) 838-0070



    This archive was generated by hypermail 2.1.4 : Tue Feb 15 2005 - 07:11:56 UTC