(no subject)

From: honhuang@einstein.phys.uwm.edu
Date: Sun Apr 21 1996 - 13:27:48 CDT

Next message: <: "some STREAM results"
Previous message: Partha Tirumalai: "COPY (stream) numbers with bld/bst."
Next in thread: honhuang@einstein.phys.uwm.edu: "(no subject)"
Maybe reply: honhuang@einstein.phys.uwm.edu: "(no subject)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi,
   This is a data point for IBM rs6000--591. aix 4.1.4.0, xlf 3,
xlf -O3 -qarch=pwr2 -qtune=pwr2.
   I only got 16 clock ticks with memory usage 228 MB. Although
the machine has 512 MB memory, whenever the memory exceed 256 MB,
the program will not run, and I got the message saying that not
enough memory. I cheched limit, it was set properly. Until I
figure out how to use the 512 MB in full, 16 clock ticks is the
best I could get. ( or maybe 17 ).
   Why do you use the best time ( shortest time ) ? Why not use
average time with more iterations ? using larger array means
average over space, using more iteration means average over time,
are they equivalent ? Average time is closer to the way machine
is used than best time, I think.
   The system load is neally 0 excluding stream itself.
   Best reguards.

                               h.huang

----------------------------------------------
Double precision appears to have 16 digits of accuracy
Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
Array size = 10000000
Offset = 0
The total memory requirement is 228 MB
You are running each test 10 times
The *best* time for each test is used
----------------------------------------------------
Your clock granularity/precision appears to be 10000 microseconds
The tests below will each take a time on the order
of 160000 microseconds
    (= 16 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
----------------------------------------------------
WARNING: The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
----------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Assignment: 695.6522 .2434 .2300 .2500
Scaling : 695.6522 .2434 .2300 .2500
Summing : 750.0000 .3345 .3200 .3400
SAXPYing : 750.0000 .3334 .3200 .3400
Sum of a is : 0.115330078110398751E+20
Sum of b is : 0.230660156211832781E+19
Sum of c is : 0.307546874927796787E+19

Next message: <: "some STREAM results"
Previous message: Partha Tirumalai: "COPY (stream) numbers with bld/bst."
Next in thread: honhuang@einstein.phys.uwm.edu: "(no subject)"
Maybe reply: honhuang@einstein.phys.uwm.edu: "(no subject)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Tue Apr 18 2000 - 05:23:05 CDT