memory bandwith results

From: David Daniel (David.Daniel@meiko.com)
Date: Mon Jun 06 1994 - 14:56:35 CDT


John,

Attached below are the results of your stream memory bandwidth tests
for the CS-2 vector node. Let me know if you need more info on this,
and I'll be in touch again with the qgbench results shortly.

David Daniel

----------------------------------------------------------------------
Meiko Email: David.Daniel@meiko.com
130C Baker Avenue Ext. Tel: 508-371-0088
Concord, MA 01742, USA Fax: 508-371-7516
----------------------------------------------------------------------

Hardware
********
MK403 compute board incorporating:
66 MHz Pinnacle SPARC.
45 MHz MK534 vector unit (2 VPUs + cache coherency hardware).
16 bank, 3 port, 128MB memory system.

The VPUs can request 1 double precision word per cycle, so peak
theoretical memory bandwidth in vector code is

   2 * 45 MHz * 8 bytes = 720 MB/s.

We achieve 85% of this for your benchmark. This is typical of the
code generated by the vectorizing compilers -- about 15% goes to
stripmine overhead.

Software
********
Solaris 2.1 + Meiko support for VPUs.
Portland Group vectorizing f77 and cc compilers.

Changes to code
***************
I replaced the second function with a call to the Solaris
gettimeofday, which I believe gives sufficient resolution. No other
changes were made.

========================================================================
$ cat second.c
#include <sys/time.h>
 
double second_ (void)
{
    static struct timeval tp;
    gettimeofday (&tp);
    return (double) tp.tv_sec + (double) tp.tv_usec * 1.0e-6;
}
========================================================================

Log of session
**************
========================================================================

$ date
Mon Jun 6 13:03:55 EDT 1994
$ uname -a
SunOS abe1 5.1 Callisto_Development dino1 sparc
$ cd ~/Delaware/stream
$ make
pgcc -I /opt/MEIKOcs2/include -c second.c
pgf77 -O4 -r8 -Mvect -o stream_d stream_d.f second.o
Linking:
$ size stream_d
149439 + 67580 + 48005400 = 48222419
$ stream_d
--------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLEPRECISION word
--------------------------------------
 Timing calibration ; time = 20.64710855484009 hundredths of a second
 Increase the size of the arrays if this is <30
  and your clock precision is =<1/100 second
 ---------------------------------------------------
Function Rate (MB/s) RMS time Min time Max time
Assignment: 621.2748 0.0516 0.0515 0.0516
Scaling : 621.3108 0.0517 0.0515 0.0535
Summing : 610.2757 0.0787 0.0787 0.0788
SAXPYing : 610.3460 0.0787 0.0786 0.0787

========================================================================



This archive was generated by hypermail 2b29 : Tue Apr 18 2000 - 05:23:03 CDT