Re: stream

From: Charles Grassl (cmg@magnet.cray.com)
Date: Sun Nov 17 1991 - 19:02:42 CST

Next message: dik@cwi.nl: "SX3 benchmark"
Previous message: Charles Grassl: "stream"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hello John;

I'm still trying to figure out the CM-2 stream code. I now strongly
suspect that the compiler has done some dead code elimination.

The orginal (from TMC) source code has additional "do i=1,100" loops
around each kernel and the TIMES array is adjusted accordingly. I ran
the KAP output (from TMC) on a CRAY Y-MP (this code is attached
below). The results are -low- by a factor of 100. The KAP output does
not have the correction for the factor of 100.

I'm disappointed that the CF77 compiler did not see the dead
code in the "do i=1,100" loops. Just the same, the calibaration
in this program is incorrect.

I suspect that the CM-5 compiler code optimized eventually kicked in
and deleted somethin. Else, if it it really ran the program correctly,
the bandwidth is 8,000,000 Mbyte/sec.!

Regards,

-- 
Charles M. Grassl
Cray Research, Inc.
(612) 683-3531 cmg@cray.com
C Source code output from KAP:
      PROGRAM stream
C     .. Parameters ..
      INTEGER n,ntimes
      PARAMETER (p=256,n=4000*p,ntimes=10)
C     ..
C     .. Local Scalars ..
      INTEGER j,k,nbpw
C     ..
C     .. Local Arrays ..
      real a(n),b(n),c(n),maxtime(4),mintime(4),rmstime(4),
     $     times(4,ntimes)
      INTEGER bytes(4)
      CHARACTER label(4)*11
C     ..
C     .. External Functions ..
      INTEGER realsize
C     ..
C     .. Intrinsic Functions ..
      INTRINSIC dble,max,min,sqrt
C     ..
C     .. Data statements ..
      DATA rmstime/4*0.0/,mintime/4*1.0E+36/,maxtime/4*0.0/
      DATA label/' Assignment:',' Scaling   :',' Summing   :',
     $     ' SAXPYing  :'/
      DATA bytes/2,2,3,3/
      etime()=second()
C     ..
 
*       --- SETUP --- determine precision and check timing ---
 
      nbpw = realsize()
 
      t = etime()
          A = 1.0e0
          B = 2.0e0
          C = 0.0e0
      t=etime()-t
      PRINT *,'Timing calibration ; time = ',t*100,' hundredths',
     $  ' of a second'
      PRINT *,'Increase the size of the arrays if this is <30 ',
     $  ' and your clock precision is =<1/100 second'
      PRINT *,'---------------------------------------------------'
 
*       --- MAIN LOOP --- repeat test cases NTIMES times ---
      DO 60 k = 1,ntimes
 
         t=etime()
         DO 20 I=1,100
            C = A
 20      CONTINUE
         t=etime()-t
         times(1,k) = t
 
         t=etime()
         DO 30 I=1,100
            C = 3.0 * A
 30      CONTINUE
         t=etime()-t
         times(2,k) = t
 
         t=etime()
         DO 40 I=1,100
            C = A + B
 40      CONTINUE
         t=etime()-t
         times(3,k) = t
         t=etime()
         DO 50 I=1,100
            C = A + 3.0 * B
 50      CONTINUE
         t=etime()-t
         times(4,k) = t
	 call dummysub(a,b,c,n)
 60   CONTINUE
 
*       --- SUMMARY ---
C*$*NOVECTORIZE
      DO 80 k = 1,ntimes
         DO 70 j = 1,4
            rmstime(j) = rmstime(j) + times(j,k)**2
            mintime(j) = min(mintime(j),times(j,k))
            maxtime(j) = max(maxtime(j),times(j,k))
 70      CONTINUE
 80   CONTINUE
      WRITE (*,FMT=9000)
      DO 90 j = 1,4
         rmstime(j) = sqrt(rmstime(j)/float(ntimes))
         WRITE (*,FMT=9010) label(j),n*bytes(j)*nbpw/mintime(j)/1.0e6,
     $        rmstime(j),mintime(j),maxtime(j)
 90   CONTINUE
 
 9000 FORMAT (' Function',5x,'Rate (MB/s)  RMS time  Min time  Max time'
     $        )
 9010 FORMAT (a,4 (f10.4,2x))
      END
*-------------------------------------
* INTEGER FUNCTION realsize()
*
* A semi-portable way to determine the precision of default REAL
* in Fortran.
* Here used to guess how many bytes of storage a real number occupies.
*
	integer function realsize()
	double precision ref(30)
	real test
	double precision pi
C	Test #1 - compare double precision pi to acos(-1.0e0)
	pi = 3.14159 26535 89793 23846 26433 83279 50288 d0
	picalc = acos(-1.0e0)
	diff = abs(picalc-pi)
	if (diff.eq.0.0) then
	    print *,'Test #1 Failed = picalc=piexact'
	    print *,'Apparently Single=Double Precision'
	    print *,'Proceeding to Test #2'
	    print *,' '
	    goto 200
	else
	    ndigits = -log10(abs(diff))+0.5
	    goto 1000
	endif
C	Test #2 - compare single(1.0d0+delta) to 1.0e0
  200	do 10 j=1,30
	    ref(j) = 1.0d0+10.0d0**(-j)
   10	continue
	do 20 j=1,30
	    test = ref(j)
	    ndigits = j
	    call dummy(test,result)
	    if (test.eq.1.0e0) then
		goto 1000
	    endif
   20	continue
	print *,'Test #2 failed - Precision appears to exceed 30 digits'
	print *,'Proceeding to Test #3'
	goto 300
C	Test #3 - abs(sqrt(1.0d0)-sqrt(1.0e0))
  300	diff = abs(sqrt(1.0d0)-sqrt(1.0e0))
	if (diff.eq.0.0) then
	    print *,'Test Failed - sqrt(1.0e0)=sqrt(1.0d0)'
	    print *,'Apparently Single=Double Precision'
	    print *,'Giving up'
	    goto 400
	else
	    ndigits = -log10(abs(diff))+0.5
	    goto 1000
	endif
 1000	write (*,'(a)') '--------------------------------------'
	write (*,'(1x,a,i2,a)') 'Single precision appears to have ',
     $		ndigits,' digits of accuracy'
	if (ndigits.le.8) then
	    realsize = 4
	else 
	    realsize = 8
	endif
	write (*,'(1x,a,i1,a)') 'Assuming ',realsize,
     $                       ' bytes per default REAL word'
	write (*,'(a)') '--------------------------------------'
	return
  400	print *,'Hmmmm.  I am unable to determine the size of a REAL'
  	print *,'Please enter the number of Bytes per REAL number : '
  	read (*,*) realsize
	if (realsize.ne.4.and.realsize.ne.8) then
	    print *,'Your answer ',sizeof,' does not make sense!'
	    print *,'Try again!'
  	    print *,'Please enter the number of Bytes per ',
     $              'REAL number : '
  	    read (*,*) realsize
	endif
	print *,'You have manually entered a size of ',realize,
     $          ' bytes per REAL number'
	write (*,'(a)') '--------------------------------------'
	end
	subroutine dummy(q,r)
	r = cos(q)
	return
	end
        subroutine dummysub(a,b,c,n)
	return
	end

Next message: dik@cwi.nl: "SX3 benchmark"
Previous message: Charles Grassl: "stream"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Tue Apr 18 2000 - 05:23:02 CDT