Re: stream benchmark results

From: Mike Kenney (mike@wavelet.apl.washington.edu)
Date: Fri Aug 26 1994 - 15:14:02 CDT


John D. McCalpin writes:
>
> Thanks for the data. The rate for a plain copy is surprisingly slow,
> but the other numbers are quite good for a machine in that price range.
> --
> John D. McCalpin mccalpin@perelandra.cms.udel.edu
> Assistant Professor mccalpin@brahms.udel.edu
> College of Marine Studies, U. Del. John.McCalpin@mvs.udel.edu
>
>

Looking at the assembler output from gcc, there are two 4-byte moves
issued on each iteration of the plain-copy loop. The other loops
use the floating-point instructions which can move 8-bytes at a
time. I would expect an 8-byte move instruction would approximately
double the first loop speed ... maybe a Pentium optimized gcc would
do this.

BTW, I tried substituting c[j] = 1.0*a[j] in the first loop but
the compiler didn't fall for it ... it still issued the 4-byte
move instructions rather than an fp multiply.

-- 
Mike Kenney
UW Applied Physics Lab
mikek@apl.washington.edu



This archive was generated by hypermail 2b29 : Tue Apr 18 2000 - 05:23:04 CDT