Re: stream benchmark results

From: Mike Kenney (
Date: Fri Aug 26 1994 - 15:14:02 CDT

John D. McCalpin writes:
> Thanks for the data. The rate for a plain copy is surprisingly slow,
> but the other numbers are quite good for a machine in that price range.
> --
> John D. McCalpin
> Assistant Professor
> College of Marine Studies, U. Del.

Looking at the assembler output from gcc, there are two 4-byte moves
issued on each iteration of the plain-copy loop. The other loops
use the floating-point instructions which can move 8-bytes at a
time. I would expect an 8-byte move instruction would approximately
double the first loop speed ... maybe a Pentium optimized gcc would
do this.

BTW, I tried substituting c[j] = 1.0*a[j] in the first loop but
the compiler didn't fall for it ... it still issued the 4-byte
move instructions rather than an fp multiply.

Mike Kenney
UW Applied Physics Lab

This archive was generated by hypermail 2b29 : Tue Apr 18 2000 - 05:23:04 CDT