John D. McCalpin writes:
> Thanks for the data. The rate for a plain copy is surprisingly slow,
> but the other numbers are quite good for a machine in that price range.
> John D. McCalpin email@example.com
> Assistant Professor firstname.lastname@example.org
> College of Marine Studies, U. Del. John.McCalpin@mvs.udel.edu
Looking at the assembler output from gcc, there are two 4-byte moves
issued on each iteration of the plain-copy loop. The other loops
use the floating-point instructions which can move 8-bytes at a
time. I would expect an 8-byte move instruction would approximately
double the first loop speed ... maybe a Pentium optimized gcc would
BTW, I tried substituting c[j] = 1.0*a[j] in the first loop but
the compiler didn't fall for it ... it still issued the 4-byte
move instructions rather than an fp multiply.
-- Mike Kenney UW Applied Physics Lab email@example.com
This archive was generated by hypermail 2b29 : Tue Apr 18 2000 - 05:23:04 CDT