These graphs llustrate the measured performance of our prototype SMC system on several benchmark kernels on vectors of 16 to 8192 elements. Performance is the average number of CPU cycles per stream access for each benchmark. The green lines labeled "performance limit" indicate the minimum cycles per access (cpa) for the computation: these limits are due to SMC startup costs, unavoidable page misses, or the cost of moving data between the SMC and CPU chips. (The absolute limit on the i860 is two cycles for each memory access.) The black lines indicate the performance of our access ordering hardware. The red lines indicate the performance measured when using "normal" caching load instructions to access the stream data in the i860's own cache-optimized memory; and the blue lines indicate the performance measured when using the i860's non-caching pipelined floating point load (pfld) instruction.
![]() |
![]() |
![]() |
![]() |
![]() |