Re: What about Apple PowerPC ?

From: Robert Fridman (rfridman@ucalgary.ca)
Date: Mon Apr 12 1999 - 14:36:57 CDT


Denis COURTIER wrote:
>
> Hi !
>
> Is anybody knows something about benchmarks on Apple PowerPC ?
> I would like some information about perfs on I-Mac or Macintosh G3,
> but I can't find any on www.specbench.org , Wintune, and so on...
>
> My goal is to compare I-Mac and PC with Celeron or Pentium...

I have an Apple laptop (running Linux, 233 MHz PPC750 with .5 MB L2 cache
66MHz bus).

I find that benchmarks don't reflect the intended use of the computer. For
example, the STREAM bechmark <http://www.cs.virginia.edu/stream/> reported a
modeds 97MB/sec of memory bandwidth. I then tried STREAM with smaller data
sizes and was amazed that if the data could fit in the L2 cache, the bandwidth
trippled. So this architecture rewards good, tight programming. I included my
STREAM runs at the end of this email.

Another 'benchmark' I tried was running the benchmark gforth (GNU Forth)
programs. My laptop performed well against a 450MHz celeron. The results show
that the laptop is a lot slower on one test (.5 the speed), 18% faster on
another and 50% and 25% slower on the last 2.

The architecture of my laptop (due to a slow system bus) rewards using the
cache. This seems to suite may applications. Benchmarks don't seem to reflect
this.

If anyone has any PII or PIII laptops runing Linux, I'd be interesed in their
performance numbers for STREAM and gforth.

-- 

Robert.

---------------------------------------------------------------------- Robert Fridman rfridman@ucalgary.ca WurcNet Inc. University of Calgary Calgary, Alberta phone (403) 220-6779 Canada fax (403) 284-4707

Array size = 10000, Offset = 0 Total memory required = 0.2 MB. Function Rate (MB/s) RMS time Min time Max time Copy: 338.9337 0.0005 0.0005 0.0005 Scale: 333.9580 0.0034 0.0005 0.0105 Add: 332.4415 0.0035 0.0007 0.0108 Triad: 338.5347 0.0007 0.0007 0.0007

Array size = 15000, Offset = 0 Total memory required = 0.3 MB. Function Rate (MB/s) RMS time Min time Max time Copy: 238.8216 0.0036 0.0010 0.0110 Scale: 219.7889 0.0011 0.0011 0.0012 Add: 204.4340 0.0067 0.0018 0.0122 Triad: 212.6390 0.0055 0.0017 0.0117

Array size = 20000, Offset = 0 Total memory required = 0.5 MB. Function Rate (MB/s) RMS time Min time Max time Copy: 182.1260 0.0071 0.0018 0.0219 Scale: 194.2932 0.0055 0.0016 0.0118 Add: 183.7676 0.0061 0.0026 0.0127 Triad: 176.2776 0.0067 0.0027 0.0129

Array size = 40000, Offset = 0 Total memory required = 0.9 MB. Function Rate (MB/s) RMS time Min time Max time Copy: 121.2802 0.0098 0.0053 0.0164 Scale: 113.4362 0.0057 0.0056 0.0059 Add: 117.1304 0.0120 0.0082 0.0185 Triad: 117.8348 0.0110 0.0081 0.0183

Array size = 80000, Offset = 0 Total memory required = 1.8 MB. Function Rate (MB/s) RMS time Min time Max time Copy: 102.7780 0.0175 0.0125 0.0335 Scale: 99.4795 0.0160 0.0129 0.0233 Add: 104.2630 0.0247 0.0184 0.0386 Triad: 105.4943 0.0222 0.0182 0.0384

Array size = 100000, Offset = 0 Total memory required = 2.3 MB. Function Rate (MB/s) RMS time Min time Max time Copy: 100.9976 0.0216 0.0158 0.0367 Scale: 99.6945 0.0178 0.0160 0.0260 Add: 104.8172 0.0275 0.0229 0.0531 Triad: 106.2185 0.0253 0.0226 0.0425

Array size = 250000, Offset = 0 Total memory required = 5.7 MB. Function Rate (MB/s) RMS time Min time Max time Copy: 98.7996 0.0463 0.0405 0.0815 Scale: 97.5706 0.0466 0.0410 0.0806 Add: 104.0945 0.0597 0.0576 0.0745 Triad: 104.3550 0.0575 0.0575 0.0576

Array size = 500000, Offset = 0 Total memory required = 11.4 MB. Function Rate (MB/s) RMS time Min time Max time Copy: 98.2149 0.0946 0.0815 0.1623 Scale: 97.6838 0.0860 0.0819 0.0992 Add: 104.3832 0.1187 0.1150 0.1313 Triad: 104.5187 0.1190 0.1148 0.1313

Array size = 1000000, Offset = 0 Total memory required = 22.9 MB. Function Rate (MB/s) RMS time Min time Max time Copy: 97.9756 0.1768 0.1633 0.2679 Scale: 97.7368 0.1641 0.1637 0.1655 Add: 104.4727 0.2302 0.2297 0.2320 Triad: 104.5920 0.2298 0.2295 0.2304

Some benchmark results for various combinations of hardware, Gforth version, and Gforth configuration. Unless specified otherwise, the default configurations were used. You can measure your combination with `make bench'. You can find a table comparing Gforth with six interpretive Forth systems in the manual (Section Performance), and a comparison with more systems in http://www.complang.tuwien.ac.at/papers/ertl&maierhofer95.ps.gz.

All times are given in seconds of user time.

siev bubble matrix fib machine and configuration 2.39 3.38 4.13 4.18 Celeron 450 (Mendocino); gcc-2.7.2.3 -DFORCE-REG -DDIRECT-THREADED; gforth-0.3.0; ELF 4.63 5.06 3.36 5.25 Apple PowerBook G3 (233MHz); egcs-1.0.2 prerelease; gforth-0.4.0 6.88 7.43 5.80 8.15 AMD K6-2 300MHz 1M PB cache 100MHz; gcc-2.7.2.3 -DFORCE-REG; gforth-0.4.0; ELF 7.36 8.16 7.73 9.04 Pentium-MMX 200MHz 512K PB cache; egcs-1.1b -DFORCE-REG; gforth-0.4.0; ELF 10.91 11.94 11.24 13.13 Pentium 133MHz 256K PB cache; gcc-2.6.3 -DFORCE_REG; gforth-0.1beta; a.out 11.16 11.86 10.64 12.53 Pentium 133MHz 512K PB cache; gcc-2.7.2p -DFORCE_REG, gforth-0.3.0; ELF 12.62 13.56 11.04 14.97 AMD K6 166MHz 512K PB cache; gcc-2.7.2p -DFORCE_REG, gforth-0.3.0; ELF 11.81 14.39 13.61 15.07 IBM/Cyrix-6x86 133MHz (P166+) 512K PB cache; gcc-2.7.2.1 -DFORCE_REG -DDIRECT_THREADED; gforth-0.3.0; ELF 12.08 11.90 11.06 12.09 Cyrix-6x86MX 166MHz (PR200) 512K PB Cache; gcc-2.7.2.1 -DFORCE_REG -DDIRECT_THREADED; gforth-0.3.0; ELF 29.89 35.42 26.96 34.59 i486 66MHz 256K cache; gcc-2.6.3 -DFORCE_REG -DDIRECT_THREADED; gforth-0.1beta; a.out 39.50 45.91 36.73 44.90 i486 50MHz 256K cache; gcc-2.7.0 -DFORCE_REG -DDIRECT_THREADED; gforth-0.1beta 42.82 46.74 38.69 48.30 i486 50MHz 256K cache; gcc-2.7.0 -DFORCE_REG; gforth-0.1beta

3.09 3.24 2.39 3.42 21164A (Alpha,164LX) 600MHz 2M cache; gcc-2.7.2.1+gas; gforth-0.3.0 3.7 3.8 2.8 4.1 21164A (Alpha,PC164) 500MHz 2M cache; gcc-2.7.2.1+as (Digital Unix); gforth-0.3.0 7.0 7.6 6.2 7.7 21064A (Alpha,Cabriolet) 300MHz 2M cache; gcc-2.7.2; gforth-0.2.0

7.49 7.85 6.21 8.07 R4400 250 Mhz 2Mb cache; gcc-2.7.2.2 8.17 9.01 6.24 9.35 R10000 (SGI PowerChallenge XL) 195MHz 2M cache; gcc-2.7.2 -DFORCE_REG; gforth-0.3.0 17.3 19.0 14.1 18.3 R4000 (DecStation 5000/150) 100MHz 1M cache; gcc-2.4.5; gforth-0.1beta 50.9 56.8 42.4 52.0 R3000 (DecStation 5000/200) 25MHz 64K+64K cache; gcc-2.5.8 -DFORCE_REG; gforth-0.1beta

7.8 8.6 7.0 10.3 UltraSparc-II 248MHz; Solaris.5.5.1; gcc-2.7.1; gforth-0.3.0 28.5 31.1 26.3 33.3 SuperSparc (Sparcstation 10) 40MHz; Solaris.5.5.1; gcc-2.7.1; gforth-0.3.0 59.5 65.8 69.5 61.9 FJMB86903 (SPARC ELC) 33MHz; gcc-2.5.8; gforth-0.1beta 84.34 91.49 76.16 88.83 L64801 25MHz (SPARC IPC) 64K WT cache; gcc-2.4.5; gforth-0.1beta

11.6 12.1 10.8 15.6 PA8000 (HP C160) 160MHz 64M RAM; gcc-2.7.2; gforth-0.3.0 30.0 34.1 20.5 33.0 PA-RISC 1.1 (HP 720) 50MHz 64K cache; gcc-2.6.3 -DDIRECT_THREADED; gforth-0.1beta

6.81 7.53 5.10 8.12 PPC604e (PowerMac) 200MHz; Linux; gcc-2.7.2.1; gforth-0.4.0 8.25 10.09 6.45 10.34 PPC604e (PowerMac) 200MHz; Linux; gcc-2.7.2.1; gforth-0.3.0 (indirect threaded) 14.05 16.96 11.14 17.51 PPC604 (PowerMac) 132MHz 256K L2 cache; MkLinux 2.1; gcc-2.7.2.1; gforth-0.3.0



This archive was generated by hypermail 2b29 : Tue Apr 18 2000 - 05:23:08 CDT