Long awaited Max memory bandwidth results

From: Patrick F. McGehearty (patrick@mozart.convex.com)
Date: Tue Oct 29 1991 - 13:51:02 CST


Here are the outputs from a series of runs. They include
1 and 2 processor tests for real*4 and real*8 data on
the C3200 series and C3400 series (total of 8 data results).
I could not get access to a fully configured C3800 to run this benchmark,
as the few we have are dedicated to paying customer benchmarks and final
product development. Next quarter I expect to have some impressive
results (clock speed diffs suggest a 2.4 times increase in speed).

As we have discussed before, these benchmarks don't properly represent
our compiler's ability to reuse vector data for many common loops such
as linpack and matrix multiply to obtain a MUL-ADD for each memory
reference. However, they do measure one important aspect of total
system performance.

I separated each set of results with a line of +++'s and a one line
description of the machine configuration which it ran on.

- Patrick McGehearty (patrick@convex.com)

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Test run on C3210 with 32 way interleave, real = 4 bytes
--------------------------------------
 Single precision appears to have 7 digits of accuracy
 Assuming 4 bytes per default REAL word
--------------------------------------
Timing calibration ; time = 4.399200 hundredths of a second
Increase the size of the arrays if this is <30
 and your clock precision is =<1/100 second
Or ignore variations in timing and just look at
 totals
---------------------------------------------------
Function RMS time Min time Max time
Assignment: 0.0435 0.0428 0.0442
Scaling : 0.0437 0.0434 0.0440
Summing : 0.0554 0.0551 0.0557
SAXPYing : 0.0554 0.0551 0.0558

Memory transfer rates in MB/s

Function Total RMS Best Worst
Assignment: 91.9481 91.9782 93.4448 90.4466
Scaling : 91.5923 91.6300 92.1766 90.8945
Summing : 108.3314 108.3740 108.8673 107.8128
SAXPYing : 108.2118 108.2583 108.9701 107.4498

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Test run on C3210 with 32 way interleave, real = 8 bytes
Test #1 Failed = picalc=piexact
Apparently Single=Double Precision
Proceeding to Test #2
 
--------------------------------------
 Single precision appears to have 16 digits of accuracy
 Assuming 8 bytes per default REAL word
--------------------------------------
Timing calibration ; time = 4.45790000000000 hundredths of a second
Increase the size of the arrays if this is <30
 and your clock precision is =<1/100 second
Or ignore variations in timing and just look at
 totals
---------------------------------------------------
Function RMS time Min time Max time
Assignment: 0.0447 0.0444 0.0451
Scaling : 0.0449 0.0446 0.0453
Summing : 0.0666 0.0662 0.0671
SAXPYing : 0.0669 0.0664 0.0677

Memory transfer rates in MB/s

Function Total RMS Best Worst
Assignment: 178.8493 178.9316 180.0950 177.2971
Scaling : 178.0718 178.1687 179.4245 176.5576
Summing : 180.0483 180.1261 181.3565 178.7603
SAXPYing : 179.3529 179.4383 180.8209 177.2473

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Test run on C3220 with 32 way interleave, real = 4 bytes
--------------------------------------
 Single precision appears to have 7 digits of accuracy
 Assuming 4 bytes per default REAL word
--------------------------------------
Timing calibration ; time = 2.309900 hundredths of a second
Increase the size of the arrays if this is <30
 and your clock precision is =<1/100 second
Or ignore variations in timing and just look at
 totals
---------------------------------------------------
Function RMS time Min time Max time
Assignment: 0.0226 0.0224 0.0230
Scaling : 0.0227 0.0225 0.0236
Summing : 0.0292 0.0284 0.0324
SAXPYing : 0.0291 0.0287 0.0294

Memory transfer rates in MB/s

Function Total RMS Best Worst
Assignment: 177.0076 177.1239 178.6831 173.7544
Scaling : 175.8365 175.9640 177.5331 169.1691
Summing : 205.5435 205.5453 210.9705 184.9742
SAXPYing : 206.1679 206.3336 208.7541 203.7487

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Test run on C3220 with 32 way interleave, real = 8 bytes
Test #1 Failed = picalc=piexact
Apparently Single=Double Precision
Proceeding to Test #2
 
--------------------------------------
 Single precision appears to have 16 digits of accuracy
 Assuming 8 bytes per default REAL word
--------------------------------------
Timing calibration ; time = 2.39510000000000 hundredths of a second
Increase the size of the arrays if this is <30
 and your clock precision is =<1/100 second
Or ignore variations in timing and just look at
 totals
---------------------------------------------------
Function RMS time Min time Max time
Assignment: 0.0243 0.0239 0.0252
Scaling : 0.0242 0.0240 0.0248
Summing : 0.0363 0.0356 0.0396
SAXPYing : 0.0360 0.0357 0.0369

Memory transfer rates in MB/s

Function Total RMS Best Worst
Assignment: 329.3360 329.6101 334.2665 318.0662
Scaling : 329.6278 329.9916 333.0281 322.5416
Summing : 330.3847 330.5466 337.2776 302.8085
SAXPYing : 332.7483 333.0980 336.4832 325.0007

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Test run on C3410 with 32 way interleave, real = 4 bytes
--------------------------------------
 Single precision appears to have 7 digits of accuracy
 Assuming 4 bytes per default REAL word
--------------------------------------
Timing calibration ; time = 2.812000 hundredths of a second
Increase the size of the arrays if this is <30
 and your clock precision is =<1/100 second
Or ignore variations in timing and just look at
 totals
---------------------------------------------------
Function RMS time Min time Max time
Assignment: 0.0262 0.0258 0.0270
Scaling : 0.0269 0.0265 0.0276
Summing : 0.0406 0.0382 0.0491
SAXPYing : 0.0395 0.0389 0.0414

Memory transfer rates in MB/s

Function Total RMS Best Worst
Assignment: 152.7073 152.8022 155.1109 148.3459
Scaling : 148.4775 148.5742 150.9206 144.7491
Summing : 148.1353 147.6949 157.0846 122.0827
SAXPYing : 151.6438 151.7295 154.4086 144.8397

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Test run on C3410 with 32 way interleave, real = 8 bytes
Test #1 Failed = picalc=piexact
Apparently Single=Double Precision
Proceeding to Test #2
 
--------------------------------------
 Single precision appears to have 16 digits of accuracy
 Assuming 8 bytes per default REAL word
--------------------------------------
Timing calibration ; time = 5.60450000000000 hundredths of a second
Increase the size of the arrays if this is <30
 and your clock precision is =<1/100 second
Or ignore variations in timing and just look at
 totals
---------------------------------------------------
Function RMS time Min time Max time
Assignment: 0.0466 0.0450 0.0482
Scaling : 0.0469 0.0464 0.0475
Summing : 0.0687 0.0670 0.0761
SAXPYing : 0.0689 0.0672 0.0765

Memory transfer rates in MB/s

Function Total RMS Best Worst
Assignment: 171.6270 171.7024 177.5923 165.9476
Scaling : 170.5793 170.6833 172.5216 168.4388
Summing : 174.5774 174.5629 179.0243 157.6086
SAXPYing : 174.2757 174.2588 178.4493 156.7972

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Test run on C3420 with 32 way interleave, real = 4 bytes
--------------------------------------
 Single precision appears to have 7 digits of accuracy
 Assuming 4 bytes per default REAL word
--------------------------------------
Timing calibration ; time = 1.751800 hundredths of a second
Increase the size of the arrays if this is <30
 and your clock precision is =<1/100 second
Or ignore variations in timing and just look at
 totals
---------------------------------------------------
Function RMS time Min time Max time
Assignment: 0.0170 0.0165 0.0179
Scaling : 0.0172 0.0167 0.0180
Summing : 0.0232 0.0228 0.0254
SAXPYing : 0.0237 0.0231 0.0270

Memory transfer rates in MB/s

Function Total RMS Best Worst
Assignment: 235.0922 235.2967 242.6743 224.0268
Scaling : 232.0791 232.3044 238.8061 222.5933
Summing : 258.4247 258.6258 263.7243 236.6210
SAXPYing : 252.9180 252.9793 259.9990 221.8773

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Test run on C3420 with 32 way interleave, real = 8 bytes
Test #1 Failed = picalc=piexact
Apparently Single=Double Precision
Proceeding to Test #2
 
--------------------------------------
 Single precision appears to have 16 digits of accuracy
 Assuming 8 bytes per default REAL word
--------------------------------------
Timing calibration ; time = 2.90190000000000 hundredths of a second
Increase the size of the arrays if this is <30
 and your clock precision is =<1/100 second
Or ignore variations in timing and just look at
 totals
---------------------------------------------------
Function RMS time Min time Max time
Assignment: 0.0298 0.0277 0.0338
Scaling : 0.0298 0.0293 0.0316
Summing : 0.0414 0.0396 0.0492
SAXPYing : 0.0406 0.0400 0.0419

Memory transfer rates in MB/s

Function Total RMS Best Worst
Assignment: 268.6195 268.5465 289.1531 236.6094
Scaling : 267.9403 268.1232 273.1121 253.1405
Summing : 290.0807 289.7490 302.7016 243.8826
SAXPYing : 295.0962 295.3977 300.3003 286.6767



This archive was generated by hypermail 2b29 : Tue Apr 18 2000 - 05:23:02 CDT