primordium 207% hinv -mv Location: /hw/node PM10270MHZ Board: barcode JKA977 part 030-1432-001 rev D Location: /hw/node/xtalk/15 IP30 Board: barcode JZD561 part 030-1467-001 rev D Location: /hw/node/xtalk/15/pci/2 FP1 Board: barcode KCD184 part 030-0891-003 rev F PWR.SPPLY.ER Board: barcode AAE9380623 part 060-0035-002 rev A Location: /hw/node/xtalk/12 MOT10 Board: barcode KCN026 part 030-1241-002 rev J 1 270 MHZ IP30 Processor Heart ASIC: Revision F CPU: MIPS R12000 Processor Chip Revision: 2.3 FPU: MIPS R12010 Floating Point Chip Revision: 0.0 Main memory size: 640 Mbytes Xbow ASIC: Revision 1.4 Instruction cache size: 32 Kbytes Data cache size: 32 Kbytes Secondary unified instruction/data cache size: 2 Mbytes Integral SCSI controller 0: Version QL1040B (rev. 2), single ended Disk drive: unit 1 on SCSI controller 0 (unit 1) Integral SCSI controller 1: Version QL1040B (rev. 2), single ended IOC3/IOC4 serial port: tty1 IOC3/IOC4 serial port: tty2 IOC3 parallel port: plp1 Graphics board: ESI Integral Fast Ethernet: ef0, version 1, pci 2 Iris Audio Processor: version RAD revision 12.0, number 1 PCI Adapter ID (vendor 4265, device 3) pci slot 2 PCI Adapter ID (vendor 4215, device 4128) pci slot 0 PCI Adapter ID (vendor 4215, device 4128) pci slot 1 PCI Adapter ID (vendor 4265, device 5) pci slot 3 primordium 208% ls -l total 184 -rw-r--r-- 1 nasko user 652 Sep 3 23:15 FILES -rw-r--r-- 1 nasko user 639 Sep 3 23:15 Parallel_jobs -rw-r--r-- 1 nasko user 3342 Sep 3 23:15 README -rw-r--r-- 1 nasko user 287 Sep 3 23:15 second_cpu.c -rw-r--r-- 1 nasko user 483 Sep 3 23:15 second_cpu.f -rw-r--r-- 1 nasko user 680 Sep 3 23:15 second_wall.c -rw-r--r-- 1 nasko user 5555 Sep 3 23:15 stream_d.c -rw-r--r-- 1 nasko user 14398 Sep 3 23:15 stream_d.f -rw-r--r-- 1 nasko user 5705 Sep 3 23:15 stream_d_omp.c -rw-r--r-- 1 nasko user 18341 Sep 3 23:15 stream_mpi.f -rw-r--r-- 1 nasko user 15233 Sep 3 23:15 stream_tuned.f primordium 209% cc -version MIPSpro Compilers: Version 7.30 primordium 210% cc -O3 -mips4 stream_d.c second_cpu.c -lm -o stream_d stream_d.c: second_cpu.c: cc-1174 cc: WARNING File = second_cpu.c, Line = 8 The variable "sec" was declared but never referenced. long sec; ^ ld32: WARNING 84 : /usr/lib32/mips4/libm.so is not used for resolving any symbol. primordium 211% cc -O3 -mips4 -64 stream_d.c second_cpu.c -lm -o stream_d_64 stream_d.c: second_cpu.c: cc-1174 cc: WARNING File = second_cpu.c, Line = 8 The variable "sec" was declared but never referenced. long sec; ^ ld64: WARNING 84 : /usr/lib64/mips4/libm.so is not used for resolving any symbol. primordium 212% f90 -O3 -mips4 stream_d.f second_cpu.f -o stream_d_f stream_d.f: second_cpu.f: primordium 213% f90 -O3 -mips4 -64 stream_d.f second_cpu.f -o stream_d_f_64 stream_d.f: second_cpu.f: primordium 214% f90 -O3 -mips4 stream_tuned.f second_cpu.f -o stream_tuned stream_tuned.f: second_cpu.f: primordium 215% f90 -O3 -mips4 -64 stream_tuned.f second_cpu.f -o stream_tuned_64 stream_tuned.f: second_cpu.f: primordium 216% ./stream_d ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 8000000, Offset = 0 Total memory required = 183.1 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Your clock granularity/precision appears to be 9999 microseconds. Each test below will take on the order of 240000 microseconds. (= 24 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) RMS time Min time Max time Copy: 355.5559 0.3651 0.3600 0.3800 Scale: 355.5568 0.3671 0.3600 0.3800 Add: 408.5106 0.4840 0.4700 0.4900 Triad: 408.5112 0.4760 0.4700 0.4900 primordium 217% ./stream_d_64 ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 8000000, Offset = 0 Total memory required = 183.1 MB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Your clock granularity/precision appears to be 9999 microseconds. Each test below will take on the order of 279999 microseconds. (= 28 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (MB/s) RMS time Min time Max time Copy: 376.4706 0.3481 0.3400 0.3600 Scale: 345.9461 0.3730 0.3700 0.3800 Add: 408.5104 0.4840 0.4700 0.4900 Triad: 417.3921 0.4700 0.4600 0.4800 primordium 218% ./stream_d_f ---------------------------------------------- Double precision appears to have 16 digits of accuracy Assuming 8 bytes per DOUBLE PRECISION word ---------------------------------------------- Array size = 2000000 Offset = 0 The total memory requirement is 45 MB You are running each test 10 times -- The *best* time for each test is used *EXCLUDING* the first and last iterations ---------------------------------------------------- Your clock granularity/precision appears to be 10000 microseconds ---------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 400.0000 0.0850 0.0800 0.0900 Scale: 355.5556 0.0900 0.0900 0.0900 Add: 400.0000 0.1200 0.1200 0.1200 Triad: 400.0000 0.1200 0.1200 0.1200 ---------------------------------------------------- Solution Validates! ---------------------------------------------------- primordium 219% ./stream_d_f_64 ---------------------------------------------- Double precision appears to have 16 digits of accuracy Assuming 8 bytes per DOUBLE PRECISION word ---------------------------------------------- Array size = 2000000 Offset = 0 The total memory requirement is 45 MB You are running each test 10 times -- The *best* time for each test is used *EXCLUDING* the first and last iterations ---------------------------------------------------- Your clock granularity/precision appears to be 10000 microseconds ---------------------------------------------------- Function Rate (MB/s) Avg time Min time Max time Copy: 400.0000 0.0838 0.0800 0.0900 Scale: 355.5556 0.0950 0.0900 0.1000 Add: 436.3636 0.1163 0.1100 0.1200 Triad: 436.3636 0.1187 0.1100 0.1200 ---------------------------------------------------- Solution Validates! ---------------------------------------------------- primordium 220%