------------------------------------------------------------------------- Revisions as of Thu, Jan 17, 2013 3:50:01 PM Version 5.10 of stream.c has been released. This version includes improved validation code and will automatically use 64-bit array indices on 64-bit systems to allow very large arrays. ------------------------------------------------------------------------- Revisions as of Thu Feb 19 08:16:57 CST 2009 Note that the codes in the "Versions" subdirectory should be considered obsolete -- the versions of stream.c and stream.f in this main directory include the OpenMP directives and structure for creating "TUNED" versions. Only the MPI version in the "Versions" subdirectory should be of any interest, and I have not recently checked that version for errors or compliance with the current versions of stream.c and stream.f. I added a simple Makefile to this directory. It works under Cygwin on my Windows XP box (using gcc and g77). A user suggested a sneaky trick for "mysecond.c" -- instead of using the #ifdef UNDERSCORE to generate the function name that the Fortran compiler expects, the new version simply defines both "mysecond()" and "mysecond_()", so it should automagically link with most Fortran compilers. ------------------------------------------------------------------------- Revisions as of Wed Nov 17 09:15:37 CST 2004 The most recent "official" versions have been renamed "stream.f" and "stream.c" -- all other versions have been moved to the "Versions" subdirectory. The "official" timer (was "second_wall.c") has been renamed "mysecond.c". This is embedded in the C version ("stream.c"), but still needs to be externally linked to the FORTRAN version ("stream.f"). ------------------------------------------------------------------------- Revisions as of Tue May 27 11:51:23 CDT 2003 Copyright and License info added to stream_d.f, stream_mpi.f, and stream_tuned.f ------------------------------------------------------------------------- Revisions as of Tue Apr 8 10:26:48 CDT 2003 I changed the name of the timer interface from "second" to "mysecond" and removed the dummy argument in all versions of the source code (but not the "Contrib" versions). ------------------------------------------------------------------------- Revisions as of Mon Feb 25 06:48:14 CST 2002 Added an OpenMP version of stream_d.c, called stream_d_omp.c. This is still not up to date with the Fortran version, which includes error checking and advanced data flow to prevent overoptimization, but it is a good start.... ------------------------------------------------------------------------- Revisions as of Tue Jun 4 16:31:31 EDT 1996 I have fixed an "off-by-one" error in the RMS time calculation in stream_d.f. This was already corrected in stream_d.c. No results are invalidated, since I use minimum time instead of RMS time anyway.... ------------------------------------------------------------------------- Revisions as of Fri Dec 8 14:49:56 EST 1995 I have renamed the timer routines to: second_cpu.c second_wall.c second_cpu.f All have a function interface named 'second' which returns a double precision floating point number. It should be possible to link second_wall.c with stream_d.f without too much trouble, though the details will depend on your environment. If anyone builds versions of these timers for machines running the Macintosh O/S or DOS/Windows, I would appreciate getting a copy. To clarify: * For single-user machines, the wallclock timer is preferred. * For parallel machines, the wallclock timer is required. * For time-shared systems, the cpu timer is more reliable, though less accurate. ------------------------------------------------------------------------- Revisions as of Wed Oct 25 09:40:32 EDT 1995 (1) NOTICE to C users: stream_d.c has been updated to version 4.0 (beta), and should be functionally identical to stream_d.f Two timers are provided --- second_cpu.c and second_wall.c second_cpu.c measures cpu time, while second_wall.c measures elapsed (real) time. For single-user machines, the wallclock timer is preferred. For parallel machines, the wallclock timer is required. For time-shared systems, the cpu timer is more reliable, though less accurate. (2) cstream.c has been removed -- use stream_d.c (3) stream_wall.f has been removed --- to do parallel aggregate bandwidth runs, comment out the definition of FUNCTION SECOND in stream_d.f and compile/link with second_wall.c (4) stream_offset has been deprecated. It is still here and usable, but stream_d.f is the "standard" version. There are easy hooks in stream_d.f to change the array offsets if you want to. (5) The rules of the game are clarified as follows: The reference case uses array sizes of 2,000,000 elements and no additional offsets. I would like to see results for this case. But, you are free to use any array size and any offset you want, provided that the arrays are each bigger than the last-level of cache. The output will show me what parameters you chose. I expect that I will report just the best number, but if there is a serious discrepancy between the reference case and the "best" case, I reserve the right to report both. Of course, I also reserve the right to reject any results that I do not trust.... -- John D. McCalpin, Ph.D. john@mccalpin.com