TESTFCPU

Introduction, Short description, How-To, Download, Feedback, Links
, Disclaimer

Introduction

So, the main question remains: How fast is that computer?

Long long time ago I wrote this small and simple program to estimate the computer performance with double precision floating-point operations.

I was surprised that on Unix/Linux OS my code was running 50-80% faster than on M$ Window$ with the same hardware! As an example, just compare different operating systems on my Athlon-XP 1.3 GHz laptop:

Benchmark measurements results

Complete list:
"__MTOPS________CPU_@_freq.__(GHz)___________OS__________comments/flags_______",
"  0.20|        Intel Pentium 0.10|             Win95|                   Win32",
"  0.40| *      Intel Pentium 0.18|             Win95|                   Win32",
"  0.68|     Motorola MPC8260 0.20|  MontaVista Linux|                  no FPU",
"  0.73|        Intel Pentium 0.16|       FreeBSD 3.3|                        ",
"  0.96|        Intel Celeron 0.40|             Win95|                   Win32",
"  1.31|    Intel Pentium Pro 0.20|       OpenBSD 3.8|gcc335 -s pentiumpro -O3",
"  1.39|            Intel PII 0.40|             Win95|                   Win32",
"  1.64|        UltraSPARC 10 0.44|   Sun Solaris 2.7|                        ",
"  1.75|        Intel Celeron 0.40|          RH Linux|                        ",
"  1.92|            Intel PII 0.40|       FreeBSD 4.8|                        ",
"  3.25|      Dual Intel PIII 0.75|      RH Linux 7.3|                        ",
"  3.87|         AMD AthlonXP 1.30|           Win2000|                   Win32",
"  4.11|           Intel PIII 0.66|       FreeBSD 5.4|                     -O3",
"  4.27|        Intel Celeron 1.70|           Win2000|                   Win32",
"  4.33|           Intel PIII 0.66|       FreeBSD 5.4|     -mtune=pentium3 -O3",
"  6.06|         AMD AthlonXP 1.30|  RH Linux 7.3/2.4|                        ",
"  6.57|           Intel PIII 1.00|    Knoppix/2.6.11|                     -O3",
"  6.82|         AMD AthlonXP 1.30|  RH Linux 7.3/2.4|                     -O3",
"  7.78|         AMD AthlonXP 1.30|       FreeBSD 5.4|                     -O0",
"  8.33|         AMD AthlonXP 1.30|       FreeBSD 5.4|                     -O3",
"  9.61|         AMD AthlonXP 2.10|  Fedora FC2 Linux|                       ?",
"  9.85|         AMD AthlonXP 1.30|       FreeBSD 5.4|    -mtune=athlon-xp -O3",
" 10.31|           PowerPC G4 1.33|     Mac OS 10.4.6|      gcc400 (Apple) -O3",
" 10.55|        AMD Athlon XP 1.40|       FreeBSD 6.1|gcc344 -s -athlon-xp -O3",
" 10.67|         AMD AthlonXP 2.10|  Fedora FC2 Linux|                     -O3",
" 10.68|             Intel P4 2.20|            Win XP|         optimized Win32",
" 10.95|     Intel P4 Xeon HT 2.80|       FreeBSD 5.4|                     -O3",
" 11.15|     Intel P4 Xeon HT 2.80|       FreeBSD 5.4|         -ffast-math -O3",
" 11.98|   AMD AthlonMP 2400+ 2.00|    FreeBSD 6.0 b5|              gcc344 -O3",
" 12.67|   Intel Celeron M380 1.60|            Win XP|         optimized Win32",
" 12.85|   Intel Xeon HT/EM64 2.80|#      FreeBSD 5.4|         -ffast-math -O3",
" 13.23|           Intel Xeon 3.00|          SuSe 9.1|    gcc333 -s nocona -O3",
" 13.44|   Intel Xeon HT/EM64 3.00|       FreeBSD 6.0|       -mtune=nocona -O3",
" 14.08|    AMD Sempron 2600+ 1.68|           Win2000|         optimized Win32",
" 14.30|         AMD Athlon64 1.80|#      FreeBSD 5.4|         -ffast-math -O3",
" 14.94|     Intel Core 2 Duo 2.16|   Mac OS X 10.4.8|              gcc401 -O3",
" 14.94|  Dual AMD Opteron242 1.60|#     FreeBSD 5.4?|                     -O3",
" 15.47|         AMD Athlon64 1.80|            Win XP|         optimized Win32",
" 15.47|          AMD Sempron 1.80|            Win XP|         optimized Win32",
" 15.83|  Dual AMD Opteron242 1.60|#     FreeBSD 5.4?|                       ?",
" 16.02|  AMD Sempron64 3400+ 2.00|       FreeBSD 6.1|    gcc344 -mtune=k8 -O3",
" 16.84|         AMD Athlon64 1.80|#      FreeBSD 5.4|                     -O0",
" 17.05|         AMD Athlon64 1.80|#      FreeBSD 5.4|         -ffast-math -O0",
" 17.73|  AMD Sempron64 3400+ 2.00|        Win XP SP2|         optimized Win32",
" 17.97|  AMD Sempron64 3400+ 2.00|       FreeBSD 6.1|ffast-math -mtune=k8 -O3",
" 18.22|         AMD Athlon64 1.80|#      FreeBSD 5.4|                     -O3",
" 18.87|   Intel Xeon HT/EM64 2.80|#      FreeBSD 5.4|                     -O0",
" 19.00|  AMD Sempron64 3400+ 2.00|#      FreeBSD 6.1|    gcc344 -mtune=k8 -O2",
" 20.15|   Intel Xeon HT/EM64 2.80|#      FreeBSD 5.4|                     -O3",
" 20.78|   Intel Xeon HT/EM64 2.80|#      FreeBSD 5.4|       -mtune=nocona -O3",
" 21.11|        AMD Turion X2 1.60|#        SuSe 10.2|              gcc412 -O3",
" 22.17|  Intel Pentium D 945 3.40|#       Linux FC 5|              gcc410 -O3",
" 22.17| *  AMD Sempron 2600+ 1.76|#         SuSe 9.3|              gcc335 -O3",
" 22.51|  Intel Pentium D 945 3.40|#       Linux FC 5|  gcc410 -ffast-math -O3",
" 22.54| AMD Athlon64 X2 4400 2.25|#      FreeBSD 5.4|              gcc342 -O3",
" 22.67|   Intel Xeon HT/EM64 3.00|#      FreeBSD 6.0|       -mtune=nocona -O3",
" 22.93| AMD Athlon64 X2 4400 2.25|#      FreeBSD 5.4|              gcc402 -O3",
" 23.20| AMD Athlon64 X2 4400 2.25|#      FreeBSD 5.4|       gcc402 nocona -O3",
" 23.47| AMD Athlon64 X2 6000 3.00|       FreeBSD 6.2|     gcc346 athlon64 -O3",
" 24.94|  Intel Pentium D 945 3.40|#      FreeBSD 6.2|              gcc346 -O3",
" 24.94|           Intel Xeon 3.00|#          SuSe 10|    gcc402 nocona -s -O3",
" 25.25|  Intel Pentium D 840 2.80|#          SuSe 10|                     -O3",
" 25.58|        Intel P4 EM64 2.80|#          SuSe 10|       gcc402 nocona -O3",
" 25.58| *  AMD Sempron 2600+ 2.00|#         SuSe 9.3|              gcc335 -O3",
" 26.60|  Intel Pentium D 945 3.40|#      FreeBSD 6.2|gcc346 -mtune=nocona -O3",
" 27.33|   AMD Phenom X4 9600 2.30|#      FreeBSD 6.3|              gcc346 -O3",
" 29.34|      Intel P4 640 HT 3.20|#          SuSe 10|           -mtune=nocona",
" 31.67|  AMD Athlon X2 4600+ 2.40|#  Linux Gentoo-r7|              gcc441 -O3",
" 35.21|   AMD Phenom X4 9600 2.30|#      FreeBSD 7.0|              gcc421 -O3",
" 36.50|   AMD Phenom X4 9600 2.30|#      FreeBSD 7.0| gcc421 -ffast-math  -O3",
" 39.90|     Intel Xeon E5420 2.50|# GNU/Linux 2.6.24|    Intel 64bit compiler",
" 43.37|Intel Core2 Duo E8400 3.00|#   Ubuntu 4.3.3-5| gcc433 -ffast-math  -O3",
"                                     Linux 2.6.28-16|                        ",
" 48.79|Intel Core2 Duo P8600 2.40|McBkPro McOSX 10.5|                  gcc401",
"103.59| *  Intel Core i7 975 3.60|#    OpenSUSE 11.2|   gcc -march=native -O3",
"                                        Linux 2.6.31|-fopenmp -ffast-math ...",
"_____________________________________________________________________________",
"      ( *) Either CPU or bus is overclocked",
"      ( #) 64 bit OS",

And the moral is do not put into trash your old Pentium III 666 MHz box because it might serve as good Unix calculator as fast as Pentium 4 1.2 GHz or Celeron 1.7 GHz with Window$! There are no fools, this is why true scientists do not hate Micro$oft they love Unix instead.
Though the result of the performance measurements very much depends on what and how we measure, OS/kernel/scheduler responses or pure hardware power (see
short description below)?

Another interesting thing is how GCC v. 3.4.2 optimization flags affect the performance on different platforms. In general, the difference in performance between -O0 and -O3 level of optimization is less than 10%. Seem that -ffast-math -O3 much slower than even -O0 on AMD64 and Intel Xeon EM64 platforms. GCC v. 4.0.2 produces code with better performance and does not show this behavior. Also -mtune=CPU is very important for x86 hardware and unnecessary for x86-64 CPUs.

Yet another useful characteristic of the platform would be a performance/price ratio.

Short description

This program is written in C++ though there is no objects and OO programming, I tried to follow ANSI standard. Software consist of one file testfcpu.cpp which you can freely download, modify and redistribute under terms of the GNU General Public License. The code is "platform independent" meaning that it is possible to compile on Unix, Linux and Window$. I have successfully compiled testfcpu.cpp with Borland C++ and VC++ compiler as a console win32 application.

The program is not a sophisticated benchmark meter, it does not measure MIPS or xFLOPs. The main idea is to create a big array of random numbers (double precision, 8 bytes!) and then take tens or hundreds of millions (or even more ...) of trigonometric operations like sine and cosine a.k.a. FFT often used in scientific computations. The performance is measured in MTOPS or millions trigonometric operations per second. The size of the array is big enough to not fit in the cache of the CPU, though for modern CPUs with huge cache it might be not true.
Basically, this program measures overall performance of the CPU, FPU and data transfer rate between them and RAM. There is no benefit to run it on multiprocessor/cluster platform because code will be executed on one CPU/node only.

In order to measure time a system call clock() is used which determines the amount of processor time used since the invocation of the calling process (man 3 clock). The way time is measured is very important in benchmarking. In early version of the program the real time measurements were performed, so other tasks and CPU interruptions were affecting the results very much. This method was not very good for performance estimations of the hardware, however showed good characterization of the OS and kernel/scheduler in particular. For example Window$ based platforms showed really bad results with its poor task scheduler. In the latest version the measured MTOPS value should be more or less OS independent and even heavy load of the CPU should not affect the result because the CPU time spent on testfcpu is measured. For the comparison, the number of operations per real second is also computed.

How-To

The compilation on Unix/Linux platform with GCC compiler is stright forward:
g++ -mtune=athlon-xp -O3 testfcpu.cpp -o testfcpu

Note: this is for my AMD Athlon-XP laptop, for your CPU -mtune flag have to contain different value, read manual for your compiler! If in doubt, just skip this flag. On AMD64/Opteron with 64 bit OS and GCC compiler you do not need this flag at all.

You may wish to make the binary code smaller:
strip testfcpu

And then run it:
./testfcpu

... and wait for about 60 seconds!

Suggestion: quit from all applications before you run testfcpu. By doing so, you will get the maximum benchmark value for your CPU.

Download

WARNING, free code! Please read Disclaimer and GNU General Public License before you download, use, modify or redistribute this code!

Source code: testfcpu.cpp (5Kb)
Window$ executable (Win32, not optimized, old version with OS performance measurements rather than hardware): testfcpu.exe (138Kb)
Window$ executable (Win32, optimized): testfcpu_o.exe (96Kb)

Feedback

Well, I am not going to support or further develop this code, it is just a tool I am using. You can use it as well.
I would appreciate if you can give me some additional statistics, I'd like to collect more data for different platforms (especially for supercomputers!), so feel free to send me statistic (CPU name, its frequency, OS name and compiler used) to my electronic address: kono@kth.se. Thanks!
PS: Please do not send me results for old hardware. Of course I could collect more detailed hardware information like motherboard name or memory type but then that nice and simple table shown above will be messy, so skip that info as well please.

Links

There is plenty of CPU/RAM/... benchmarking software packages: advanced, simple, sophisticated, contradictory etc. Some are proprietary and cost money but many are free. I would recommend free open source software because how do you know what the hell that proprietary code really does or does not?
ubench
flops
SciMark2 (Java and C)
stream
... see other in FreeBSD benchmarks category

Disclaimer

There is no any warranty for this software. There is no any warranty that this is a virus-free code, I am not responsible for any damages caused to you or to your system due to viruses or bugs in this code.

Alexander Konovalenko

Introduction, Short description, How-To, Download, Feedback, Links
, Disclaimer


Last updated: 08/Dec/2009