For this follow up of my previous post, I decided to use my project Ovt'sa, a simple CPU raytracer, including Visual C++, GCC but also LLVM, using GCC 4.2 front-end.
Ovt'sa is a pure C++ program but not especially effective for what it does. It uses GLM and GLI but no other dependency. Despite using SSE optimizations, the program isn't especially design to take advantage of them and only run on a single thread. No disk access is included in the mesurements. The tests have been done on Windows 7 64 and a Phenom II X6 1055T. On Visual C++, I used /Ox, /fp:fast and link time build optimizations. On GCC and LLVM, I used -O3 and -fast-math but without -lto which is only available on GCC 4.5 but only provides from 0.0% to 2.5% of performance benefice. More information on link-time optimizations on a future post.
LLVM is becoming more and more mature and it has build itself a reputation, I was curious to how it actually behaves. LLVM turns out to be slower to build the code than GCC but has reached GCC level of performances. However, they both stay slower than Visual Studio which has made good performance improvements in its 2010 release. Since LLVM 2.3, the performance has evolved with some sort of performance unstability across versions even if in the end the performance level has just slightly progress in LLVM 2.8. LLVM 2.3 is the first version able to build Ovt'sa succesfully.
It's really interesting to compare the behaviours of Intel CPUs against AMD CPUs. The Phenom II X6 seams less sensitive to the compiler version and optimization flags than Intel CPUs. On the AMD CPU, using the x87 instruction set is just slightly slower. On Intel CPUs, it can double the performance!
Despite huge performance gains, GCC remains a step behind Visual Studio and its progress has only prevent LLVM to be a relevant replacement.