Optimizing my Haskell Raytracer::02.25.2012+23:35
Way back in 2010 I wrote a really crappy, proof of concept raytracer as a way of familiarizing myself more with Haskell. I picked it back up this morning because I wanted to see how much I could improve the performance of it by doing some pretty simple optimizations. I could probably improve performance further by doing actual structure changes to the program, but I’d rather fall back on algorithmic improvements before fighting with the code generator. At any rate, the final results are satisfying: a 70% decrease in running time overall.
Strictness Annotations
The first optimization I did was to put strictness annotations
on the Doubles in the vector type.
What started off as:
became:
data Vec3f = Vec3f Double Double Double
deriving(Eq,Show)
This resulted in a 19% decrease in overall run time. Nothing drastic, but
still a very significant difference.
data Vec3f = Vec3f !Double !Double !Double
deriving(Eq,Show)
Unboxed Strict Fields
The next optimization was to add the compiler flag “-funbox-strict-fields”. What this does is tell GHC to automatically add the UNPACK pragma to strict fields in data constructors. The end result, is that the Vec3f constructor is no longer storing heap pointers to Doubles and instead just storing the Doubles themselves. Unboxing the strict fields brought the total run time decrease to 35%.
Float versus Double
As most people know, Doubles are not very fast on many systems. In the past, I had seen a speed increase by using Doubles instead of Floats in this raytracer. I believe that it had something to do with GHC not using SSE for Float and only for Double. Regardless, switching from Double to Float doubled the total savings. The result was a decrease of 70% run time versus the original.
Code / Full Disclosure
You can download the code here: raytracer.zip. When run
it will generate a file called “output.ppm” containing the rendered image. It should look like this:
The above tests were done on my Acer Aspire One, which has an Intel Atom N270 1.6GHz HT and 1GB RAM.
I’m not sure what the performance differences will be for a 64-bit machine with a better
processor.