IMPORTANT: Two months later, it appears that this presentation is not as good as I intended it to be. For better references and detailed explanations, see Agner Fog's optimization manuals.

I did a short presentation to the CMLA Image Processing Group about how to produce code that runs faster, based on my recent work on a denoising algorithm (neural networks and deep learning).

This presentation is mostly a loosely structured list of recommendations, tools, hints, rules and examples. I tried to back my words with actual speed measures, but they are obviously linked to the software and hardware environment. YMMV.

- slides: HTML (reveal.js), PDF dump
- source code (timing macros, measures and examples): C/C++

Updates:

- 2015/01/20: added reference to Agner Fog manuals.

Follow-ups:

- Disabling Hyper-threading and Frequency Scaling
- Integer and Floating-Point Arithmetic Speed vs Precision
- Floating-Point Math Speed vs Precision