Got myself a few months ago into the optimization rabbit hole as I had a slow quant finance library to take care of, and for now my most successful optimizations are using local memory allocators (see my C++ post, I also played with mimalloc which helped but custom local memory allocators are even better) and rethinking class layouts in a more “data-oriented” way (mostly going from array-of-structs to struct-of-arrays layouts whenever it’s more advantageous to do so, see for example this talk).
What are some of your preferred optimizations that yielded sizeable gains in speed and/or memory usage? I realize that many optimizations aren’t necessarily specific to any given language so I’m asking in [email protected].
I recently spent some time optimizing a small Julia program I wrote that generates a lookup table of brainfuck constants. Because it only needs to run once, I originally didn’t care about performance when I originally wrote it (and the optimization was mostly for fun).
I achieved an ~100x improvement by adding types, using static arrays and memoization. In the end, the performance was mostly limited by primitive math operations, I tried using multiple threads, but any synchronization destroyed the performance.
However, the most impressive thing was the ability of Julia to scale from dynamically typed scripting language to almost a compiled language with minimal changes to the code.