During this last week, some of the work I did was to optimize my particle system, since it was showing up consistently on my profiles and I was adding more and more particles to my environments.
There are a few basic approaches you can take when trying to optimize code like my particle system.
- The fact that you have a large number of particles all behaving in the same way means that you can easily distribute the work of updating/rendering particles across multiple cores, as long as your data structures and libraries are set up correctly to handle it – so one option is to multithread your particle system.
- The parallel-friendly nature of a particle system also means that it’s possible to offload much of the work involved in rendering particles directly to the GPU, and do it in a shader instead of on the CPU. This is almost always faster.
- In fact, in many cases, you can even update your particles on the GPU, by storing their state in a texture or vertex buffer and having a shader run over all the particles and write their new state to another texture/buffer. You can then take the new state and feed it into another shader as input to render your particles.
- And of course, you can always take the standard approach of brute-force optimization, by making your particle system as efficient as possible with the same basic algorithm.


