Tuesday, January 25, 2011

Optimizing Sequential Programs via CUDA Parallelizations

You may not necessarily think of a massive parallelization through CUDA the same way as other program optimization techniques. However, the same general strategies should apply.

That is, you should never prematurely optimize everything (i.e. convert everything from your sequential code into CUDA kernels). It's a lot of effort to convert programs into CUDA and you may be wasting a considerable amount of valuable time in something that will have negligible impact.

Benchmark everything and then strategically optimize portions where the program spends the most time on. Depending on algorithms, you may find that the program spends 90%+ of its time on a few select routines. If those routines are well suited for CUDA's SPMD (Single Program Multiple Data) architecture, you might reap huge benefits.

No comments:

Post a Comment