Friday, January 28, 2011

Programmer Productivity vs Programming Language

There is a good article about productivity of each programming language : http://www.codinghorror.com/blog/2005/08/are-all-programming-languages-the-same.html. The analysis is based on the number of lines of code that are necessary to express the same idea under different languages.


Basically, on the average, scripting languages like Python can do with one line what a C program will have to do in 6 lines. Although this is a simple analysis that doesn't take into account other aspects like library availabilities, debuggability, compilation vs interpretation, performance, etc., it is a strong reminder that you can save a significant amount of time by being smart about the choice of the language.


In addition, when you do a lot of numeric computations, single lines of languages like Matlab and R could perform tasks that would require tens and maybe hundreds of lines of C/C++ (again depending on libraries available in C/C++).


This is also a reminder that setting up your project to use multiple languages could be beneficial in the long run. Often, you can use C/C++ for the most time consuming parts of the program and leave the rest to the easier to use scripting languages like Python.

Wednesday, January 26, 2011

RBM Training Tip

A sparse stacked RBM (a type of a deep belief network) is supposed to capture robust patterns that can be very useful features for various classification/machine-learning tasks.


However, the model is very dependent on the strength of the visible units. I'm finding that even obvious patterns that humans can easily detect are not captured by the RBM learning if the visible units are scaled down by some reasonable factor. This may be because the energies for a lot of desirable patterns become too small and the model will only be able to learn a few bases that do give strong signals.

It helps to scale up lower layer weights in these cases.

Tuesday, January 25, 2011

Optimizing Sequential Programs via CUDA Parallelizations

You may not necessarily think of a massive parallelization through CUDA the same way as other program optimization techniques. However, the same general strategies should apply.

That is, you should never prematurely optimize everything (i.e. convert everything from your sequential code into CUDA kernels). It's a lot of effort to convert programs into CUDA and you may be wasting a considerable amount of valuable time in something that will have negligible impact.

Benchmark everything and then strategically optimize portions where the program spends the most time on. Depending on algorithms, you may find that the program spends 90%+ of its time on a few select routines. If those routines are well suited for CUDA's SPMD (Single Program Multiple Data) architecture, you might reap huge benefits.