Who's afraid of the big bad wolf! Why do developers find parallel programming so hard?
Guest post by Colin Fitzpatick - Software Development Director at Waratek
Parallel programming seems to scare people; books and articles warn of dead locks and non-determinist problems which are hard to find. As a result, people steer away from parallel programming, avoiding it at all costs. When programmers find themselves having no option but to deal with it, they often end up using so many locks to remove problems that they may as well not be parallel programming at all!
As part of my current role, and other previous roles, I have been regularly required to interview developers to work on projects with lots of parallel components. One question I frequently ask candidates is to what extent they understand - or have worked with - the happens-before relationship. This is a key concept in writing and debugging parallel code. However, over the years, I have found that few people understand what this means.
Consider the following problem: you have two pieces of code running on two different CPUs in a SMP system. They operate on two global variables called A and B; at the start, A and B are zero. The first piece of code first sets A to 1, and then sets B to 2. The second piece of code (running in parallel on a different chip in an SMP system) checks if B is equal to 2, and if so, displays the value of A on the screen. What gets displayed on the screen? Most people would presume that either nothing is displayed, or 1 is displayed, but in a real parallel system it's possible that A=0 will display on the screen. Understanding why this is the case is very important when developing parallel code.
In general, there are several reasons why there are so few developers who are competent in the field of parallel programming:
1. SMP and multi-core chips are relatively new. Like any new technology, it takes time before it's generally understood.
2. There aren't many good examples of existing code that use happens-before concepts to base new code on.
3. There is a widespread lack of parallel programming languages and/or libraries that are used everyday. While many parallel languages do exist, they tend to be obscure and academic.
4. Parallel code involves a different way of thinking.
As with any skill, doing is the only way to get good at a problem. Writing real multi-threaded applications is the only way to learn. If you write web applications with good web servers you will never hit or understand the core problems of having concurrent code. When writing applications, test them on both multi-core chips and real SMPs (if you can get your hands on one!). Remember that SMPs can behave much differently to multi-core chips.
In the medium to long-term, the ‘barriers to entry' mentioned above will fade away as a result of technical and coding advances, thus opening up this world to a wider developer community. Cheap multi-core chips are everywhere and new developers will have no choice but to use these cores effectively. This will in turn produce better paradigms. Advances in auto-vectorisation are also appearing; this method allows compilers to detect areas of code that can run in parallel automatically, without any programmer involvement. These will remove some of the technical problems that developers have to craft solutions for by hand today.
In the short-term, if you are a developer interested in parallel programming I would advise you to check out the Java Language Spec, in particular thread and locks at http://java.sun.com/docs/books/jls/third_edition/html/memory.html.
