One thing I heard a lot at the Mostly Functional conference last week that concurrency is required for performance on multicore processors. Since Moore’s Law ended it is certainly true that the old trick of not writing performant code but letting hardware advances pick up the slack has been flagging (although things like SSD have still had their impact).
However equating concurrent code with performance is subtly wrong. If there was a direct relationship then we would have seen concurrent programming adopted swiftly by the games programmers. And yet there we still see an emphasis on ordered, predictable execution, cache structure and algorithmic efficiency.
Performance is one of those vague computing terms, like scale, that has many dimensions. Concurrency has no direct relation to performance as anyone who has managed to write a concurrent program with global resource contention can attest.
There are two relevant axes to considering performance and concurrency: throughput and capacity. Concurrency, through parallelism, allows you to greatly increase your utilisation of the available resources to provide a greater capacity for work.
However that work is not inherently performed faster and may actually result in lowered throughput due to the need to read data that is not in memory and the inability to predict the order of execution.
For things like webservices that are inherently stateless then often concurrency does massively increase performance because the capacity to serve request goes up and there is no need coordinate work. If the webservice is accessing a shared store where essentially all of the key data is in memory and what we need to do is read rather than mutate that data then concurrency becomes even more desirable.
On the other hand, if what we want to do is process work as quickly as possible, i.e. maximise throughput, then concurrency can be a very poor choice.
If we cannot predict the order that work will need to be executed in, due to things like having to distribute work across threads and retry work due to temporary errors then we may have to create the entire context for the work repeatedly and load it into local memory.
In circumstances like these concurrency hurts performance. After all the fastest processing is probably still pointer manipulation of a memory-mapped file, if your want to really fast.
So concurrency means performance and beating Moore’s Law if you can be stateless and value volume of processing over unit throughput.