I have been toying around with Erlang since quiet some time and finally made something that could be considered the de facto *Hello World* of Erlang. Unlike most of the other programming languages where concurrency is an added feature, Erlang was built around the concept of concurrent programming. If you love functional programming and can get over the weird Erlang syntax, learning Erlang will not be very difficult and is definitely worth the effort. I have been reading Joe Armstrong’s Programming Erlang: Software for a Concurrent World and will definitely recommend it to anyone starting out with Erlang. I am yet to dive into OTP, but so far, so good. :)

I have taken the naive recursive factorial algorithm and parallelized it to take advantage of multiple cores available on average system these days. Lets assume that we need to distribute the task of calculating *N* factorial over *K* processes. To do that, I have divided the set of integer {1…N} to *K* subsets, say S₁, S₂, …., S_{k}, such that

S₁ ∪ S₂ ∪ … ∪ S_{k} = {1…N} and S₁ ∩ S₂ ∩ … ∩ S_{k} = ∅

Where,

S₁ = [ 1, K+1, 2K+1, … , (x₁)*K+1 ];

x₁ : (x₁)*K + 1 <= N and (x₁+1)*K + 1 > N and x₁ ∈ ℤ

S₂ = [ 2, K+2, 2K+2, … , (x₂)*K+2 ];

x₂ : (x₂)*K + 2 <= N and (x₂+1)*K + 1 > N and x₂ ∈ ℤ

…

…

…

S_{k} = [ K, K+K, 2K+K, … , (x_{k})*K+2 ];

x_{k} : (x_{k})*K + 2 <= N and (x_{k}+1)*K + 1 > N and x_{k} ∈ ℤ

Product of all the elements in each of the sets in calculated in a separate thread and all these calculations, theoretically, can be done in parallel. These sub-products are then multiplied together to get the final factorial.

This is how it looks like:

*Update*: The algorithm described above works but is not the best way to find the factorial of a number and after some time on Google, I am convinced that Luschny’s is *the* resource on the internet if you are looking for something faster and has a factorial somewhere in it.

Also, this may not be the best language to implement this and this may not the best problem to parallelize. Nonetheless, if you run the code given below you’ll realize that this one makes a pretty good example of the limitations of parallelism as given by Amdahl’s Law:

The potential speedup gained by parallel execution of a program is limited by the portion that can be parallelized.

I am an Erlang beginner, and this could be very well a sub-optimal implementation. Do let me know, what you think of it and if you enjoyed reading this, you should follow me on twitter.

Start the Erlang shell

$ erlCompile the code

1> c(factorial).Call the calculate function

2> factorial:calculate(10000).

```
```