Concurrent Programming

How DataFlow avoids race conditions by default

In my previous articles, I tried to present Akka.Net. Let’s remember that the good old Task Parallel Library (TPL) also includes an implementation of the actor model in a dedicated assembly: DataFlow. The actor model is a way to simplify concurrent programming. How DataFlow solves the problem of race conditions that comes with concurrency?

We just want easy multithreading, please!

More precisely, we would like to send messages to an actor asynchronously without having to explicitly use locks or other synchronization techniques. In DataFlow, actors are represented by blocks. In the example we use an ActionBlock which expects an int and produces nothing. We want to compute the sum of numbers from one to n, which is actually easily computed by calculating n * (n + 1) / 2.

Running the program it produces the correct output:

DataFlow: 705082704 expected 705082704

But what happen? The message was processed asynchronously by the ActionBlock in a single and different thread. This is done because MaxDegreeOfParallelism is set to 1 by default. The creation of the block is equivalent to:

If I set MaxDegreeOfParallelism to 2 knowing that my cpu has more than one core, the output is incorrect and varies between executions because we have race conditions. In fact, sum is a shared variable accessed by the ActionBlock block in two different threads.

DataFlow: 509088149 expected 705082704 (first attempt)
DataFlow: 398544839 expected 705082704 (second attempt)

That’s it: set MaxDegreeOfParallelism to 1 if you are using shared variables in your blocks. That is what DataFlow do by default but sometimes it is good to do things explicitly!