Batch processing with Akka part 4

This article is a follow-up to Batch processing with Akka part 3 .

In the last part of this series of articles I have a look at data processing within a network of computers. To do that the program was split into two logical parts, the one part – the master – contains the actors that read and write the data from and to files, the other part – the workers – contains the actors that do the actual processing. The program configuration contains definitions, which parts of the program should be started, and on which host the master is runnning. So it is possible to have one master and multiple workers running within the same network and to spread the work amongst them.

Implemeting the changes for that was easy. I had to add the akka-remoting library, modify the configuration an change the lookup for two actors in the program code. Without further changes it is possible to start the program on a couple of hosts and to increase the performance. While the program is running, it is possible to add machines running workers and to remove them again without the need to stop the master program.

Conclusion of these tests: Akka is an enormous help when developing flexible and scaling programs. As a programmer you don’t have to care about the details of threads, threadpools, synchronisation, locking or network protocols, this is all handled by Akka.

The program code is available at https://github.com/sothawo/AkkaBatch .