The whole computer hardware industry embraced the multi-core. The extreme optimisation of sequential algorithms is then no longer sufficient to squeeze the real machine power, which can be only exploited via thread-level parallelism. Decision tree algorithms exhibit natural concurrency that makes them suitable to be parallelised. This paper presents an in-depth study of the parallelisation of an implementation of the C4.5 algorithm for multi-core architectures. We characterise elapsed time lower bounds for the forms of parallelisations adopted and achieve close to optimal performance. Our implementation is based on the FastFlow parallel programming environment, and it requires minimal changes to the original sequential code. Copyright © 2013 John Wiley & Sons, Ltd.