Cloud computing is an emerging technology where information technology resources are provisioned to users in a set of a unified computing resources on a pay per use basis. The resources are dynamically chosen to satisfy a user service level agreement and a required level of performance. A cloud is seen as a computing platform for heavy load applications. Conjugate gradient (CG) method is an iterative linear solver that is used by many scientific and engineering applications to solve a linear system of algebraic equations. CG generates a heavy load of computation, and therefore, it slows the performance of the applications using it. Distributing CG is considered as a way to increase its performance. However, running a distributed CG, based on a standard API, such as Message Passing Interface, in a cloud face many challenges, such as the cloud processing and networking capabilities. In this work, we present an in-depth analysis of the CG algorithm and its complexity to develop adequate distributed algorithms. The implementation of these algorithms and their evaluation in our cloud environment reveal the gains and losses achieved by distributing the CG. The performance results show that despite the complexity of the CG processing and communication, a speedup gain of at least 1157.7 is obtained using 128 cores compared with National Aeronautics and Space Administration Advanced Supercomputing sequential execution. Given the emergence of clouds, the results in this paper analyzes performance issues when a generic public cloud, along with a standard development library, such as Message Passing Interface, is used for high-performance applications, without the need of some specialized hardware and software. Copyright © 2012 John Wiley & Sons, Ltd.