One of the principal goals of cloud computing is the outsourcing of the hosting of data and applications, thus enabling a per-usage model of computation. Data and applications may be packaged in virtual machines (VM), which are themselves hosted by nodes, that is, physical machines. Several frameworks have been designed to manage VMs on pools of physical machines; most of them, however, do not efficiently address a major objective of cloud providers: maximizing system utilization while ensuring the QoS. Several approaches promote virtualization capabilities to improve this trade-off. However, the dynamic scheduling of a large number of VMs as part of a large distributed infrastructure is subject to important and hard scalability problems that become even worse when VM image transfers have to be managed. Consequently, most current frameworks schedule VMs statically using a centralized control strategy. In this article, we present distributed VM scheduler, a framework that enables VMs to be scheduled cooperatively and dynamically in large-scale distributed systems. We describe, in particular, how several VM reconfigurations can be dynamically calculated in parallel and applied simultaneously. Reconfigurations are enabled by partitioning the system (i.e., nodes and VMs) on the fly. Partitions are created with a minimum of resources necessary to find a solution to the reconfiguration problem. Moreover, we propose an algorithm to handle deadlocks that may appear because of the partitioning policy. We have evaluated our prototype through simulations and compared our approach with a centralized one. The results show that our scheduler permits VMs to be reconfigured more efficiently: the time needed to manage thousands of VMs on hundreds of machines is typically reduced to a tenth or less. Copyright © 2012 John Wiley & Sons, Ltd.