Get access

Detecting asynchrony and dephase change patterns by mining software repositories



Software maintenance accounts for the largest part of the costs of any program. During maintenance activities, developers implement changes (sometimes simultaneously) on artifacts in order to fix bugs and to implement new requirements. To reduce this part of the costs, previous work proposed approaches to identify the artifacts of programs that change together. These approaches analyze historical data, mined from version control systems, and report change patterns, which lead at the causes, consequences, and actors of the changes to source code files. They also introduce so-called change patterns that describe some typical change dependencies among files. In this paper, we introduce two novel change patterns: the asynchrony change pattern, corresponding to macro co-changes (MC), that is, of files that co-change within a large time interval (change periods) and the dephase change pattern, corresponding to dephase macro co-changes (DC), that is, MC that always happens with the same shifts in time. We present our approach, that we named Macocha, to identify these two change patterns in large programs. We use the k-nearest neighbor algorithm to group changes into change periods. We also use the Hamming distance to detect approximate occurrences of MC and DC. We apply Macocha and compare its performance in terms of precision and recall with UMLDiff (file stability) and association rules (co-changing files) on seven systems: ArgoUML, FreeBSD, JFreeChart, Openser, SIP, XalanC, and XercesC developed with three different languages (C, C++, and Java). These systems have a size ranging from 532 to 1693 files, and during the study period, they have undergone 1555 to 23,944 change commits. We use external information and static analysis to validate (approximate) MC and DC found by Macocha. Through our case study, we show the existence and usefulness of these novel change patterns to ease software maintenance and, potentially, reduce related costs. Copyright © 2013 John Wiley & Sons, Ltd.