Understanding the quality and evolution of Android app build systems

Build systems are used to transform static source code into executable software. They play a crucial role in modern software development and maintenance. As such, much research effort has been invested in understanding the quality and evolution of build systems, including Apache ANT, Apache Maven, and Make‐based ones. However, the quality and evolution of build systems for mobile apps, such as on the Android platform, have not as yet been investigated in detail. Mobile app development, and the Android development context in particular, impose unique constrains, such as different device conditions and capabilities. It presents unique challenges, such as frequently upgraded Android frameworks, which those who implement and maintain build systems must tackle. In this paper, we present an exploratory empirical study of the build systems of 5222 Android projects to better understand their quality and evolution. We (a) study the build technology choices that Android developers make (Gradle being recommended and the most popular choice), (b) explore the sustainability of the official Gradle build system (parts of build files are updated more frequent that others and the update of the special Gradle plugin would induce unrecommended configurations), and (c) analyze the quality of Gradle scripts for Android apps—more than a half of the open‐source Android apps cannot be successfully built due to five common root causes.


| INTRODUCTION
Build systems-the systems responsible for transforming of source code into executable software artifacts-are one of the most important technical artifacts that support the development of modern software.A typical software application is composed of many modules, containing multiple source files, which rely upon a complex layer of third-party libraries.These artifacts must be assembled carefully in order to produce a valid deliverable.It is not uncommon for build systems to orchestrate the invocation of hundreds of order-dependent commands.
Since their introduction in the 1970s, 1 build systems have become a key part in the process of software development. 2,3Build systems are at the heart of modern development approaches, such as continuous integration.In these rapid integration processes, an internal or third-party service (e.g., Circle CI 4 ) verifies that the build process can still cleanly apply to changes to the codebase as they are produced by the development team.Modern rapid release strategies extend the continuous integration process to delivery/deployment, (semi-)automatically producing new releases when changes to the codebase pass a set of quality gates (e.g., code compiles, tests pass).As argued by McIntosh et al, 5 a fast and correct build system is critical.7][8][9] Without a correct build system, CI/CD processes would not be reliable enough to foster trust from the development team, leading to miscommunication and/or unacceptable releases.
Due to their importance, build systems have attracted the attention of the research community.Adams et al., 10 Zadok, 11 and Nadi and Holt2 set out to understand how make-based build systems (co-)evolve with C and C++ codebases.McIntosh et al 5 studied similar phenomena in Apache ANT and Maven build systems concerning the automation of Java source code.These studies have highlighted that a better understanding build systems will (1) allow project managers (especially for aging projects) to allocate appropriate personnel and resources to perform system maintenance tasks effectively and (2) reduce build maintenance overhead on regular development activities.
While much has been discovered about the build systems of traditional applications, 12,13 little is known about the evolution of build systems for mobile applications.Nowadays, Android is the most popular mobile platform with over 4000 versions of different devices, over 2.5 billion monthly active users, and almost three million distinct Android apps published on the official Google Play Store.While Android applications can be written in a Java-like language, their event-driven development model is quite different from a typical Java application.
In this work, we set out to better understand how Android build systems evolve.To do so, we perform a comprehensive empirical study of 5222 Android applications from the AndroZooOpen corpus. 14Our empirical study mainly answers three research questions: (1) What are the build technology choices that Android app developers make?(2) How do app developers work on the current Gradle build system to fulfill their development?(3) How many open-source Android projects provide valid Gradle scripts to successfully build deliverables?
The key contributions of this work include: Build technology choices: We carried out an extensive study on 5222 Android open-source projects and found that there are four common build technologies used-Apache ANT, Apache Maven, Eclipse ADT, and Gradle.We found that of the many existing build systems, Android developers predominantly adopt Gradle, that is, the default build technology of the Android SDK 95:12% ¼ 4967 5222 À Á .We found that developers who adopted other build systems earlier have then often migrated their legacy build systems to Gradle.Of the 130 repositories we detected build system migration, 118 of them have changed their build system to Gradle.Our manual investigation further reveals that, in many cases, it is mandatory to update the fundamental build techniques as historical build techniques may not be supported anymore (it could yield library not found errors when performing the build).
Sustainability of Gradle build system: We analyzed the evolution of build systems and found that after the build files are well configured, parts of the build scripts are changed more frequently than others, especially Android-related ones.Furthermore, because the Gradle plugin by itself may evolve (e.g., certain configuration keywords such as compile, apk, or provided may be deprecated), the underlying Gradle build scripts need to be aligned with such updates.Otherwise, the build scripts cannot be executed to generate installable apks.

Quality of Gradle scripts in open-source Android apps:
We manually summarized root causes of build failures of open-source Android projects.
Among 4774 projects containing a Gradle Wrapper script, only 1495 (or 31.32%) of them can be automatically built.From 100 randomly sampled repositories that failed to build, we manually checked their build failure reasons and summarized five key root causes, including (1) Source Code Error, (2) Configuration Error, (3) Resource File Missing, (4) Library Not Available, and (5) Native Development Kit (NDK) Error.The root causes of the build failures conform to a previously established set of categories by Hassan et al 12 and will be helpful for developers to avoid such unnecessary problems.This is the first large-scale analysis on the evolution of build systems in Android apps development so far.We detail the updates of Gradle build system scripts and investigate the rationale behind the update in Android APKs' auto-build, which would benefit both Android app developers and build system maintainers.
The rest of this paper is organized as follows.In Section 2, we present the common build systems and our research questions.In Section 3, we introduce our dataset selection.Sections 4-6 focus on three different research questions described in Section 2, respectively.Section 7 implies threats to validity.Then, we discuss the potential usefulness for potential practitioners and researchers at Section 8 and related work at Section 9.At last, we conclude our work at Section 10.
Our source code and datasets are all made publicly available in our artifact package. 15

| ANDROID BUILD SYSTEMS
Android applications are complicated Java-based systems with many files, directories, and inter-relationships.Building these apps requires sophisticated build tool support for developers.We outline some of the most commonly used build tools for Android applications below: • Apache ANT, an acronym for Another Neat Tool, was created by James Duncan Davidson in 1999. 5,16It originated from the Apache Tomcat project in early 2000 and was designed for automating software build processes due to a number of limitations with Unix's make system, which was the de facto standard automatic build tool among system programming languages, such as C/C++.Apache ANT is written in Java and provides a number of built-in tasks to compile, assemble, test, and run Java applications. 17,18Apache Maven, primarily designed for Java projects, was created by Jason van Zyl in 2002 as a subproject of Apache Turbine. 19It is also a build automation tool and can be used to manage the build process, report, and document projects.Unlike Apache ANT, it uses conventions to provide default behavior for the build procedures and also offers the ability to automatically manage third-party libraries for projects. 20The project botbrew-gui of entry jyio/botbrew-gui introduces the Maven installation commands to build the repository.
• Eclipse ADT was the initial officially supported IDE for Android app development.ADT stands for Android Development Tools.It extends Eclipse with the capacity to create Android projects, add and manage packages used by the project based on the Android Framework API, debug the applications using the Android SDK (Software Development Kit), and export signed or unsigned APKs to distribute the final artifacts. 21The plugin was developed by Google and designed to provide developers with a powerful integrated environment to develop Android applications.For example, the Github project DroidShows of entry ltGuillaume/DroidShows uses Eclipse as its build system, and the project is still under maintenance.However, at the end of 2015, Google announced that ADT was deprecated and the IDE would be replaced by Android Studio.
• Gradle is an open-source build automation system, built upon the concepts of Apache ANT and Apache Maven, and also provides a Groovybased 22 and Kotlin-based 23 Domain-Specific Language (DSL).Based on Apache ANT and Apache Maven, Gradle uses tasks to represent atomic build activities, such as compiling the source code, packing a JAR file, and generating Javadoc.The dependencies between these tasks form a directed acyclic graph (DAG), which the Gradle tool uses to safely execute tasks in parallel to speed up build execution. 24For example, the entry of the project Maxr1998/home-assistant-Android instructs users to generate the installable by executing the Gradle command.

| Gradle
Apache ANT and Apache Maven have been significantly investigated by our fellow researchers, 5 and Eclipse ADT has been deprecated.Thus, here, we focus on the newly introduced Gradle.A typical Android project, created by Android Studio with the default Gradle build system, contains several different configuration files as shown in Figure 1.* 25 The structure of a Gradle configuration contains project-level and module-level build files.In the root directory of an Android app project, it contains two project-level configuration files: build.gradleand settings.gradle.The settings.gradlefile is used to determine the required modules for building the app, while the file build.gradledefines build configurations. 26A configuration represents a group of artifacts and their dependencies.
In the build.gradlefile, the buildscript block configures the repositories and dependencies.The repositories can be pre-defined remote repositories, common places for finding/sharing popular packages used by build systems (e.g., JCenter, Maven Central, and Ivy), any local repositories, or self-defined remote repositories.The dependencies declared in this block are searched and downloaded from the specified list of repositories.
Unlike other build systems, these dependencies are consumed by Gradle itself, such as the dependency of the Android Gradle plugin that provides additional instructions for Gradle to build applications.The allprojects block in the build.gradledefines the third-party repositories and dependencies, which can be consumed by all of the modules of the project.Different from the buildscript block configurations, this block section is for the modules and sub-modules in projects being built by Gradle.
F I G U R E 1 Typical directory structure of Android application adopting Gradle build system.
As well as the project-level build file, module-level build files configure build settings for the specific module where it resides.The build settings include the basic configurations, like dependencies and build tasks for the module-specific build.
The Gradle plain text configuration files build.gradleare generally written in the Groovy language.Groovy is a dynamic DSL for the Java Virtual Machine.The corresponding build file fulfilling the same function is build.gradle.kts,written in the Kotlin language. 27Kotlin is the preferable programming language for Android application development, announced by Google on May 7, 2019.Thus, a Kotlin-based configuration makes the development language the same as configuration, which reduces the learning time for Android application development. 28 addition, the Gradle Wrapper (i.e., MyAppp/gradlew) is also located in the root directory.The Gradle Wrapper is the recommended way to execute any Gradle build.It is a script that invokes a declared version of Gradle and downloads Gradle beforehand if need be.The wrapper file also relies on two files, the gradle-wrapper.jarand gradle-wrapper.properties.The gradle-wrapper.jarfile is responsible for downloading the Gradle distribution, while the gradle-wrapper.propertiesfile configures the wrapper runtime behavior.
Gradle provides different wrappers for different operating systems, such as a shell script for unix-like systems and a batch script for Windows.
It allows developers to invoke Gradle without first (manually) installing it.Moreover, the script ensures build consistency by ensuing that the same version of Gradle (the one specified in the properties file) is used for each build.
The core implementation of the application is located in the directory main.It typically contains the directory java, res, and Android configuration file AndroidManifest.xml.The file with the exact name AndroidManifest.xml is a mandatory file in which the components of the app, the permissions information, and the hardware and software requirements are all listed.The directory java contains the detailed implemented Java and/or Kotlin files while the directory res holds the necessary resource files, such as the pre-loaded pictures and the localization languages support. 29

| Research questions
In this work, we want to answer the following three key research questions: RQ1: Which build technologies are most prevalent in the Android ecosystem [technology choice]?By answering this research question, we expect to present to the community with a clear picture of the predominant build technology choices in the Android ecosystem.Novice Android developers can, therefore, adopt the advanced and popular build system to facilitate their development.Developers using legacy build systems can still gain benefit from the popular one with the active development community and plentiful resources of the popular build system if they are willing to migrate their legacy build system.
RQ2: How do app developers use the current Gradle build system in their app development [suitability]?Our preliminary investigation reveals that Gradle, the current official build system recommended by Google, is the most frequently adopted build system in the Android ecosystem.Focusing on the evolution of the Gradle build system and answering this research question, we could understand the impact of the build system, Gradle, on the Android apps development and enlighten app developers on the maintenance of the build system.
RQ3: Can open-source Android app projects be successfully built with their provided build scripts [quality]?Open-source Android developers publish the core app source code alongside with the build system on the public hosting site.Other developers should be able to build and test the app with the help of the provided build scripts so as to collaborate on new features and bug fixes and so on.built our dataset by considering the Android projects that have published their source code on popular code hosting sites and their installable APKs (built by the published source code) have been released on Google Play Store or F-Droid.We believe that the apps deployed on Google Play or F-Droid are more likely to involve building automation techniques than other apps, and such building techniques related configurations are also more likely to be correct.However, the majority of apps of AndroZooOpen cannot be identified on Google Play or F-Droid.That is because some of them are toy projects, which were developed just for practice by developers, while some others have been removed from Google Play Store (Google Play Store always removes apps roughly once a quarter † ) as Google believes that they are of low quality.We inspected every project from AndroZooOpen and identified if the projects have been built and published on Google Play Store or F-Droid.Eventually, 5222 apps from AndroZooOpen, not only hosted on public hosting sites but also published on Google Play Store or F-Droid, are selected as the dataset for this study.
For our identified 5222 open-source Android app repositories, we used the official provided REST API version 3 30 to acquire their basic repository information on Github, such as the creation of timestamp, the last update date, the number of contributors, stars given by fellow developers, forks, and the creation of issue and pull requests.The representativeness of our dataset is further demonstrated by the popularity of the selected projects.In Figure 3A, over half of the projects receive 3.5 stars and one fork, three issues, and eight pull requests.For example, repository 31 received 60 stars and was forked 12 times at the time of our visit.
With respect to the contributors for the open-source projects, we first count the total number of contributors and then compute the number of active contributors who contribute the most of the projects periodically.Figure 3B shows that the selected projects are almost all implemented by a single contributor.We also compute the time interval for developers to close the created issues and pull requests.Figure 3C represents the results, indicating that developers take more then 19 and 3 h to close the issues and pull requests, respectively.Liu et al 32  We compute the duration years of the selected repositories represented in Figure 3D during which developers still maintain the repositories on hosting sites.Figure 3D shows the median value of the maintenance period among the selected Android repositories is 3. Therefore, we must take these no longer maintained projects into consideration as they represent the utilization of the advanced build systems at that time.
We requested the metadata of the selected apps on Google Play Store with the help of Google Play Scraper. 33Figure 4 represents the number of installs and rating score of the selected apps.Figure 4A shows how many installations for the selected apps.It is worth mentioning that Google Play Store shows the number of installations in a range rather than a specific number. 34For example, installs 10+ represents the corresponding apps are installed spanning from 11 to 50 times by smartphone users.In our dataset, there are 758 apps with installs 10+.
Figure 4B indicates the score given by app users and shown on play store.It shows the median value of score is 0 and the maximum value is 5.It is notable that not all apps in play store have a rating.§ Some of the selected apps were uploaded recently but have gained much popularity among app users.With the installations and scores acquired from Google Play Store, we could have a high-quality dataset including the newly introduced apps and fully fledged ones to conduct our study.
The availability of the built APKs and the maintenance period of the repositories demonstrate the necessity of an extensive analysis on a large-scale dataset including projects out of maintenance in the last 3 years.
The number of projects introduced and updated each year.
The basic attributes of the selected Android projects on hosting sites.
The basic attributes of the selected Android projects on app store.
In this section, we answer the research question RQ1-which build technologies are most prevalent in the Android ecosystem?The Android app build system plays a key role to compile source code and package them into APKs for further testing, deploying, signing, and distribution, so as to allow CI/CD.We conduct a preliminary analysis aiming at understanding the status of usage of different build systems used by Android developers in the historical and current Android apps?
The following sub-research questions are answered: • RQ1.1:What are the build techniques recurrently adopted by Android apps?
Android apps integrate their app code together with third-party libraries and resources and may need to be built into different variants for different types of devices.Building apps manually is arduous and prone to errors.With the help of the build system, it is easier to reuse code, resources, and configurations.In addition, extending and customizing the build process specified in a build system is a more traceable and manageable task than manual app assembly.To begin our investigation of build systems for Android apps, we first set out to understand what build techniques have been used by developers in the Android ecosystem.
• RQ1.2:How often do Android app developers update their build practices?
There are plenty of build systems that can be used to specify build processes.Newer build technologies often offer more features, but there is a cost to migrate. 35In our second subquestion, we are interested in understanding whether and how often Android app developers migrate between technologies.This knowledge will not only help new developers choose between different build technologies that facilitate their app build processes but also encourage maintainers to fully consider the value of migrating away from legacy build systems to state-of-theart ones.
By sampling and manually analyzing projects of the dataset, we are able to identify the following build technologies, including Apache ANT, Apache Maven, Eclipse ADT, and Gradle.We now briefly describe them, respectively.
Given an open-source Android app project, we need to automatically identify which build technology is adopted by the project.In this work, we take a straightforward approach to achieve that.As shown in Table 1, each build technology requires specific files, which could then be leveraged to form the classification criteria.For example, if a given project contains a build.gradleor a build.gradle.ktsfile, it follows that the build system leveraged by this project is Gradle.Additionally, for projects that do not contain any of the recognized files for the four build techniques or contain recognized files for more than one build technology, we label the project as "None" or "Multiple," respectively.It is worth noting that projects labeled "None" do not mean they were built without any build systems.It is the developers of these projects who are not willing to upload the build scripts to source code hosting sites that prevents us from determining the build systems utilized in these projects.
Based on the above approach for determining build technologies, we classify the latest version of all of the projects in our dataset and summarize the build technology choices in Figure 5. Unsurprisingly, the most popular build technology is Gradle (95.12% of projects), the default build system generated by Android studio (the official Android IDE).

RQ1.1 Finding
The Apache ANT, Apache Maven, Eclipse ADT, and Gradle build technologies all could be adopted for building Android apps.
Among the four techniques, by far the dominant technology is Gradle-the officially supported technology of the Android development tooling.
T A B L E 1 Key build system resources.The majority of our selected projects adopt Gradle for their build system.However, this is only determined from the last update of every project in our dataset, which reflects their latest status.What is still unclear is how quickly the Android community has converged on its current status.
To investigate this, we inspected every history commit state via Git to get the information about the build system's evolution.With the help of Git, we analyze the entire commit history via the reflog command, and we reuse the classification criteria to identify possible migrations of the build technology.
Figure 6A summarizes our experimental results concerning the changes made in migrating from one build technology to another.We tag these PriorTechnology_NextTechnology.For example, Eclipse_Gradle represents a the migration from Eclipse ADT to Gradle.During the lifecycle of an open-source project, migration among build technologies may happen more than once.For example, the developers of hgdev-ch/ toposuite-android have migrated its build systems twice, at first from Eclipse ADT to Apache ANT, and then from Apache ANT to Gradle (i.e., Eclipse_None_Ant_Gradle).
The most popular migration pattern is from the former officially supported build system Eclipse ADT to the current officially supported system Gradle.We suspect that this is common because Google announced that the applications can be implemented using Kotlin and also can be implemented in native code, such as C/C++ with the help of the Android NDK.It is not easy to implement apps using a combination of Java, Kotlin, and native code, especially without support from the build technology.Because Apache Maven and Apache ANT lack first-class support for Kotlin and C/C++, while Gradle provides greater flexibility and better performance, adding support for building apps that combine Java, The build system of Android repositories.
The migration of build systems.
Kotlin, and native code was likely easier with Gradle.In addition, Gradle, as the latest build system with the build features including build caching, compile avoidance, and an improved incremental Java compiler, is now 2-10Â faster than maven 36 for most Java projects.
Migrating to Android Studio with the Gradle build system from Eclipse ADT is also an actively supported path. 37Eclipse ADT projects can be imported directly into Android Studio, and developers can complete a migration by responding to a series of prompts.An import summary text file will be generated once the migration is complete.Developers could have a better understanding of what has been changed in the project by reading the generated summary file and manually correcting what has been updated incorrectly.
To migrate from Apache Maven to Gradle, some documentation 38 provides guidelines for developers.This documentation lists the mapping of build phases of Maven to tasks of Gradle and provides instructions to incorporate plugins, dependencies, and default library repositories.It is not difficult to migrate to Gradle from Apache Maven because they share conventions, such as project structure and dependency management.
Our previous analysis reveals two projects that migrated from Apache Maven to Gradle directly.We manually analyzed these migrations and found that all migrated by following the instructions in the documentation.
Similar to migrating from Apache Maven to Gradle, some projects migrated to Gradle from Apache ANT.However, it is difficult to migrate from Apache ANT to Gradle, as there does not exist any standard Apache ANT build.Developers need to migrate to Gradle according to their specific Apache ANT build conventions.Fortunately, Gradle provides integration features with Apache ANT that can ease such migration.In general, there are two approaches to complete the migration.The first approach uses the existing Apache ANT build as much as possible via hook methods provided by Gradle.The other approach migrates to an idiomatic Gradle build as much as possible, even though the core and custom Apache ANT tasks can still be used directly with Gradle hook methods.In our analysis, we find that all of the migrations from Apache ANT to Gradle tend to have a purely idiomatic Gradle build.That is because the Android projects do not rely on indispensable Apache ANT tasks via Gradle hook methods.
During our investigation, we also observed that 13 (or 0.25%) Android projects appear to contain multiple build systems.For example, project MPieter/Notification-Analyser adopted Gradle first, and later on, it includes Maven to the project.Our manual investigation reveals that the Maven build system is included because of an externally developed library that has been recently added to the project.Nonetheless, it is non-trivial to automatically distinguish between cases of migration from one technology to another and cases of coexisting build technologies.Furthermore, our investigation also reveals that some Android app projects have adopted build systems repeatedly.For example, as revealed in the commit history of project Ifsttar/NoiseCapture, it first adopts Gradle, and then migrated to Maven, only to later migrate back to Gradle.

RQ1.2
The build system adopted by app developers to automate the build process is not immutable from the beginning of the project.It is uncommon to replace one build technology with another except for the migration recommended by the Android official.

| EVOLUTION STUDY (RQ2)
In this section, we answer research question RQ2-how do app developers work with the current Gradle build system?The build system of Android projects changes over time.We shift our focus on Gradle as many selected projects update their build system to Gradle, the officially support system.Based on the build technologies used (cf.Section 4), and the structure of the Gradle build system (cf.Section 2), we need to answer two research sub-questions to analyze the evolution of the Android Gradle build system: • RQ2.1:What has been changed in Gradle build files during the evolution of open-source Android apps?
The Gradle build system offers the ability to perform custom build configurations without modifying the source code in the project.This flexibility empowers the developers to customize and automate multiple build configurations.Build types (providing certain properties for Gradle to build and pack to the final artifact), product flavor (determining the product versions such as free or paid), and dependencies (managing remote repositories needed by developers' own local project development), along with some other aspects are totally configurable.This research question investigates what has been changed in Gradle build files during the project development.
It is important for developers to know the necessity behind these changes because it can help both the project owner and other developers to have a fuller understanding on the updates of to app projects.Thus, this research question focuses on investigating the rationale behind observed changes to build files over time.
To answer these research questions we several Python scripts to inspect every git commit in the commit history for each selected project.

| RQ2.1: What has been changed in Gradle build files during the evolution of open-source Android apps?
A build system is used to automate and speed up the whole Android build process.Once the build file is finished, the developers can leverage the convenience of the build system to focus and speed up the development of the functionalities.Any change of the functionalities or unit tests can be readily and easily tested by just running some simple command lines provided by the build system.

| Gradle script changes
We investigate what kinds of build files are changed in the evolution of Android apps.To do this, two different authors, according to the Android build specification 25 (cf.Section 2.1), manually analyzed 100 # randomly selected app projects and classified the main and frequent kinds of changes made to the typical build files.With cross-validation between them, the changes in the typical build files are subsequently grouped into four categories-classpath, dependency, Android, and task-changes.
Figure 7 39 shows an example of updating the classpath in the declared dependencies of the buildscript for Gradle, such an example plugin is used to count the number of methods in an Android APK or AAR file when build.In general, configuration classpath locates in buildscript section in project-level build files used to specify transitive dependencies for Gradle itself, such as Gradle plugin.Figure 8 shows an example of modifying the dependency information in the build.gradlefile that declares third-party libraries used by the source code to generate the final artifacts.
By convention, configuration dependencies can be specified both in project-level build files which could be used by any sub-modules in the project and in module-level build files which could only be utilized by the sub-module of the project.Figure 9 illustrates an example of revising the version code in Android that is specific to Android projects in the sub-module build file to define the basic build settings such as build types and product flavors.The type of Android is actually a configuration block in module-level build files.
In addition to version code, the Android block (or Android type we classified) contains a wide range of configurations, including com-pileSdkVersion, buildToolsVersion, applicationId, minSdkVersion, targetSdkVersion, versionName, build types (release, debug), product flavors (free, paid), signing (configurations of keyAlias, keyPassword, etc.), code and resource shrinking (configurations of minifyEnabled, shrinkResources, and proguardFiles), and multidex enabled when the total number of methods exceeds 65,536.We intentionally excluded the sub-block of dependencies from Android block and classifies it as a separate type because it is updated more frequent than other types.Figure 10 presents an example of removing a task that was added by the developers to achieve their own built task.In general, developers do not need to build their specific tasks to finish their Android development.However, specific requirements can be satisfied with the project-oriented tasks.To have a deeper understanding of relationships between the changed code files and the triggering of the build files we again selected 100 projects in order to have a better representativeness.For each project, we sort the commit chronologically and then acquire the diff result between two consecutive commits.To go one step further, we also randomly select one diff result per project with both Java source code and Gradle build files changed.Among the 100 diff results in 100 Android projects, only 39 of them contain direct clues indicating that new dependencies added and/or old dependencies removed would result in Java source code update in order to adapt different dependencies.The remaining 61 projects do have build files and Java source code update, but we cannot directly conclude there is a relationship between build files and Java source code.Moreover, 28 among the total 61 projects do not even contain updates of dependencies.The update of build files only lies in Android build configurations, such as update of SDK version, version name, and version code.
We further investigate the updates made to Gradle script files with respect to the four categories of changes during the evolution of the projects.Figure 11A,B, respectively, illustrates the distribution on the commits and the updated statements about the four categories.From the aspect of the commit times, classpath-related updates are not committed as frequently as the updates of other three categories.From the aspect of the updated statements in each commit, the updates of classpath and task always involve a few statements, while developers always change multiple statements to adjust the related dependencies and Android block.Although the task is as frequently updated as the dependencies and Android block, the updates related to task always involve revising a few statements.Developers need to take special attention to the update of dependencies and Android block of build files during and after the development of Android application to give customers a better satisfaction.

RQ2.1 Finding
The updates of build files can be grouped into four categories based on their modified positions.The Android block and dependencies are modified more frequently with more updates than the classpath and task.Developers should pay special attention to the frequent update blocks, especially the Android specific one, Android block, to keep the app build making use of newly released features of build system and satisfying the requirements of developers, app users, app stores, and so on.Android studio provides the automatic build system Gradle, but Gradle is not like any other build system.It only provides little authentic automation itself.All of the useful and detailed implementation of the automation is provided by plugins.Intentionally, plugins add the necessary component tasks (e.g., JavaCompile), domain object (e.g., SourceSet), and conventions (e.g., the directory structure of the source code) as well as extending other helpful objects from other plugins.

| Android plugins
When it comes to the Android app building, the Android Gradle plugin provides the specific build configurations and tasks, such as buildTypes, productFlavors, and flavorDimensions.In practice, Gradle and Android Gradle plugin can actually run independently in Android Studio, which provides the possibility to build Android apps out of the IDE with the command line without Studio installed.In each Android repository, the build files provide the version of Gradle and Android Gradle plugin.The version of the Gradle plugin resides in the root level build file while the version of the Gradle locates in the gradle-wrapper.properties.
Via the Git, we examine every commit state by shifting to the specific commit with the Git command "git reset," extract the version of Gradle and Android Gradle plugin, and compute the average update interval of the version of Gradle and Android Gradle plugin.To compute the average update interval, we compute the total update time interval between two consecutive state-updated commits and calculate the average update interval for each Android app project.The distribution of the average update interval for the collected Android projects is shown in Figure 12, which shows that the version changes of Android Gradle plugin are slightly more frequent than the update of Gradle.This confirms that the Android Gradle plugin is more crucial for Android app build than Gradle, because the Gradle plugin involves the core development steps for Android apps.

| Android configuration
The upgrade of the Android plugin could impact the build files of Android projects.One of the major changes on the build files build.gradle is the configuration, such as updating the classpath presented in Figure 7.The configuration in the context of Gradle refers to a set of artifacts and their dependencies.During the process of building software artifacts, their related dependencies have a specific intent in each build phase.For example, some dependencies are used in the phase of compiling source code while others would only be used at runtime.Gradle uses the term "configuration" to express the scope of a dependency with an unique name.After the Android Gradle plugin was upgraded to version 3.0.0 on October 2017, the configurations of compile, apk, and provided k are deprecated.Therefore, we investigate to what extent the deprecated configurations exist in Android app projects.
As presented in Figure 13A, 1008 projects are involved with deprecated configurations in 2017.We infer that this might be caused by the projects that were migrated to Gradle in 2017, because we observe that the deprecated configurations of the projects (cf. the first column in Figure 13B) were changed to the recommended alternatives.From the end of 2018 to the end of 2022, the number of projects containing the deprecated configuration decreases sharply year by year, and the number of projects whose deprecated configurations are updated is reduced year by year as well.These results show that developers are prone to fix the deprecated configurations in Android app projects in recent years, but there still is a number of unaddressed issues in many projects.
The interval (in days) of Gradle and Gradle Plugin.
In addition to discussing the version update between Gradle and Gradle plugin, we also try to determine the updates of the detailed Android configurations, such as multiDex, minify, and signing related.In the beginning, we detect if the Android projects have such build-specific rations.We set every project to the latest commit when we downloaded them and detect if such configurations are specified in the files build.
gradle or build.gradle.kt.To be more specific, if the configurations of multiDexEnabled, multiDexKeepFile, or multiDexKeepProguard are designated, we could conclude that multiDex related configuration is specified.Consequently, the configuration minifyEnabled and proguardFiles describe minify related and the configuration keyAlias, keyPassword, storeFile, and storePassword represent signing-related configuration.Among the total 5222 projects, we could detect 708, 2766, and 1402 projects supporting multiDex, minify, and signing-related configurations, respectively.
Unexpectedly, only 17 projects have intentionally set the config option minifyEnabled to true, while others were set to false.This is not our expectation as we opted to believe that this config option should be set to true in order to have smaller executables.2][43] Therefore, Android developers intentionally set minifyEnabled to false to prevent rare potential non-functional bugs.
With regard to configuration multiDexEnabled, it is easy to grasp such attributes.The maximum number of referenced methods in an Android app is 65,536.Once the number of methods exceeds the limitation, the build error message indicating that your app has reached the limit of the Android build architecture would remind developers to set the configuration to true.If the developers finish the development and build, they also need to sign the final APKs with a certificate before release and upload the APKs on an app store.However, the key used for signing is confidential for developers.They need to protect such sensitive information carefully.Therefore, they may upload these projects with sensitive information and signing-related configurations excluded intentionally.

RQ2.2 Finding
For the maintenance of Android app projects, the Gradle plugin (built on Gradle and providing substantial tasks to build) is updated more frequent than the basis Gradle.Upgrading Gradle plugin in projects would urge developers to replace deprecated configurations to gain benefits from the newly recommended ones.Even though the configurations have been deprecated several years ago, some developers are still utilizing such configurations.It is necessary to remind developers abandon the deprecated ones while updating Gradle plugins so as to make fully usage of the new released versions.

| BUILD SCRIPT QUALITY STUDY (RQ3)
In this section, we answer the research question RQ3-can open-source Android app projects be successfully built with their provided build scripts?Collaborators, providing new features and fixing issues for open-source projects, need to be able to successfully build projects automatically with the provided scripts.Constructing their own, often complicated build scripts, would make this very difficult.In addition, in the mobile software engineering community, many state-of-the-art approaches proposed for analyzing Android apps often focus directly on Android APKs, as most Android apps are only released to the community in bytecode (e.g., via Google Play Store).To use these approaches, analysts often need first to build the open-source Android apps into APKs, and such often need to be done automatically.In addition, Keeping this requirement in mind, we answer our last research question by checking the quality of the build scripts provided by open-source Android apps.If an Android application cannot be built successfully via the build system, we manually analyze it to determine what are the root causes.To build every single project of the dataset, we constitute our script to build applications by invoking Gradle Wrapper with the following simple command.Ideally, with this command, the build process should be automatically and smoothly completed, if the build environment is properly set up.
gradlew assembleDebug By default, there are two different build variants for every Android project: One is in debug mode, and the other is in release mode.Release version intended for app users, while debug mode is better for app developers and researchers.As app developers and researchers could fix more potential issues exposed while building in debug mode compared with release mode to improve the quality of these projects.We focus on Gradle Wrapper (instead of Gradle) to build open-source Android apps because Gradle Wrapper helps to prepare the correct version of Gradle for building the project.Otherwise, we need to install and set up the correct Gradle version (as specified by the project) to build the project.This process is however tedious as many Gradle versions have been released in the lifetime of Gradle.For example, the Android projects in our dataset, which are developed between 2009 and 2022, require Gradle versions ranging from as early as v1.0 to the latest v7.6.
Many open-source Android projects provide a Gradle Wrapper to help developers ease the build process.Among the 4974 projects considered in our dataset, 4774 of them (or 95.97%) contain Gradle Wrapper scripts.In this research question, we focus on those 4774 open-source Android app projects to evaluate their provided build scripts' quality.To control the experimental time, we further set a 10 min threshold for each build.If a build cannot be done in 10 min, we consider it as a failed case.

| RQ3:
Can open-source android app projects be successfully built with their provided build scripts?
Among the 4774 projects containing Gradle Wrapper script, only 1495 of them can be automatically built, giving a success rate of 31.32%.The fact that more than half of the considered projects cannot be successfully built reveals a serious problem to the community.It questions the quality of the app projects (poorly maintained with problematic code that cannot be compiled) or the app's build scripts, which are not properly configured and hence are invalid to produce the final APK.
Hassan et al 12 also did a similar study to investigate the auto-build of Java projects with build systems Apache ANT, Apache Maven, and Gradle on 200 Java projects downloaded from Github.They found that 91 of them cannot be built successfully and summarized three types of root causes for these failures.What is more, they tried to auto-repair on the three types of failures and successfully fixed 65 Java projects.Similar to the existing literature, Android projects also contain dependency problems.Unlike traditional Java projects with traditional build systems, Android project build requires much more additional processes, such as NDK build, code shrinking, and signing, which indeed could induce build problems and cannot be fixed easily.
To understand the reason why so many of these Android projects cannot be built, we manually analyze and summarize the root causes behind the large number of failures.To do this, we manually look at the build logs of such projects that cannot be successfully built.Because it would be very time-consuming to go through all the failed projects, we randomly selected 100 projects for this purpose, as done previously.Table 2 summarizes the five main categories of build failure reasons we identified.It is worth mentioning that the total number of projects in Table 2 is 95 as the root causes of the remaining five projects cannot be well identified.We discuss these separately.
The first column presents the error type, while the second column shows the number of the projects that are categorized into that type.The last column provides a concrete example discovered during our manual observation process.In total, there are five well-defined types of root causes observed: (1) Source Code Error, (2) Configuration Error, (3) Resource File Missing, (4) Library Not Available, and (5) NDK Error.We now detail these five types, respectively.
T A B L E 2 Root causes of the build failure.status of projects hosting on Github.However, the results could still valid to some extent among Android community as the large-scale dataset is adopted.In addition, we set the building threshold to 10 min to reduce time costs.Consequently, the number of successful build projects is influenced by this artificial setting.

| Internal validity
The major threat to the internal validity of our study concerns the build system detection and build failures of Android projects.App developers push their projects on Github with file .gitignore.They may intentionally filter out build system specific files, such as .classpath,.project,or accidentally forget to delete the specific files.However, the number of Android projects classified as None and Multiple is quite small.The experiment could still reflect the real status of the open-source Android projects.

| Construct validity
The major threat to the construct validity of our study lies in possible errors in the implementation of our experimental scripts and tools.To determine the build techniques of Android projects, we implemented Python scripts to check whether the specific files in Table 1 exist or not.However, there are a few projects that include source code of other open-source libraries, which also contain their own build files.To mitigate this threat, we have carefully reviewed the toolchains and manually validated partial experimental results against selected benchmarks.However, we cannot exclude all the third-party libraries in the projects.We plan to mitigate this in the future.

| Conclusion validity
The primary threat to the conclusion validity of our work lies in the size of the sampled dataset, including the number of sampled projects to analyze the relationship between Java source code and build files update and the number of sampled projects to reveal the root causes of auto-build failures.To mitigate this, we resort to the well-known Sample Size Calculator with a confidence level of 95% and a margin error of 10% to compute the sample size and select the projects randomly.In addition, the external validity also lies in our manual work, such as manually summarizing the root causes of failed Android projects, manually determining the relationship between build files and Java source code.To mitigate this, we cross-validated the manually analyzed results drawn by two different authors.

| DISCUSSION
We discuss the key potential implications of this work for both practitioners and researchers.

| On the need of supporting automated migration of build techniques
As revealed in the findings of RQ1, build techniques are continuously evolving and Android app developers also continually update their projects to align with the latest build techniques (e.g., from Apache ANT to Apache Maven and from Eclipse ADT to Gradle).Even though, to the best of our knowledge, our community has not proposed automated tools to help developers automatically achieve such a purpose.Often, developers need to manually change the build techniques.Such a process might be time-consuming as developers may not have pre-experience about the new build technique (hence may need time to resolve various issues raised in the process).Furthermore, there is not only a need to migrate build techniques but also a requirement to update the same build technique to its latest versions.For example, when manually observing the failed builds, we find that there are some legacy projects leveraging Gradle under version 1.2, which uses HTTP to download Gradle distributions.However, Gradle services will only respond to requests with HTTPS since January 2020.The requests made with HTTP will be denied.In order to build these legacy projects, we need to upgrade the Gradle version to at least 2. As revealed in the findings of RQ3, resource file missing is the dominant root cause for the auto-build failures.However, some of the missing files are intentionally filtered out by the app developers because these files contain sensitive information, such as the keystore.propertiesfile.To release these projects on Github and guarantee the success of auto-build for other developers and researchers, we argue that new techniques are necessary for developers to protect such sensitive information when uploading to Github while achieving successful build in anywhere.

| On the need of advanced code shrinking techniques
To shrink the final build of Android projects, app developers need to set minifyEnabled to true.However, there are so many potential bugs induced if they set the item minifyEnabled to true, such as Exceptions in Android minify 41 and minify related bugs in Github. 42Therefore, some of the developers intentionally set minifyEnabled to false to avoid these potential build issues for a smooth build.To provide a better build experience and achieve smaller final installables, more advanced techniques are urgently needed for Android community.

| On the need of supporting continuous delivery
The fact that many open-source Android projects cannot be successfully built impedes the research progress of Android researchers.Some of the projects are out of maintenance by the developers because they have changed their focus to other projects.To properly remind developers to update the projects in time, the continuous delivery feature provided by the hosting site Github could be utilized, which would build the projects periodically and inform the author once an error occurred during the build process.With the help of the continuous delivery feature, developers could fix the build error on-time and provide a consistent delivery both in binary and open-source approaches.We, therefore, argue that there is a strong need to adopt continuous integration and delivery into the development process of Android apps.

| Better tools to do error analysis
We developed our automatic build error repair tool according to the manual analysis on the randomly selected 100 open-source projects.In order to fix different build errors as much as possible, manual summarization of the root causes of the different build errors is necessary.
However, it is time-consuming and painstaking for researchers.To relieve the burden saddled on open-source Android researchers, better static analysis tools are indispensable.The new proposed tools can be used to not only detect potential build errors lying in Java source code of the project per se but also the build system, such as the library removal from the library repositories (e.g., Maven and Lvy).Thus, we believe that an automatic tool that can be used to detect root causes of the build failure is necessary.By applying these root cause detection tools, different root causes can be considered when we perfect our error build repair artifact.

| Different projects investigation
Build systems leveraged by Android projects are not unique for Android app development.They can also be used to build projects written in other languages, such as C++ and Java.Build failures investigation on other projects rather Android applications can also benefit for our researchers to have a better understanding of the build systems achieving a more comprehensive analysis.

| RELATED WORK
Build systems innate the ability to transform source code implementation to final executable artifacts easing the whole development process.
As of the importance of build systems, plenty of researches were conducted by our fellow researchers to investigate the effectiveness and validity of build systems.We now summarize the key works related to our research topic in the area of build systems.

| Java projects
Plenty of researches focus on the build systems utilized for Java projects as of the popularity of Java programming language.McIntosh et al 5 discuss the details of Java build systems, especially the build system Apache ANT and Maven, and argue that the update of the build system always requires the change of the project's source code and also the build system evolves both statically and dynamically with regard to the size and complexity.Hassan et al 12 study how Java projects can be automatically built with three popular build systems adopted including Apache ANT, Apache Maven, and Gradle.They downloaded 200 Java projects with the highest number of stars on Github and found 91 cannot be built automatically.They also summarized and categorized the root causes of these build failures.Macho et al 44,45 only focused on build changes from Apache Maven build files.They summarized build change types and found that version changes and dependency changes more frequently than other types and build changes are not equally distributed over the projects' timeline.Shridhar et al 46 analyzed 13 Eclipse projects and five Apache projects.They summarized six different build change categories, including adaptive, corrective, perfective, preventative, new functionality, and reflective, and concluded that the corrective, adaptive, and (to some extent) new functionality changes are the most common, and induce the largest churn and invasiveness in the build system.We, however, focus on the newer build system Gradle, which can also be used to build Java and Android project but has different build semantics compared with Apache ANT and Apache Maven.

| Other projects
Suvorov et al 47 analyze the build system migration on K Desktop Environment (KDE) and Linux kernel and find that the build system migration follows the model of spiral.They also summarized four different challenges for software engineers to tackle while performing a build system migration.Gligoric et al 48 focus on the migration of build systems and provide an automatic approach to migrate any of the build systems to the Microsoft cloud-based build system.They developed an automatic approach and implemented their approach in a tool named METAMORPHOSIS, which reduces the size of the synthesized scripts up to 46%.Kumfert and Epperly 49 study how big the hidden labor overhead spent on the build infrastructure.In Kumfert and Epperly survey, the developers spend on average 11.91% of their development time on the build maintenance while the maximum is 35.71%.Robles et al 50 focus on the relationships and differences between the project core source code and other source artifacts including interface specifications, internationalization, and localization modules, which the developers use as input to produce the final deliverable.Zaidman et al 51 study the co-evolution between the core source code of the project and its test and introduce the views of the change history, the growth history and the test quality evolution, which are applied on two open-source systems to figure out the co-evolution to aware with developers and manager alike practitioners.Gall et al 52 concentrate on large systems to examine the system's building blocks such as modules to reveal the logical obvious or hidden dependencies and change patterns among these modules.Our work, however, concentrates on the upgrade during the evolution of the Gradle-based Android projects.

| CONCLUSION
Build systems are significant in software construction due to the complexity of modern software.Focusing on the build system of open-source mobile Android applications, we have extensively studied the evolution of the build systems leveraged by Android applications, especially the relatively newer build system Gradle.We found that there are four different auto-build technologies used for Android open-source projects-Apache ANT, Apache Maven, Eclipse ADT, and Gradle.By far the most popular one is Gradle, the default build system recommended by Google when Figure2Bpresents the distribution of the number of projects based on their latest update.The fact that the majority of apps are updated in recent years (at least 90.54% [4728/5222] are updated within 3 years) shows that most of our selected projects are still under active development and hence should provide a reasonable lens through which we can understand the status quo and the evolution of Android build systems.

F I G U R E 7 F I U R E 9
Example of updating the classpath taken from entry cgeo/cgeo.F I G U R E 8 Example of updating the dependency taken from entry liaoheng/BingWallpaper.Example of updating the Android taken from entry HabitRPG/habitica-android.

F I G U R E 1 0
Example of updating the task taken from entry mrf345/online-wallpapers.¶ F I G U R E 1 1 Updates with respect to four update types.5.2 | RQ2.2:Why are build files changed?
The necessity of automating version upgrade of the third-party libraries RQ2 unveiled the necessity of the version upgrade of third-party libraries.The evolution of third-party libraries, such as bug fixes involvement and feature enhancements, enforces app developers to upgrade to latest version in order to take fully advantage of the newly released ones.For example, newly added APIs ease app function development and bug fixes involvement improve the robustness of apps.To speedup the version upgrade, automatic approaches to complete version update are necessary.8.1.3|Automated fix build errorsThe fact that only 31.32% of open-source Android apps can be automatically built into APKs shows that many open-source apps cannot directly benefit from existing state-of-the-art app vetting tools to mitigate potential quality or security issues.To mitigate this, we argue that there is a need to propose automated repairing tools to the community to help users directly fix build errors to allow successful builds of such projects that cannot be in the first place.8.1.4| On the need of protecting sensitive information while achieving successful auto-build 2.1.To free developers from handling such a time-consuming process, we argue that there is a need to invent promising approaches to automatically migrate build techniques in given open-source Android app projects.8.1.2|