High‐resolution national land use scenarios under a shrinking population in Japan

In sharp contrast with the global trend in population growth, certain developed countries are expected to experience rapid national population declines. Considering future land use scenarios that include depopulation is necessary to evaluate changes in ecosystem services that affect human well‐being and to facilitate comprehensive strategies for balancing rural and urban development. In this study, we applied a population‐projection‐assimilated predictive land use modeling (PPAP‐LM) approach, in which a spatially explicit population projection was incorporated as a predictor in a land use model. To analyze the effects of future population distributions on land use, we developed models for five land use types and generated projections for two scenarios (centralization and decentralization) under a shrinking population in Japan during 2015–2050. Our results suggested that population centralization promotes the compaction of built‐up areas and the expansion of forest and wastelands, while population decentralization contributes to the maintenance of a mixture of forest and cultivated land.


Abstract
In sharp contrast with the global trend in population growth, certain developed countries are expected to experience rapid national population declines. Considering future land use scenarios that include depopulation is necessary to evaluate changes in ecosystem services that affect human well-being and to facilitate comprehensive strategies for balancing rural and urban development. In this study, we applied a population-projection-assimilated predictive land use modeling (PPAP-LM) approach, in which a spatially explicit population projection was incorporated as a predictor in a land use model. To analyze the effects of future population distributions on land use, we developed models for five land use types and generated projections for two scenarios (centralization and decentralization) under a shrinking population in Japan during 2015-2050. Our results suggested that population centralization promotes the compaction of built-up areas and the expansion of forest and wastelands, while population decentralization contributes to the maintenance of a mixture of forest and cultivated land. few centuries, and this trend is expected to continue (Pereira, Navarro, & Martins, 2012). In sharp contrast to the global trend in population growth, certain developed countries are expected to experience rapid population declines at the national level (United Nations, Department of Economic and Social Affairs, Population Division, 2015). Such historic changes in human population dynamics may affect land use patterns in both rural and urban areas, resulting in changes in the functioning of the ecosystem, which is the source of many goods and services that contribute to the economy and human welfare (Foley et al., 2005;Lambin & Geist, 2010).
In the last few decades, rural areas have suffered from severe population declines combined with rural-tourban migration. These patterns were stimulated by an economic shift from agriculture to industry, increasing job opportunities in the cities. Additionally, the recent expansion of large-scale modern agriculture has caused a concentration of agricultural activities in areas with favorable conditions and the abandonment of unfavorable sites (Aide & Grau, 2004;Mottet, Ladet, Coque, & Gibon, 2006). Land abandonment is coupled with population declines in marginal agricultural landscapes, especially in mountain areas (Aide & Grau, 2004;Ray Benayas, Martins, Nicolau, & Schulz, 2007). Land abandonment may have both negative and positive consequences. It may decrease landscape heterogeneity, increase fire frequency, and result in soil erosion, a loss of biodiversity, and a loss of cultural and aesthetic value (MacDonald et al., 2000;Ray Benayas et al., 2007). However, abandoned land can provide a new opportunity to restore self-sustaining ecosystems, providing various kinds of ecosystem services, if appropriate management for rewilding is conducted .
In urban areas, population decline can lead to the hollowing of urban landscapes. Vacant landscapes may embody the unfavorable legacies of past uses and threaten the health, safety and quality of life of residents (Nassauer & Raskin, 2014). Unmanaged vacant lands in urban and peri-urban landscapes are a source of increasing mobility costs, energy expenditure, and emissions of CO 2 and air pollutants. These areas may potentially be reclaimed as green spaces, which supply ecosystem services. For example, green spaces in urban and peri-urban landscapes can reduce disaster risks (Liu, Chen, & Peng, 2014) and mitigate the heat island effect (Davis, Jung, Pijanowski, & Minor, 2016;Norton et al., 2015;Stone, Hess, & Frumkin, 2010). In other words, population decline may create opportunities for a transition toward more sustainable urban forms. There are two dominant and contradictory theories about sustainable urban forms: the "compact city" and the "dispersed city" (Geschke, James, Bennet, & Nimmo, 2018;Holden & Norland, 2005). The "compact city" is an urban planning concept that promotes high residential density and lowers the energy requirements for housing and daily travel (Ariga & Matsuhashi, 2012Holden & Norland, 2005;OECD, 2012). However, such high-density development may have an inverse relationship with livability (Neuman, 2005). In contrast, a "dispersed city" suggests a relatively open urban structure, where buildings, fields, and other green areas form a mosaic-like pattern. In both cases, the restructuring of urban forms may affect human well-being and sustainability under future population declines by enhancing different ecosystem services. For example, in compact cities, large green spaces can be set aside, without human disturbance, thereby promoting the sequestration of carbon in the soil and contributing to the mitigation of global climate change (Collas, Green, Ross, Wastell, & Balmford, 2017). Dispersed cities may enhance the recreational use of urban green spaces (Soga et al., 2015), contributing to human health (Shanahan et al., 2016).
Future scenarios of land use change coinciding with depopulation are quite valuable to develop strategies to mitigate the negative effects of land abandonment and to exploit the emerging opportunity for land use restructuring to improve human well-being in both rural and urban areas. Spatially explicit population projection has played an important role in studies of the environment and sustainability (Bengtsson, Shen, & Oki, 2006), and land use scenarios under a shrinking population should be consistent with a high-resolution population projection. A population-projection-assimilated predictive land use modeling (PPAP-LM) approach, in which a spatially explicit population projection is incorporated as a predictor in land use models, can offer a powerful framework to draw future land use scenarios in the context of changing populations. However, previous studies have focused on a regional scale (Manson, 2006;Thorn et al., 2017), and no studies, to our knowledge, have developed and compared national-scale future land use scenarios associated with population projections (but see Alig, Kline, & Lichtenstein, 2004, which simulated change in U.S. urban land).
Diverse land use change models have been developed to analyze interactions between driving factors and land use changes and to predict future land use changes at various scales (Dietzel & Clarke, 2007;Lin, Chu, Wu, & Verburg, 2011;Verburg & Overmars, 2009;Verburg, Ritsema van Eck, de Nijs, Dijst, & Schot, 2004). Most existing land use models use process-based approaches (Dietzel & Clarke, 2007;Verburg et al., 2004) and have been developed for countries with population growth expressed by a simple Markov process (Dietzel & Clarke, 2007;Verburg et al., 2004). Some have explicitly incorporated the relationship between land use and demographic dynamics, especially in countries with population growth (Li et al., 2015;Luo, Xing, Wu, Zhang, & Chen, 2018). However, population "shrinking" involves more complex processes and is not always explained by a simple mechanism (but see Verburg & Overmars, 2009). A machine-learning approach (Castella, Kam, Quang, Verburg, & Hoanh, 2007;Faleiro, Machado, & Loyola, 2013;Li & Yeh, 2002;Verburg et al., 2002) is a powerful alternative tool that can be used to develop a national-scale predictive land use model under depopulation. This approach can accommodate complex and spatially heterogeneous relationships between population distributions and land use patterns, and models can be selected with high predictive ability (Lin et al., 2011;Verburg et al., 2002). Moreover, this approach can be used to identify complex processes from spatial patterns in land use changes under a "shrinking" population that cannot be expressed by simple mechanisms.
In the 21st century, Japan has become one of the most depopulating countries in the world. Even before the detection of these problems at a national scale, Japan experienced a rapid population decline and land abandonment regionally after the late 1950s to 1960s. In rural areas, traditional agricultural landscapes have been altered by land abandonment and agricultural intensification, which is a major threat to biodiversity (Fukamachi, Oku, & Nakashizuka, 2001). Urban areas in Japan have experienced both sprawl to the suburbs and the hollowing of old residential areas. To promote urban renovation to develop compact cities, the Japanese government es-  (Tanaka, Iwamoto, & Nishina, 2014). In such a situation, social consensus about the desirable population and land use gradients in rural and urban areas is not established.
In this study, we developed land use change scenarios for 2015-2050 under a shrinking population in Japan by applying the PPAP-LM and evaluated the effect of population distribution on future land use structure. We applied the model to two feasible spatially explicit population scenarios: centralization and decentralization (Ariga & Matsuhashi, 2012). We compared the consequences of different patterns of population distribution on future land use structure. We made the projected high-resolution land use available online ( https://doi.org/10.17605/ OSF.IO/A9QVY). It can be used to predict the effect of land use changes on biodiversity and ecosystem services.

| ME THODS
PPAP-LM is a machine-learning approach used to project a high-resolution land use change trajectory corresponding to a population change scenario. It consists of the following three phases: pre-processing of data, construction of land use models, and projection of future scenarios (Figure 1).

| Study area
The study area was Japan, an archipelago consisting of 6,852 islands with a total land area of 377,972 km 2

| Data description
Data were based on 30″ lat. × 40″ long. ca. 1-km grids (Tertiary Mesh Units, TMUs: 367,705 units) covering the whole Japanese archipelago because most statistical information prepared by Japanese administrative agencies are recorded at this scale. TMUs enable analyses of different statistical surveys without being constrained by differences in survey area. A summary of explanatory variables, variable names, and data sources used for the PPAP-LM approach is provided in Table 1.

| Longitudinal land use data at a national scale
The "Land Utilization Tertiary Mesh Data" in the National Land Numerical Information (National Land Information Division, National Spatial Planning and Regional Policy Bureau, Ministry of Land, Infrastructure, Transport and Tourism of Japan, 2015) for 1976, 1987, 1991, 1997, 2006, 2009, and 2014 were used as longitudinal land use data. Land cover classifications were based on satellite images compiled in TMUs. Each record included areas (m 2 ) of 12 land cover types in a TMU calculated from a 100-m resolution land use map. Since the definition of some land use categories varied among survey years, these classifications were integrated into eight common categories F I G U R E 1 Schematic diagram of the analysis process using the PPAP-LM approach comparable among years: paddy field, forest, wasteland, built-up area, trunk transportation land (including roads and railways), waterbody (including beaches), golf courses, other agricultural land (including fields and orchards), and other artificial land. Other artificial land was defined as artificial land use, except for agricultural land and builtup areas, such as green space in urban areas, athletic fields, open space on factory sites, schools, and port areas.
To apply a PPAP-LM approach to the land use change model, a set of land use data and population data for the same time periods is required. Therefore, longitudinal land use maps with irregular intervals (1976, 1987, 1991, 1997, 2006, 2009, and 2014) were transformed to maps for the years with available population data (i.e. 1980, 1985, 1990, 1995, 2000, 2005, and 2010). Inverse time-weighted interpolation was applied to time-series data, which is analogous to inverse distance-weighted interpolation applied to spatial data ( Figure 1). The calculation process was as follows. Let ȗ(t, s, v) be an interpolated value for land use type v at TMU cell s in year t based on observed samples, u t i ,s, v in year t i (i = 1, 2, … , N) was calculated as follows:  , 1980, 1985, 1990, 1995, 2000, 2005 1976, 1987, 1991, 1997, 2006, 2009, and 2014 (National Land Information Division, National Spatial Planning and Regional Policy Bureau, Ministry of Land, Infrastructure, Transport and Tourism of Japan, 2015) where as defined by Shepard (1968). A value of p = 2 was used as the power parameter to avoid overweighting the nearest sample and to generate a smooth curve of ȗ(t, s, v) around t i . Through this interpolation procedure, the weighted average of the areas of the eight land use categories for each grid and each year was obtained.

| Population data at a national scale
Grid-based national population census data for 1980-2010 were used (Statistics Bureau, Ministry of Internal Affairs and Communications, 1985, 1990; National Land Information Division, National Spatial Planning and Regional Policy Bureau, Ministry of Land, Infrastructure, Transport and Tourism of Japan, 2017).
The census was conducted every 5 years, and the number of residents was aggregated in the TMUs. Additionally, population data were aggregated at three distance scales, 1 (original data), 5 and 10 km, and used as explanatory variables. Although the total population in Japan increased during this period, at a local scale many rural and urban areas had already experienced a population decline. A mixture of both population trends in the training data captures the characteristics of land use changes under both a shrinking and a growing population. Moreover, population changes can have a lagged effect on land use (Veldkamp & Fresco, 1996). Therefore, populations at three distance scales in a previous time step were used as explanatory variables.

| Topographical factors
The land surface of Japan is covered by mountains, and landforms are steep and rugged. Plains cover only 29% of the land surface and are located along the seacoast. The population is heavily concentrated in these limited flat areas. Therefore, topographical factors are important determinants of the spatial distribution of land use patterns. Information from the Japan Engineering Geomorphologic Classification Map (Wakamatsu, Kubo, Matsuoka, Hasegawa, & Sugiura, 2005) was used to determine geographic and topographical features of each tertiary mesh: average elevation (m), average slope increment (tangent θ), relative relief (m), and geomorphological class. The original 20 geomorphological classes were reclassified into four categories: lowland, diluvial upland, mountain, and volcano.

| Spatial factors
Unmeasured variables, such as climatic factors or historical events, can result in spatial autocorrelation in the prediction error (Legendre, 1993). Therefore, spatial variables were included in the models to reveal geographical trends in land use to mitigate spatial autocorrelation (Márquez, Real, Olivero, & Estrada, 2011;Nakao et al., 2014).
The longitude and latitude of each tertiary mesh were included as spatial variables in the models.

| Legacy effect of excessively artificialized land use
Excessively artificialized land use may leave legacies on future land uses, such as infrastructure and contaminants, when areas become vacant (Nassauer & Raskin, 2014). Therefore, to express the legacy effect of excessively artificialized land use, the sum of the built-up area and other artificial land in the previous time step was included as an explanatory variable.

| Construction of predictive land use models
Classification and Regression Trees [CART] (Breiman, Friedman, Olshen, & Stone, 1984) were used to construct predictive land use models for five major land use types: paddy field, forest and wasteland, built-up area, other agricultural land, and other artificial land (Figure 1). This method has been applied in various fields, such as medical diagnosis, meteorology, plant physiology, soil sciences, and wildlife management, and has recently been used to successfully model land use change (Du, Shin, Yuan, & Managi, 2018). CART models have various advantages: they are relatively immune to multicollinearity, robust to non-normal distributions of variables, and capable of determining complex interactions among explanatory variables, without specifying them a priori (Breiman et al., 1984).
Additionally, these analyses are easy to interpret because they provide a hierarchical view of relationships among variables, allowing the identification of close correlations (Dea'th & Fabricius, 2000).
The CART model uses binary recursive partitioning, and the resulting models can be represented as binary trees. The tree is built by first including all observations together in the root node. Then, each explanatory variable is assessed in turn to determine the optimal split that divides observations into left and right descendant nodes.
The fundamental idea is to select each split in a subset so that the data in each of the descendant subsets are "purer" than the data in the parent subset. Purity was calculated based on the Gini index. Optimality is defined as the split that results in the maximum reduction in the average impurity of the two (left and right) nodes. Tree construction continues until the number of cases assigned to each leaf is small or the leaf is sufficiently homogenous.
However, a maximally grown tree is usually over-fitted to training data (Therneau & Atkinson, 2018). Therefore, a computational step to constrain the tree to its best size is required to avoid the problem of over-fitting. A common approach in tree-based techniques is to freely allow the maximum growth process and then prune the overgrown branches of the tree.
In this study, 90% of the whole dataset was sampled as the training dataset to construct the model, and 10% was retained as test data to evaluate the model. Then, the initial tree was built using the training dataset, allowed to attain the maximum size, and pruned by 10-fold cross-validation. In the cross-validation process, the training dataset was further divided into 10 parts; each subset was removed in turn and used as a test sample for predictions based on the remaining 90% of the training dataset. The average error rate based on 10 repeated calculations was plotted against the tree size to obtain the complexity parameter, and the optimal complexity parameter for the tree was chosen as the smallest tree with an error rate within one standard error (SE) of the minimum. By using this complexity parameter, the tree with optimal complexity was obtained (Venables & Ripley, 2002).
The final model with optimal complexity was applied to the test data. The following four frequently used measures of model accuracy were evaluated: Pearson's r, mean absolute error (MAE), mean squared error (MSE), and root mean squared error (RMSE). Further, to evaluate the effect of "error propagation" by using the prediction result of the previous time step as the input for the current time step, a simulation was conducted for 1985-2010 using the predicted value for the previous time step as the input for the current time step, and the predictions were compared with real data.

| Projection to future scenarios
Two future population distribution scenarios were considered: a "centralization" scenario and a "decentraliza-  2008). Technical details of constructing population scenarios were presented in Ariga and Matsuhashi (2012). The datasets are available in the Kankyō Tenbōdai section of the National Institute for Environmental Studies website ( http://tenbou.nies.go.jp/). Our land use-population model was projected using these two population scenarios, and the projections were compared among land use models.

| Example for two municipalities with different socioeconomic environments
Two municipalities were selected as examples: Toyota city and Joetsu city. Although both of these municipalities are aiming to become compact cities, they differ in socioeconomic conditions. Toyota city is located in the center of Aichi Prefecture in central Honshu. While this city is known as the "Motor City" of Japan, because it is a world leader in the automotive industry, it has rich forests that account for about 70% of the city's total area. The popu-

| Accuracy of the model
The accuracy of our model differed according to land use type. The model predicted the proportions of five land use types in test data with a high degree of accuracy (Table 2). Models for built-up areas, forests and wastelands, paddy fields, and other agricultural lands showed relatively high accuracies (Pearson's r > 0.80, Table 2). However, the accuracy values for other artificial lands were relatively low (Pearson's r = 0.793, Table 2). The result was robust to changes in the proportion of training data (Table S1). The simulation for 1985-2010 using predicted values from the previous time step as inputs for the current time step also showed high accuracy (Table S2), although the error was higher compared with that of the result using real data.

| Important variables for each land use type
The proportion of paddy field was most strongly determined by topographical factors. Among the topographical factors, paddy field was affected by geological features, occupying a large proportion of lowland and diluvial upland areas ( Figure S1). The importance of variables was then, in decreasing order, population, spatial factors, and legacy effects (Figure 2). The proportion of paddy field increased slightly as the population in the surrounding area increased and decreased as the proportion of excessively artificialized land use in the previous time step increased ( Figure S1).
Similarly, the proportion of other agricultural land was most strongly determined by topographical factors in total. Among topographical factors, other agricultural land was affected by slope, and occupied a larger proportion of land in areas with a lower slope. Both paddy fields and other fields increased slightly as the population in the surrounding area increased and decreased as the proportion of excessively artificialized land use in the previous time steps increased ( Figure S1). The proportion of other agricultural land increased slightly as the population in the surrounding area increased and decreased as the proportion of excessively artificialized land use in the previous 5 years increased ( Figure S1).
Additionally, the proportion of forest was mostly determined by topographic factors, followed by population, historical, and spatial factors (Figure 2). The proportion of forest was high in mountain and volcano areas. The proportion of forest decreased as the population increased and as the proportion of excessively artificialized land use in the previous 5 years increased ( Figure S1).
The proportion of wastelands was determined by both spatial and topographical factors, followed by population and historical factors ( Figure S1). The proportion of wastelands was large at high latitudes, in low-lying areas, and in high-elevation areas. The proportions of built-up area and other artificial land were most strongly determined by population, followed by legacy effects (Figure 2). Topographical and spatial factors had almost no effect on these two land use types ( Figure 2). Built-up area increased as population increased and also increased as the proportion of excessively artificialized land use in the previous 5 years increased. The proportion of other artificial land decreased as the population increased but increased as the proportion of excessively artificialized land use in the previous 5 years increased ( Figure S1).

| Future projection of land use change
We projected our model results using two different scenarios for the population distribution: a centralization scenario and a decentralization scenario. Paddy field and other agricultural land were predicted to decrease over time until 2050. Both land use types decreased to a greater degree in the centralization scenario than in the decentralization scenario (Figures 3a and d). In the centralization scenario, the total area of paddy field decreased When we evaluate the change in land use at a more local scale, we can see the difference in the spatial distribution of each land use type between the two population scenarios. In Toyota city, where the total population is predicted to remain constant, built-up area expanded, especially in the decentralization scenario ( Figure 4).
Simultaneously, the proportion of paddy fields surrounding the densely inhabited district declined. Forest and wasteland decreased at the grid cell scale in which built-up area expanded. These trends were also obvious in the decentralization scenario. In Joetsu city, however, paddy fields decreased not only in the area surrounding the densely inhabited district, but also in the area dominated by forests, especially in the centralization scenario ( Figure 5). In these areas, forest and wasteland expanded in the centralization scenario, while land use types related to human activity (such as paddy fields, other agricultural land, and built-up area) remained steady (or even increased) in the decentralized scenario.

| D ISCUSS I ON
In this study, we applied a machine-learning method to construct a predictive land use model and applied it to two distinct scenarios for future spatially explicit population projections: centralization and decentralization scenarios.
Our model successfully predicted land use changes in 1985-2010 in Japan with high performance (Table 2), indicating the effectiveness of the machine-learning method for predicting land use change. Land use models usually combine remote sensing data, socioeconomic data, and other parameters to obtain high accuracy (e.g. Li et al., 2015). However, our PPAP-LM approach enabled us to obtain high accuracy with comparatively simple variables.
This might be attributable to the use of the CART model based on if-then rules, which can capture qualitative aspects of human knowledge and reasoning processes underlying decisions contributing to land use change (Sadok et al., 2009).
The total areas of paddy field and other agricultural land were predicted to decrease over time, especially in the centralization scenario (Figures 3a and d). In Japan, ownership transfer of agricultural land is restricted by the Agricultural Land Act, and efforts aimed at the relocation and accumulation of agricultural land are insufficient, although the area of agricultural land per farmer is small. Therefore, if a farmer leaves and becomes an absentee landowner, their lands tend to become abandoned (Kubo, 2011;Suginaka, 2005). These specific problems in the Japanese land system may promote land abandonment at a large scale, especially in rural areas. In these areas, promoting further migration from rural to urban areas (the centralization scenario) may cause a rapid decrease in land use related to agricultural activities in rural areas. Conversely, promoting backward migration from urban to rural areas (the decentralization scenario) may prevent the decline in land use related to agricultural activities in rural areas.  Figure S1). Generally, agricultural land and forest in urban fringes are exposed to severe development pressure and tend to be converted into non-agricultural land use (Nakahara & Hoshino, 2006).
Previously (during population growth periods), built-up area increased as the population increased in urban areas, and this increment was associated with a decrease in other land use types in surrounding areas. When the population starts to decrease, a "legacy" effect of previous artificial land use may remain; "rewilding" may not occur naturally in areas that experienced intense artificial modification.
Although the total population was the same in the two scenarios, the total built-up area became smaller in the centralization scenario than in the decentralization scenario (Figure 3c). This result may reflect the efficient use of built-up area by a concentrated population in the centralization scenario, which supports the longstanding compact city theory (Holden & Norland, 2005). Therefore, the centralization scenario may be more effective for the prevention of the loss of agricultural land and forest due to sprawl in the fringe regions of urban areas, such as Toyota city  (Doygun, 2009;Yeh & Li, 1999). On the other hand, in agricultural areas, such as Joetsu city, centralization may also accelerate the abandonment of agricultural land (Figures 5a, b, and d). In this situation, promoting the expansion of agricultural land per farmer might be necessary to maintain agricultural productivity and ensure future food security.
Our model did not consider the effects of future economic and climatic changes. Incorporating our land use model with a top-down approach that accounts for economic change (e.g. Hasegawa, Fujimori, Ito, Takahashi, & Masui, 2017) may be useful for forecasting land use change under a shrinking population in the future.

| CON CLUS IONS
Previous studies related to population distribution have focused on social aspects, such as the cost of public transportation or the amount of CO 2 emissions. In this study, we examined an additional aspect of the effect of population distribution: effects on land use change. Our results illustrated that the spatial distribution of a population may affect various land use types, not limited to urban areas. During a population decline, the proportions of each land use type showed complex responses according to the "legacy" of excessively artificialized land. Population centralization may improve the expansion of forest areas and enhance carbon storage, while population decentralization may contribute to the maintenance of the cultural landscape with a mixture of forest and agricultural land use. Our results can be used to analyze the effect of future population distributions via land use changes, such as effects on wildlife or ecosystem services. Further studies are required to reveal the effects of changes in population distribution via land use change, which can provide a basis for comprehensive planning to balance rural and urban development.