Destination Choices of Chinese Rural–Urban Migrant Workers: Jobs, Amenities, and Local Spillovers

Using the 2014 China Migrants Dynamic Survey, we analyze rural&#8211;urban migrant workers&#8217; destination choices after the global financial crisis, with an emphasis on jobs, amenities, and local spillovers. By using an equilibrium&#8208;sorting model, this paper disentangles local spillovers from local attributes in the estimation process. We employ both an artificial instrumental variable and the provincial highway passenger flow in 1979 to tackle the endogeneity issue. After controlling for the network effects of migrants from the same origin, we find a separate and strong preference for colocating with a large population of migrants, regardless of origin. The results remain robust when we take into account labor supply&#8208;driven migration, spatial autocorrelation between provinces, different industry definitions, and regional differences within provinces. Our results imply that due to institutional barriers, the rural&#8208;migrant community will still be a very important factor in the foreseeable future. In addition, as the ongoing industrial upgrading and transfer policies in China may lead to a westward movement of rural&#8211;urban migrants, the movement will be expedited when the older, less educated, or lower income migrants relocate.

migrating to the eastern region shows a continuous decline of approximately 8%. In Figure 1b, we show that ruralurban migration is increasingly an intraprovince phenomenon, and the rate of growth is slightly higher in the middle and western regions. In short, workers are migrating more to nearby destinations after the GFC. That is, the migration flow is changing from a predominantly inland-coast direction to both inland-inland and coast-coast.
What is causing the spatial redistribution of rural-urban migration flows? Studying the new migration pattern is of great importance, as migration is closely related to the spatial allocation of human capital and to local economic development. Understanding rural-urban migrant sorting patterns is also helpful to local migrant population service centers, enabling them to design more efficient policies. 4 This paper sets out to understand what factors are driving the phenomenon of westward and short-distance migration. We study the destination choice of rural-urban migrant workers in China, with an emphasis on comparing the roles of jobs, amenities, and, in particular, local spillover effects.
The local spillovers in this paper refer to the channels through which the presence of a large rural-migrant inflow might affect location choice separately from the presence of migrants from one's own province. We conjecture that the presence of a large rural-migrant inflow is a favorable local attribute for future potential migrants, because it facilitates labor market pooling and serves to limit the effects of various forms of discrimination faced by rural hukou holders in cities. Although a large migrant population is likely associated with a deterioration in housing quality, because rural migrants are competing for limited cheap housing units on the rental market, the costs might be smaller than the aforementioned benefits. 5 We analyze the destination choices of 58,595 rural-urban migrant workers, using the 2014 China Migrants Dynamics Survey. A retrospective question on the timing of migration allows us to study the sorting patterns of migrants between 2008 and 2014, the critical transition period after the GFC for China. Our main conclusions are as follows. For province characteristics, the pull effects of jobs and local spillovers are more important than those of amenities. Employment in secondary industries contributes substantially to province's attractiveness for potential migrant workers, whereas the presence of a large rural-migrant inflow increases the province's attractiveness by a similar magnitude. Good medical resources turn out to be attractive for rural migrants, but we do not find such a pull effect for educational resources. Only higher income workers place greater weights on medical and educational resources when choosing destinations. For individual heterogeneous preferences, the preference for colocation with other migrants, regardless of origin, is especially strong among the older, less educated, and lower income groups. This might reflect the migrants' high demand for social support from peers. Our results imply that perhaps, due to institutional barriers faced by rural hukou holders, the presence of a large migrant community will likely be a very important factor affecting location choice for the foreseeable future. In addition, the trend in China has been for manufacturing firms to relocate more labor-intensive plants to the middle and western regions in response to ongoing industrial upgrading and transfer policies. Our results indicate that the flow of rural migrants to these new employment opportunities will likely accelerate as spillover effects draw in still more rural migrants, in particular from the older, less educated, or lower income demographic groups.
We contribute to the literature on urbanization in China in two aspects. First, we examine new westward and short-distance migration pattern after the GFC by using a large up-to-date micro data set of migrant workers. We study the relative importance of jobs, amenities, and local spillovers among rural migrants on location choice. This study helps to better understand the determinants of rural-urban migration, and to predict what future demographic patterns will be in response to changing urbanization or industrial policies in China. Second, we pay special attention to a critical but often neglected determinant of destination choice, local spillovers. In the previous literature, by simply 4 In recent years, the central government began setting up many local branches specifically aimed at helping migrants in various aspects of life, such as settlement and social integration. See http://www.ldrk.org.cn/ for more details. 5 The situation is somewhat similar to international migration in developed countries. International migrants usually choose to live where there are others with the same ethnic background and a presence of other immigrant groups as well. Coethnic networks provides assistance in job information and initial settlement, whereas a high share of foreign-born population reflects that the area is possibly a more tolerant environment to foreigners (see, e.g., Åslund, 2005;Buckley, 1996;Zavodny, 1999).
observing people choosing to move to a specific place on a large scale, it is not clear whether the attractiveness of the destination can be attributed to certain location-specific features or to agglomeration effects and other externalities.
Controlling only for location-specific fixed effects would mix these effects. Other papers examining migrant location choices in China have often used province or city-level fixed effects (see, e.g., Liu & Xu, 2017;Xia & Lu, 2015;Zhang & Zhao, 2013) in a discrete-choice framework. By using the equilibrium-sorting framework developed by Bayer and Timmins (2007), we are able to test for the impacts on location choice of various observed local attributes and recover estimates of local spillovers in the presence of unobserved characteristics.
The remainder of the paper is organized as follows. Section 2 develops the main factors examined in this paper.
In Section 3, we explain the sorting model used for our empirical analyses and our identification strategy. Section 4 describes the data set and the variables, and then, Section 5 presents and interprets the estimation results.
Section 6 gives the results of different robustness checks, all of which are found to be in line with our main findings.
The final section discusses potential policy implications and avenues for future research.

| JOBS, AMENITIES, AND LOCAL SPILLOVERS
This section reviews related studies, and presents our hypotheses with regard to jobs, amenities, and local spillovers. There are two types of studies on the migration decisions of rural-urban migrants in China. Earlier studies of rural-urban migration primarily focused on the individual decision to become a migrant (see, e.g., Liang & White, 1997;Wang & Zuo, 1999;Zhao, 1999;Zhao, 2005). The initiation of a migration decision is closely related to one's farmland size, rural enterprise development, local rural income, number of early migrants from the village, and so forth.
Several recent studies (Su, Tesfazion, & Zhao, 2018;Xing & Zhang, 2017;Zhang & Zhao, 2013) focus on a subsequent decision related to migration, that is, how rural migrants choose their destinations, and what their sorting patterns are. Economic improvement is usually the most important incentive, which is found to be a positive factor impacting the rural migrant's destination choice in all previous studies. Migrants tend to emigrate to areas with relatively high wages and low moving costs. According to Lin, Wang, and Zhao (2004), China's widening income disparity between its coast and inland regions, as well as urban and rural regions, is consistent with increased responsiveness of migration to regional income differentials. In more developed areas, there are more job opportunities as well as higher income levels. This pull effect has also been empirically tested for the entire population of migrant workers in China (Fu & Gabriel, 2012;Xia & Lu, 2015), but not specifically the rural-urban type. 6 Similarly, we conjecture that jobs and wages are playing an unambiguously positive role to attract more rural migrants to cities.
In this paper, amenities refer specifically to public goods. It is worth mentioning that the central government in China has initiated reforms aiming to increase migrant participation in urban employee health insurance and pension programs in the 2008 Labor Contract Law and the 2011 Social Insurance Law (Giles, Meng, Xue, & Zhao, 2018). Although the percentage of rural-migrant workers enrolled in urban health insurance or pension programs is still low (30% in 2015), access to urban social insurance programs is open to migrant workers. In contrast, the situation is worse for educational resources. The right to attend urban public schools in most cities is still associated with hukou status. Public schools have limited quotas. Most migrant children in cities can only go to a special type of migrant school, where the teachers and managers usually come from rural areas also. These migrant schools began to emerge in the 1990s and quickly became the major venue for the education of migrant children (Chen & Feng, 2017;Lai et al., 2014). As medical resources and educational resources are two of the most important public goods, we conjecture that a typical migrant is likely to be concerned with them. 7 6 In the context of international migration, we refer to Wang, De Graaff, and Nijkamp (2018) for an exhaustive literature review of determinants for migrants' location choice.
Networks play significant roles in location choice. The previous literature has focused extensively on network effects of migrants from the same origin (see, e.g., Liang, Li, & Morooka, 2017;Rozelle, Guo, Shen, Hughart, & Giles, 1999;Zhao, 2003), but has ignored network effects of other migrants from different origins. The central feature here is that the payoffs from choosing a location partly depend on the number of other rural migrants choosing the same location in equilibrium, independent of their origins. Therefore, our key variable of interest, that is, local spillovers, is measured by the share of individuals choosing a specific location among all rural migrants. In the following three paragraphs, we explain possible channels where the presence of a large rural-migrant inflow might affect location choice.
As the system of household registration in China causes the migrant population to be segregated from the urban host population, rural-migrant workers are treated differently from their urban counterparts in terms of occupational attainment and wages, and most of them, indeed, experience some form of the discrimination (Démurger, Gurgand, Li, & Yue, 2009;Kuang & Liu, 2012;Meng & Zhang, 2001). In fact, the rural migrants' self-identities are at least twodimensional: (a) the perception of the culture and language of one's place of origin and (b) the perception of being a rural migrant in cities under segregated economies of rural-urban division. On the one hand, social networks formed by migrants from the same place ("Lao Xiang" in Chinese) emerge as effective mechanisms in facilitating rural-urban migration in China (Zhao, 2003). On the other hand, migrants from other places of origin can also provide important networks to make the groups of rural migrants less marginalized. "Nong Min Gong" is what the whole population of rural-urban migrants is called. The nonlocal identity limits their free access to public services in cities and to some extent, hurts their feelings (Chan, 2012;Chen, 2005). Hence, the presence of more rural migrants in the same destination would serve as a large community in which they might form social capital. 8 Social linkages and networks are more likely to be built among rural migrants themselves than to be built between rural migrants and urban citizens, due to their similar backgrounds. These social interactions provide a pool of information about settlement, jobs, and many other aspects of life, specific to the rural hukou holders.
The way more rural migrants choosing the same destination makes it attractive to still more can also be attributed to labor market reasons. As rural migrants sort into a destination, the density of the labor market increases from the supply side. Simultaneously, firms locate in destinations with large migrant inflows and thus increase province's attractiveness for future rural migrants. Marshallian externalities, referring to the concentration of same-industry production, contribute substantially to China's industrial agglomeration (Lu & Tao, 2009). For example, the Pearl River Delta economic zone developed its industries in clusters after the economic reform in China. It has now become a major global manufacturing base. The specialized agglomeration of firms and workers generates potential benefits through labor market pooling. Workers gain industry-specific skills via working, self-learning, and vocational training. There is a reliable demand for specialized workers. When they change jobs, they do not have to change their skill set or move to a new destination, so they have lower search costs and quicker matching processes.
Nevertheless, a negative effect of many rural migrants in the same destination could possibly arise due to the limited housing in cities. The suppliers of urban housing have overlooked the needs of the migrant population, despite its considerable size. As rural-migrant workers are excluded from the housing-distribution system, they rent low-cost dwelling units or live in enterprise dormitories (Song, 2014;Wu, 2004). Those affordable dwelling units are usually found in "villages in the city" ("Cheng Zhong Cun" in Chinese), located in both the outskirts and downtown segments of cities. Local farmers living in villages are allowed to construct housing units and rent them out. Many of these housing units are equipped with very poor facilities and situated in neighborhoods with inadequate infrastructure. The other source of housing, enterprise dormitories, are commonly present in construction and manufacturing sites. Dormitory-style housing provides smaller usable areas, making these housing 7 We also tried to include housing price as one of the important local attributes in an earlier version of this paper. We will discuss in detail in later paragraphs that migrant workers are only competing for low-cost dwelling units in cities. The average housing price publicly available cannot reflect the bottom portion of the rental market. As was shown in our earlier results, the marginal contribution to province's attractiveness is insignificant. 8 The reference group for a given population is based on spatial proximity and other dimensions. For Chinese rural-urban migrant workers confronting different types of populations, they may refer to nonmigrants in the same village, a cohort from the same clan culture, natives in cities for comparison, or decision-making on other things (Akay, Bargain, & Zimmermann, 2012). options difficult for families to live in. Based on the latest survey of Rural-Urban Migration in China 9 , approximately 50% of rural migrants report that their employers provide accommodation, implying that the rest of the housing demand must be met in the rental market, most likely in these "villages in the city." Migrant workers are quite constrained in their housing budgets. According to Zheng, Long, Fan, and Gu (2009), on average, migrants are unwilling to spend more than 19% of their total income on housing. They prefer small dwelling units to save more money, even though the housing units are overcrowded and located in poorly served neighborhoods. In that case, more migrants sorting into one province means a continuing deterioration of living quality and thus a decrease in destination's attractiveness for future rural migrants.
To sum up with regard to local spillovers, we conjecture that the presence of a large rural-migrant inflow is a favorable local attribute for future potential migrants, because it facilitates labor market pooling and serves as another large community to deal with different forms of discrimination in cities. Although it is likely to cause a deterioration of housing quality because rural migrants are competing for limited cheap housing units on the rental market, whether the cost might be smaller than the aforementioned benefits is an empirical question we seek to answer.

| EMPIRICAL MODEL
This section describes a sorting model for migration behaviors that can be used to assess the relative importance of jobs, amenities, and local spillovers in destination choice. It is a widely accepted assumption that one's locational preferences are expressed by physical migration, which is often called "voting with your feet" (Tiebout, 1956). The equilibrium-sorting model proposed by Bayer and Timmins (2007) (and a slightly different version in Bayer & Timmins, 2005;Bayer, Mcmillan, & Rueben, 2004) inherits this classic assumption and is the starting point for our methodological framework. We use it to examine migrant workers' heterogeneous preferences and the contribution of different province attributes to a location's attractiveness. Our empirical analyses rely on the differentiated-product discrete-choice approach, typically referred to as the Berry-Levinsohn-Pakes (BLP) method in the literature (Berry, Levinsohn, & Pakes, 1995). In addition to Bayer and Timmins (2007), the BLP method has been applied recently in a number of empirical studies of location choice (see, e.g., Levkovich & Rouwendal, 2014;Van Duijn & Rouwendal, 2013;Wang, De Graaff, & Nijkamp, 2016).
This model allows us to measure the size of local spillovers based on the location decisions of migrant workers in the presence of unobservable local attributes, employing the internal logic of the sorting model itself. The estimation process distinguishes the contribution of individual preferences from regional factors (including the unobserved attributes) in two steps. 10 Although the model has been developed to study the distribution of a whole population over a number of localities, it can also be applied to a specific group of individuals (Bayer & Timmins, 2007). In that case, local spillovers refer to self-segregating preferences among the individuals in that group. In this paper, we are focusing on the rural-urban migrant population in particular, and their sorting behaviors should be seen independently from the urban-urban migrant population. The latter group faces much fewer institutional barriers, and they differ substantially from the rural-urban migrant population in terms of educational level, income level, profession, and migration behavior.
The data set is a representative survey of migrant workers in China, with approximately 85% being rural-urban migrant workers. Our sample provides good spatial coverage with 58,595 observations and 26 provinces. The substantial variation across the locations that form the consideration set for each origin makes identification possible. One limitation of using this methodology in the context of China is that the individuals are a selected group of rural people who have decided to leave rural areas and move to cities. This means that this group might be 9 The survey was initiated by the Australian National University. We calculate the percentage for new respondents who entered the survey in 2016. 10 In previous studies, only Xing and Zhang (2017) employ a similar idea in modeling the rural-urban migrants' location choice, whereas the focus is not on local spillovers among rural migrants themselves. more opportunity-seeking and may be more responsive to province attributes in making destination choices.
Therefore, all coefficients should be interpreted as lower bounds relative to a less selected population. 11

| Equilibrium-sorting model
We present the utility-maximizing location choice model in the context of rural to urban migration in China. A population of migrants, indexed by = … m M 1, , , is modeled. The migrants' province origins are indexed by = … i I 1, , , and they choose a destination province from all possible alternatives, indexed by = … j J 1, , . Each migrant maximizes an indirect utility function in choosing destinations as follows: V mij describes the utility that migrant m in origin province i derives from living in province j. Each province j is described by the following: • Z j an observable vector of province's attributes; • j σ the share of individuals who choose this location among all survey respondents; • P ij province pair-specific information for each origin i and destination j combination; • j η province-specific unobservable characteristics.
Estimation requires that migrants do not sort across locations based on the unobserved qualities of the migrants after taking into account the common component captured by the location fixed effect j η .
The taste parameters m β and m α in Equation (1) indicate that the individual preference for a particular province characteristic is not the same for every migrant, but is interacted with individual heterogeneity. It is a function of the migrants' individual characteristics, X m , such as age and gender, and is written as follows: We can rewrite our indirect utility function by substituting Equations (2) and (3) into Equation (1), as follows:

| Identification
The model is estimated using a two-stage procedure due to Berry, Levinsohn, & Pakes (1995).
In the first stage, Equation (4) is estimated using multinomial logistic regression with alternative-specific constants. The coefficients λ, 1 β , 1 α , and γ are obtained: • λ measures province attractiveness; • 1 β describes the individual heterogeneous preferences for various province attributes; 11 This limitation applies to all discrete-choice models on destination choice. The selection problem can be weakened to a certain extent if data about the migrants' counterpart group in the rural areas are available. Unfortunately, we do not have this information to look further into the differences between movers and stayers in rural areas.
• 1 α describes the individual heterogeneous preferences for other migrants choosing the same province; • γ describes the gravity force between origin i and destination j, such as the migrants' network from the same place of origin.
In the second stage, Equation (5) is estimated using ordinary least squares (OLS) or instrumental variables (IVs).
The estimated λ values are regressed on Z and σ , to obtain the coefficients 0 β and 0 α : • 0 β describes the contribution of observed province attributes to province's attractiveness; • 0 α describes the contribution of local spillovers to province's attractiveness.
We have to address potential endogeneity in the second step, as the unobserved province characteristics (η) are very likely to be correlated with both the regressors and province's attractiveness (λ), which may cause estimation bias. Therefore, we resort to IV estimation. The most likely endogenous variable is σ , which measures the share of migrant workers in each province in the sample of migrant workers. If, for whatever reason (prospects for future development, productivity shocks, etc.), many migrant workers choose province j, and the coefficient for the variable of interest will be biased upwards. The unobserved characteristics η then leads to an overestimation of the coefficient 0 α . The model is flexible and can admit additional endogenous variables. In this paper, however, we focus primarily on the estimation of local spillovers. 12 Note that estimation is hierarchical, in that IV is used in the second stage of the procedure. To avoid confusion in wording, we use the term "IV estimation" instead of "two-stage least squares." Whenever the phrase "two-stage" or "second stage" is used, it refers uniquely to the second stage of the estimation procedure. The procedure of the full estimation is shown in the following structure:

| IVs
The first IV is derived from the internal logic of the sorting model. Throughout the analysis, all agents play a static simultaneous-move game. Given the utility function in Equation (1), the probability, Prob mj , that migrant m chooses alternative j can be written as a function of all regional and individual characteristics (observed or unobserved).
Aggregating these probabilities over all migrants yields the share of individuals choosing province j as follows: The sorting equilibrium is the outcome that every location decision is optimal given the location decisions of all others. We need to find a variable that predicts the share of individuals choosing a specific location that is not correlated with the unobserved characteristics of the location. Assuming there are no local spillovers, the artificial instrument * σ constructed arises naturally out of the following equilibrium condition: By imposing = 0 η and = 0 σ , we solve for the predicted * j σ that would clear the market when only the observed province characteristics are considered by the migrants: 12 The variables of local amenities could be possibly endogenous as well if the local government is adjusting the local provision in response to the demand of incoming migrants. It will still lead to overestimated coefficients; however, the local provision of medical and educational resources in China is largely supply-driven, not demand-driven. Even if it is demand-driven, it appeals to the local residents. The endogeneity of amenities might be less of a concern here. ∑ * = ( ′ + ′ ) This approach requires an iterating procedure for both stages. The initial values of 0 β are obtained by estimating Equation (5) using OLS. The estimates of 0 β , together with 1 β and γ from Equation (4), are then used to calculate a new * σ under equilibrium conditions (7) after imposing = 0 η and = 0 σ for all j. The new vector we solve is then used as an IV for j σ in Equation (5). 0 β coefficients are then updated as IV estimation coefficients. The new values of 0 β are plugged back into the equilibrium conditions (7) in the same way as before, and this process is repeated until the instrument stabilizes.
The second IV for j σ is the provincial highway passenger flow in 1979 and relies on the unique economic history of China. Voluntary migration was not allowed before 1978 under the planned economy. After the economic reform, the market gradually became liberalized, as did migration flows. The provincial highway passenger flow in 1979 reflects the initial status of highway connections, which is not likely to be driven by past or current demand for labor in cities.
The relevance of the instrument lies in the fact that, when the restriction on mobility was loosened, people more often moved to provinces with a higher level of accessibility (mostly by intercity coaches). The multiplying effect of local spillovers attracts more migrant workers and hence, continues to affect the present attractiveness of provinces.

| DATA AND VARIABLES
The full estimation of the model requires the following three types of data: individual sociodemographic characteristics, X m , origin-destination pair information, P ij , and province attributes, Z j .
Our main data set is the China Migrants Dynamic Survey (CMDS), which was conducted by the National Health  Table 1 presents the descriptive statistics of these migrant workers' individual characteristics. The average age is approximately 31 years old, 61% of the sample are male, and almost 63% report being married. On average, they have 10 years of education, which is slightly above middle school level. The average age at migration is 29, with a 90% central range of 18-46. Of all migration decisions, 53% are interprovince. 16 The individual characteristics at migration in the X m vector used for estimating Equation (4) are as follows: male, age, years of education, number of school-aged children (under age 16), and potential income. Years of education is calculated as min (AgeAtMigration-6, EduYearsAtSurvey). As six is the starting age for primary schooling, this calculation takes into account the possibility that some migrant workers had additional schooling after migration. Potential income is a predicted value. The income presented in Table 1 cannot be used directly for the estimation, as we need each migrant's potential income in each province. We run an origin-province-specific OLS regression of income on individual characteristics and use the predicted value as the potential income. Our analysis is restricted to destination choices in the Mainland of China at the provincial level. Hainan, Ningxia, Qinghai, Tibet, and Xinjiang are removed because the total floating population size reported by the Census is too small. 16 In appendix, Table A1 shows how the variables of the individual characteristics are asked in the questionnaires, including the detailed definitions and coding.
The origin-destination pair variables, P ij , are supplemented by China Geographic Information System Data 17 , Population Census 2000, and the Language Atlas of China (Wurm, Li, Baumann, & Lee, 1987). Distance: the geographic distance between the centroid of provinces i and j. Co-origin network: the number of migrants from origin i divided by the total population of the destination j. Linguistic distance: the linguistic distance measure between the origin and destination, constructed by counting the shared number of linguistic groups based on the Language Atlas of China. 18 We use the dialect spoken in the capital city of the province to represent the language of the province: The linguistic distance measure is equal to 3 if the two dialects do not share features with any common linguistic group; the measure is equal to 2 if the two dialects only share features at the most aggregated level of the linguistic group; the measure is equal to 1 if the two dialects share features in the first and second linguistic groups; finally, the measure is equal to 0 if the two dialects share features with all three levels of the linguistic group. In Table 2, we show in particular the descriptive statistics for interprovincial migrants, that is, those migrants who cross provincial borders. On average, coorigin network percentage in destination province is 5%. For some origin provinces with large migrant outflows, the value can be as high as 19%.
We collect most of the province-specific attributes Z j in 2008 from the NBS of the People's Republic of China. 19 Average income of rural-urban migrants: calculated as the average income of rural migrants in the destination province, adjusted by the consumption price index. Employment in secondary industries: calculated from the total number of workers in mining, manufacturing, electricity, and construction. Employment in tertiary industries: calculated from the total number of workers in transportation, information technology sectors, retail and wholesale, hotels and restaurants, finance, rental and commercial services, research and technical services, public facility management, residential services, educational activities, health and social activities, recreational activities, and public administration. 20 Employment in healthcare: the number of health staff owned per 10,000 people, which proxies for medical services. Private education spending: private funds in education spent per 10,000 people. 21 Share of migrants among all respondents j σ : the share of migrants who choose a specific province among all respondents in the survey, which proxies for local spillovers. The descriptive statistics for the province attributes are shown in In addition to employment in levels, we also use employment shares. The results remain consistent. 21 We use private instead of public funds for education to proxy for local educational resources, because privately run schools are the primary venue for the education of migrant children, which has been discussed in Section 2. WANG AND CHEN | 597

| ESTIMATION RESULTS
This section presents and discusses the results from estimating the sorting model for the 26 provinces in our sample from the 2014 CMDS. The model is implemented in two stages. In the first stage, individual characteristics of migrants are interacted with province characteristics to estimate heterogeneous preferences controlling for unobserved province heterogeneity. In the second stage, we can then estimate the relative contributions of province characteristics and local spillovers in determining a location's attractiveness to rural-urban migrants.
Lastly, we compare the magnitudes of all attributes using the relative risk ratios (RRRs) and discuss some variables of particular interest. All province characteristics (except for migrant shares σ ) and origin-destination pair variables are standardized.

| First stage
The upper panel of Table 4 reports estimates for the 36 coefficients ( 1 β , 1 α ) on terms that interact individual characteristics and province characteristics. The lower panel reports three coefficients (γ ) for the origindestination pair variables from Equation (4).
If we read the coefficients by row, the upper panel shows, given a fixed level of a certain province attribute, how much more likely the odds ratio of a province being chosen over another will change in response to different individual characteristics. The first row concerns the local average income of rural-urban migrants.
Migrants who migrate at an older age (0.091 for the squared term), who have at least one child of school age (0.06) and who have higher potential earnings are likely to move to provinces with higher income levels on average. The second and the third rows concern employment in secondary and tertiary industries. The estimates in the two rows give almost opposite values, indicating a clear sorting pattern of migrants of different types into the two broad industry categories. The last row is of particular interest to us, as it implies the type of migrants preferring to locate in a province with many other rural migrants. The sorting pattern across provinces with different shares of rural migrants is strong. Local spillovers are significant among males (7.336), older (3.884 for the squared term), less educated (-0.655), and lower income (-3.593) workers. In other words, for provinces with an existing share of rural-migrant workers, there is a much higher tendency for these groups to move there.
If we read the coefficients by column, it shows that given a fixed value of certain individual characteristics, how much more likely the odds ratio of a province being chosen over another will change in response to different province attributes, ceteris paribus. Take Columns (2) and (3)  higher-educated derive higher utility from employment in tertiary industries (0.121). In the last column, only potentially high-income earners derive higher utility from local medical services (0.246) and better educational resources (0.188).
The estimates of γ for the origin-destination pair variables are shown in the lower panel of Table 4. Geographic distance reduces the probability of a potential migrant worker choosing a destination (-0.992). The network share from a migrant's province of origin makes it more attractive for migrant workers to move in (0.445). We see this variable as capturing the effect of social networks formed by migrants from the same place ("Lao Xiang").
Furthermore, larger linguistic distance seems to deter migrant workers from choosing a destination province (-0.19). Table 5 reports the mean indirect utility, λ, of each province estimated in the first stage as alternativespecific constants. Beijing, Fujian, Guangdong, Jiangsu, Shanghai, and Zhejiang rank the highest and are the most attractive destinations for migrant workers. They are all located in the eastern region. Henan, Sichuan, Anhui, Guangxi, and Shanxi rank lowest by mean indirect utility. The ranking of the provinces is comparable to the ranking of top 20 cities in Xing and Zhang (2017). Except for the two cities of Shenyang and Dalian, all other cities are included in our top six provinces. Note: SE are in the parentheses. Individual variables are listed in the top: male, age at migration, age at migration squared divided by 100, education years at migration, number of children at school age, and potential income divided by 1,000. Province variables are listed in the first column: average income of rural-urban migrants (RMB), employment in secondary industries (÷10,000), employment in tertiary industries (÷10,000), employment in healthcare (per 10,000 people), private education spending (per 10,000 people in RMB), and share of migrants among all respondents. The origindestination pair variables are listed at the bottom of the table: geographic distance, percentage of migrants from one's origin province, and linguistic distance. *p < 0.1. **p < 0.05. ***p < 0.01.
In Column (1), we report OLS estimates over the average income of rural-urban migrants, employment in secondary industries, employment in tertiary industries, employment in healthcare, private education spending, and migrant shares σ . A higher local income level for rural migrants, a large employment in secondary industries, better medical services, and a larger population of rural migrants all increase province's attractiveness. Column (2) reports the IV estimation using the artificial IV: * σ is the equilibrium σ value, which is, by definition, orthogonal to province's attractiveness but relevant to the observed migrant share σ . In Column (3), we use the provincial highway passenger flow in 1979 as another instrument for σ . As expected, a higher volume of highway passenger inflow is positively correlated with σ due to the convenience of transportation. The F-test value in the first-stage regression is 10.87, implying a borderline weak instrument. However, both sets of IV estimation results are consistent with those of the OLS estimation.
The effects of local spillovers are shown in the last row of Table 6. As the value of the migrant share is used to calculate the artificial IV, we do not standardize this variable. Thus, the magnitude of the coefficient here should be interpreted with caution. Take Column (3), for example, one standard deviation of the migrant share variable (0.02) leads to a value of 1.466 (≈0.02 × 73.291) for the mean indirect utility λ, which is as strong as the impact of employment in the secondary industries on λ. The more migrants that move to a specific province, the more likely it is that a potential migrant worker will choose that location.
The estimate for educational resources is not statistically significant throughout the specifications. This looks unintuitive at first sight. However, it should be noted that the measure for educational resources is private funds instead of the government financial funding for education. As the children of rural migrants are denied access to public schools in cities, they can only go to low-cost private schools, which are of inferior quality. Our results imply that local educational funding does not add to province's attractiveness at all in the second stage. Returning to the results for the first stage, the interaction term between the number of school-aged children and the private funding of education is not significant either. Note: λ is the mean indirect utility for each province. Anhui is the reference location with mean indirect utility 0.

| Comparison by RRRs
In the simplest version of a multinomial logit model with only one predictor, z, for many alternatives, we call the probability of province j being chosen relative to the probability of baseline province k being chosen as the relative risk (commonly known as odds ratio). When certain predictor of interest, z, changes to + z 1, ceteris paribus, the RRR is calculated as follows: where b is the coefficient for predictor z. One unit increase in z leads to a composite impact on the relative risk as shown above. > b 0 is equivalent to > RRR 1, which implies that the probability of j being chosen over k is getting larger due to the one unit increase in predictor z. In this way, the main results in all tables can be easily summarized The pull effect of jobs is mainly due to average income and employment in secondary industries. In Column (3) of Table 6, if we increase province j's employment in secondary industries by one standard deviation (127 million The overall share of migrant workers increases the province's attractiveness with a similar magnitude. Here, we are mostly interested in how the group of rural migrants in general differ from the group of migrants coming from the same province in affecting one's destination choice. RRR is 1.6≈ exp(0.445) times greater with one standard deviation increase (5%) in the population share of origin province at the destination province, whereas RRR is 4.3≈ exp(0.02*73.291) times greater with one standard deviation increase (2%) in the share of migrants moving to a destination among all respondents.

| ROBUSTNESS CHECKS
We perform several robustness checks to validate our main results. First, we check whether the preference for amenities is confounded by possible job opportunities in medical-and education-related jobs. Second, we extend our model by considering the possible presence of spatial autocorrelation. Third, we replace employment in secondary industries with employment in manufacturing only. Fourth, we implement the analysis at a finer regional level, that is, cities. Fifth, we divide the sample into a more educated group and a less educated group and check how preferences differ.
First, we remove from the sample respondents who work in medical-and education-related jobs. Provinces with better public services are usually associated with a high demand for workers in these jobs. The estimate for medical services might imply a combined effect of amenity sorting and job sorting. Column (1) of Table 7 shows that results are consistent with previous findings, implying that the effect of these province variables is not confounded by other channels. The demand for public services and amenities plays a major role here.
Second, we use a spatial autoregressive model (Anselin, 1988) to deal with the possible presence of spatial autocorrelation, namely, the impact of these province attributes can possibly extend over geographic boundaries.
Thus, a province could generally be regarded as more attractive when it is closer to an attractive province. The specification is written as follows: where W is a spatial matrix measuring the proximity of neighboring provinces. Two common types of such matrices are (a) the inverse distance matrix in which the weight is calculated by the inverse of the geographic distance between provinces and (b) the contiguity matrix in which the weight simply indicates whether spatial units share a geographic boundary. Columns (2) and (3) of Table 7 show the contribution when the attractiveness of neighboring provinces is controlled for. Whether using the inverse distance matrix or the contiguity matrix, spatial autocorrelation seems trivial at the provincial level.
Third, the provinces receiving the most rural-migrant workers are also more economically developed and focus on labor-intensive industries, especially the manufacturing industry. We look particularly at how the employment in manufacturing only affects the migrants' spatial sorting. Column (4) of Table 7 shows that the estimate for the employment in manufacturing becomes slightly smaller, while the other estimates remain relatively similar in magnitude. The contribution of manufacturing alone makes up approximately two thirds of the contribution from secondary industries.
Fourth, we use a more disaggregated geographical unit and test for "sorting by city" instead of "sorting by province." Although we have considered the differences between provincial units, there could be considerable heterogeneity within provinces. To address this issue, we restrict the sample to the ruralmigrant workers originally from Sichuan province, which is one of the largest rural migrant sending provinces. Among the top 40 cities chosen as destinations, we implement our analysis. Eighty percent of migrants from Sichuan migrated to these 40 cities, ensuring the validity of the choice set. We also adjust the number of cities to 30 or 50, and the results are similar. 22 Column (5) of Our proxy for the educational resources at the city level is slightly different from the variables at the provincial level due to data limitations. The educational resources are now measured by the total number of students enrolled in primary schools. 23 The migrant share is not standardized, so the impact should be calculated by multiplying one standard deviation of the variable and the coefficient.

| 603
Lastly, we divide the sample into a more educated group (years of schooling > 9) and a less educated group (years of schooling ≤ 9) to further investigate the role of heterogeneous preferences. Nine years of schooling is a critical point of having finished compulsory education in China, and hence, there might be systemic differences between the two groups. The more educated group only accounts for one-third of the sample. They appear to be younger and show a higher willingness to settle in the destination in the future. impact of migrant share on the more educated group is lower than that on the less educated group, which substantiates our main results.

| CONCLUSION
Since the reform of China's economy in 1978, coastal provinces in the east have traditionally received large migrant inflows drawn from rural areas in the interior of the country. We identify how following the GFC, rural-urban migration in China is increasingly an intraprovince phenomenon, especially for provinces in the middle and western regions. In this paper, we examine what might be causing these changes in the spatial distribution of migration flows by analyzing the sorting behavior between the years 2008 and 2014 of 58,595 migrant workers in a largescale government survey. Three sets of factors are studied: jobs, amenities, and local spillovers. We employ the equilibrium-sorting model developed by Bayer and Timmins (2007). This methodology follows the conventional discrete-choice framework, but allows us to estimate spillover effects using location decisions when local attributes are unobserved. Using this approach, similar to previous studies, we find that rural-urban migrants prefer destinations close to their province of origin, locations in which many people from their own province reside, and destinations where the dialect is similar.
New in this paper, after controlling for the aforementioned factors, we find a separate and strong preference for colocating with a large population of rural migrants, regardless of their origins. What these migrants all have in common is that they lack the urban hukou that would give them access to the same education and employment opportunities as the local residents. The effect is particularly strong among older, less educated, and lower income rural-urban migrant workers. We take this as evidence in support of a channel where a network of rural migrants might help to reduce the effects of systemic discrimination on these more vulnerable groups. We cannot test labor market pooling and housing competition channels directly due to missing information about job mobility and housing quality. What we can claim is that the social and economic benefits brought by a large community of rural migrants appear to be much higher than the potential costs in various aspects. The results remain robust when we take into account labor supply-driven migration, spatial autocorrelation between provinces, different industry definitions, and regional differences within provinces.
Our study can inform urbanization policies in China and the future distribution of rural-urban migrants.
First, for the foreseeable future, the size of the rural-migrant community will still be an important factor in destination choices due to institutional barriers. Although there have been several small breakthroughs in hukou reform since the 1980s due to a decentralized process of hukou management, rural-urban migrant workers have not benefited much from these changes. Some provinces have removed the distinction between rural and urban hukou within their own localities (Song, Zenou, & Ding, 2008). These changes mostly benefit people who already have local hukou in those places. As the current hukou system places barriers on rural-urban migrants based on two classifications, that is, the hukou type (rural vs. urban) and the hukou location (local vs. nonlocal), the efforts to abolish the distinction between rural and urban hukou do not equate with efforts to address the distinction between local and nonlocal hukou. Migrant workers are still not entitled to the same benefits as local workers, and there is a long way to go.
In addition, our results suggest that ongoing industrial upgrading and transfer policies in China might exert a sizable effect in redirecting rural-urban migration flows westward in the future. Following the GFC, eastern provinces have developed more capital-intensive sectors and manufacturing firms have moved labor-intensive plants to the middle and western regions (Ang, 2018;Meng, 2014;Peng, 2015). As much rural-urban migration originates from provinces in these regions, we expect intraprovince migration to grow stronger as migrants are more able to find jobs in the nearby cities. This westward movement will be accelerated as spillover effects draw in still more migrants, in particular from older, less educated, and lower income demographic groups.

| 605
Lastly, our results provide guidance to local policy makers that the lack of schooling access for rural migrants is the most urgent problem to tackle. Migrant destination choice is almost inelastic to changes in local private educational funding, which currently provides the primary access to education for migrant children. We find that only more educated migrant workers are able to consider educational resources in location choices. The lack of schooling access for rural migrants will potentially be a crucial determinant for whether migration flows to cities will be temporary or permanent in nature.
This study can be extended in many directions in the future. First, we study how migrants choose destinations, and the results are relevant for people who have decided to leave rural areas and work in cities. The limitation of discrete-choice models in tackling the selection problem of the target group can be weakened to a certain extent if data about the migrants' pre-migration characteristics and counterpart groups in rural areas are available. Second, the current estimates only roughly consider the impact of employment in secondary industries, and the variable is not merely determined by the demand factors of labor. An analysis of exogenous industrial policies at a more disaggregated level is important for policy-making. For example, the change in the content of new vacancies due to industrial upgrading could help researchers understand the sorting patterns of migrant workers to a greater extent.
Third, as most of the industrial transfer policies go against economies of scale, the economic efficiency of these policies needs to be evaluated apart from the impact on migration flows. Due to limitations in the data currently available, these details are not discussed in this paper.

ACKNOWLEDGMENTS
We are grateful to Daniel Broxterman and other guest editors of the Journal of Regional Science special issue on "endogenous amenities and cities" for editorial guidance, Chun Kuang for insightful discussant comments, and an anonymous referee for constructive suggestions. This paper has also benefited from comments and suggestions