Can the Internet Improve Agricultural Production? Evidence from Viet Nam

This paper aims to contribute to the growing literature on the potential benefits of the Internet on rural livelihoods. We estimate the relationship between Internet access and agricultural production in rural Viet Nam using a panel dataset from 2008–2012. This is a time span during which Internet access increased substantially and government-run and private online outlets providing information about agriculture started to operate. Our findings suggest that Internet access is associated with a 6.8% higher volume of total agricultural output. We find that this result is manifested through more efficient use of fertilizer. Our findings are stronger for younger households. The less developed northern provinces have benefited the most from the arrival of the Internet. The results are weaker in the case of rice, which is related to strong government involvement in rice production and prices.


INTRODUCTION
Information and communication technology (ICT) is spreading rapidly and is becoming available and affordable to an increasing share of the world's population. ICT has reached areas where industrialization is still in its infancy and livelihoods rely on subsistence farming. This study contributes to the literature which explores the question of how the new information economy can help rural societies. Understanding how ICT can be used for development is considered to be one of today's most important development challenges; the World Bank's World Development Report 2016 was devoted to this issue. Our results provide evidence of how the rural population in Viet Nam has been able to benefit from the ICT revolution.
Like many other countries in the developed and developing world, Viet Nam has experienced a significant increase in the number of Internet users since the year 2000. The share of the Vietnamese population using the Internet increased from 17% in 2006to 40% in 2012(International Telecommunication Union, 2013. In the rural provinces studied here, the share of households in communes with at least one Internet access point increased from 30.7% to 70.6% between 2008 and 2012. In 2012, the population depended heavily on agriculture. In the rural provinces in our dataset, 76% of all income earned came from agricultural activities. As poverty is more persistent in the rural areas (Markussen, Tarp, & Newman, 2013), new technologies may provide the means to improve the livelihoods of the rural population. These opportunities have not gone unnoticed by the Vietnamese officials, who began to provide agricultural information online in 2006 (Hoa, Dung, & Son, 2008). There are currently a number of websites run by the authorities and by private companies that provide farmers with information about agriculture: news, information on practices, inputs, prices, etc. Given the heavy reliance on agriculture, coupled with the fact that the most important online activity among Vietnamese Internet users is "information gathering" (Broadcasting Board of Governors [BBG], 2013;Cimigo, 2011), it is not surprising there is demand for agricultural online platforms in rural Viet Nam.
The macroeconomic benefits of the introduction of information technology, mobile phones, computers, and the Internet, are well documented in the literature. 1 Lio and Liu (2006) present macroeconomic evidence on the positive relationship between ICT and agricultural productivity. However, evidence on the precise transmission channels and the microfoundations of how information technology-or any general-purpose technology-affects growth, remain ambiguous (Foster & Rosenzweig, 2010). More microlevel evidence on technology adoption is required to understand the linkages between technology and growth.
In a developing country context, the literature on information technology and agriculture has focused on how mobile phones can increase information in agricultural markets and potentially lead to improved market efficiency (Aker, 2010;Aker & Fafchamps, 2015;Fafchamps & Minten, 2012;Jensen, 2007;Muto & Yamano, 2009;Shimamoto, Yamada, & Gummert, 2015;Tadesse & Bahiigwa, 2015, Mitra, Mookherjee, Torero, & Visaria, 2018. Aker (2010), Aker and Fafchamps (2015), and Jensen (2007) find that mobile phones reduce consumer and producer price dispersion spatially as well as over time. Muto and Yamano (2009) find that mobile phone coverage has increased market participation. As summarized in Nakasone, Torero, and Minten (2014) and Jensen (2010), the literature on ICT and agriculture is mostly concentrated on agricultural markets, and most of the interventions are based on mobile phone technology (Nakasone & Torero, 2016). While there are a number of findings related to increased market efficiency, heterogenous effects-for instance between crops-dominate.
There are fewer studies related to the effect of ICT on agricultural practices. 2 In a randomized experiment in India, Fafchamps and Minten (2012) find that a commercial market and weather information system using mobile phone technology had little or no effect on prices or agricultural practices. Aker and Ksoll (2016) find that households which received a mobile phone and education on how to use it planted a 1 See, for instance, Jalava and Pohjola (2008), and Choi and Hoon Yi (2009) on how information technology fosters economic growth.
2 For a review of more studies related to farm productivity with a focus on mobile phones, see Deichmann, Goyal, and Mishra (2016). more diverse basket of crops. To our knowledge, there is only one study that examines the effects of the Internet rather than mobile phone technology. Goyal (2010) finds that the area under soy cultivation increased as a result of Internet kiosks providing information about soy prices and marketing opportunities. Goyal (2010), to our knowledge, is also the only study to find impacts at the level of prices received, not just price dispersion, as the new information the farmers have access to allows them to avoid intermediaries. 3 The Internet is a new medium that allows its users to aquire information that was previously unavailable through, for example, video, text, and audio. Unlike simple mobile phones, the Internet is not just a communication technology; it is an information and communication technology. The Internet can therefore increase productivity by providing market information or information on other technologies and production processes. Simple mobile phone technology cannot be used to access all the information online, as it requires initiating a personal contact: that is, the potential benefits of the Internet are not tied to one's social network in the same way the benefits of having a mobile phone are.
Our work presents new microeconomic evidence of the benefits of Internet access on agricultural production in Viet Nam using a large-scale panel dataset on rural households, the Viet Nam Access to Resources Household Survey (VARHS) covering the period from 2008 to 2012. We find that the arrival of the Internet in a commune is associated with a 6.8% increase in agricultural output. Using a household fixed effects approach, which exploits the variation of the timing of the arrival of the first Internet access point in a commune, we find that this result is likely due to the use of chemical fertilizer. However, our findings suggest that the Internet is neither related to an increase in the use of fertilizers, nor is it otherwise associated with a change in the input mix. This implies that the productivity gains are likely to be related to more efficient use of chemical fertilizer. Even though fertilizers have been widely used in Viet Nam since the 1970s, knowledge about their optimal use is still lacking (Thang, 2014). Our results are weaker for rice production, which is related to there being high government involvement in rice production practices (Markussen, Tarp, & van den Broeck, 2011) and price regulations in both sales and input markets (Thang & Linh, 2015).
We find that households with younger household heads benefit more from the arrival of the Internet. Earlier evidence suggests that education level is positively associated with Internet availability (Kaila, 2017). We find suggestive evidence of heterogeneous effects in terms of education level, such that highly educated households benefit sligthly more, while there is no significant difference at the margin of literacy. Furthermore, our results suggest that northern provinces benefit most from the arrival of the Internet. This is encouraging given the high rates of poverty and low rates of agricultural productivity in these areas.
We contribute to the literature by shedding light on the benefits of introducing a new general-purpose technology instead of a specific technology intervention (e.g., Fafchamps & Minten, 2012;Goyal, 2010). We study whether merely having access to a new general-purpose technology translates into benefits to the farming household, as in Aker (2010). Therefore, instead of studying whether a predetermined way of using a technology renders some desired effect, we aim to show the relationship between the arrival of the Internet in a commune and agricultural output. Due to the observational nature of the data, the caveat of our analysis is that there are several possible mechanisms for how the online information reaches the farming household, and how exactly members of the household employ this information in their everyday lives at the farm. The benefit of the observational nature of the data is that we demonstrate a benefit resulting from the ICT revolution.
The arrival of the first Internet access point in a commune is not likely to be random across communes. Our main empirical strategy therefore relies on the parallel trends assumption: that is, in the absence of the Internet, the difference in agricultural output between the communes that receive the Internet and communes that do not, is constant over time. We test this by running placebo tests in a household fixed effects framework. They confirm that we cannot reject the null of parallel trends, a result also supported by graphical inspection. As a robustness check, we conduct the coefficient stability test proposed by Oster (2019), which builds on Altonji, Elder, and Taber (2005). This test examines omitted variable bias in our results and shows that it is highly unlikely that unobservable characteristics drive our results. Finally, we find our results to be robust to alternative methods of production function estimation that correct for endogeneity in inputs (Ackerberg, Caves, & Frazer, 2015;Levinsohn & Petrin, 2003).
This study proceeds as follows. Section 2 presents information on the Internet in Viet Nam and the data used, while Section 3 summarizes the production function approach and the estimation method. Section 4 presents the results, and Section 5 concludes.

BACKGROUND AND DATA
In parallel to the vast expansion of ICT, a number of online platforms providing information about agriculture have emerged. The Vietnamese government has several such online outlets, one of the more prominent being AgroInfo (https://agro.gov.vn/vn/default.aspx), which was established in 2008 when it operated under the name PMARD (Hoa et al., 2008). AgroInfo provides farmers with news related to agriculture, information about production, and information about regional prices of various inputs and crops. Fertilizer is a prominent topic in each of these areas, and information about fertilizer has its own page. In addition to AgroInfo, the website of the Ministry of Agriculture and Rural Development (http://www.mard.gov.vn) contains information about crop prices and news related to agriculture. Some of the regional Departments of Agriculture and Rural Development also have their own websites. They are less educational in nature and focus more on regional agricultural news. 4 We are also aware of three privately run websites that provide information on agriculture in Vietnamese, which were operating during the period of our study. Altogether we are aware of six other online platforms. The list of all known online platforms is given in Online Appendix B.
The hypothesis that farmers learn through information provided online is consistent with the way the Vietnamese report using the Internet. The most important Internet activity in Viet Nam is "information gathering" (BBG, 2013;Cimigo, 2011), most importantly reading the news (93.6% of Internet users according to the nationally representative Gallup survey conducted by BBG). Some 78.3% of those surveyed went online to find information about a specific topic (BBG, 2013), and Google is the most visited website (Cimigo, 2011;Vietnam Internet Network Information Center, 2014).
To get closer to answering the question of whether Vietnamese farmers gather information about agricultural practices online, we collected information from Google Trends (https://www.google.com/trends/) on the Google searches of the most important purchased inputs of productionfertilizers and pesticides in Vietamese. Figure B1 in Online Appendix B shows that there was an increase in searches for both these terms over the period covered in our analysis. 5 As many as 11 enterprises in Viet Nam have licenses to build network infrastructure. Of these, three have built telecommunications network infrastructure on a national scale (Viettel, VTN (VNPT), and EVN Telecom). The arrival rate of the Internet in the rural areas is therefore subject to decisions taken by a large number of companies (Tuan, 2011). Maps of the VARHS communes and Internet access in the proximity of Hanoi and Ho Chi Minh City are presented in Online Appendix A, Figure A2, Panel A and Panel B, respectively. Figure A2 shows that the Internet spread first to the rural areas close to the urban centers, Hanoi and Ho Chi Minh City, from where it has gradually expanded to more remote rural areas. In the provinces further away from urban areas the arrival of the Internet has been less systematic, possibly as a result of the fragmented nature of the Internet provider market.
The VARHS dataset we use for our analysis is a panel dataset of 12 rural provinces in Viet Nam, a subset of the current total of 58 provinces and five cities. 6 In this study, we use three waves of data: 2008, 2010, and 2012, collected between July and September 2008, June and August 2010, and June and August 2012. In addition to a large set of data on household characteristics as well as land and agriculture-related variables, VARHS contains a commune-level questionnaire answered by decision makers at the municipal level. The survey areas are scattered across the country, as displayed in Figure A1 in Online Appendix A.
Our variable of interest is Internet access in the communes, collected as a recall question in a commune questionnaire conducted in 2014. Our question asked whether the commune had at least one Internet access point in a specific year. Table 1, Panel A illustrates the extent to which the Internet has become available in the areas studied.
We restricted the sample to a balanced panel of households that report having agricultural output greater than zero in every survey round. The data used consist of 478 communes, with a total sample size of 2,477 households and a very low attrition rate of 2.2%. Our dataset also includes information on the output volume of rice, and the input and land use in rice production, which makes it possible to estimate a production function for rice. The large majority of households in the sample (82%) produce rice. In the rice production analysis, the sample is restricted to the 2,029 households that produce rice in every round. Table 1 shows the summary statistics for the balanced panel, as well as by year. Online Appendix A provides a description of the sample along with a detailed description of the variables used.
Panel B of Table 1 presents the characteristics of households engaged in agriculture during the entire four-year period. Panel C shows the descriptive statistics related to agricultural output and input-the key variables in the production function. For illustration purposes, we adjusted the volume of agricultural output and the costs of inputs according to the area of land cultivated by the household. In the analysis, we use the log values of the variables and include land size as an 6 The VARHS is a collaboration between UNU-WIDER, the Development Economics Research Group at the Department of Economics at the University of Copenhagen, and the Central Institute of Economic Management, the Institute for Labour Studies and Social Affairs, and the Institute of Policy and Strategy for Agriculture and Rural Development in Hanoi, Viet Nam. The first round of the VARHS panel was representative of rural households at provincial level. Brandt and Tarp (2017) provide full details of the sample design. input in the production function. 7 Panel D presents summary statistics for rice production.
The households in our sample have an average of five members and the household heads an average age of 49 years. Over half of the sample belong to the ethnic majority (Kinh), which we include as a dummy variable. The average number of years of schooling completed by household heads is 5.7. The literacy rate of household heads is 78%, while, when looking at the maximum education level in the household, 2.7% of the households are fully illiterate. 8 We keep this small subsample of fully illiterate households in our analysis as a third of the subsample reside in communes that have Internet access and we want to be able to capture the "intent-to-treat" (ITT) estimate of Internet availability, to allow for spillovers.
Almost all of the households engage in activities other than agriculture, mainly wage labor and household enterprises. Agriculture is the most important source of livelihood; the volume of agricultural output exceeds the total nonagricultural income in each period. Both agricultural and nonagricultural incomes have risen in real terms, and we control for real nonagricultural income in our regressions.
We also control for other information technology and the ownership of radio, television, and phones. The variables are dummies indicating whether a household has at least one of each of these assets. Radios are owned by 16% of the households, with a steady decline over the years, whereas the ownership of televisions and phones (both fixed line and mobile combined) has increased. We also control for the use of extension services, as a potential source of agricultural information.
Panels C and D in Table 1 describe the agriculture-related variables. The output volume and cost of inputs used in production are in monetary terms: 1,000 Dong adjusted by province-level consumer price indices to take account of differential regional inflation rates. The values are all for the previous 12-month period, and therefore encompass all agricultural seasons. Labor is measured by the number of days spent on agriculture. The input variables are for all the plots cultivated by the household (except those used in forestry) and include all crops. The inputs selected for the production function are those used by almost all farmers, and they jointly yield a production function that has close to constant returns to scale. 9 Capital consists of the value of machinery and tools used in farming. 7 The logs are taken by log(x + 1). Additionally, we present robustness checks of the main variables of interest using the inverse hyperbolic sine (IHS) transformation in Table F.6 in Online Appendix F (see, e.g., Burbidge, Magee, and Robb, 1988). The transformation is IHS( ) = ln( + √ 2 + 1).

Panel B: Household
Variables in Table 2 Number of HH members 4.9 1.9 5.1 2 4.9 1.9 4.8 1.9 School years 5.7 3.8 5.5 3.7 5.8 3.9 5.8 3.9 Female head HH 0.14 0.35 0.14 0.35 0.14 0. By comparing Panels C and D of Table 1, we see that, on average, the value of output per hectare is higher for rice than for total agricultural production, and rice is also more intensive in the use of pesticides and fertilizers. 10 Other crops planted in these regions include maize, potatos, sweet potatos, cassava, peanuts, soy beans, and fruits and vegetables. Coffee farming is common in the Central Highlands, where three of the 12 VARHS provinces are situated.
We also include a control for the share of the land with a property right (a "red book"). Over half of the land is under a formal property right, and this figure changes little over time. We also control for self-reported measures of land quality as categorical variables, with the base category being "average quality," and the categories included in the regressions being higher and lower quality than the average. Most of the land is perceived to be of average quality. We also control for the number of plots.

PRODUCTION FUNCTION FRAMEWORK
In this section, we present the production function framework and its empirical counterpart.

Conceptual framework
The relationship between Internet access and agricultural output is studied using a Cobb-Douglas production function, 11 a standard benchmark specification for estimating agricultural production functions (Griliches, 1957(Griliches, , 1963. In our main specification, the Internet enters the production function through the "unexplained" total factor productivity (TFP) component. The production function used in our main analysis is where is the volume of agricultural production of household at time , is a dummy for Internet access in commune at time , is a vector of inputs in the household production on farm, and is land. Average TFP is denoted by 0 + 1 .
Taking logs and rearranging, we get This is our baseline formulation used in estimating the model, where lower case letters denote log variables.
10 Still only about a half of all fertilizer is used for rice (the mean real value of fertilizer used on all crops is 454,000 VND, while that used on rice is just 256,000 VND). 11 It is also the approach used by Lio and Liu (2006) in cross-country analysis.

Empirical specification
To ensure that the production function is well specified, we first estimate a production function with the input vector , which includes labor, capital, pesticides, and fertilizers, and land , so that we can see that the production function yields close to constant returns to scale. The ordinary least squares (OLS) model to be estimated is: Now, is the log volume of agricultural production for household at year and is a dummy denoting Internet access in commune at year . The log size of land cultivated by the household is denoted by , is the log value of capital, is the log amount of household labor supplied on a farm, is the log value of fertilizers, and is the log value of pesticides used. Time dummies are denoted by . Household fixed effects are denoted as . 12 In another empirical specification, we use commune, instead of households fixed effects, to account for commune-level time-invariant characteristics. These fixed effects absorb the information about characteristics, such as the distance to Hanoi or Ho Chi Minh City, or the distance to extension services. We also control for a large number of time-varying controls denoted as . These include controls for land, household characteristics, and other technology as described in Table 1. In all the specifications, we cluster standard errors at the commune level (Bertrand, Duflo, & Mullainathan, 2004). In another specification, we impose the constant returns to scale assumption, that is + + + + = 1, on our input vector to verify that the theoretical assumption is satisfied without causing major changes to the coefficient estimates of the unrestricted model.
As our variable of interest is a dummy denoting whether the commune has at least one Internet access point, the coefficient estimate captures the ITT estimate of the availability of the Internet on the volume of agricultural production. Hence, we do not have self-selection into treatment at the household level. The ITT estimate allows us to capture both the relationship between Internet use and agricultural output in the commune and the positive externalities of that use, if productionrelated information obtained online spreads in the commune to nonusers. The literature on technology adoption in developing countries (Ben Yishay & Mobarak, 2014;Conley & Udry, 2010;Foster & Rosenzweig, 2010;Munshi, 2004) suggests that farmers learn about new technologies through their social networks, such as neighbors. Hence it is not crucial to know how much Internet use is devoted to looking up productionrelated information-as long as someone acquires the information and the information is spread.

Parallel trends
Next, we investigate the assumption of parallel trends. If the parallel trend assumption holds, the volume of agricultural output would have evolved similarly in areas that received the Internet and in areas that did not, had the Internet not been introduced. Since we have two time periods when the Internet arrived (either between 2008 and 2010 or between 2010 and 2012), our fixed effects model of Equation (3) is essentially a generalization of the difference-in-difference approach, in the case of more than two time periods and more than two groups. This results from the fact that the variation exploited is the time variation for the communes which received the Internet during the time period of the study.
We test the validity of the parallel trends assumption by running a placebo test. We do so by regressing the lead of the Internet variable +1 on the volume of agricultural output . We would expect the coefficient estimate of this regression to be statistically significant (i) if the households in the commune can anticipate the information that the Internet brings, which does not seem plausible, or (ii) if the parallel trends assumption does not hold. Our results are robust to this placebo test: The coefficient estimates are very close to zero and not significant. 13

Production function results
Panel A of Table 2 presents the results of estimating Equation (3) for a production function of all crops. The results suggest that Internet access is strongly related to the volume of agricultural production. In column 1, the production function is estimated with the Internet variable, controlling only for year fixed effects. The second column is similar, except for the restriction of constant returns to scale for all the inputs, hence excluding the Internet. Though this slightly inflates the coefficient estimate of the Internet variable, overall we see that the 13 We also considered other identification strategies, namely an instrumental variables strategy to correct for the endogeneity of inputs. However, we were unable to find instruments, which would have not violated the exclusion restriction. We also considered using propensity score matching (PSM). This method would require a common support at the level of the commune, not at the level of the household, as the Internet variable is measured at the level of the commune. This would result in a commune-level analysis, which would considerably reduce our sample size, which is why we did not go forward with PSM. coefficient estimates of the inputs barely change from column 1 to column 2, suggesting that our model has indeed close to constant returns to scale without the explicit restriction. We also display the sum of the coefficient estimates of the inputs in Table F2, Online Appendix F where we see that the sum is slightly below, yet close to, one. Going back to Panel A of Table 2 in column 3, we add controls and commune fixed effects to the model. These results suggest that Internet access is related to a 7.2% higher volume of agricultural output. In colums 4 and 5, we include household fixed effects, together with controls in column 5. From these results, we infer that the arrival of the Internet is associated with a 6.8% increase in agricultural output. All results are significant at the 5% level.
As shown in Panel B of Table 2, the results do not carry over to restricting the sample to rice. 14 While e see that the coefficient estimates range between 1.1% and 3.6%, they are not statistically significant in any of the specifications. 15 This is not surprising, given the strong government regulations in rice production (Markussen et al., 2011) and prices (Thang & Linh, 2015).
It is also important to comment on the controls, especially other information sources such as television, radio, phones and extension services. Albeit there being time variation in these variables, none of the coefficient estimates are significant at the 5% level in the specification with household fixed effects. This holds for the production of all crops as well as for rice. 16 However, in rice production extension services seem to be more strongly related to increased production than Internet or the other information sources considered, although this relationship is significant only at the 10% level (Panel B of Table 2, column 5).
Also years of schooling completed by the household head is not significant at the 5% level in either of the 14 The inputs in the production function of all crops and rice are slightly different. Capital does not appear in the rice production function. Rice production is highly labor intensive, and hence capital is not an input that is available for rice production only. Seeds are included in the production function of rice as they are an important input for rice production. However, seeds are not an input in perennial crops, which are a component of the production function of all crops. For the production function of all crops, we have therefore used those inputs that are common to the cultivation of both perennial and annual crops. 15 The results remain unchanged when looking at the quantity of rice produced or the sales value of rice. Results are available on request. 16 In Table F3 in Online Appendix F, we show a correlation matrix of the information sources. We can see that none of the variables are correlated with Internet access with a coefficient larger than 0.25. The additional sensitivity check in Table F4 shows the contribution of each of the information source variables as well as school years, without controlling for the others. Each of the coefficient estimates remain stable when excluding the others, suggesting weak multicollinearity. The Internet variable also remains stable across the specifications. We thank an anonymous reviewer for suggesting this. Note. Authors' calculations. Dependent variables are the volume of all crops produced (a) and volume of rice produced (b). Summary statistics of the control variables are presented in Table 1. The household-specific control variables are number of HH members, female head HH, age of head HH, average age of adult household members, Kinh ethnicity, real nonagricultural income, school years, radio, television, and phone, extension. Agriculture-related control variables are presented alongside the inputs in Panels C and D of Table 1, for all crops and rice, respectively. Description of all variables is provided in Table A1. CRS denotes that the constant returns to scalerestriction is imposed on the coefficient estimates of the inputs. Heteroscedasticity-robust standard errors in parentheses. Standard errors clustered at the commune level. Significance: *** p < .01, ** p < .05, * p < .10. Table 2) is robust to replacing the school year variable with different indicators of education, such as a dummy for literacy and the maximum years of schooling completed in a household; none are significant when other factors are held constant. These results are shown in Table F5 in Online Appendix F.

Production function with interaction terms
To explore the mechanisms driving our results, we estimate a model where the Internet enters the production process through interactions with inputs. The results are presented in Panel A of Table 3 for the production function for all crops and Panel B of Table 3 for rice. For each interaction, we estimate a model that includes all controls and commune and year fixed effects (columns 1, 3, 5, and 7), and a model with household and year fixed effects and controls (columns 2, 4, 6, and 8). We see from Panel A of Table 3 that the Internet is associated with improved use of fertilizers. The interaction terms in Panel A of Table 3, columns 1 and 2 are significant at the 1% level. The interaction terms of other inputs are insignificant.
It is also possible that our results capture a situation where farmers who use more fertilizer benefit more from the Internet than those who do not. This would be possible if fertilizer use is related to some time-varying unobservable characteristic that we have not been able to capture. However, this seems unlikely given that the results from the Oster test show that our results are very robust to the test of omitted variable bias. From Panel B of Table 3, columns 1 and 2, we can see that the relationship with fertilizer is also present in rice production, but the relationship is significant only at the 10% level and the coefficient estimates are smaller. The analysis using the interaction terms reveals that households residing in communes with the Internet are able to use fertilizers more productively than households in areas where the Internet is not available.

The relationship between Internet and inputs
To assess whether the mechanisms studied in Panels of A and B of Tables 3 are due to changes in input use, we Note. Authors' calculations. Dependent variables are the volume of all crops produced (a) and volume of rice produced (b). Summary statistics of the control variables are presented in Table 1. The household-specific control variables are number of HH members, female head HH, age of head HH, average age of adult household members, Kinh ethnicity, real nonagricultural income, school years, radio, television, phone, and extension. Agriculture-related control variables are presented alongside the inputs in Panels C and D of Table 1, for all crops and rice, respectively. Description of all variables is provided in Table A1. CRS denotes that the constant returns to scalerestriction is imposed on the coefficient estimates of the inputs. Heteroscedasticity-robust standard errors in parentheses. Standard errors clustered at the commune level. Significance: *** p < .01, ** p < .05, * p < .10.

T A B L E 3 Production function with interactions with inputs
analyze whether Internet availability is related to the volume of inputs used. The results are presented in Tables D1.a. and D1.b in Online Appendix D. The dependent variables are in logs. Looking at the production of all crops (Table D1.a), the Internet has little association with the inputs used, including fertilizer. We see a similar picture in rice production (Table D1.b), albeit with some increase in the use of seeds. The results in Tables 3 and D1 together suggest that households may be learning about the use of chemical fertilizers online. Even though fertilizers are already a widely and extensively used input among our sample farmers and among Vietnamese farmers in general, the government is concerned about suboptimal knowledge of fertilizer use (Thang, 2014). Given the number of websites that provide information about fertilizer use, it is plausible that information about farming practices has spread online. Information acquired could be related to the optimal timing of the application of fertilizer, the optimal amounts by crop, or the differences between different types of fertilizers on the market, etc.
The weaker results relating to rice are in line with the strong government involvement in rice production. As discussed in Markussen et al. (2011) andVasavakul (2006), authorities require certain plots to be reserved for rice only, so there is little self-selection into rice production. They monitor this and the quantities of rice produced. In 2010, floor prices were introduced for rice purchased by enterprises from producers (Thang & Linh, 2015). Price regulations imply that there are less arbitrage opportunities via price information available online. The government also regulates rice input prices and has policies to support the input costs in rice farming to guarantee a certain level of food security. Strong government involvement in production might indicate that farmers have sufficient information about rice production practices through traditional information sources, such as extension services. Due to the strong government involvement, it could also be that information on the recommended rice production practices has been widely available prior to 2008. Finally, as most farmers are rice producers it is also possible that information on practices spreads easily through word of mouth.

Heterogeneity
We investigate demographic as well as geographic heterogeneity in total agricultural production. We are motivated to investigate heterogeneity with respect to age, since younger household heads may be more open to adopting information technology (Aker & Mbiti, 2010). We also look at heterogeneity in education, as the ability to use the Internet may be higher for more educated households, and so they may benefit more (Aker & Ksoll, 2016;Akerman, Gaarder, & Mogstad, 2015). Years of completed schooling and age of the household head are used as control variables in Tables 2  and 3. Both variables are statistically insignificant, so we do not find any evidence of a direct association between schooling or age and agricultural output.
First, we investigate heterogeneity with respect to age. Figure 1 displays the marginal effects from a model where age is interacted with the Internet, where the vertical axis denotes the predicted values of (log) agricultural output. We see that younger household heads benefit more from the Internet. The median household head age in 2008 was 46 years, and graphical inspection shows that households where the household head was below the median age benefit more. This finding is confirmed in Table 4, columns 1-3: the interaction term between age and Internet is negative (columns 1 and 2). We do not find the squared term in column 3 to be significant, which confirms that the linear model in Figure 1 fits our data well.  Tables 2 and 3 with the following exceptions: in columns 1-3, we have excluded the average age in household from the controls. In columns 4-8 ,we do not control for the variable school years. Summary statistics are provided in Table 1. Description of all variables is provided in Table A1. Heteroscedasticity-robust standard errors in parentheses. Standard errors clustered at the commune level. Significance: *** p < .01, ** p < .05, * p < .10.
The results on heterogeneity with respect to education are presented in Figures 2a and 2b and in Table 4, columns 4-8. Altogether, we investigate various measures of education: the number of years of schooling completed by the head of the household, the literacy of the head, and a 6-point scale measure of the level of education. We also look at the . The levels are as follows: 1, cannot read and write; 2, can read and write but did not finish primary school; 3, finished primary school; 4, finished lower secondary school; 5, finished upper secondary school; 6, third level. The omitted category is 1. The model includes all inputs and controls as well as commune fixed effects similar to Table 2, column 3. Standard errors are clustered at the commune level. maximum level of education in the household by any household member. 17 We investigate the interaction between Internet and these education indicators by using both time-varying variables and time-invarying measures (the level in 2008). We 17 Since we study skill bias by interacting education variables with the internet variable, the test for skill bias is un-related to working hours. We therefore assume that the skill level required for "raw" farm labor is unrelated to the education level of the person taking part in agricultural activities.
use both of these measures since they may differ as a result of the household head having changed (for instance, due to a shock such as death or divorce) rather than acquiring more education. The time-varying measures capture these changes, while the time-invariant measures do not. Figures 2a and 2b show the results of the model where six levels of education are interacted with the Internet. The variables are the highest education level of household head, and the highest education level of anyone in the household. In   Figure 2b. The y-axis shows the size of the coefficient estimate of each interaction term. Although we find there is a trend, such that more educated households benefit more, the standard errors are large and indistinguishable from zero in both figures. Furthermore, the difference is most noticeable at high levels of education. We expand this analysis in Table 4, columns 4-8. The results show no evidence of a skill bias in Internet access. Moreover, columns 5 and 7, which look at heterogeneity in literacy, confirm the finding from Figure 2: if there is a skill bias, it is at the higher end of the education distribution, not at the margin of literacy. We study regional heterogeneity in Tables F1.a and F1.b in Online Appendix F. In Table F1.a, we split the sample between the less-developed northern Viet Nam and the more-developed southern Viet Nam. We see that the results are driven by northern Viet Nam (column 3). The results are strikingly similar when we split the sample across the commune mean income: The communes below the median drive the results (column 1). Additionally, in Table F1.b we split Viet Nam into five regions, such that the region denoted in the column title is removed from the sample. We find that the results are weakest if we drop the most northern provinces (column 2), which is the only specification where the results are no longer statistically significant. Given that the north is less developed than the south in terms of agricultural productivity and commercialization (Cazzuffi, McKay, & Perge, 2017), these results point to a higher marginal productivity of the Internet in the poorer northern areas.

Robustness checks
In Tables D2.a and D2.b in Online Appendix D, we run the placebo tests of the relationship between the Internet in period  Table 2 and Panel B of  Table 2, respectively. We see that all the coefficient estimates are close to zero and not even borderline significant: that is, we cannot reject the assumption of parallel trends. Tables D3.a and D3.b investigate the placebo tests for the input regressions presented in Tables D.1.a and D1.b, respectively. We see that none of the coefficients are statistically significant even at the lowest levels. We also conduct a graphical inspection of the parallel trends assumption, which is presented in Figure 3. The mean agricultural output of households in communes that received the Internet in 2012 (between the 2010 and 2012 rounds) is plotted against the mean output of households in areas that never received the Internet. We see from the graphical inspection that the assumption holds. We conclude that one cannot reject the hypothesis of parallel trends.
As an additional check, we derive bounds for the OLS results by conducting a test of coefficient stability. 18 According to Oster (2019) and Altonji et al. (2005), the OLS estimate of the Internet coefficient on agricultural output should be considered an upper bound for the true effect. The coefficient with our most conservative OLS estimate is 7.2% (Table 2, Panel A column 3). Hence, we use the regression in column 3 of Panel A of Table 2 as the full model with controls, which yields the result * = 0.061. 19 That is, the coefficient estimate of the Internet on agricultural output likely lies between 6.1% and 7.2%, which under the assumption = 1, suggests that selection on unobservables is also low. 18 The method is explained in Online Appendix C.
Next, we get the value for that would be needed to produce a treatment effect * = 0, = 7.48, suggesting that the unobservables would need to be 7.48 times as important as the observables to produce a treatment effect of zero. We conclude it is unlikely that our results are driven by unobservables.
We also run additional robustness check to correct for the endogeneity in the agricultural inputs in the production function by using the control function methods of Levinsohn and Petrin (2003) and Ackerberg et al. (2015). We present the results in Table E1 in Online Appendix E, which also provides a brief description of these methods. Columns 1 and 2 in Table E1 display the results for all crops and columns 3 and 4 for rice. Our results are robust to this approach: the coefficient estimates for all crops are slightly higher in magnitude than our estimates in Panel A of Table 2 and are significant at the 1% level. The coefficient estimates for rice are of similar magnitude and significance to those in Panel B of Table 2.
Additionally, we run another robustness check, where we subtract the volume of rice crops and the input volumes of rice inputs from the variables in the production function of all crops. This approach provides us with a robustness check to study whether the results are indeed driven by crops other than rice. 20 The results are presented in columns 5-8 of Table E1. 21 The coefficient estimates for other crops are similar to the specification for all crops using the control function approaches.

CONCLUSION
The results of this study suggest that Internet access is associated with a 6% to 7% higher value of agricultural output. We find that while this relationship is not the result of changing the input mix it is possibly related to more efficient use of chemical fertilizer. Weaker results for rice production emerge in line with the rice market being operated under restrictions on both production and prices. The results are strongest in the least-developed northern provinces of Viet Nam, implying that marginal productivity is higher in areas where agricultural productivity is initially lower. We highlight the existence of a number of government and privately run online outlets supplying information on agricultural production and that information gathering is the most popular online activity 20 We run two specifications. One includes inputs similar to the specification with all production and rice production. The other has two dummy variables added for those households that have zero pesticide and fertilizer use. This follows the approach by Battese (1997), Villano, Bravo-Ureta, Solis, and Fleming (2015), and Abdul-Rahaman and Abdulai (2018), noting the fact that a fraction of households do not use those inputs in other-than-rice production. 21 We present summary statistics of the variables used in estimating all crops minus rice in Table E2 in Online Appendix E.
in Viet Nam. The overall result therefore indicates that farmers have indeed been able to use this information to their benefit to learn about modern inputs. Since we look at the arrival of the first Internet access point-the ITT estimate-our results include possible spillovers: Farmers who have benefited from the new information might have been exposed to it through their social connections, or otherwise. Our estimation strategy relies on the parallel trends assumption, supported by placebo tests.
We believe we have been able to shed light on whether the introduction of a general-purpose technology, that is, the Internet, can serve as a means of improving practices in the traditional sectors of the economy. Since Viet Nam has recently obtained lower-middle-income country status by World Bank standards, foreign aid is gradually being withdrawn from the country. It is therefore crucial that poor households are not left behind in their capacity to use new technologies in their everyday lives. Active support for them to benefit from Internet access is called for, noting that this would appear to be associated with relatively high marginal productivity.