Measuring rural access for SDG 9.1.1

The Rural Access Index is a measure of access developed by the World Bank and has been incorporated into the Sustainable Development Goals (SDGs) as indicator 9.1.1. This measures the proportion of the rural population living within 2 km of an all‐season road, using GIS layers and relying on three data sources: population, road network location and condition. Open GIS data are used for population and road location, but there are challenges to define the all‐season status of roads. Every country measures road condition differently and against different parameters, which makes consistency of the all‐season status between countries difficult. This article reports on research to refine the GIS methodology for assessing SDG 9.1.1 to make it more sustainable, repeatable and consistent by using geospatial data and tools, based on trials in four countries—Ghana, Malawi, Myanmar and Nepal—which were selected for their diversity of environment and data.

The Rural Access Index (RAI) is a key indicator that estimates the proportion of the rural population with adequate access to the transport system. It is defined as the proportion of the rural population living within 2 km of an all-season road. Two kilometres is considered to represent a 20-25-minute walk (subject to the topography). An all-season road is one that is motorable all year, but may be temporarily unavailable during inclement weather (Roberts, Shyam, & Rastogi, 2006).
TRL have been given the responsibility to refine the GIS-based methodology for SDG 9.1.1 under a project for the UKAid-funded programme Research for Community Access Partnership (ReCAP).

| BACKG ROU N D
The RAI is one of the most important global development indicators in the transport sector. It was designed to measure accessibility for rural communities, using an approximate walking distance of 2 km. There is a common understanding that the 2 km threshold is a reasonable extent for people's normal economic and social purposes and equates to approximately 20-25 min of walking time ( Figure 1).
The definition is also simple enough to understand and use not only in transport, but also in the broader development context, such as poverty reduction. In the initial study in 2006, the worldwide RAI was estimated at 68.3% based on household surveys, leaving a rural population of about one billion unconnected to a good-quality road network. When the SDGs were established in 2015, the RAI was adopted as the measure for SDG 9.1.1, using the same wording from the original 2006 study. However, the 2006 methodology was based on the interpretation of existing household surveys and had several disadvantages, such as inconsistency across countries, lack of sustainability of regular updates, weak operational relevance and client ownership. It was found to be costly to rely on a household survey, and although it was recommended that a specific RAI question be included in all future censuses, this did not transpire. This limited the accuracy and sustainability of the index. In addition, the household-based approach was felt not to be spatially representative and of limited operational usefulness.
In 2016, the World Bank partnered with the UK Department for International Development (DFID) and ReCAP to develop a new geospatial methodology to measure rural access, to be sustainable, consistent, simple and operationally relevant (World Bank, 2016). The new methodology took advantage of GIS and used geospatial techniques and data collected using innovative technologies, and equated "all-season" with road condition.
F I G U R E 1 RAI distance to a road (Roberts et al., 2006) The results obtained from application of the geospatial methodology in eight pilot countries were inconsistent compared to the same results for those countries obtained using household surveys in 2006, as can be seen in Figure 2.
Some countries had a significantly higher RAI, whilst some others had significantly lower results. It was assumed that the geospatial methodology was more accurate, but further investigation was needed to explain the differences. This led to the development of a project under ReCAP, which was designed to review the previous research and recommend measures to enhance the consistency and sustainability of the geospatial approach.
One of the most challenging aspects of the RAI is to define the all-season status of the road network. The original 2006 methodology defines the term "all-season" as "… a road that is motorable all year round by the prevailing means of rural transport, allowing for occasional interruptions of short duration". Some countries have interpreted the all-season status based on visual assessment of condition, others on roughness, others on speed.
The 2016 methodology defines an all-season road by interpreting road condition. So an all-season road is defined as either a paved road with International Roughness Index (IRI) less than 6 m/km or an unpaved road with IRI less than 13 m/km, or if the IRI is not available a paved road in excellent, good or fair condition or an unpaved road in excellent or good condition.
Whilst this provides an initial coarse estimate of all-season access, it ignores the fact that, for example, poor-condition paved roads and fair-condition unpaved roads could, and often do, provide all-season access. It also presupposes that accurate and recent road condition data are available, and that all countries are able to collect this information on a regular basis. Low-income countries (LICs) often struggle with data collection for roads because it is an onerous and resource-hungry task that requires a high level of administration.
Thus, consistency of assessment of the all-season status of road networks in different countries is challenging.
This issue was the main problem associated with the research project under ReCAP and forms the focus of this article.

| PROB LEM S TATEMENT
The problems with the original RAI in 2006 are clear; the results were based on household surveys, but the surveys did not include specific questions that related to the RAI, so the researchers had to interpret questions already in the surveys to try to estimate the distance to all-season roads. Also, questions sometimes used different F I G U R E 2 Geospatial RAI measurement in 2016 compared to 2006 HH surveys. Source: World Bank (2016) terminology (e.g., "all-weather" instead of "all-season"), and people's perception and estimation of walking distance often proved to be inaccurate because rural communities who do not use vehicular transport regularly had no experience of measuring distance.
The principles of the index were sound, but the methods used to measure it in 2006 were unreliable or unsustainable. Inserting questions into household surveys across the world, and training enumerators as to the intended meaning of the question, is difficult to organize and to scale up. It is also difficult to check the accuracy of the results.
The surveys were based on the World Bank Living Standards Measurement Study (LSMS) data collection, and although the data are still available, the specific methods used to interpret them are not detailed in the 2006 report.
Because the geospatial results were significantly different from the 2006 results, it was not possible to definitively say which were the more reliable, although it was accepted at the time that the geospatial approach using GIS should be more accurate and operationally relevant.
One potential issue with the 2016 results was the fact that the all-season status of roads was mainly interpreted from road condition. Rural roads are most important with respect to the RAI, but many countries do not formally measure the condition of their rural roads, and although research is under way to explore measuring road condition from satellite imagery, a system has not yet been established (Workman, 2018). Where condition data were present and reliable they were used, but in some cases extra condition surveys were commissioned using a smartphone app called RoadLab (World Bank, 2015), which measures road condition according to the IRI.
Although smartphone apps have not proven to be a particularly reliable medium to measure IRI in the past, especially for unpaved roads of fair and poor condition, this does provide a rapid and approximate assessment of IRI. The issues with smartphone apps are partly due to their inability to accurately record IRI at slow speeds, coupled with the quality of smartphone accelerometers, but also due to the variation in driving style and the variables associated with using a response-based method (Workman, 2017). However, there is considerable potential for smartphones to provide crowdsourced data on roads and their condition by monitoring cell-to-cell movements and reporting issues, perhaps via social media and OpenStreetMap (OSM).
The other major issue with use of road condition on unpaved roads is that it can change very quickly, so a survey conducted one day might produce very different results from one conducted just a few days or weeks later. This meant that consistency of measurement was a serious concern in terms of defining the all-season status of a road.
There will be inevitable issues with data collection, data quality and completeness. Although GIS tools are able to effectively manipulate the data to provide rapid results across a large number of countries, if the data are inaccurate the results will be inconsistent. RAI is a very visible indicator, so when the graph in Figure 2 was published, some countries robustly questioned the results.
F I G U R E 3 GIS layers used to measure the RAI

| OVER ALL G IS PRO CE SS TO ME A SURE THE R AI
There are three layers in GIS that are used to measure RAI: population, road network and the all-season status of the network, as shown in Figure 3.
The process to measure RAI using GIS was established with the geospatial methodology in 2016 and used available open data, due to resource constraints. The sources identified were as follows.

| Population
The source for population data recommended by the custodian was WorldPop. This uses national census data, projections and other ancillary data from countries to produce aggregated, 100 m 2 population data that can be downloaded and used in a local GIS platform. Other services such as the Gridded Population of the World (GPW) exist at a lower resolution, although they may change in future and so can still be evaluated as potential sources of data.
Because the indicator measures the "rural" population, it is necessary to define rural/urban boundaries. At to instil some consistency into the definitions based on population density on a 1-km grid, but adjusted for local situations. This may be recommended in future, but at present the boundaries defined by the National Statistics Office (NSO) in each country take precedence for RAI. In many countries, the local government agencies publish their urban/rural boundaries on their website, and many are able to provide shapefiles of the boundaries that can be included as a GIS later to define the rural population when combined with the population layer. This same definition is used for all statistics in the country that depend on an urban/rural breakdown and would therefore be useful for measuring other SDG indicators.

| Road network
The source for road network data recommended by the custodian was OSM. There are many freely available sources of road mapping data. OSM is a collaborative project to create a free editable map of the world, and has almost become a de-facto standard for crowdsourced mapping not only for roads but also for health services, education services, etc. Other major online mapping platforms and GIS tools often link to live OSM map services for background mapping.
In OSM, the main roads and urban roads tend to be more accurately and more completely recorded than the rural roads. However, that is changing as more and more countries become accustomed to working with online datasets, and as universities and humanitarian agencies work to improve online mapping of road networks, especially for the purposes of disaster planning and disaster relief. A definitive online map of the road network in any country provides a wealth of benefits to national and international, government and non-government agencies, to local businesses and the general public.

| All-season status of roads
There is no reliable, current open database for road condition that could be applied geospatially. The 2016 trials therefore used any available source they could find, usually from the local roads authority, and in some cases they measured road condition using smartphone applications that measure road roughness.
Where road condition was available, the following interpretation was applied for IRI and a visual-based good/ fair/poor assessment: • Paved road with IRI less than 6 m/km and unpaved road with IRI less than 13 m/km, when IRI data are available.
• Paved road in excellent, good or fair condition and unpaved road in excellent or good condition, when IRI data are not available but other road condition data, such as the PCI or visual assessment by class value, are available.
Although this does provide an overall approximate assessment of all-season status, it requires road condition to be measured and puts the onus on countries to provide resources for carrying out the measurement. Many countries do not measure rural road condition. Or, they measure it only infrequently, so this would impose a burden of data collection that would be difficult to sustain.
The overall process for measuring RAI using GIS can therefore be demonstrated in the diagram in Figure 4.
Under the 2016 methodology, road condition would be used to adjust the road network map to show only all-season roads. This diagram demonstrates how the assessment of all-season roads using accessibility factors is applied to the GIS results after the population living within 2 km of a road has been defined.
F I G U R E 4 GIS procedure for measuring the RAI

| Issues with open GIS data
There are a number of issues that need to be considered with using open data. The most important is the consideration that in the future the country itself will be reporting SDG indicators to the United Nations, normally through the NSO or equivalent, so the information they report needs to be officially ratified before it can be published.
This means that the open data must match the national data, or at least be authorized by a relevant authority.

| DEFINING THE ALL-S E A SON S TATUS OF RUR AL ROADS
The key focus of this article is how to geospatially measure the accessibility of rural roads in LICs for RAI (SDG 9.1.1) using the original definition of the all-season status of roads in a sustainable and reliable way using GIS. One of the earliest studies on unpaved road deterioration was in Kenya in the 1970s (Hodges, Rolt, & Jones, 1975), and studied 38 test sections of various gravel materials and 8 earth sections over a period of 2 years. Jones (1984a) also looked at the type and extent of deterioration on engineered gravel roads in Kenya. He followed this with a report recommending optimum maintenance strategies (Jones, 1984b).
These studies kicked off the work done to develop the Highway Design and Maintenance model (HDM-I), as first reported by Harral (1980). Further research led to the development of HDM-III in 1987 (Watanatada et al., 1987a(Watanatada et al., , 1987b and to a computerized version in 1989 (Archondo- Callao & Purohit, 1989). The initial studies defined and quantified the relationships between various defect mechanisms on unpaved roads and led to the development of HDM-IV (Kerali, 2000).
The concept of road roughness was developed in the 1980s, starting with the International Road Roughness Experiment (IRRE), held in Brazil from 1982 onwards (Sayers, Gillespie, & Queiroz, 1986), to define an IRI. TRL also published Overseas Road Note 20, Management of Rural Road Networks (TRL, 2003), which reflected best practice in condition surveys.
The report "Passability criteria for unpaved roads" (Paige-Green & Bam, 1994) discusses the wet weather passability criteria for South African conditions based on a performance-related study of typical road problems.
The key factor was found to be drainage and ensuring that water is removed from the road surface as quickly as possible, but even with good drainage, extended and penetrating rain can still cause damage to an unpaved road.
Passability equates to the all-season status of a road, and typically the wet season is when roads become impassable (i.e., not all-season). Climate will therefore have a key influence on passability. Landslides can also be prevalent in steep areas, so terrain also has a significant effect on all-season access.

| Accessibility factors
As a way to address the issues outlined in the problem statement and the research shown in the status section, the research team have proposed using an accessibility index to supplement the RAI in countries where road condition is not collected or not reliable. This removes the need for onerous data collection by making an overall assessment of a network or part of a network based on surface type, climate and terrain, which are the three key factors in determining whether a road is all-season or not. This can then be applied to the GIS results, as shown in Figure 4.
The proposed use of accessibility factors relies on the following three aspects: • Surface type. Many rural roads in LICs (and even in large high-income countries including the USA and Australia) are unpaved. As mentioned before, unpaved roads deteriorate rapidly and in a different way to paved roads.
They are very susceptible to water ingress to the surface, which softens the materials and makes them very vulnerable to the action of traffic. So, when a road surface becomes saturated and is subject to traffic, the deterioration is accelerated.
• Climate. Precipitation has a significant effect on the condition of a road, especially on unpaved roads, which predominate in LICs and provide much of the extended connectivity to rural and poor areas. As mentioned above, the rainfall on a road is a significant factor in its deterioration, but the extent depends on the type of rainfall in terms of duration and intensity, and how well the roadside drainage copes with this.
• Terrain. The gradient and altitude of roads also has an effect on their accessibility. Steep roads become impassable more easily due to the potential for scour during heavy rainfall, and also due to slipperiness as a result of the road surface materials used. Many road authorities use particular technologies for steep areas, such as stone soling, cobblestones or concrete, which are more durable than unpaved surfaces and resist scour better than bituminous surfaces. Although they are often rougher than bituminous surfaces, they reliably maintain access during times of excessive rainfall and give the transport operators confidence that the road will remain motorable.

| Development of accessibility factors
Factors for accessibility can be applied to the GIS-produced population data, so that the final RAI results are effectively weighted by population ( Figure 5). So, when calculated, the RAI will include the probability that the roads people are using in each area will be all-season or not. This is measured on a scale of 0 to 1.0, with 1.0 being 100% probability that the roads are all-season. For example, a paved road in a flat area with low rainfall would have an accessibility factor of 1.0, as this road is designed to be accessible all year round and the environmental effects on its impassability are minimal.
However, an unpaved road in a mountainous area with high rainfall would have a much lower accessibility factor, perhaps 0.90. Being unpaved it is naturally vulnerable to rainfall, and the terrain heightens the chances that it will become impassable. These factors would vary by road surface type, because unpaved roads are more vulnerable to climate and terrain than paved roads.
Each country would need to determine its own accessibility factors, based on the local conditions and in line with the guidelines produced under the RAI project. It is recommended that accessibility factors are determined using ground-truthing in the various different climatic and geographical zones within a country. It is expected that local engineers should be able to provide a good overview of the all-season status of a small sample of roads, which would then be used to inform the accessibility factor for the overlap of different environments, as shown in the potential example in Figure 5.
F I G U R E 5 Potential example of accessibility factors Figure 6 shows how the RAI would be defined and calculated for a simple area with three roads. The example below shows how accessibility factors could be used as a proxy for the all-season status of a road by applying them to the GIS results for rural population within 2 km of a road. How well a road network is maintained is also important. Simple tasks such as drain clearing and vegetation control can affect the condition of a road, so in extreme cases this can affect whether a road is classed as all-season or not. Thus, accessibility factors may be improved if more efficient maintenance practices are introduced, and/or if maintenance funding is increased.
Accessibility factors should therefore not be changed frequently, and only in response to a significant change in funding, policies or standards. The rationale for defining those factors should be documented.
The calculation of the RAI can be relatively easily achieved in GIS software, using layers of population data (with urban/rural boundaries applied), road network mapping and using maps or road condition and/or accessibility factors. However, careful thought needs to be given to deciding which datasets to use as inputs, and assessing their quality and coverage for this purpose.
F I G U R E 6 Example of accessibility factor application

| Sustainability
A key factor to address is the sustainability of any measurement process, which would enable countries to measure RAI using their own resources. In this respect any measurement should ideally utilize existing data and not impose any additional data collection regime. This is particularly relevant to the all-season aspect of the RAI, which has in the past needed an assessment of road condition to measure the indicator. This implies that if road condition data do not exist, as is the case for many rural roads in LICs, then they need to be collected.
The resources to measure the RAI using GIS are usually available, and do not require specialist or very experienced inputs. The project has proposed to develop online tutorials and videos to support country measurement of the RAI with minimal inputs for external consultants.

| Publication of the RAI
The ReCAP programme has also engaged a software specialist company to develop a GIS tool to calculate the RAI.
This tool is expected to be integrated into the United Nations Global Platform (UNGP). At present it is focused on F I G U R E 7 Publication process for the RAI measuring the RAI for all countries using readily available open data, but the ultimate aim is to develop a tool that will allow countries to enter their own, authorized, data into the tool and calculate the RAI independently.
The United Nations Global SDG database (available at unstats.un.org/sdgs/indicators/database/) provides access to data compiled through the United Nations system in preparation for the Secretary General's annual report on "Progress towards the Sustainable Development Goals". The SDG metadata repository is part of the Global SDG database at unstats.un.org/sdgs/metadata. It reflects the latest reference metadata information provided by the United Nations System and other international organizations on data and statistics for the Tier I and II indicators in the global indicator framework.
The UNGP (https://offic ialst atist ics.org/) provides a platform for learning about trusted data, projects, applications, services and partners. It is hoped that eventually it will provide data, tools and services with which to calculate SDGs and other indicators (including RAI). It currently contains tools and services to make imagery, mobile phone network data and social media data available for statistical practitioners.
Further details on reporting of the SDGs are contained in the United Nations Development Group publication "Guidelines to support country reporting on the Sustainable Development Goals" (United Nations Development Group, 2017). This is a useful guideline which takes the reader through the steps necessary to produce an SDG report, including examples and a checklist. This document is aimed primarily at data custodians.
The diagram in Figure 7 shows how the RAI measurement will be made, who is responsible for the various steps in the process and how it will be published.
F I G U R E 8 Presentation of the RAI (SDG 9.1.1) The map in Figure 8 shows one potential way that the measurement of the RAI could be presented. It would also be powerful to indicate the RAI measurement as the number of people who do not have access to an all-season road.

| Improvement of the RAI
The main ways that a country can improve their RAI score are to extend the network into more remote rural areas, to improve the quality of roads to ensure that a higher proportion are classed as all-season, or through population movement so that people live closer to the all-season network. Such changes will take time, and the RAI would not need to be measured more than once every 3 years because the changes would be very small. A recommendation from this project is for the RAI measurement to be made at a sub-national level, which would enable resources for infrastructure to be focused in the most deserving areas. This should be possible in most areas because the enumeration area for population data collection is usually at district level or lower.

| Alternative uses of GIS products
GIS is an essential tool to measure the RAI (SDG 9.1.1). However, the products from this project and the processes used also have the potential for much wider usage, especially for SDG measurement and beyond. The mapping clearly shows how developments are concentrated in areas served by roads. The process of community expansion around roads can be explored by looking at archive data for population (WorldPop has archive GIS data for every year back to 2000, for every country in the world) and historical mapping. The concentration of populations in respect to roads can be seen in Figure 9.

| CON CLUS IONS
In conclusion, it is essential to find an accurate, replicable and sustainable method of measuring SDG 9.1.1 in the future, to ensure its continued use as the key rural accessibility indicator globally. Sustainability depends on the F I G U R E 9 Population and road network in GIS data collection being kept simple and undemanding on local resources. A key aspect of making this process sustainable is to maximize the use of GIS software and tools to process the data, and to define the all-season status of the road without putting extra burden on countries to collect additional data.
The GIS procedures are relatively straightforward and should be implementable by a competent GIS technician; specialist expertise and extensive experience in GIS should not be necessary. This is an important consideration as although many road authorities do have GIS technicians, they may not necessarily be specialist operators.
A universal, consistent measurement of the RAI will undoubtedly rely on GIS technology. The calculation tool being developed on the UNGP will make measurement simple and quick, and will give countries the independence to carry out the measurement with minimal support.
A country can improve its RAI score by extending the road network into more remote rural areas, by improving the quality of roads or through population movements so that people live closer to the all-season network. It is not expected that such changes will happen quickly.