Site Visit Frequency Policies for Mobile Family Planning Services

Improving access to family planning services is key to achieving many of the United Nations sustainable development goals. To scale up access in remote areas and urban slums, many developing countries deploy mobile family planning teams that visit “outreach sites” several times per year. Visit frequencies have a significant effect on the total number of clients served and hence the impact of the outreach program. Using a large dataset of visits in Madagascar, Uganda, and Zimbabwe, our study models the relationship between the number of clients seen during a visit and the time since the last visit and uses this model to analyze the characteristics of optimal frequencies. We use the latter to develop simple frequency policies for practical use, prove bounds on the worst‐case optimality gap, and test the impact of the policies with a simulation model. Our main finding is that despite the complexity of the frequency optimization problem, simple policies yield near‐optimal results. This holds even when few data are available and when the relationship between client volume and the time since the last visit is misspecified or substantially biased. The simulation for Uganda shows a potential increase in client numbers of between 7% and 10%, which corresponds to more than 12,000 additional families to whom family planning services could be provided. Our results can assist policymakers in determining when to start data‐driven frequency determination and which policies to implement.


Introduction
Access to family planning plays a crucial role in achieving many of the United Nations (UN) sustainable development goals (Starbird et al. 2016), notably goal 3 (good health and well-being) by preventing maternal deaths and controlling the spacing of pregnancies.Universal access to contraception is estimated to reduce unintended pregnancies by 75%, maternal deaths by 25% (Darroch et al. 2017), and infant mortality by 10% (Cleland et al. 2006).Additionally, family planning allows women to postpone the birth of their first child and advance their education, which aids goals 4 (quality education), 1 (no poverty), and 5 (gender equality).For example, if adolescent girls in Brazil and India could postpone childbearing until their early twenties, the economic productivity of those countries would increase by more than US$3.5 billion and US$7.7 billion, respectively (UNFPA 2014).The UN estimates that, for "every dollar spent in family planning, between two and six dollars can be saved in interventions aimed at achieving other development goals" (UN Population Division 2009).
Despite these compelling facts, for at least one out of five women 1 in Sub-Saharan Africa the need for family planning goes unmet (United Nations 2015) 2 .Many live in rural areas and urban slums, where access to family planning is limited or non-existent (Eva andNgo 2010, Solo andBruce 2010).Mobile (i.e., traveling) outreach teams are crucial to scale up access to family planning in these areas.Yet with a widening gap in funding of family planning organizations (UNFPA 2018), mobile services have to reach more people with fewer resources.
Mobile teams, consisting of doctors and nurses, are often the only provider of family planning services and have to allocate time to the sites in the catchment This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.area.Naturally, they tend to visit sites that attract many clients more often than those that attract relatively few.Marie Stopes International (MSI), an NGO with 500 mobile family planning teams, has therefore introduced the guideline that (besides other factors) "The level of client demand [...] should be used to determine the optimal frequency of visits" (MSI 2016).Although this guideline is intuitive, it is loosely specified.Our interviews with outreach leads from five African countries show that country programs interpret these guidelines in different ways, and our data analyses reveal that adherence to them is low.This, as our study shows, has a negative impact on client volumes per outreach day, used in this study as the metric of effectiveness.
Developing more firmly specified policies is a necessary but difficult task.Policies not only depend on client numbers but affect client numbers: More frequent site visits may lead to fewer clients per visit.In the absence of research on this effect, however, this has so far not been factored into decisions on visit frequency.Our study thus focuses on the frequency determination problem (FDP): Given a team, a set of sites, a lower and an upper bound on the frequency of visits to each site, and a certain number of service delivery days per month, what frequencies should be assigned to each site so as to maximize the effectiveness of the visits?
We use a large dataset from MSI to fit a fixedeffects model that captures the link between effectiveness and visit frequency.We then mathematically model FDP and analyze this model to obtain characteristics of optimal frequencies.Although the latter can be efficiently calculated by algorithms, there are strong arguments for basing frequency decisions on simple policies.We therefore use the characteristics of optimal frequencies to develop practical frequency policies and test them using a simulation of MSI's mobile teams in Uganda.Our findings were discussed with staff at MSI headquarters in London and with the Marie Stopes Uganda team.
The main finding is that, despite the complexity of the frequency optimization problem, simple visit frequency policies yield near-optimal results.This holds even when little data is available, parameter estimates are substantially biased, and the objective function is misspecified, and even though several simplifying assumptions are made.Based on the data for Marie Stopes Uganda, we estimate that putting these policies in place could increase client numbers by between 7% and 10%.If this were representative of MSI as a whole, an increase of 7% would correspond to at least 175,000 clients globally 3 .In a context of severely reduced funding, such an increase in effectiveness would go some way to achieving the UN sustainable development goals.
Our paper is organized as follows.Section 2 gives further details on the way mobile teams operate and the FDP.Section 3 discusses the relevant literature.In Section 4, we develop an econometric model for client numbers to examine the effect of visit frequencies on the number of clients per visit.In Section 5, we formally model and analyze the FDP and propose exact solution methods and policies.Section 6 presents numerical results on the effectiveness of the policies for Marie Stopes Uganda.

Problem Description
This section describes the FDP faced by outreach teams.This was obtained through discussions with outreach leads from five countries in which MSI operates, the international outreach lead and the directors of the research team.
Each outreach team has a fixed set of sites which it visits regularly to deliver family planning services, determined in cooperation with the government.These are often small healthcare facilities that do not provide (long-term) family planning methods.They also use pop-up tents.Teams provide short-term and long-term family planning methods as well as counselling, treatment of side effects, and implant and intrauterine device (IUD) removal.Some also offer permanent methods of contraception (i.e., sterilization).
Each team has a limited number of days per month to visit sites.Nearly all site visits last a full day.Sometimes two neighboring sites can be visited on the same day.In most countries, the visits are day trips that start and end at the team's base, but they may camp on site to reach remote locations on multi-day trips.Before the visit, marketing or "demand generation" is done through radio, posters, or community health workers to encourage people to come.During the visit, drivers sometimes tour the village with a megaphone to announce that the team is on site.Since demand generation typically occurs locally and travel times between sites in rural areas are typically long, clients rarely come from different sites.
Client volumes per site visit are determined by factors such as the intensity of demand generation, staffing, the service package, whether a market-day or vaccination campaign is taking place, the time of arrival/departure, and visit frequencies.We focus on optimizing the latter.MSI recognizes its importance in its guideline "The level of client demand [...] should be used to determine the optimal frequency of visits" (MSI 2016).Similarly, USAID's "handbook for program planners" for mobile outreach services mentions selecting "frequency of mobile outreach services" as the second program decision (after selecting sites to be visited) and states that the optimal frequency "will depend on demand for services" (Solo and Bruce 2010).
Visit frequencies determine the "return time," or the time between visits.They should take into account ethical and medical considerations, since it is neither fair nor medically advisable to have long return times.Clients may want to have a contraceptive device replaced or removed or require treatment for a side effect.MSI has set an interval between visits of no more than 6 months.They can be flexible on return times only if government providers can assure removal of contraceptive devices and provide access to short-term methods such as condoms.If not, they aim for a return time of 3 months at most.Our analyses assume a 6-month upper bound.There is a minimum interval between visits as it takes about a month to generate sufficient demand and coordinate the visit with those managing the local facilities.
These bounds leave room to optimize visit frequencies.Although it makes sense to visit sites that attract many clients more often, this has not yet been translated into a concrete policy.Should a site that attracts twice as many clients be visited twice as often?What would higher visit frequencies look like?How far should it depend on historic frequencies?Questions such as these are presently unanswered.
Perhaps unsurprisingly, many teams link their return times to demand in a relatively ad hoc way.Data from Marie Stopes Uganda show that for those sites with return times below the 6-month upper bound, there is little to no alignment with client numbers (as illustrated in Figure 1).There are quite a few sites with a long return time but a large number of clients.Conversely, many sites have low average client numbers but are visited frequently.Shifting resources from sites with short return times and low average client numbers to those with longer return times and higher numbers would seem likely to improve effectiveness (measured as the average client volume per outreach day).
To optimize frequency policies, it is key to understand how return times affect client volumes.Our conversations with MSI staff suggest that client numbers tend to increase with an increase in return time, which was attributed to more time for demandgeneration, word-of-mouth effects, and existing clients demanding contraceptive device renewals or removals.However, the strength of this effect may be limited.For example, adequate demand-generation can result in substantial client volumes irrespective of the return time.Long return times were said to reduce client satisfaction and induce a loss of trust in the teams, which also suggests that the effect diminishes at some point.More specific evidence on the effect of return time on client numbers is lacking (an issue addressed in Section 4).
The effectiveness of simple policies as opposed to advanced algorithms is also unclear.Despite the complexity of the FDP, organizations such as MSI have good reason to prefer simple policies, the main one being that they fit with the prevailing decentralized decision-making culture.Another is that advanced algorithms push up the costs of implementation, roll-out, and training (De Vries and Van Wassenhove 2020).Organizations also value flexibility, that is, policies that simply recommend a frequency and leave the teams to decide on day-to-day planning.For this reason, the policies proposed in this study only recommend frequencies, allowing each team to incorporate local knowledge into its decision making.For example, MSI team leaders know when it is market day in a village, which tends to lead to higher numbers of clients.According to Peter Schaffler, MSI's outreach lead, "Team leaders are also extremely aware of road conditions and incorporate this knowledge into their routing decisions" (personal communication, May 1, 2018).Section 6 analyses the optimality gap of visit frequency policies.
A third question considered is how the effectiveness of a policy depends on the amount of data available.MSI started systematic data collection in Uganda in 2015.Other countries started more recently, so few data points are available.This can be perceived as a barrier to data-driven decision making.We analyze the impact of data quantity in Section 6.
Based on discussions with MSI, we have chosen to analyze policies in terms of numbers of clients per outreach day.One alternative measure would be to use the total couple years of protection (CYPs) provided per outreach day.The CYPs of a particular contraceptive method is the number of years that it protects a couple from an unwanted pregnancy.Another measure could be weighted client numbers depending on specific target groups: the young, the poor, and clients who have no other way to access family planning.This would take into account that seeing 20 clients in a relatively wealthy, well-served site is not the same as seeing the same number of clients at a poorer, remote, underserved site.The main reason for our choice (clients per outreach day) is that it fits the way outreach teams presently make decisions.The policies we encountered during our interviews were all expressed in terms of client numbers.Marie Stopes Madagascar, for example, aims to carry out monthly visits for sites that attract on average more than 20 clients per visit.Maximizing client volumes also fits well with maximizing CYPs and weighted client volumes, since differences between sites in the same catchment area in terms of either CYPs or the percentage of clients in each target group are small.In Section 6.2, we examine how the performance of our policies is affected by this choice.

Literature
Three streams of literature are closely related to our work: mobile (family planning) outreach, global health operations management, and related scheduling problems.The FDP most closely resembles a knapsack problem (KP) with a separable concave non-linear objective function and linear constraints (see Ibaraki and Katoh 1988, for an overview of solution algorithms).In particular, the FDP can be modeled as a KP where all items have the same weight, the objective value is a non-linear function of the number of a given item included (i.e., the visit frequency), and this number is bounded from above and below.The specific objective function of the FDP is what makes this problem distinctive.Mobile (Family Planning) Outreach.Access to a health service in hard-to-reach areas can be enhanced among others by scaling up telehealth/eHealth programs (see, e.g., Medlife in India), optimizing the match of supply and demand (see, e.g., platforms such as StanPlus), expanding and strengthening the network of static health units, and deploying mobile health units (MHUs).The latter are the primary source of healthcare services for millions of people (Oriol et al. 2009), both in high-income areas and lowand middle-income countries.They deliver primary care, maternal and newborn care, family planning services, tuberculosis screening and treatment, HIV screening and treatment, cancer screening, dental services, treatment of childhood pneumonia, and many other services (Khanna and Narula 2016).Several studies have shown MHUs to be capable of improving health, reducing treatment delay, improving treatment adherence, decreasing mortality, and decreasing healthcare costs (Khanna and Narula 2016).
Mobile family planning units have been shown to significantly affect contraceptive use.Using data from Zambia, White and Speizer (2007) estimate that the modern contraceptive prevalence rate in rural areas would increase by 5.9 percentage points if all women had at least one outreach visit.Joshi and Schultz (2013) study the impact of introducing outreach services in villages in Bangladesh, and show substantial benefits in terms of birth spacing and fertility rates in comparison to "control villages."Similarly, Lutalo et al. (2010) use a randomized controlled trial in Uganda to show that introducing outreach services increases the use of hormonal contraceptives and decreases pregnancy rates.Outreach has also been shown to increase adoption of long-term family planning methods and reach poor and underserved populations (Ngo et al. 2017, Noccio andReichwein 2013).
Access to and impact of family planning is commonly measured through indicators such as (modern) contraceptive prevalence, unmet need, the number of unintended pregnancies, unsafe abortions, and maternal deaths averted, and adult birth rates (cf.FP2020 2020).Since is it hard and time-consuming to measure site-level changes in these indicators, mobile outreach programs commonly use service statistics such as the number of (young) clients served, the number of adopters (i.e., clients who (re)started using contraception), and CYPs (cf.Ngo et al. 2017).Weinberger et al. (2013) propose a model to translate these statistics into an estimated impact on contraceptive prevalence.
Though a significant body of research has covered impact of mobile family planning, little is known about impact drivers.This study contributes to the literature on mobile (family planning) outreach by exploring how visit frequencies affect client volumes and how to incorporate this in visit frequency policies.
Global Health Operations Management.Our work also contributes to the growing stream of research that uses OR/MS knowledge, models, and techniques to address global health challenges (Kraiselburd and Yadav 2013).One branch considers enhancing access to health products by studying, for example, root causes of stockouts (Karamshetty et al. 2021, Karimi et al. 2021), supply network (re)design (Vledder et al. 2019), inventory policy design (Leung et al. 2016), funding systems (Gallien et al. 2017, Taylor andXiao 2014), inventory allocation (Natarajan and Swaminathan 2017), distribution (Parvin et al. 2018) and procurement (Martin et al. 2020).A second considers improving access to health services.Examples include studies of network design for clinics (De Vries et al. 2020), community healthcare (Cherkesly et al. 2019), and diagnostics (Deo andSohoni 2015, J ónasson et al. 2017), incentive design for patients and providers (Mehrotra and Natarajan 2020), and patient enrolment/ capacity allocation (McCoy and Eric Johnson 2014).
We identified five OR/MS papers that specifically consider the deployment of mobile healthcare units when capacity is restricted.Deo et al. (2013) study mobile asthma care units that visit schools, taking the timing of visits as given, and optimize the scheduling of patients to maximize health gains.Our work is similar in that we also consider (a proxy of) outcomes, but differs in that we consider optimizing the frequency of visits.Hodgson et al. (1998) and Doerner et al. (2007) do consider the scheduling of site visits, and model this as a covering tour problem (CTP).The aim is to construct a tour through a subset of locations in a network, subject to the constraint that a pre-specified percentage of the population (or the locations considered) must be within a given distance from a location in the tour.There are two structural differences between the CTP and the FDP.First, a CTP does not consider optimizing visit frequencies, nor how demand is dependent on frequency.Second, the objective of a CTP is to minimize travel time, whereas we aim to maximize the number of clients reached in a setting where outreach teams use one-day trips to visit sites.
McCoy and Lee (2014) consider the allocation of motorbike visits to outreach sites, with the objective being to maximize effectiveness and fairness.Each site visit is said to satisfy a certain "need."Effectiveness is defined as the total "need" satisfied.Fairness is incorporated by assuming diminishing utility of additional visits to a site.The authors quantify these objectives as a weighted sum of the number of visits to each site, raised to the power of some constant.As detailed later, the functional form of our objective function is different and fits with the data from MSI.Unlike them, we also consider lower and upper bounds on the number of visits.We focus on policies and insight into the effectiveness of family planning operations, whereas they focus mainly on analytical insights into optimal allocation decisions.De Vries et al. (2021) analyze the scheduling problem for mobile teams in medical screening.The problem clusters villages that can be easily incorporated into a daily schedule in terms of the travelling distance between them.It then assigns each mobile team to one cluster in each planning period.A key difference is that they consider how to optimize schedules (i.e., which sites to visit in which month), whereas our work considers the more tactical question of how to design frequency policies.They consider a curve describing how health evolves between visits; their objective is to minimize the total expected disease burden over the planning horizon, that is, the area under the curve.In contrast, we consider the specific value of the curve for each visit, namely the number of clients.As a consequence, the functional form of the objective function differs.As in this study, De Vries et al. (2021) consider both optimal solution approaches and planning rules.
Related Scheduling Problems.Two other problems are mathematically closely related to the FDP.Chemical product line scheduling problems deal with one or multiple parallel machines that process multiple products (see Floudas and Lin (2004) for a review).The problem is to determine the frequency of production for each product and the length of each production period so as to minimize inventory and setup costs.Despite obvious differences between the concepts, this can be seen as similar to the FDP, if machines are seen as teams and products as sites.Frequency decisions affect the objective function, as applies to the FDP.The corresponding functional form of the objective function is, however, substantially different.
Maintenance scheduling problems (MSPs) consider a fixed number of machines to be maintained and a maximum number of maintenance visits in a given period of time (see Nicolai and Dekker (2008) for a review).If we substitute machines for sites and maintenance visits for site visits, the similarities are clear.However, the objective function is substantially different.In addition, our focus is on the development of simple policies, whereas the MSP literature typically considers exact approaches.

Data Description and Analysis
This section describes the data used in this study and analyses the effect of return time (the time since the last visit) on client numbers.The results are used in the modeling of the FDP in Section 5 and in the case study in Section 6.

Data Description
We use data on outreach visits from MSI. MSI is exceptional in the sense that it is one of the few humanitarian organizations to systematically capture data on operations and use it to improve policies, guidelines, and operations.
We use data from Marie Stopes Uganda from May 2015 to September 2017, from Marie Stopes Madagascar from January 2016 to March 2018, and from Marie Stopes Zimbabwe from January 2017 to September 2017.These countries were chosen by MSI because the data were of good quality.The start dates correspond to the start of systematic data gathering and the end dates to the time that access to the data was terminated.The cleaned datasets for Uganda, Madagascar, and Zimbabwe include 10,293 visits to 1581 sites, 10,498 visits to 1794 sites, and 732 visits to 243 sites, respectively.For each site visit, the datasets show the name of the site, the date, the team, and a list of clients.For each client, they list among others the age bracket, number of children, and the family planning method.We use this data to determine the total client volume and the return time in months for each site visit.Descriptive statistics are presented in Table 1.
For each site, the first visit included in the data is excluded, because the return time is unknown.We also exclude all sites that have only one visit for which return time is known.This excludes 23%, 46%, and 71% of the sites for Uganda, Madagascar, and Zimbabwe, respectively.With the Uganda data, we exclude visits where multiple teams go to the same site.These visits represent special events, such as youth focus days, which result in unusually high numbers of clients.Some exceptionally high numbers are also observed in Zimbabwe.Based on our conversations with MSI staff, we set the cut-off value for the number of clients that can realistically be seen by one team to 120.We exclude the 3.5% of visits that exceed this threshold.From the Madagascar data, we exclude the 0.4% of visits where no family planning methods are provided, because these are special events such as AIDS prevention or vaccination campaigns.

Effect of Return Time on Client Numbers
In this section, we aim to model the effect of the return time for visit v to site s, RT vs , on the client number for visit v to site s, CL vs .Specifically, we want to discover the shape and magnitude of the effect and the distribution of the error terms.We use the following general function to model CL vs : Here ɛ vs denotes an error term, which captures visitspecific factors for which accurate data are lacking (e.g., weather, marketing efforts, and the start time of a visit).Parameter m s represents the expected client volume obtained in site s for some "baseline return time" which we set at one month (without loss of generality; we return to this in Online Appendix C).Function fðRT vs Þ represents a return-time-dependent multiplier for the baseline client volume.For example, f(3) = 1.2 expresses that, when the return time is three months, the expected client volume is 20% higher than the expected client volume for the baseline return time.
As stressed during conversations with MSI outreach leads for Madagascar, Tanzania, and Sierra Leone and with HQ staff (see Section 2), client numbers are determined by many factors that are not related to return time, such as the population density of the area and marketing efforts.We therefore split fðRT vs Þ into a constant part and a return-time-dependent part: Here, α and β are non-negative constants.We estimate both parameters for Uganda, Madagascar, and Zimbabwe.The larger β, the more client numbers grow in return time, possibly because of word of mouth, community health workers, or renewals.α will indicate the percentage of clients reached when gðRT vs Þ ¼ 0. We consider three functional forms for gðRT vs Þ: RT vs , logðRT vs Þ, and ffiffiffiffiffiffiffiffiffiffi RT vs p .In line with input obtained from MSI staff, all forms can model client numbers as a strictly increasing function in return time.The logarithmic and square root functions model this increase to be diminishing over time.
The formulation reflects the implicit assumption that client numbers for different sites are mutually independent.As the distances between sites are large and use of health services tends to drop off sharply the farther clients have to travel, this assumption is realistic (see, e.g., Tanser et al. (2006)).Another implicit assumption is that the effect of return time is proportional to m s .The results presented in Online Appendix B suggest that this assumption is realistic.
Appendix A describes how we fit function ( 2) and how we deal with endogeneity (site characteristics affect return times) and heteroskedasticity (the variation in client volumes is larger for sites that attract more clients).In short, to handle endogeneity in return times, we first estimate β by relating withinsite variations in return times to within-site variations in client volumes.Plugging this estimate into function (2), we estimate α and m s using weighted least squares (WLS) with weights m s to reduce heterogeneity.Since fitting m s requires solving a complex nonlinear, non-convex optimization problem, we test two heuristic procedures in Online Appendix C, including using average historical client volumes (multiplied by a constant) as an estimate.We show that different procedures have hardly any effect on the resulting estimates and we therefore use the average historical client volumes as estimates for m s in the

Results
Estimates for α and β.Table 2 describes the estimated values of α and β for each function g(Á).The effect of return time on client numbers is significant and positive for all the countries considered.Surprisingly, the strength of this effect appears to be rather limited.For example, the "logarithmic model" estimates that increasing the return time from 1 month to 6 months increases the expected client volume by 6%, 35%, and 27% for Madagascar, Uganda, and Zimbabwe, respectively.In Uganda and Zimbabwe, the client numbers "build up" more strongly over time than in Madagascar.This could be because in Uganda and Zimbabwe more of the clients come to see the outreach team because of word of mouth or encouragement from community health workers, which takes time.In all three countries, however, a significant part of the client volume is not related to demand that "builds up" over time.The MSI staff to whom we presented these results, clarified that there is a large unmet need in many sites, so attracting clients is largely a matter of reaching this population through proper "demand generation."These results have important implications for visit frequencies, as we explore in Section 6.
Comparison of functionsgðRT vs Þ.The root mean square errors of calibration (RMSEC) presented in Table 2 are based on the residuals resulting from the weighted least squares regression, calculated as in (3).
We note that this metric does not differ greatly between the different functional forms gðRT vs Þ.For Madagascar, the three models perform almost equally well.The RMSECs corresponding to the "logarithmic model" are slightly lower for Uganda and Zimbabwe.We observe the same when considering the mean absolute (unweighted) error (MAE), calculated as the absolute difference between predicted and actual client numbers (i.e., calculated as ffiffiffiffiffi ffi m s p e vs , with e vs as defined in (3)).
The table also presents the (weighted) root mean square error of prediction (RMSEP) based on out-ofsample predictions.Specifically, the models were calibrated using all data except the last visit for each site.The fitted model was subsequently used to predict the client volume for the last visit.The RMSEP for the logarithmic model is lowest for each country, suggesting that this function provides the best overall fit among the datasets considered.
Error term distribution.To simulate the performance of various visit frequency policies (see Section 6), insight into the distribution of e vs is required.Using the distribution fitter of MATLAB version R2015b, we find that the shifted log normal distribution fits the residuals e vs most closely.Figure 2 shows the fit on the residuals for Uganda for gðRT vs Þ ¼ logðRT vs Þ.The calibrated parameter values are: μ = 1.955, σ = 0.310, and shift = −7.4.
We can summarize our findings as follows.First, the results indicate that the effect of return time on client numbers is statistically significant and positive for all the countries considered.Second, return time has a relatively limited effect on client numbers.Finally, we showed the error terms to approximately follow a log-normal distribution.The relations identified in this section will be used in the modeling of the FDP in Section 5 and in the case study in Section 6.

Model and Solution Methods
This section has three objectives: to derive insights on determinants of optimal visit frequencies, to develop methods to determine optimal frequencies, and to explain various (heuristic) policies for choosing visit frequencies.In Section 5.1, we formally model the FDP.Section 5.2 develops algorithms to solve the model and Section 5.3 analyses the resulting frequencies.In Section 5.4, we use insights from the algorithms developed in Section 5.3 to develop several frequency policies.

Model
The model considers one outreach team and a set of outreach sites S for which it is responsible.Decision variable y s represents the number of full-day visits to site s per month.The assumption of full day visits is generally met, because travel times in developing countries are long.Two small sites close together that are always combined in a single visit can be incorporated into the model as one artificial site.We optimize over y s only and leave out the exact planning of which site to visit on which day.We make this simplification because our work is about the strategic/tactical decision to choose a policy for visit frequencies.
We assume that the y s visits per month are evenly spread over time such that the return time in months, RT s , is 1=y s .Marie Stopes International is increasingly focusing on enhancing schedule regularity, as evidenced, for example, by this statement from MSI's "Success Model" handbook: A regular schedule increases the community's trust in the care that they will receive, especially for those who rely on MSI for the provision of short-term methods [STMs] and follow-up care.It also considers implementing tools and the policies presented in this work, which would likely also decrease variations in return times.There will nevertheless remain reasons to choose a return time that differs from 1=y s .Examples include bad road conditions, team vacations, and non-availability of the site.Section 6.2 therefore examines how our results are affected when return times are stochastic.Since return times can be formulated in days, y s is in practice close to continuous.We model y s as such.Parameter D represents the total number of full day site visits that can be scheduled in a month.In line with Section 4, we represent the relationship between the expected number of clients per visit in site s and y s by function m s Á ðα þ β Á gð1=y s ÞÞ with a site-specific multiplier m s .The number of visits to site s is constrained by an upper bound, UB s , and a lower bound, LB s .As explained in Section 2, these bounds ensure that the return time is large enough to adequately prepare for the visit and small enough to be medically and ethically acceptable.For MSI, LB s ¼ 1=3 (once per 3 months) when government providers can assure removals of contraceptive devices and provide access to short-term methods such as condoms, and LB s ¼ 1=6 otherwise.
The FDP can then be modeled as in model (M1).The objective (4a) maximizes the total expected number of clients reached in a month.Capacity constraint (4b) restricts the total number of visits, and (4c) and (4d) are boundary constraints.
Problem (M1) is a non-linear simple allocation problem with boundary constraints (Derman 1959).Throughout this study, we consider the typical case when ∑ s ∈ S LB s < D and ∑ s ∈ S UB s > D. Otherwise, the problem is infeasible or trivial.In the following

Solution Algorithms
In this section, we explain the algorithms used to solve (M1) to optimality for multiple functions gð1=y s Þ and analyze the characteristics of optimal frequencies of each functional form.Section 5.2.1 considers the simplest case, where client numbers are independent of return time: gð1=y s Þ ¼ 0. This may be realistic when services are relatively unknown (i.e., there is no natural build-up of demand) and where site-specific factors such as population size and efforts to market the visit determine the number of clients.Section 5.2.2 considers the simplest increasing effect in which client numbers are linearly dependent on return time, that is, gð1=y s Þ ¼ 1=y s .Section 5.2.3 considers the non-linear dependencies gð1=y s Þ ¼ logð1=y s Þ and gð1=y s Þ ¼ ffiffiffiffiffiffiffiffiffi 1=y s p .
5.2.1.Client Numbers Independent of Return Time.When client numbers are independent of return time and gð1=y s Þ ¼ 0, the problem becomes a knapsack problem with bounds.For this problem, the optimal solution can easily be found by a greedy algorithm (Kellerer et al. 2004).Online Appendix D describes this algorithm in the notation of this study.
In short, we initially set all visit frequencies at the lower bound.That leaves D À ∑ s ∈S LB s days to allocate.In optimality, we assign as many days as possible to the sites with the highest m s .We thus increase the visit frequencies of the "unsaturated" sites for which m s is highest to their upper bound until all available days are used.Consequently, at most one site will be assigned a frequency not equal to the lower or upper bound.

Client Numbers Linearly Dependent on
Return Time.If client numbers are linearly dependent on return time, that is, gð1=y s Þ ¼ 1=y s , we can rewrite objective function (4a) as The first term equals the objective function for constant demand.The second term is constant and can be discarded in optimization.What remains is an objective function of the form considered in Section 5.2.1.The same algorithm can thus be applied to find the optimal solution.There will therefore still be at most one site with an optimal frequency not equal to the lower or upper bound.
We use the iterative procedure from Bretthauer and Shetty (1995) to compute λ Ã , the optimal value of λ.
Online Appendix D describes this algorithm in more detail.The optimal frequencies are then y logÃ s ðλ Ã Þ and y sqrtÃ s ðλ Ã Þ, which can be in between the bounds.

Numerical Insights into Optimal Visit Frequencies
The algorithms in Sections 5.2.1 and 5.2.2 provide a valuable insight: There is at most one site with an optimal frequency different from the lower or upper bound if gð1=y s Þ ¼ 0 or gð1=y s Þ ¼ 1=y s .Under a nonlinear dependency, such insights cannot be derived directly from the algorithm referred to in Section 5.2.3.In the remainder of this section, we use that algorithm to solve several small problem instances to gain insight into the characteristics of an optimal solution.
In particular, we assess to what extent optimal solutions resemble lower and upper bound frequencies and how this depends on context.We specifically consider three contextual factors.The first is the magnitude of the effect of return time on client numbers, i.e., the value of the parameter β.We compare optimal frequencies for the values of β fitted for Uganda and Madagascar in Section 4, since these values were the highest and lowest, respectively.Second, we examine how frequencies are affected by the level of variation in m s .We consider three instances.Instance 1 has 40 sites with m s ¼ 2:5, 5, ..., 100.Instances 2 and 3 have the same number of sites but m s ¼ 31, 32, ..., 70 and m s ¼ 40:5, 41, ..., 60, respectively.Third, to study the effect of capacity on optimal frequencies, we consider what happens when D = 20 (240 days p. year) and D ¼ 16 2 3 (200 days p. year).Furthermore, we set LB s ¼ 1=6 and UB s ¼ 1 for all sites s, which corresponds to return times of between one and six months.
We compute the optimal frequencies for each instance using gð1=y s Þ ¼ logð1=y s Þ.Table 3 shows, for each instance, the number of frequencies on the lower bound, the upper bound, and in between.The results are similar for gð1=y s Þ ¼ ffiffiffiffiffiffiffiffiffi 1=y s p , and the corresponding table is provided in Online Appendix D.
Table 3 shows that for several sites the frequencies are not set to the bounds.This happens more frequently in Uganda than in Madagascar.The reason is that the value of β is higher for Uganda.The higher the value of β, the greater the increase in client numbers on the return visit.The model then "deviates" more from the constant model, where it is optimal to set frequencies to the bounds.Additionally, the numbers show that a larger variation in m s , as in Instance 1, results in more frequencies on the bounds.To explain the rationale, let us consider a situation in which there is just enough capacity to serve one of two sites, s 1 and s 2 , with the upper bound frequency and the other with the lower bound frequency.If baseline client volume m s 1 ¼ 20 and m s 2 ¼ 80, then it is intuitively clear that serving s 1 with the lower bound frequency and s 2 with the upper bound frequency is optimal.However when m s 1 ¼ 50 and m s 2 ¼ 51, assigning the lower bound frequency to s 1 and the upper bound frequency to s 2 is not optimal.The reason lies in the concave shape of gðRT vs Þ: Increasing the frequency of s 1 by Δ would lead to a relatively large increase in client volume, and decreasing the frequency of s 2 by Δ would lead to a relatively small decrease in client volume.
In Online Appendix E, we also show that frequencies that are not on the bounds are about equally distributed across the interval [LB, UB].That is, they are not concentrated around the bounds.This suggests that it is beneficial to develop frequency policies which allow for at least one frequency value between the bounds.
Finally, the number of sites for which the assigned frequency is not on the upper or lower bound does not seem to depend strongly on capacity.In all instances, a decrease in capacity of 40 days makes hardly any change to this.In short, under a logarithmic effect of return time on client numbers, the optimal frequencies seem to differ most strongly from the bounds when the variation in m s is small and the effect of return time on client numbers is large.

Policies
In this section, we use insights from Section 5.3 to develop frequency policies.All our data-driven policies rely on an estimate of the site-specific multiplier m s .We consider two ways to estimate this, which define two types of policy.The first type estimates m s as the historical client average.This is simple and close to current practice, where teams tend to consider past client numbers informally-that is, they do not correct for the impact that return times had on historical client numbers.We refer to these policies as "ignore return time" (IRT) policies.The second type of policy does make this correction and we refer to these policies as "correct for return time" (CRT) policies.CRT policies estimate m s by ms as defined in (7).Here, V s is the set of the V s previous visits to site s.
This is an unbiased estimator that assumes client numbers CL vs are determined by (2) 4 .CRT policies are considered more complex, because they require estimates for α and β and more complex calculations.
Given a choice on how to estimate m s , various frequency policies are possible.We next explain four of them in increasing order of complexity: the equal policy, two-category policy, three-category policy, and infinite category policy.For ease of exposition, we consider the case when all sites have the same lower bound LB s ¼ LB and upper bound UB s ¼ UB.The policies can easily be adjusted to the case when bounds differ per site.
Equal Policy (EP).In this baseline policy, each team divides its monthly capacity of D days equally across all sites for which it is responsible.Each site is assigned the same frequency D=jSj.
Two-Category Policy (TwoCP).The previous section shows that where client numbers are not dependent on return time or have a linear connection to it, all frequencies are set to the bounds with at most one exception per team.Our numerical experiments show that many frequencies are also set to the bound when client numbers depend nonlinearly on return time.This leads to the hypothesis that a TwoCP-a policy that specifies two frequency categories-could perform in a close to optimal way.In such a policy, sites are assigned to either the low-frequency category (LFC), that is, y s ¼ LB, or the high-frequency category (HFC), that is, y s ¼ UB.The policy initially assigns all sites to the LFC and reassigns them to the HFC in decreasing order of estimated m s for as long as capacity allows.This may leave some spare capacity after the last reassignment.Assuming the upper bound is a hard bound, we shall increase the frequency of sites in the LFC to slightly above the lower bound so all capacity is fully utilized.Three-Category Policy (Three-CP).Under nonlinear dependencies of client numbers on return time, the optimal frequencies can deviate from the bounds, as shown in Section 5.3.It might therefore be desirable to include a middle-frequency category (MFC).It is possible to optimize the size of the MFC and its corresponding frequency, but this would be a study in its own right.We therefore fix the frequencies corresponding to the three categories, and suggest that a national or team guideline should be issued on the percentage of sites to allocate to the MFC.
Such guideline on the percentage of sites in MFC can, for instance, be developed by solving model (3C), which models the problem of allocating sites to frequency categories.
Here we define C ¼ fLFC; MFC; HFCg as the set of categories and y c as the frequency corresponding to category c ∈ C. Furthermore, a sc represents the expected number of clients for site s if this site is in category c, which we estimate through formula (2).Binary variable x sc denotes whether site s is in category c.In the case of team-specific guidelines on the percentage of sites in the MFC, model (3C) can be solved for each team.For national guidelines, organizations can solve (3C) for each team and use the aggregated percentage.
Given the percentage of sites per category, the assignment of sites to the HFC and then to the MFC and LFC is done in decreasing order of estimated m s .Implementation is discussed in more detail in Section 6.
Infinite Category Policy (InfCP).The InfCP applies the exact algorithms from Section 5.2 using estimates for m s .This policy can recommend all frequencies between the lower and upper bounds.It is of course more difficult to implement.

Optimality Gaps
We now analytically prove a bound on the optimality gap when using two frequency categories instead of infinitely many and then generalize this to the case when k frequency categories are distinguished.We specifically consider the realistic case when gðRT s Þ is concave and first derive the following lemma (see Appendix A for the proof): LEMMA 1. Objective function (4a) is concave in y s for any function gðRT s Þ that is concave in RT s .
In addition, we consider the realistic scenario when visiting a site more frequently never leads to fewer clients in that site in total and state without proof (it follows when applying the product rule) that: We note that, for the fitted values of α and β presented in Section 4.3, this condition is always met.
Let us write the objective function again as ∑ s m s fðy s Þ, let f 0 ðyÞ denote the derivative of function f(y), and let ỹðδÞ ¼ fy : f 0 ðyÞ ¼ δg-the point where the derivative equals δ.Furthermore, let n c denote the optimal number of sites to be assigned to frequency category c ∈ {1, . .., k}, y c the corresponding frequency with y 1 ¼ LB and y k ¼ UB, and S c the set of sites assigned to category c.Assume that this leaves no spare capacity.Assume further that that the decision maker knows the exact value of m s and assigns the n 1 sites with the lowest value of m s to the lowest frequency category, the n 2 remaining sites with the lowest value of m s to the second lowest frequency category, etc. Proposition 1 states the corresponding optimality gap for the two-category policy.
PROPOSITION 1.The optimality gap of TwoCP, measured as the absolute difference with the optimal InfCP solution value, is bounded from above by ∑ s ∈S m s Δ, where Δ is calculated as fðỹ δ ð ÞÞþ We refer to Appendix B for the proof, which utilizes the two conditions stated in Lemmas 1 and 2. Specifically, we replace the objective function by a linear "outer approximation."The optimal solution is the same as the TwoCP solution and its value provides an upperbound on the optimal solution value.Taking the difference between this value and the "real" value of the TwoCP solution yields the optimality gap.We apply the same approach to generalize the result to the case when there are k frequency categories.
PROPOSITION 2. The optimality gap of a policy that distinguishes k frequency categories, measured as the absolute difference with the optimal InfCP solution value, is bounded from above by ∑ c∈ C ∑ s ∈ S c m s max c∈ C fΔ c g, were, Δ c is calculated as fðỹ δ c ð ÞÞþ For the case study presented in the next section, with fðy s Þ ¼ y s ðα þ βlogð1=y s ÞÞ, α = 0.8475 and β = 0.1665 (i.e., the fitted function for Uganda), we derive that this corresponds to a worst-case optimality gap of only 7.7% for TwoCP and 4.0% for ThreeCP, measured as a percentage of the InfCP solution value.This worst-case gap is not tight and is hence likely even smaller in reality.This is a very important result, since a large worst-case gap would likely be seen as a barrier in practice.
We do note that these results are obtained for the case when the decision maker knows the exact value of m s , whereas in practice it must be estimated.The next section therefore examines the optimality gap numerically.

Case Study: Outreach Teams of Marie Stopes Uganda
In this section, we study the performance of the policies proposed in Section 5.4, using Marie Stopes Uganda as a case study.We use real data from 10,293 visits to 1581 sites in Uganda.We first explain the simulation model before presenting the results.

Simulation Model
The simulation model uses function (2) with the parameters α, β, m s and the error distribution found in Section 4.3 to simulate client numbers.It assumes a logarithmic dependency of client numbers on return times, because Section 4.3 suggests that this functional form provides the best fit.Our simulations follow the outline depicted in Figure 3.They start by simulating n historical data points which the policies take as input.We change the value of n in our experiments to assess the impact of the amount of data available.For each historical data point, we first randomly draw a historical return time from our data for the corresponding site and use the error distribution fitted in Section 4.3 to draw an error.Function (2) transforms these return times and errors into simulated client volumes.In Step 2, the policies use these simulated client numbers to estimate m s for each site s.Such estimates can differ from the value of m s used by the simulation model-which we refer to as the "actual" value of m s -due to stochasticity and the estimation method used by the policy.We refer to these cases as imperfect information, because m s is estimated.For comparison reasons, we also consider the case of perfect information in which the actual value of m s is known.In Step 3, the policies subsequently use these estimates to determine frequencies.Finally, we use function (2) in Step 4 to compute for each policy the expected average number of clients per visit when the frequencies recommended by the policy are implemented.To obtain a precise estimate of expected average client numbers, we repeat these steps 100 times.Appendix C explains each of the four steps in more detail.

Simulation Results
We first assess the trade-off between the complexity and effectiveness of policies when historical return times are used in Step 1.Second, we analyze how the effectiveness of the policies depends on the amount of historical data available for each site.Third, we analyze the sensitivity of results to different simulated return times.
Comparison of Performance of Policies.Figure 4 shows the average number of clients per visit that can be served when applying the various policies for n = 5 as well as the current 5 average number in the data.The policies that assign different frequencies to different sites clearly perform better than either the equal policy (EP) or the current situation.Interestingly, the simplest differentiation policy, the two-category policy with a simple estimate for m s (TwoCP-IRT), already yields over 7% more clients than the current situation.Moving from two to three categories leads to another 1% increase.The effect of more complex policies is relatively small.Moving from three to an infinite number of categories results in a further increase of less than 0.4%.The increase in client volume achieved by correcting for historical return times when categorizing sites is less than 0.5% for all policies.
Although the increases in client numbers might seem small, even a small increase per visit mean that, over the course of a year an enormous number of additional families can be provided with family planning services.With D = 20 and 24 teams, the increase in client numbers when moving from the current situation to TwoCP-IRT corresponds to an increase of over 12,000 clients a year in Uganda alone.
Impact of Number of Observations.Many countries currently have little historical data.In Uganda, Zimbabwe, and Madagascar, for example, 36%, 91%, and 47% of the sites have five or less data points, respectively.Lack of data may induce reluctance to implement the proposed data-driven methodology.We therefore examine the impact of n, the number of observations available to estimate m s , on the effectiveness of policies.Figure 5 plots the average client number for n ∈ {1, 2, 5, 10, 20, 50}.Equal polices are not data-driven and therefore not influenced by limited data availability.For all other policies, effectiveness increases with n, but this increase is small when n is large.With n = 50, all policies perform to within 0.6% of the number of clients obtained when there is perfect information.
Even with n = 1, all policies outperform the EP by at least 3%.The ThreeCP consistently performs around 1% better than the TwoCP.It is interesting that the InfCP performs worse than the TwoCP and ThreeCP if n = 1.The policy appears to assign frequencies that are too specific, based on too little information.The InfCP only outperforms the ThreeCP when n ≥ 5.With perfect information, the TwoCP results in 11% more clients than is currently the case, the ThreeCP in close to 1% more clients than the TwoCP, and the InfCP in less than 0.5% more than the ThreeCP.The effect of correcting for return time diminishes with n from around 0.5% when n = 1 to around 0.25% when n = 50.
The importance of correcting historical client volume for return times.We previously used historical return times in the first step of the simulation.Once policies are in use, the return times will be very different.We therefore also examine how the policies perform when in use.Instead of sampling from actual historical return times in step 1 of the simulation model, we consider the case when return times were optimized by the policy itself 6 .All other simulation steps remain the same.Figure 6 shows the resulting average client numbers per visit.When n = 1, the policies that include return times now reach over 5% more clients than their counterparts that ignore return times.Even for n = 50, the TwoCP-CRT, ThreeCP-CRT, and InfCP-CRT perform better than their counterparts by 3.5%, 2.0%, and 1.7%, respectively.
These results can be explained by the observation that the variation in client numbers will decrease when mobile teams start using frequency policies.This happens because sites with a high m s are visited often, resulting in relatively low client numbers, and the opposite happens for sites with a low m s .This leads to a bias in m s estimates used in IRT policies: When c m s is based solely on client numbers, the baseline client volume m s for sites that are visited very frequently will be underestimated and the volume for  sites that are visited much less often will be overestimated.The importance of correcting historical client volume for return times therefore increases when frequencies are already data-driven.

Sensitivity Analyses
Our analyses and policies implicitly make several simplifying assumptions.This section summarizes the results of extensive sensitivity analyses presented in Online Appendix G.Return Time Stochasticity.Our evaluation of frequency policies implicitly assumes that each site is visited with a fixed time interval.As explained in Section 5.1, return times are stochastic in reality, which may bias our evaluation.Our results, however, show that this bias is very small and that it hardly affects the optimality gap 7 .
Client Characteristics.Family planning organizations perceive a.o.young clients and adopters as key target groups while our policies assign equal weight to each client.Our results show that optimality gaps are hardly affected when the "real" weight of young clients and adopters is significantly higher.The explanation is that differences in the percentage of young clients and adopters among a given team's sites are typically minor.
Biased Progression Rates.The strength of demand progression, captured by parameter β, has to be estimated from historical data.Our analyses show that the policies are robust to significant overestimation of β.The explanation is that the policies set many frequencies to the bounds, which is optimal when β = 0.The policies are also insensitive to substantial underestimations of β.TwoCP appears more sensitive than the other policies.The reason is that, when β is large, it becomes relatively attractive to set many frequencies between the bounds (see Section 5.3).
Heterogeneous Progression Rates.Our models implicitly assume that parameter β is the same across all sites.Due to differences in client characteristics, however, demand progression may differ per site.We therefore examine the performance of our policies when demand progression is represented by sitespecific constant β s .As expected, optimality gaps increase with the amount of heterogeneity in β s , but this increase is rather minor, even when the level of heterogeneity is substantial.The reason appears to be that, for many sites with a high (low) value baseline client volume, even a substantially biased estimate of m s correctly identifies the site as one to be assigned to the high frequency category (low frequency category).
Functional Form of gðRT s Þ. Section 4.3 suggests minor differences in the fit of various functional forms for the relationship between client volumes per visit and return time.Our results show that the performance of our policies is hardly impacted when the "real" functional form differs from the assumed one (logarithmic).The explanation appears to be that, for the fitted values of α and β, the "distance" between the logarithmic function and the other functions is small.

Conclusion and Discussion
Mobile outreach teams are crucial to scale up access to family planning in rural areas and urban slums.Gaps in funding mean that mobile teams have to reach more people with fewer resources, without sacrificing the quality of the service.This study contributes to this endeavor by studying the frequency determination problem (FDP) or how to determine site visit frequencies for mobile outreach teams in order to maximize client volume per outreach day.Visit frequencies have an upper and lower bound based on medical, ethical, and planning concerns and are further constrained by the number of days each mobile team is available for site visits.We specifically consider the FDP in the context of family planning and use data from Marie Stopes International (MSI).
The complexity of the FDP lies in how the frequency of site visits affects client numbers, specifically through the return time (i.e., the time since the last visit).We use MSI data for three different countries to analyze the strength of this effect.We find that the client numbers are higher when the return time is longer, but that the magnitude of this effect differs across the countries considered.For example, we estimate that increasing the return time from one month to six months increases the expected client volume by 6%, 35%, and 27% for Madagascar, Uganda, and Zimbabwe, respectively.
We examine how optimal frequencies depend on the shape and magnitude of the effect of return time on client numbers.When client numbers are nonlinearly dependent on return times (we consider a logarithmic and square root shape), the optimal frequencies can be largely on the bounds, depending on various factors.They deviate widely from the bounds when the effect of return time on client numbers is large-that is, when demand progresses strongly over time-such that the objective function differs most strongly from the linear case.Deviations from bounds are also more frequent when there is only a small variation in the so-called baseline client volume-the expected client volume for a given baseline return time.In extreme cases, when all sites have the same baseline client volume or when demand increases very strongly with the return time, it can be optimal to set none of the frequencies to the bounds.These results have important implications for outreach program managers.First, they show that in none of the realistic functional forms and parameter settings considered it is optimal to only have one frequency category (i.e., to assign the same frequency to each site).Second, they indicate when it is beneficial to have three or more frequency categories: when demand progresses strongly over time and when there are large variations in baseline client volumes across sites.
We used these insights to develop simple frequency policies that use only two (TwoCP) or three (ThreeCP) different frequencies and present analytical results on their worst-case performance for the case when decision makers know the exact value of the baseline value of each site.For our case study on Marie Stopes Uganda, they reveal a worst-case optimality gap of 7.7% for TwoCP and 4.0% for ThreeCP.This is an important finding, as a large optimality cap could be seen as a barrier to adoption of the policies.
We also tested the policies' effectiveness in an extensive simulation analysis for Marie Stopes Uganda.The main takeaway for outreach program managers is that the effectiveness of mobile teams can increase substantially when visit frequencies are based on historical client numbers.We estimate that the expected number of clients can increase by over 7% compared to the current number when using policies that base visit frequencies on historic client numbers.An increase on this scale would correspond to more than 12,000 additional families that could be provided with family planning services over a year in Uganda alone.Though reaching this number does not require dropping sites altogether, it does require decreasing visit frequencies for sites with few clients-a difficult decision to make.
Interestingly, we find that simple planning policies yield solutions that are close to those of exact methods.For example, the results from a policy with two possible frequencies are less than 2% below those obtained using exact methods.This is important, because such a policy fits well with the decentralized decision making culture in humanitarian organizations and because the implementation, training, and maintenance costs of advanced tools are high (De Vries and Van Wassenhove 2020).A simple policy might thus be more cost-effective.It would also allow local contextual knowledge and experience to be incorporated into the decisions, and help to safeguard the teams' autonomy.
A second finding is that the effectiveness of policies depends strongly on the variation in historical return times.We analyzed policies that correct historical client volumes for return times, and found that when there is little variation in historical return times, these perform less than 0.5% better than policies that do not correct for this.The latter are easier to implement as they only require the outreach team to extract the average historical client volume (i.e, no additional calculations).In addition, such policy resembles the informal decision rules observed in practice.This suggests that the "costs" of correcting historical client volumes for return times outweigh the benefits when there is little variation in historical return times.However, when the variation increases, which happens once data-driven frequency policies are implemented, correcting for return times can increase expected client numbers by more than 5% 8 .This, then, calls upon outreach program managers to develop ways to facilitate such correction.It could be done by developing a simple spreadsheet model, integrating this in existing software, or through a correction table.Additionally, awareness of this matter can be raised in training sessions.
Third, we find that data-driven frequency determination performs well even when little historical data is available.Even if data on one site visit is available for each site, all data-driven policies outperform current practice.This has important implications for the uptake of this approach to determine visit frequencies, because limited data availability is often seen as a barrier in practice.
As explained, a strength of our policies is that they are easy to adopt and require little data.Implicitly, they thereby make several simplifying assumptions: (1) the strength of demand progression (captured by β) is the same across sites, (2) young clients and adopters have the same weight as other clients, and (3) return times are constant.Our sensitivity analyses show that the performance of the policies is hardly affected by these assumptions.We also show that performance is robust to over-or underestimating demand progression and misspecifying the functional form of the relationship between return time and client volume.These findings are important for practitioners, as they reduce/take away potential barriers for implementation.
A limitation of this study is that we assume that the part of the client numbers not affected by return time will be constant over time, regardless of the policy.This is realistic in the short and medium term, but in the long term there may be fewer clients in need of family planning around a site that is visited often, because the needs of many have been met in earlier visits.Reversely, the very young populations in developing countries could lead to large cohorts entering their reproductive lives and hence increase the need.Other aspects of a policy besides visit frequency can influence long-term client numbers.One is trust in mobile teams.In all the policies considered in this study, a site is visited with the same frequency over time.The resulting predictability of future visits increases trust in the organization and thus client numbers.Future research is needed to analyze the magnitude of such long-term effects.Follow-up research is also needed to study the relationship between return times and health or inconvenience.Our model captures this relationship in a dichotomous manner: a return time is either medically and ethically acceptable or not.Future empirical research could develop more precise models and subsequently test the performance of guidelines that solely consider client volumes.
Teams may have reasons to deviate from the recommended return times on some occasions.The weather, accessibility, security, a market day or vaccination day, for example, may prompt them to visit a site sooner or later than recommended by the policy.Visits also need to be planned with the outreach sites (often local health facilities), which can impose additional constraints.This clearly has an impact on performance.In future research, we hope to pilot our policies and assess them empirically, thereby closing the modeling cycle.If the pilot confirms our findings and leads to a global roll-out of the policy, it could substantially increase the number of woman with access to family planning.As mentioned, one dollar invested in family planning is estimated to save between two and six dollars in interventions aimed at meeting other SDGs.Despite convincing statistics, however, there continue to be major funding gaps.Our study exemplifies how knowledge and tools from the operations management discipline can help to scale up access without additional investments, and can thus aid progress towards many of the UN sustainable development goals.
The concavity of f implies that f(y) ≥ f(y).Replacing objective function (4a) by maximize∑ s∈ S m s fðy s Þ therefore yields an upper bound on the optimal solution value.The corresponding problem is trivial: the linear, non-decreasing objective function makes is optimal to greedily assign the n HFC sites with the highest value of m s to HFC (see Section 5.2.2) yielding solution value ∑ s ∈ S LFC m s fðy LFC Þ þ∑ s∈ S HFC m s fðy HFC Þ.
Before we finish our proof, we note that: Using these results, we now finish our proof: for c < k.The line segment itself is represented by f c ðyÞ ¼ fðy c Þ þ δ c ðy À y c Þ for y ∈½y c , y cþ1 .The concavity of f(y) implies that f (y) ≥ f(y).We now construct function f(y) ≥ f(y) by "shifting" function f(y) up by some constant Δ: f(y)= f(y)+Δ.We represent the "shifted" line segments by f c ðyÞ ¼ f c ðyÞ þ Δ.Note that line segment c needs to be shifted up by at least Δ c to ensure that f c ðyÞ ≥ fðyÞ, where Δ c ¼ fðỹ δ c ð ÞÞþ y c À À y δ c ð ÞÞδ c À fðy c Þ Hence, setting Δ ¼ max c ∈S fΔ c g ensures that f (y) ≥ f(y).Replacing objective function (4a) by maximize∑ s ∈ S m s fðy s Þ therefore yields an upper bound on the optimal solution value.The corresponding problem is trivial.Since f(y) is assumed to be nondecreasing in y s , it is optimal to assign the n k sites with the highest value of m s to category k, the n kÀ1 with the highest value of m s among the remaining sites with to category k−1, etc.This yields solution value ∑ c ∈C ∑ s ∈S c m s fðy c Þ ¼ ∑ c∈ C ∑ s ∈S c m s ðfðy c Þ þ ΔÞ.The difference with the lower bound solution hence amounts to ∑ c∈ C ∑ s ∈ S c m s Δ, which completes the proof.

Appendix C. Simulation Details
Step 1: Simulate historical return times and client numbers.In step 1, we simulate client numbers using Equation ( 2) with the actual m s and the α, β and estimated error distribution found for Uganda in Section 4. We simulate client numbers for n visits per site.We initially randomly draw return times from the set of historical return times for that site.
Step 2: Estimatem s for each site.In the perfect information case, we use the actual m s for each site.In the imperfect information case, we estimate m s as the average simulated client number for IRT policies and by Equation ( 7) for CRT policies.
Step 3: Apply policies to find frequencies.The policies explained in Section 5.4 take the estimated values of m s and return the recommended visit frequency per site.We next state the parameter choices and then the implementation details for ThreeCP.As explained in Section 2, we use an upper bound of one visit per month and a lower bound of one visit per six months.We set D equal to 20 days per month.
We next explain the implementation of the ThreeCP.For the middle category we choose a visit frequency of one visit per 3 months: MB = 1/3.This value is chosen because access to short-term family planning methods (such as condoms and birth control pills) should ideally be provided every three months.We solve model (3C) to derive a national guideline for the percentage of sites in the MFC.For a given team, we operationalize this guideline as follows: Let x be the number of sites this team should assign to the MFC, according to the national guideline.As this number may be fractional and may not utilize all the capacity, we propose rounding this number to the closest multiple of five.Starting from the allocation of the TwoCP, the policy then moves the site from the UFC with the lowest estimated value of m s to the LFC (which frees up a capacity of UB − LB = 5/6 days per month) and then moves the five sites from the LFC with the highest estimated value of m s to the MFC (which costs MB − LB = 1/6 days per site per month, thus 5/6 days in total).
Step 4: Compute expected client numbers.For the case of imperfect information, we simulate client numbers, estimate m s and compute frequencies.We then compute the expected client number per month per site, ECL s , by multiplying the frequency with the expected client number per visit from (2).The total client number per month is then computed by summing across all sites, ∑ s ∈S ECL s .We repeat this process 100 times and average this total over all 100 times.Dividing this average by the total number of monthly visits (i.e., D = 20 times the number of teams) then yields the average client number per visit.

Notes 1
The data concern married or in-union women only.Women are defined to have an unmet need for family planning if they indicate that they want to stop or delay childbearing but are not using any method of contraception.This follows when dividing both sides from (2) by ðα þ βgðRT vs ÞÞ, subtracting ɛ vs from both sides, and taking the expectation on both sides.5 This current average client number per visit is the average across all client numbers in the cleaned dataset.The official constraints in the current situation are somewhat less restrictive than those used in the models; in practice frequencies are currently not always within the bounds.

6
That is, the return times suggested by the policy when the actual value of m s is known.7 I.e., the gap with the exact InfCP solution value, measured as a percentage of the latter 8 N.B., these percentages were obtained for a specific organization and country and might differ in other situations or contexts.

Figure 1
Figure 1 Current Alignment of Return Time with Demand.Each Dot Represents a Site.The Bottom Left Circle Highlights Sites with a Low Client Volume and Low Average Return Time.The Other Circle Highlights Sites with a High Client Volume and High Average Return Time.Shifting Resources from the Former to the Latter would Seem Likely to Improve Effectiveness.[Color figure can be viewed at wileyonlinelibrary.com]

Figure 2
Figure 2 Shifted Log Normal Distribution Fitted to Errors for Uganda

FigureFigure 4
Figure 3 Overview of Simulation

Figure 5 Figure 6
Figure 5 Average Number of Clients Per Visit for Multiple Values of n and all Policies

∑
s ∈S LFC m s ðfðy LFC Þ À fðy LFC ÞÞ þ ∑ s∈ S HFC m s ðfðy HFC Þ À fðy LFC ÞÞ 2. Consider the solution of the policy that distinguishes k frequency categories.It assigns the n k sites with the highest value of m s to category k, the n kÀ1 with the highest value of m s among the remaining sites with to category k − 1, etc.The corresponding solution value provides a lower bound on the optimal solution value, and is given by ∑ s ∈S ∑ s ∈S c m s fðy c Þ. Now consider piece-wise linear function f(y) with breakpoints ðy c , fðy c ÞÞ.This function hence has k − 1 line segments.Let δ c represent the slope of segment c, calculated asfðy cþ1 ÞÀfðy c Þ y cþ1 Ày c 2

3
This number is based on 500 teams who work at least 200 days a year and see on average 25 clients per day. 4

Table 1
Descriptive Statistics of the Datasets De Vries, Swinkels, and Van Wassenhove: Visit Frequency Policies next sections.Section 6.2 shows that the main insights we present later are insensitive to biased parameter estimates.

Table 2
Estimated WLS Parameters (Standard Error between Parentheses), Root Mean Squared Error of Calibration (RMSEC), Mean Absolute Error (MAE), and Root Mean Squared Error of Prediction (RMSEP) for Multiple Functional Forms Notes: All parameters are significant at a 1% significance level.Lowest value among the functional forms in bold.

Table 3
Frequencies of Sites in Several Instances under gðRT vs Þ ¼ logðRT vs Þ, LB s ¼ LB, and UB s ¼ UB