A multi-program analysis of cleft lip with cleft palate prevalence and mortality using data from 22 International Clearinghouse for Birth Defects Surveillance and Research programs, 1974–2014

Background: Cleft lip with cleft palate (CLP) is a congenital condition that affects both the oral cavity and the lips. This study estimated the prevalence and mortality of CLP using surveillance data collected from birth defect registries around the world. Methods: Data from 22 population- and hospital-based surveillance programs affiliated with the International Clearinghouse for Birth Defects Surveillance and Research (ICBDSR) in 18 countries on live births (LB), stillbirths (SB), and elective terminations of pregnancy for fetal anomaly (ETOPFA) for CLP from 1974 to 2014 were analyzed. Prevalence and survival (survival for LB only) estimates were calculated for total and subclassifications of CLP and by pregnancy outcome. Results: The pooled prevalence of total CLP cases was 6.4 CLP per 10,000 births. The prevalence of CLP and all of the pregnancy outcomes varied across programs. Higher ETOPFA rates were recorded in most European programs compared to programs in other continents. In programs reporting low ETOPFA rates or where there was no ascertainment of ETOPFA, the rate of CLP among LB and SB was higher compared to those where ETOPFA rates were ascertained. Overall survival for total CLP was 91%. For isolated CLP, the survival was 97.7%. CLP associated with multiple congenital anomalies had an overall survival of 77.1%, and for CLP associated with genetic/chromosomal syndromes, overall survival was 40.9%. Conclusions: Total CLP prevalence reported in this study is lower than estimates from prior studies, with variation by pregnancy outcomes between programs. Survival was lower when CLP was associated with other congenital anomalies or syndromes compared to isolated CLP.


| INTRODUCTION
Cleft lip with cleft palate (CLP) describes a congenital condition that affects both the oral cavity and the lips (Kadir et al., 2017). The condition is a result of the failure of the left and right palatal shelves and lips fusing during the first 9 weeks of fetal development (Berkowitz, 2013). CLP can arise as part of a syndrome or as an isolated disorder and the causes behind CLP are thought to be due to a range of both genetic and environmental factors (Berkowitz, 2013;Cobourne & Sharpe, 2012). The degree of clefting varies from case to case and does not affect each person equally. This article focuses on undifferentiated CLP for which current estimates of prevalence are 1.7 per 1,000 live births (LB) (Mossey, Little, Munger, Dixon, & Shaw, 2009).
The prevalence data available for orofacial clefts (OFC) vary internationally due to differences in ascertainment ability, registry resources, and comparability of the conditions classified in reported studies. A European study carried out across 17 different nations demonstrated variation between 6.3 and 26.2 per 10,000 births for all orofacial clefts (cleft palate or cleft lip +/− cleft palate) (mean prevalence 15.2 per 10,000 births) (Calzolari, Rubies, Neville, & Bianchi, 2002). As with many conditions, high-income countries have a greater ability to conduct birth defect surveillance due to more advanced health systems and centrally organized registries (Swanson, 2021). In low-and middle-income countries, the resources available for birth defect surveillance are reduced, which impacts data availability and prevents accurate international comparisons and inferences (Cobourne & Sharpe, 2012).
Mortality of infants born with OFC is associated with the lack of access to appropriate care and surgical intervention (Cobourne & Sharpe, 2012). The diagnosis and treatment available for children with OFC varies internationally, leading to inequalities in health outcomes (Mossey et al., 2009). Understanding where mortality rates are high could help to target further research and interventions to reduce mortality, improve quality of life, and provide greater equity of care. Prevalence data from multiple countries would guide future research to identify risk factors, policies, or ascertainment methods that give rise to variation globally, including nutrition/ fortification policies, policies regarding early termination of pregnancy for fetal anomaly ETOPFA, prenatal care arrangement, and prevalence of underlying genetic/chromosomal anomalies in the parent population. Understanding more about these associations could enable development and testing of preventative interventions.
The International Clearinghouse for Birth Defects Surveillance and Research (ICBDSR) was founded in 1974 and is affiliated with the World Health Organization. It has a stated mission to "bring together birth defect programs from around the world with the aim of conducting worldwide surveillance and research to prevent birth defects and to ameliorate their consequences" (ICBDSR, n.d.). The ICBDSR includes 42 programs spread across the world with a mixture of population-and hospital-based registries. Data collected by ICBDSR programs enable analysis of the prevalence, pregnancy outcomes, and survival for a range of congenital anomalies on an international basis.
The aim of this retrospective cohort study was to analyze undifferentiated CLP birth surveillance data from participating ICBDSR programs to estimate the prevalence and survival of CLP by pregnancy outcomes while identifying areas for improvement in data collection processes for this type of study.

| METHODS
The structure and content of this article is informed by The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement: guidelines for reporting observational studies (von Elm et al., 2007).

| Case definition
The primary congenital anomaly reported in this study is undifferentiated CLP. This includes all CLP cases including isolated CLP (no other orofacial anomalies identified), CLP associated with multiple congenital anomalies, and CLP associated with syndromes. Isolated cleft palate and isolated cleft lip have been reported separately to this data set and are, therefore, not included here. Cases included all identified conceptions, resulting in an individual with CLP, regardless of outcome. Where data were available, subclassifications including isolated CLP, CLP associated with multiple congenital anomalies, and CLP associated with syndromes are reported as mutually exclusive categories.
Keeping with accepted terminology, birth prevalence is used in this article to describe the point prevalence of CLP in discrete populations included in ICBDSR programs (Mason, Kirby, Sever, & Langlois, 2005). Mason et al. (2005) suggested using total births alone as the denominator, but those data were not available in this study, so a slightly modified equation was used. Birth prevalence was calculated as follows: Birtℎ prevalence = Number of cases T otal Live birtℎs + T otal Stillbirtℎs .

| Data source
All ICBDSR programs were invited to participate. Twenty-three birth surveillance programs from 18 countries provided data covering a range of time periods within the date range 1974-2014. Each program returned a single data set except for the Registry of the Spanish Collaborative Study of Congenital Malformations (ECEMC). ECEMC provided data from two different hospital cohorts, one reporting ETOPFA and the other where data regarding ETOPFA was explicitly not recorded. The ECEMC data sets have been treated separately in the data analysis.
The programs are a mix of population and hospital-based registries. Raw data were provided in the form of MS Excel documents with cases classified by pattern into the following categories: isolated CLP, CLP associated with multiple congenital anomalies, CLP associated with syndromes, and CLP unclassified. No program returned data for CLP unclassified and, therefore, analysis is focused on the other categories only. Further to this, all programs reported a "Total CLP" value that combines all the aforementioned subclassifications.
Data sets included varying amounts of data on LB, stillbirths (SB) and ETOPFA for each of the CLP subclassifications. The most complete data set across all programs was 'Total CLP' and this was selected for more detailed analysis with descriptive statistics presented for subclassifications where possible.

| Data quality assessment/Data analysis
Data were extracted and combined using the Microsoft Excel. Primary inspection and analysis of the data were conducted using R (R Core Team, 2021). The data were inspected and cleaned. Data anomalies identified in the reporting triggered dialog with the reporting programs for clarification and correction of errors where possible. Data quality issues were considered by NMG, GR, and PM following early data cleaning. Data sets with overwhelming data errors or omissions following attempts to clarify were excluded from the analysis (n = 1). Therefore, data from 22 surveillance programs amounting to 23 data sets have been included in the analysis.
The quantities of data and formatting of the data files varied considerably between programs, so automated importation was not practical and the data had to be manually imported one file at a time. The data set from each program was individually copied and pasted into a large Excel "master" file, which was of a format suitable for analysis in R. The data were inspected primarily through the use of the aggregate command and ggplot2 to produce summary statistic tables and graphs. Where possible, the data were further checked for obvious errors (e.g., extremely low or high prevalence or mathematical errors such as more deaths than reported cases in a given year).
Not all programs provided data for the entire observation period; therefore, data provided were averaged for the period that each program provided results. For example, when calculating prevalence, while the number of years returned varied, the denominator in all cases was the total number of LB plus SB reported by that program for all the years they returned data, and the numerator was the total cases observed during that same period.
Survival was calculated using data for LB only. Survival data are presented as percentages surviving at timepoints from <1 day to 5 years + where this was ascertained. Overall survival includes the timepoint survival data and any death confirmed but without a timepoint attached.

| Ethical consideration
ICBDSR programs providing data for this study have done so according to local ethical procedures and review. Only aggregated data without any personal identifiers were used in this study, and therefore, further ethical review was not required.

| RESULTS
Data from 22 programs amounting to 23 separate data sets were included in the analysis. This included a total number of 23,523,031 births and 15,103 CLP cases. Table 1 provides a description (location, type of registry, area covered, ascertainment period, stillbirth definition, whether ETOPFA is permitted, and prenatal screening services) of ICBDSR programs providing data and included in this study. A description of the follow-up method for LB for each program is presented in Table 2. Table 3 presents descriptive statistics including pregnancy outcomes for each of the included programs (total number of births, total number of CLP cases, prevalence per 10,000 births, percentage of LB among CLP cases, percentage of SB among CLP cases, and percentage of ETOPFA among CLP cases) for the observation period 1974-2014. Tables 4, 5, and 6 present similar descriptive statistics for programs that provided data for each of the subclassifications of CLP described in the methods.
The prevalence of total CLP for each of the included programs ranged from 1.3 per 10,000 births (Mexico Neuvo Leon) to 10.4 per 10,000 births (Mexico RYVEMCE) and is presented in Figure 1. The pooled average for total CLP prevalence from all programs across the observation period was 6.4 per 10,000 births.
The mean prevalence of total CLP varied each year with a maximum of 11 per 10,000 births (live and still births) reported in 1979 followed by a range between 4.5 per 10,000 births (live and still births) to 8.5 per 10,000 births (live and still births). There is significant spread of data around the mean, illustrated in Figure 2.
The survival rates of LB varied across programs. A description of the percentage of LB surviving at timepoints varying from 1 day through 5 years for total CLP followed by each of the subclassifications is presented in Table 7. Overall survival is also presented for each program. The pooled average of surviving LB for total CLP was 91% when considering all-cause mortality. The pooled average for isolated CLP was 97.7%. For CLP associated with multiple congenital anomalies, the average surviving LB once all-cause mortality was considered was 77.1%, and for CLP associated with genetic or chromosomal syndromes was 40.9%.

| DISCUSSION
Strengths and limitations of this study are discussed throughout this section. The prevalence of total CLP varied substantially across programs ranging from 1.26 to 10.37 per 10,000 births for the observation period. Surveillance methods and ascertainment of the presence of clefts varied during the observation period and between programs. Important differences include hospital-versus population-based registries with hospital programs serving a select sample of a wider population. The geographical area covered by the program is also important to consider, as areas with a local registry may be skewed by local clusters, although this can be very useful when aiming to identify possible causes for perceived higher prevalence associated with a program that may be due to local environmental or genetic influences. Analyses in this study do not account for the heterogeneity between programs. The context of the country, culture, and health system where each registry is based should be taken into account when interpreting the data presented in this article. While these data may point to further questions related to causality, it is not possible to draw inferences on causality from these data. The data will provide utility to reporting programs to discuss and interpret locally.
The pooled birth prevalence presented for total CLP of 6.4 per 10,000 births is slightly lower than what would have been expected from the global literature. Mossey (Swanson, 2021). The variability in case definition among published studies makes direct comparison difficult. A range of factors may contribute to the low prevalence reported in our study, such as the inclusion of SB in the denominator data, which may impact the prevalence calculation, variance in the ability of programs to ascertain all cases, and variance in the source used for denominator populations between hospital-based or regional programs.
The rate of ETOPFA for total CLP in most European programs was higher than other continents with notable exceptions such as Malta, where termination of pregnancy is not legal (including for anomalies that are fatal beyond the womb), and the reported rate of ETOPFA was 0% for that surveillance program. A 19% stillbirth rate for total CLP was reported for Malta; it is important to note that the SB were related to chromosomal syndromes and multiple congenital abnormality cases, and the total number of CLP cases for Malta was small, 42 cases, highlighting the need to apply caution when comparing programs. Previous reports of ETOPFA with CLP among European populations averaged 11.8% (Calzolari et al., 2002). A further program of note is Israel, where terminations or SB available but are not registered. For programs that provided data on the subclassifications, the percentage of ETOPFA for isolated CLP was low when compared to the other subclassifications for CLP associated with multiple congenital anomalies and CLP associated with genetic/chromosomal syndrome. This is a positive finding for isolated CLP as this birth anomaly can be surgically repaired resulting in effective cure for the majority of cases (Williams et al., 2001). Unfortunately, access to quality surgical care is not universal, as demonstrated by a 2015 Lancet Commission (Meara et al., 2015). The low rates of EOPTFA for isolated clefts are similar to that reported in other studies (Calzolari et al., 2002;Yazdy, Honein, & Xing, 2007).
Identification of variations in survival is an essential component of the Global Burden of Disease (GBD) project for monitoring progress in global health and alleviation by access to care (Horton, 2012). The variation in the overall survival of LB for total CLP presented in Table 7 shows a tendency for higher survival rates among programs in Europe but with some exceptions such as Malta (where termination of pregnancy is illegal) with an overall survival of 85% compared to the cohort average of 91%. The survival data for the subclassifications demonstrate clear differences among subclassifications with isolated CLP cases having the highest rates of survival. Issues surrounding infant mortality in the presence of birth defects are important in the context of primary prevention, and in the case of CLP, timely access to primary cleft repair results in survival rates equivalent to unaffected infants (Christensen, Juel, Herskind, & Murray, 2004;Cobourne & Sharpe, 2012). Christensen et al. demonstrated an increase in mortality among CLP cases compared to standardized rates in a Danish population (Christensen et al., 2004). Mossey and Modell (2002) have explored the influence of access to care on survival, finding that access to comprehensive (multidisciplinary) cleft care coincides with improved survival; most programs in this study are in countries providing comprehensive care (Cobourne & Sharpe, 2012). Furthermore, cleft lip repair has been suggested as a marker of the provision of essential pediatric care, particularly in low-and middle-income countries (Vanderburg et al., 2021). It is important to note the majority of programs (64%) included in this study are from countries that are classified as high-income by the World Bank. Further consideration of efforts to improve ascertainment and recording of CLP cases in low-and low-middle-income countries should be pursued.
As has been reported in other studies of this kind, some of the variation in findings may be due, in part, to variation in ascertainment, available diagnostics, and data quality issues associated with data submitted for this study; therefore, the results should be interpreted with caution reflecting on the limitations described above (Calzolari et al., 2002;Cobourne & Sharpe, 2012). A detailed diagnostic document has been produced exploring the source data used for this article (Revie, 2020). This presents a number of areas for improvement in data extraction and processes to standardize the approach within reporting programs internationally. It is also suggested to add certain information to the data collection that would aid in useful comparison data across programs and allow for further useful statistical analysis, for example, calculation of risk ratios for specific outcomes.
Following discussion and further consideration the authors propose a draft set of data quality indicators that may improve data quality in future studies of this kind ( Figure 3). The indicators reflect some of the analytical and data management issues presented with this large and complex set of data. Factors are separated into "critical" and "less critical". Critical missing data and evident calculation errors are given a heavier weighting, whereas empty cells in otherwise complete data sets receive a lower weighting. The draft quality indicators and the detailed diagnostic analysis document (not published) may form the basis for discussion and future work to improve the quality of the data extracted and reported from programs. A working group focused on standardizing data collection forms, guidance, and regular reporting mechanisms for CLP may prove beneficial enabling an improvement in the validity and reliability of similar data sets in the future.
Adopting a health economics approach and incorporating this in future analysis of orofacial clefts and other congenital anomalies should be considered. Similar analysis that includes the suggestions above could be applied to other data sets held by ICBDSR, including the cleft palate with and without Robin sequence and cleft lip without the cleft palate.

| CONCLUSION
Total CLP prevalence of 6.4 per 10,000 births as reported in this study is regarded as slightly lower than previous global estimates. There was variation across included programs for prevalence, pregnancy outcomes, and survival. The survival of LB with CLP was greatest among cases with isolated CLP (97.7%) and worst among CLP cases associated with genetic or chromosomal syndromes (40.9%). Data quality and heterogeneity among data sets have both been highlighted in this study, and efforts to improve data quality related to future CLP epidemiology studies deserve consideration.

ACKNOWLEDGMENTS
With many thanks to Simonetta Zezza, ICBDSR General Manager, for her help and administrative support when preparing this manuscript.

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.  Prevalence by year of undifferentiated cleft lip with the cleft palate.  Quality indicators used by authors to assess data quality of data sets and inform inclusion criteria.