Is There a Crisis in Clergy Health?: Reorienting Research Using a National Sample

Are religious leaders unusually unhealthy? This question has long occupied scholars interested in the study of religious institutions, and a significant body of research has examined the causes, correlates, and effects of poor health among clergy. In this study, we aimed to: (1) outline the development of, and bias inherent to, the scholarly understanding of clergy health over the past 50 years; (2) test, using a recently collected nationally representative sample of clergy, the standing assumption that clergy are an especially unhealthy vocational group, specifically in terms of depression, obesity, and self-rated health; and (3) identify the major correlates of health among clergy using these data. Contrary to the recent tenor of scholarly research on this subject, our research revealed that clergy are not a particularly unhealthy group. We suggest potential pathways forward to ameliorate the bias inherent in the research into clergy well-being.


Introduction
The physical and mental health of religious leaders has been of long-standing interest to both scholars of religion and occupational health.Over the past 50 years, the narrative on the state of clergy1 health has shifted considerably.In the 1960s and 1970s, research pointed to clergy having more favorable health outcomes (King and Bailar 1969), while the contemporary consensus is that the rates of poor health among clergy have reached the level of "crisis" (Hough et al. 2019;Proeschold-Bell and Byassee 2018).In this study, we critically assess whether the evidence supports this change in perspective.We present results from a recent, nationally representative sample of clergy in the United States to assess the validity of the current narrative about the health of clergy and offer guidance for future research and intervention priorities of the field.
Is There a Crisis in Clergy Health?581 We find scant evidence that the health of clergy in the United States-as indicated by obesity levels, symptoms of depression, and self-rated health-differs significantly from the health of other Americans with similar demographic characteristics.While we identify specific clerical subgroups whose indicators point to worse health than other groups-most notably the mental health of White Mainline Protestant clergy-the evidence does not support the claim that clergy, as an occupational group, are at elevated risk for poor health as compared to the general population. 2 We argue that the crisis narrative emerged because past studies have focused on either regional or denominational subgroups of clergy, leading to a biased picture of the state of clergy health.While not without limitations, this works highlights the inherent dangers in extrapolating from nonrepresentative samples to an entire occupational group, presses for more care by researchers in the field when making generalized statements, and advocates for research on more diverse groups of clergy.

Background
In their 1969 article reviewing past literature on clergy mortality, Haitung King and John Bailar find that "the mortality experience of clergymen has been consistently more favorable than that of the general male population" (p.27).Between 1968 and 1980, King and colleagues published a variety of studies of clergy physical health and mortality, which included both comprehensive literature reviews and statistical analyses using large samples from multiple theological traditions (King 1971;King andBailar 1968, 1969;King and Locke 1980;King, Zafros, and Hass 1975;Locke and King 1980).King and coauthors found that on several physical health outcomes, including life expectancy, cancer, and cardiovascular disease, clergy fared decidedly better than the U.S. population.They argued that this health advantage was driven by clergy's relatively high socioeconomic status, especially their high levels of educational attainment.By 2006, the story began to change.Based on a nationally representative sample of clergy collected in 2001, Jackson Carroll shifted the narrative to one where "[clergy health fares] not very well, though not much worse than the US population as a whole" (p.124).
Since 2006, the narrative has further shifted to one where scholars broadly accept that clergy face significant threats to their physical and mental health.It has become commonplace to assert that clergy health has reached "crisis levels" (Hough et al. 2019), with some arguing that "there is a true crisis in clergy physical health" (Proeschold-Bell and Byassee 2018) and "significant levels of psychological suffering" among clergy (Reynolds 2015).One scholar goes so far as to claim that, because of the levels of burnout and physical and emotional exhaustion experienced by pastors, "pastoral work is not only tough; it also may be dangerous" (Bloom 2019).This shift over the last 50 years-from clergy as exceptionally healthy to exceptionally unhealthy-raises important empirical, methodological, and theoretical questions.

Review of the Literature
To trace the development of the literature on clergy health, we conducted a comprehensive literature review of the topic of clergy health in the United States.We had four main criteria for inclusion into our literature review.First, the work had to be published after the analyses done by King andcolleagues between 1968 and1980. Second, in order for us to situate the results within the health trends of the larger population in the United States, the work had to focus on clergy    Another potential answer lies in the shift in theoretical perspective.In this more recent body of research, authors follow the theoretical perspective of a larger body of scholarship that demonstrates how social characteristics, including gender, race, educational attainment, and occupation, are fundamental drivers of health inequities (Phelan, Link, and Tehranifar 2010).The major pathway through which these social factors are thought influence the health of clergy is through chronic stress exposure in their occupational role and the body's corresponding response to that stress (Adler and Rehkopf 2008).The clergy profession is typically portrayed as inherently difficult and stressful, involving long hours, role overload, and lack of social support (Bloom 2019;Carroll 2006;Knox et al. 2005;Lee 1999;Miner 2007;Rowatt 2001;Virginia 1998).These factors, coupled with the aging of the clergy population, the loss of social prestige, and ongoing and high-profile cases of sexual abuse in several religious groups (Chaves 2017), are theorized to predispose clergy to develop poor mental and physical health.In addition to chronic occupational stress, clergy spend considerable time socializing with people in homes, church events, and meetings and thus, are often around food they feel obligated to consume.They may also eat out frequently, which is also associated with weight gain (Lachat et al. 2012).The U.S. food system has been demonstrated to be a fundamental driver of poor health through large portion size and the preponderance of cheap, calorie dense foods (Shannon et al. 2015).
However, it is important to note that there are also social factors that are associated with both being a religious leader and having better health, such as socioeconomic status and religiosity.These social factors are thought to provide a stress buffering effect (Cohen and Wills 1985;Ellison et al. 2001).This theoretical perspective was put forth by King and coauthors in the original body of research on clergy health in the 1960s and 1970s; this has largely been abandoned by clergy health researchers in the past 20 years.This change in theoretical perspective may be the case because of bias in the samples used in recent literature on clergy health, which may have led researchers to overemphasize the potential role of chronic occupational stress among clergy and to underemphasize the potential protective social factors associated being a clergyperson.

Bias in the Literature
Examining the 40 works on clergy health that were published after 2006 more closely, Mainline Protestant clergy make up most of the research participants.Looking at studies published since 2006, 22 of the 40 (55 percent) are based solely on samples of Mainline clergy.This is compared to four (10 percent) based on Catholic clergy, four (10 percent) based on Evangelical Protestant clergy, three (8 percent) based on Black Protestant clergy, and one (3 percent) based on Eastern Orthodox clergy.For comparison, according on the recently collected National Survey of Religious Leaders (NSRL), 21 percent of lead clergy are Mainline Protestant, 6 percent are Catholic, 43 percent are Evangelical Protestant, 22 percent are Black Protestant, and 9 percent are from non-Christian religions (Chaves, Roso, and Holleman 2022).While six works (15 percent) use samples of clergy from multiple religious traditions, only three of these works include any clergy from non-Christian religious traditions (Ferguson et al. 2015;Webb, Bopp, and Fallon 2013;Wells 2013).This raises important questions about bias, as contemporary research can draw conclusions about mostly White Mainline Protestant clergy health, but not about the health of all clergy groups.
In addition, research into clergy health after 2006 rarely draws from nationally representative data and relies on both denominationally and regionally specific samples.Of the 40 works published after 2006, 26 (65 percent) are based on data from a single denomination in a single state, six (15 percent) on data from a single denomination from a national sample, and four (10 percent) from multiple denominations in a single state.Only four (10 percent) are based on data from multiple denominations from a national sample, with only two (Ferguson et al. 2015;Wells 2013) being a national random sample from multiple denominations (though Wells [2013] uses the same data from the 2001 Pulpit and Pew Study used by Carroll [2006]).
Looking at this research on an outcome-by-outcome basis, bias is even more of a concern.Of the 27 works written after 2006 which include at least one mental health measure, two-thirds (67 percent) are based solely on samples of Mainline Protestant clergy.Although there is more diversity among the 24 works that include a measure of physical health, still almost half (42 percent) come from samples of Mainline clergy, which represent a minority of clergy in the United States. 6This means that even when past research has adjusted comparison populations to match the demographic characteristics of the sample, they only allow inference about the specific clergy subgroup studied.Clergy are a diverse vocational category, with variability by theology, gender and racial characteristics, organizational structure, and requirements for leadership, among other factors.Despite some researchers proclaiming a crisis of health among clergy in recent years, we know very little about the health of Black Protestant clergy or Conservative Protestant clergy, and almost nothing about the health of non-Christian clergy.

The Current Study
To provide a more representative picture, in this study, we analyzed data from a recent, nationally representative sample of religious leaders from all faith backgrounds to present a more holistic picture of the state of clergy health in the United States.To put clergy in perspective, we compared this sample of religious leaders to a sample of the U.S. population, which we adjusted to be demographically similar to clergy.We also explored the correlates of good and poor health among this sample of clergy to look for important differences among clergy in terms of health outcomes.

Data
We used data from the National Survey of Religious Leaders (Chaves, Roso, and Holleman 2022), a nationally representative study of the leaders of religious congregations collected in 2019-20.The National Survey of Religious Leaders (NSRL) was collected in conjunction with the 2018 General Social Survey (GSS) and 2018-19 National Congregations Study (NCS) (Chaves et al. 2020b;Smith et al. 2019).The 2018 GSS asked respondents who said they attended religious services at least once a year where they attended.The NCS then contacted those congregations and interviewed a key informant about the people, programs, and characteristics of the congregation.The NSRL sampling frame was made up of the religious leaders of the congregations in the NCS sample.NSRL data collection was conducted primarily through selfadministered questionnaires online.The response rate for primary leaders-solo or senior leaders of the religious congregation-in the NSRL was 70 percent, with 890 total primary leaders in the final sample.In our analysis, we focused only on these primary leaders, excluding primary leaders of Catholic congregations who were not priests.Our analytical sample of primary leaders from the NSRL had 884 congregational leaders.
We also used data from the National Health and Nutrition Examination Survey (NHANES) and the National Health Interview Survey (NHIS) to compare clergy to the general U.S. population (CDC 2019;Chen et al. 2020).The NHANES and NHIS are nationally representative data sets, collected by the National Center for Health Statistics.We used data from the in-person interview portion of the NHANES, which was collected in 2017 and 2018. 7In 2017-18, the NHANES collected data from 9245 respondents between the age of 0 and 80 years and had a response rate of 51.9 percent.Similarly, the NHIS was made up of in-person interviews, collected in 2019.The 2019 NHIS collected data on 31,997 respondents over the age of 18 and had a response rate of 61.1 percent.NHANES respondents know their height and weights will be physically verified, which may reduce bias in reporting; obesity using self-report data is higher in NHANES as opposed to NHIS (Flegal et al. 2019;Stommel and Schoenborn 2009).The NHANES and NHIS data sets are weighted to be representative of the U.S. population.We restricted our analysis to respondents participating in the labor force.

Measures
We focused on the three health measures in the NSRL that have corresponding measures in nationally representative studies of health and well-being.As an indicator of mental health, we focused on depressive symptoms, operationalized by the Patient Health Questionnaire (PHQ-2) (Kroenke, Spitzer, and Williams 2003;Levis et al. 2020).This measure asked respondents the frequency with which they had been bothered over the past 2 weeks by (1) having little interest in pleasure or doing things, and (2) feeling down, depressed, or hopeless.Response options were "not at all," "several days," "more than half the days," and "nearly every day."Scores ranged from 0 to 6, with higher scores reflecting more depressive symptoms.PHQ-2 scores greater than or equal to 3 qualified respondents as likely qualifying for major depression (Kroenke, Spitzer, and Williams 2003;Levis et al. 2020).We operationalized "depressive symptoms" as the numerical PHQ-2 score, and "depressed status" as a dichotomized variable indicating whether the PHQ-2 score was greater than or equal to 3.
As a measure of physical health, we used body mass index (BMI).Respondents were asked to self-report their height in inches and their weight in pounds.BMI was calculated with the formula: BMI = [weight (kg)] / [height (m)]ˆ2, where kg is a person's weight in kilograms and m is their height in meters squared (Flegal et al. 2019).Using the NSRL and NHANES data, we analyzed BMI as a continuous variable.We did not analyze BMI for the NHIS.In the NHIS, heights and weights of people with unusually high or low values were suppressed, to protect respondents' confidentiality, making the mean values not representative of the population.We also used the indicator for respondents who qualified for obesity with a BMI greater than or equal to 30 (CDC 2021), provided in the public use data sets (the categorical obesity variable is calculated using the full range of data in the NHIS).Research has documented the relationship between obesity and an increased risk of developing several health conditions, including cardiovascular disease, mortality, diabetes, arthritis, hypertension, angina, and asthma, among others (Nystad andMeyer 2004 et al. 2004;Reynolds and McIlvane 2009;Taylor et al. 2010).
The third health measure analyzed was self-rated health (Idler and Benyamini 1997).Respondents were asked "in general, would you say your health is…," with response options being "excellent," "very good," "good," "fair," and "poor."Scores ranged from 0 to 4, with higher scores indicating worse health.We included self-rated health numerically in analyses, as well as a dichotomized variable indicating respondents who had reported their health being "poor" or "fair."Research has documented the relationship between this measure of self-rated health and mortality, hypertension, diabetes, and cholesterol, among others (Idler and Benyamini 1997;Wu et al. 2013).
Our predictor of interest was the religious tradition of the leader, a categorical variable corresponding to the religious tradition of the congregation the leader served, indicating Roman Catholic, evangelical Protestant, Black Protestant, Mainline Protestant, or Non-Christian.
We employed several controls in our analyses.For individual demographics, we included respondents' gender, race, nativity, and education.For respondent gender, we included a dichotomized variable that indicated whether the respondent reported their gender was female.For respondent race, we included a dichotomized variable that indicated whether the respondent reported being White and non-Hispanic.For respondent nativity, we included a dichotomized variable that indicated whether the respondent reported being born in the United States or Canada.For respondent education, we included a dichotomized measure indicating if respondents reported attaining any graduate degree-whether a Master of Divinity or other graduate degree.
In addition, because the NSRL respondents can be linked with their congregations in the NCS, we also included six congregational measures (Chaves et al. 2020a).First, we included three measures of the geographic location of the congregation: a categorical measure indicating if the congregation was located in the Northeast, Midwest, Southeast, or West/Pacific Census region of the United States; a dichotomized measure indicating the congregation was in a census tract in which at least 30 percent of residents were below the poverty level; and a dichotomized measure indicating whether the congregation was in a rural census tract, meaning less than 2500 people lived in the census tract.We also included congregation size, operationalized by a continuous measure of the number of regularly attending adults of the leader's congregation (logged in the regression models); a measure of congregational growth or decline, operationalized by the percent change in the number of regularly attending adults over the past 2 years; and a measure of parishioner involvement in the congregation, operationalized by the percent of parishioners who served in a leadership role in the congregation in the last year. 8inally, we included two measures of occupational conditions.First, we included the total number of hours the respondent worked in a typical week, including work activities related to jobs the respondent held other than their congregational work (logged in the regression analysis).Second, we included a measure of support felt from congregants.Respondents were asked "To what extent do you feel truly cared for by people in your congregation?," with response options being "very much," "quite a bit," "a moderate amount," "a little bit," or "not at all."We created a dichotomized measure indicating if respondents indicated they felt cared for by their congregants "very much" or "quite a bit."

Statistical Analysis
To understand the reality of clergy health, we focused on bivariate relationships alongside estimating multiple regression models.For all bivariate analyses, we weighted the NSRL using the weight "WT_NSRL_PRIMARY_DUP." For multivariate analyses, we used a series of tests to determine if the weights should be included in the analyses (Bollen et al. 2016;Winship and Radbill 1994) by regressing the outcome variable with religious tradition.When justified, unweighted models are preferred because standard errors on the regression coefficients are smaller.The Du-Mouchel and Duncan tests on the regression models (DuMouchel and Duncan 1983) indicated that across outcome variables, the weights were necessary, however, the Pfeffermann and Sverchkov (Pfeffermann 1996) test indicated weights were not necessary.Upon further examination of the models, the interaction term between the indicator variable for Roman Catholic respondents and the survey weight was statistically significant, indicating that the survey weights were correlated with the outcome variable only in the case of Roman Catholic leaders.In addition, we ran regression models with and without the survey weights and compared the results.Only in the case of the coefficient on Roman Catholic did the weighted model produce a different result than the unweighted model, adding credence to the conclusion that the weights were necessary only for Roman Catholic respondents.In all final regression models, we used weighted regression.However, we set the survey weight to 1 for all non-Roman Catholic cases.We used R to run all the statistical models.
When comparing the NSRL results to the general population, we removed NHANES and NHIS respondents under 20 years of age (the age of the youngest respondent in the NSRL) and respondents who were unemployed and not actively looking for work.We then used raking, or sample balancing (Battaglia et al. 2009, Lumley 2004), to adjust the survey weights in the NHANES and NHIS by age, gender, and race to match the distributions of these variables in NSRL.Raking was done via the survey package in R (Lumley 2021).

Results
Respondents averaged a PHQ-2 score of 0.46, indicating that, on average, clergy experienced only one of the depressive symptoms less than several days over the past 2 weeks.Furthermore, 4.1 percent of our sample met the diagnostic criteria to be classified as likely depressed based on their response to the PHQ-2.Respondents' average BMI was 29.7, with close to half (42.3 percent) of our sample qualifying as obese.The average self-rated health of the sample was 1.24, meaning the average clergyperson in our sample rated their health somewhere between "very good" and "good."Only 5.3 percent reported "fair" or "poor" health.

Bivariate Results
Physical and mental health metrics differed significantly by religious tradition, as well as other demographic factors correlated with health.Table 3 shows this variability, specifically assessing bivariate differences among religious tradition for each health measure and demographic factor.We stratified our analyses by religious tradition.In regression analyses, Mainline Protestant clergy served as the reference category, as research on Mainline Protestant clergy samples makes up the largest proportion of research on clergy health.
First, at the bivariate level, there were differences based on religious tradition for every health measure.Mainline Protestant clergy reported a PHQ-2 value of 0.69, significantly larger than the 0.22 among Catholic clergy, and 0.25 among Black Protestant clergy (p < .001,p < .01,respectively).Non-Christian clergy had slightly higher PHQ-2 scores than Mainline Protestant clergy, and evangelical Protestant clergy had slightly lower scores, although these differences were not statistically significant (p = .96;p = .45,respectively).These basic patterns remained when looking at depressed status.While 4.1 percent of all head clergy qualified as having elevated depressive symptoms, 8.3 percent of Mainline Protestant clergy qualified.This was significantly higher than elevated depressive symptom rates of 0.6 percent of Catholic clergy (p < .001).Evangelical Protestant, non-Christian, and Black Protestant clergy demonstrated slightly lower rates of  elevated depressive symptoms than Mainline Protestant clergy (4.0, 1.9, and 2.5 percent, respectively), although these differences were not statistically significant (p = .40,p = .80,p = .50,respectively).Turning to BMI, Catholic clergy and non-Christian clergy reported significantly lower BMI values than Mainline Protestants.Average BMI scores were, respectively, 27.5 and 25.1 (p < .05for both).Black Protestant clergy reported an average BMI of 31.0, a value which is significantly higher than Mainline clergy (p < .05).Evangelical Protestant clergy demonstrated very similar BMI averages to Mainline clergy (p = .499).These basic patterns are replicated for obesity prevalence.A total of 43.6 percent of Mainline clergy were classified as obese, which is on par with the 42.3 percent of all head clergy.Only 20.6 percent of Catholic clergy were obese, which is statistically smaller than Mainline Protestant clergy (p < .05).By contrast, 51.7 percent of Black Protestant clergy were obese, a value which is statistically larger than Mainline Protestant clergy (p < .05).Non-Christian clergy demonstrated lower levels of obesity than Mainline Protestant clergy at 10.8 percent, and evangelical Protestant clergy demonstrated slightly higher levels of obesity at 45.2 percent, but these differences were not statistically significant (p = .12,p = .44,respectively).
Across religious traditions, there was similar variability in terms of self-rated health.Mainline Protestant clergy reported an average self-rated health of a 1.2.There were no significant differences among Christian clergy, with Catholic, evangelical Protestant, and Black Protestant clergy reporting averages of 1.69, 1.23, and 1.30, respectively (p = .173,p = .133,p = .235,respectively.By contrast, non-Christian clergy had significantly better self-rated health than Mainline Protestant clergy, with an average score of 0.9 (p < .05).However, when isolating clergy who reported especially bad self-rated health by answering "fair" or "poor," these patterns shifted slightly.While 7.6 percent of Mainline Protestant clergy reported fair or poor health, slightly fewer evangelical Protestant clergy did so, at 2.2 percent (p < .05).Notably, the largest difference was Catholic clergy, 33.4 percent of whom reported fair or poor health.This was equivalent to over five times the average for all clergy and significantly higher than Mainline (p < .05).3.4 and 1.3 percent of evangelical Protestant and non-Christian clergy reported fair or poor health, differences which were not statistically different (p = .31,p = .30,respectively).
In addition, we found that there was significant variability by religious tradition pertaining to demographic factors that are known to be correlated with physical and mental health.Mainline Protestant clergy had a significantly higher representation of women than Catholic, evangelical Protestant, and Black Protestant clergy (p < .001,p < .001,p < .001,respectively); higher representation of White clergy than Catholic, evangelical Protestant, and Black Protestant clergy (p < .001,p < .1,p < .001,respectively); higher representation of U.S. nativity than Catholic and non-Christian clergy (p < .001,p < .1,respectively); and higher representation of graduate education than Catholic, evangelical Protestant, Black Protestant, and non-Christian clergy (p < .1,p < .001,p < .001,p < .01,respectively).
We also found the characteristics of congregations that may be related to clergy health differed by religious tradition.Mainline Protestant clergy were significantly less likely to serve congregations in the Southeastern United States, compared to evangelical and Black Protestant clergy (p < .001for both); significantly more likely to serve congregations in the Northeastern United States than evangelical and Black Protestant clergy (p < .001and p < .01,respectively); significantly more likely to serve in the Midwestern United States than Catholic and Black Protestant clergy (p < .1 and p < .01,respectively); significantly more likely to serve in the Western United States than Black Protestant clergy (p < .05);and significantly less likely to serve in the Western United States than non-Christian clergy (p < .01).Mainline Protestant clergy were also significantly more likely to serve congregations located in census tracts with high rates of poverty compared to non-Christian clergy (p < .05);and significantly less likely than Black Protestant clergy to serve in a poor census tract (p < .001).Mainline clergy served in slightly smaller congregations than evangelical clergy (p < .05),and Mainline clergy served in congregations whose laity are significantly more likely to hold volunteer leadership positions than non-Christian clergy (p < .01).
Finally, the occupational conditions in which religious leaders work varied by religious tradition.Mainline clergy worked significantly more hours per week compared to Black Protestant clergy (p < .01).Finally, significantly fewer Mainline clergy reported feeling very cared for by their congregation as compared to Catholic, evangelical, and Black Protestant clergy (p < .1 for all three).

Multivariate Results
To understand the ways demographic differences between religious traditions influenced differences in health, we conducted a series of multivariate analyses, which are reported in Table 4.We found that, even when controlling for clergy demographics, congregational demographics, and occupational conditions, Catholic clergy demonstrated lower depressive symptom scores, lower likelihood of depressed status, worse self-rated health, and higher rates of reporting fair or poor health than Mainline Protestant clergy (p < .1,p < .01,p < .1,p < .01,respectively).When controlling for relevant demographic characteristics, evangelical Protestant clergy demonstrated better self-rated health and lower rates of reporting fair or poor health than Mainline clergy (p < .1 for both), Black Protestant clergy demonstrated a lower likelihood of reporting fair or poor health than Mainline Protestant clergy (p < .05),and non-Christian clergy demonstrated better self-rated health than Mainline clergy, even when including controls (p < .05).
We also found that, when accounting for religious tradition, individual demographic variables, congregational demographics, and occupational factors in the multivariate analyses, religious tradition was the one of the two most consistent significant factors correlated with health outcomes.We did find some demographic patterns in the multivariate analyses: White clergy demonstrated better self-rated health and a lower likelihood of reporting poor or fair self-rated health than non-White clergy (p < .05 and p < .1,respectively)9 and clergy born in the United States reported higher BMIs and a greater likelihood of obesity than clergy born outside of the United States (p < .05,p < .1,respectively).Clergy serving congregations in the Northeastern United States reported lower BMIs, better self-rated health, and a lower likelihood of poor or fair self-rated health compared to clergy in the Southeastern United States (p < .05,p < .1, and p < .1,respectively).; and clergy serving in the Western and Pacific United States had a lower likelihood of reporting poor or fair self-rated health compared to clergy in the Southeastern United States (p < .05).
Clergy serving congregations located in rural census tracts were more likely to be obese and were less likely to report poor or fair self-rated health (p < .05,p < .1,respectively); and clergy serving congregations located in areas with high levels of poverty reported a lower likelihood of reporting fair or poor health (p < .05).Clergy leading larger congregations demonstrated lower BMI levels and better self-rated health (p < .05,p < .001,respectively).Clergy leading congregations that had demonstrated higher levels of growth in attendance over the past 2 years had significantly lower depression scores (p < .01);and clergy who led congregations that had higher percentages of laity participate in congregational leadership demonstrated higher rates of obesity and higher likelihood of reporting poor or fair self-rated health (p < .1 for both).
Along with religious tradition, occupational conditions were some of the most consistent predictors of variations in clergy health.Clergy who reported feeling cared for by their congregation reported lower depressive symptoms, a lower likelihood of being depressed, lower BMI, better

Discussion and Conclusions
In this analysis, we found, as an occupational group, clergy in the United States exhibit similar levels of elevated depressive symptoms, obesity, and fair/poor self-rated health when compared to a sample of the U.S. population weighted to look similar on age, race, and gender characteristics.Looked at broadly, these results do not reflect an occupational group facing a health crisis.On the contrary, we echo Carroll's findings from 20 years ago that clergy health is not significantly better or worse than the overall health of the general U.S. population.That said, our results do suggest that clergy may have somewhat higher obesity rates and somewhat lower rates of fair/poor health than a matched population sample.Given that the percent qualifying for obesity in clergy is 4.3 points higher than NHANES and that the percent reporting fair/poor health is half of that in the general population, there is a possibility that clergy are different from the population on these two measures.Previous research has shown a tendency for clergy to underreport poor self-rated health, which might account for the discrepancy on self-rated health (Proeschold-Bell and LeGrand 2012).A systematic meta-analysis of existing studies would be helpful to assess the evidence of elevated obesity in clergy.
In addition, in both the bivariate and multivariate analyses, we found major variations in health outcomes by religious tradition.We showed that Mainline Protestant clergy differ from other religious traditions in both their physical and mental health-even when controlling for relevant individual-and congregational-level demographic information.Mainline Protestant clergy have higher mean score and rates of elevated depressive symptoms than Roman Catholic clergy; worse self-rated health than evangelical Protestant and non-Christian clergy; better self-rated health than Catholic clergy; higher rates of reporting poor or fair self-rated health than evangelical and Black Protestant clergy; and lower rates of reporting poor or fair self-rated health than Catholic clergy.We know Mainline Protestant clergy have constituted the bulk of research participants in studies exploring clergy health.Because our results show they have a different health profile than other subgroups of this population, recent literature that has sought to characterize "clergy health" as one cohesive phenomenon is likely presenting misleading conclusions.
Beyond Mainline Protestant distinctiveness, we found that clergy from each major religious tradition demonstrated unique health patterns.These present avenues for further research.For example, Catholic clergy report the lowest levels of depressive symptoms.This is surprising given that past national studies reported rates of elevated depressive symptoms among Catholic priests of 18 percent (Knox et al. 2005), 20 percent (Knox, Virginia, and Lombardo 2002), and 72 percent (Virginia 1998, with the rate in religious/monastic priests being 40 percent).These studies used different measures of depressive symptoms, but these highly discrepant results call for more comprehensive studies of mental health symptoms in Catholic clergy.We also found that Catholic clergy reported significantly worse self-rated health than their Protestant counterparts.Other research on United Methodist clergy has shown clergy are overly optimistic about their self-rated health (Proeschold-Bell and LeGrand 2012), a pattern which is contradicted among the Catholic priests in our sample.
Among Protestants, Black Protestant clergy were less likely to report fair or poor self-rated health in the multivariate analyses.And while Black Protestant clergy do have the highest obesity rates among clergy from any religious tradition, the differences were not statistically significant when accounting for other factors.Given that Black Americans have elevated rates of obesity compared to White Americans, we expect that, if there was a larger sample of Black Protestant clergy in this study, we would find elevated rates of obesity in this subgroup.And while Black Protestant clergy may have elevated rates of obesity, they are not less healthy across all measures 14685906, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jssr.12859,Wiley Online Library on [29/01/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License than the rest of the predominantly White Protestant sample, a finding which contradicts past literature (Ferguson et al. 2015).
In addition, although we have noted that Mainline Protestant clergy differ from leaders in other religious traditions in important ways, additional research needs to be done to understand the source of these differences-especially in terms of their mental health.The rate of depression among Mainline Protestant clergy is more than twice as high as the rest of the sample.This may be partially due to lower levels of mental health stigma in this group, as past research has found that White liberal pastors are more likely to endorse medical and biological causes of mental illness than a lack of faith or other spiritual cause as compared to clergy from other religious traditions (Holleman and Chaves 2023;Payne 2009, though notably Holleman andChaves [2023] found that Catholic clergy were equally likely as Mainline Protestant clergy to endorse medical and biological causes of mental illness).However, the differences we find are large enough that they indicate that a significant amount of psychological distress is occurring among Mainline clergy in a manner distinct from clergy in other religious traditions.
This research has several important limitations.First, the data we employed for this study have limited health measures that have comparable items on nationally representative studies of health and well-being.The three measures we used in this work-depression, self-reported obesity, and self-rated health-were the only three measures in the NSRL that allowed for this kind of comparison.While we believe we have demonstrated our argument with the health measures at hand, future nationally representative samples of clergy should include a greater number of health measures.Second, we acknowledge that the mode of survey administration can impact systematic biases in reporting.The NSRL is self-administered, while the NHIS and NHANES are in-person interviews.It is not clear how self-administered surveys may affect reporting of body weight.Because of the anonymity inherent to self-administered surveys, we presume less underreporting of weight in the NSRL than an in-person interview like the NHIS or NHANES, but we do not have systematic analysis to prove this.
Third, like past research on religious leaders in the United States, the NSRL contains very few non-Christian leaders.Although we found much lower BMI and better self-rated health among non-Christian clergy, we cannot speak to the reasons why.Aggregating clergy from traditions as diverse as Buddhism, Islam, Judaism, among others, is an oversimplification of this category.Additional research should be done to understand clergy health among specific non-Christian religious traditions.Finally, the survey did not allow us to differentiate secular and religious Roman Catholic clergy.Previous research has shown large health differences between these two groups (Virginia 1998).
A major conclusion of our study is a word of caution to researchers who study specific subgroups of clergy or work with nonrepresentative samples of clergy to be more circumspect in generalizing their findings to the entire occupational group.Different subgroups are subject to different selection pressures (e.g., Roman Catholic priests are male and take vows of sexual abstinence), have different geographic distributions (e.g., a majority of Black Protestants live in the U.S. South), have different proportions of foreign born clergy (e.g., a majority of Roman Catholic clergy are born outside of the United States), and have different occupational structures (e.g., United Methodist clergy are appointed annually to their positions by the denomination), which could drive differences in health.We do not mean to imply research should not be done on specific subgroups.On the contrary, we believe that more research is needed on a diverse array of subpopulations to understand the specific mechanisms driving health outcomes in these groups-especially those that have most often been overlooked in past research: Black Protestants, evangelical Protestants, and non-Christians.Determining the unique challenges to health that being a clergy person entails requires studying the full range of people in this profession.It is only by doing so that scholars can both establish the empirical reality of the state of clergy health and understand the theoretical mechanisms at play.

Table 1
14685906, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jssr.12859,Wiley Online Library on [29/01/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License when only nine works were published on clergy health, the period between 2007 and 2022 saw 40 works published on the subject.In addition, while research between 1981 and 2006 focused primarily on Catholic clergy, research on clergy health since then has included clergy from all religious traditions, but it has disproportionately focused on Mainline Protestant clergy.

Table 2 :
14685906, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jssr.12859,Wiley Online Library on [29/01/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License Comparison of head clergy to us population Note:The NHANES and NHIS data were adjusted by age, gender, and race to ensure distributions of these variables are the same as the NSRL sample.All percentages from the NSRL are weighted using the WT_NSRL_PRIMARY_DUP weights in the NSRL data set.Sample includes all head clergy, excluding primary leaders of Catholic congregations who are not priests.We did not report BMI for the NHIS.In the NHIS, heights and weights of people with unusually high or low values were suppressed, to protect respondents' confidentiality, making the mean values not representative of the population.The categorical obesity variable is calculated using the full range of data.Source: National Survey of Religious Leaders 2019-20; National Health and Nutrition Examination Survey 2017-18; National Health Interview Survey 2019.

Table 3 :
Summary statistics by religious tradition Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jssr.12859,Wiley Online Library on [29/01/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License

Table 3
Statistical significance indicates that values are significantly different from Mainline.All percentages and means are weighted using the WT_NSRL_PRIMARY_DUP weights in the NSRL data set.Statistical significance is determined by logistic and linear regression equations, in which Catholic leaders are weighted using WT_NSRL_PRIMARY_DUP, and non-Catholic leaders are left unweighted.See main text for more information.Sample includes all head clergy, excluding primary leaders of Catholic congregations who are not priests.Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jssr.12859,Wiley Online Library on [29/01/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License ˆp < .1,* p < .05,** p < .01,*** p < .001.

Table 4 :
Linear and logistic regression predicting health outcomes Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jssr.12859,Wiley Online Library on [29/01/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License Statistical significance is determined by logistic and linear regression equations, in which Catholic leaders are weighted using WT_NSRL_PRIMARY_DUP, and non-Catholic leaders are left unweighted.See main text for more information.Sample includes all head clergy, excluding primary leaders of Catholic congregations who are not priests.-rated health, and a lower likelihood of reporting fair or poor health (p < .001,p < .001,p < .05,p < .001,p < .05,respectively).