Pre- pregnancy participation and performance in world's largest cross- country ski race as a proxy for physical exercise and fitness, and perinatal outcomes: Prospective registry- based cohort study

Objective: Investigate associations between pre- pregnancy participation and performance in a demanding cross- country ski race (proxy for exercise volume and fitness) and perinatal outcomes. Pre- registered protocol: osf.io/aywg2. Design: Prospective cohort study. Setting


| I N TRODUC TION
Declining general exercise and cardiovascular fitness levels are of global concern given their strong association with health and longevity. [1][2][3][4][5][6][7][8] However, the phenomenon's full importance for public health remains to be elucidated, especially associations between exercise and perinatal outcomes. During pregnancy, moderate-intensity exercise in randomised controlled trials (RCTs) impacts positively on a range of maternal and fetal outcomes. [9][10][11][12] Still, exercise interventions initiated during pregnancy are necessarily short and their intensity typically does not exceed light jogging. 13 There are important perinatal outcomes where interventions have shown no benefit, 14 but may not capture the range of exercise habits in the population. The time before pregnancy is a neglected period for potential interventions and policies aiming to improve perinatal and next-generation health. 15,16 Benefits are plausible, given that exercise habits before pregnancy strongly predict those during pregnancy, 17,18 and for physiological effects after longer exposure to exercise. 19 Current literature on pre-pregnancy exercise and perinatal outcomes is sparse, methodologically limited and has mixed results for both benefits and potential harms. 13,20,21 Most studies are cross-sectional, use retrospectively recalled self-reported exercise, and do not pre-register analyses, which are sources of bias. Complementary evidence is needed to guide exercise recommendations and policy decisions to target the wide-ranging societal determinants of exercise. 4,22 This prospective cohort study is based on the electronic registry of the world's largest cross-country ski race (Vasaloppet), held annually in Sweden, with recreational and elite participants. 23 Cross-country skiing is a demanding endurance sport. Vasaloppet participants engage in many types of exercise and high-performers report the largest exercise volumes. 24 The Vasaloppet registry has been leveraged to study other health outcomes, using Swedish national registries to adjust for socio-demographic factors and comorbidity. [24][25][26][27][28][29][30][31][32] Our aim was to investigate whether pre-pregnancy participation and performance in a Vasaloppet cross-country ski race, as proxies for higher exercise volumes and better fitness, are associated with important perinatal outcomes.

| M ET HODS
This cohort study constitutes the full overlap between the Vasaloppet registry and the population-based Pregnancy Register of births in Sweden, and is reported according to RECORD (Reporting of Studies Conducted Using Observational Routine Health Data). 33 Data were prospectively collected in registries and the study protocol was prospectively registered before data access (Open Science Framework, www.osf.io/aywg2; amendments to protocol in Table S1). This study had no patient and public involvement, but outcomes were selected based on a prior study with patient involvement (see below).

| Study population
The annual competitive race Vasaloppet (90 km) is preceded by the 'Winter Week' that hosts several ski races of 30-90 km. We included all women registered for participation in at least one ski race during Vasaloppet Winter Week in 1991-2017 (Vasaloppet, Open Trail, Half Vasa, Short Vasa, Women's Vasa, or Skate Vasa), who subsequently had a delivery recorded in the Pregnancy Register from inception to 31 December 2017 (skiers). Registration in the ski race was done using unique Swedish personal identification numbers (PINs). Twin and other multiple pregnancies were excluded, as were women without a Swedish PIN (e.g. non-Swedish citizens).
Deliveries were categorised by 5-year age bands, county of residence and calendar year. For each year between 2013 and 2017, a ten times larger group of deliveries (by women never registered for a ski race, non-skiers) was frequency-matched on the combined age-county categories. The Pregnancy Register holders were requested to perform data linkage (using unique Swedish PINs), matching and pseudonymisation.
No delivery was drawn as a match twice. Among several deliveries of the same mother, only one was included: for skiers, the first delivery following their first ski race, and for non-skiers a randomly selected delivery. Additional exclusion criteria applied for data (Tables S2 and S3).

| Variables
The Pregnancy Register was used for outcome and covariable extraction. The inception year 2013 covers the Stockholm and Gotland regions and years 2014-2017 contain 17/21 Swedish regions (90% of national deliveries; 98-100% within included regions). 34 We used two main exposure variables: ski race participation (non-skiers versus skiers with the latter as baseline, model 1) and ski race performance (continuous variable of relative finish times, skiers only, model 2). As weather and snow properties cause wide variation in absolute finishing times, performance was standardised as the percentage of the fastest female result for the same day and ski race (fastest time = 100%, e.g. if the fastest skier finished at 5 hours, 7.5 hours would be 150%). If skiers competed several times, we chose the fastest relative finish time ≤5 years before delivery, or most recent race, in that order. 35 No core outcome set exists for the study question; however, we investigated 'critically important' and 'important' outcomes selected by a panel of obstetric, exercise, public health and methodological experts in collaboration with patients, 13 which were also available in the Pregnancy Register (diagnostic codes in Table S2). We added perinatal venous thromboembolism (VTE) and psychiatric morbidity, two major causes of maternal mortality and morbidity. 36,37 Gestational diabetes mellitus (GDM) screening was selective, risk-based, in most maternal health care centres during the study period, using the oral glucose tolerance test or fasting glucose. 38 Recommended gestational weight gain (GWG) per BMI category were coded as 12.5-18 kg (<18.5 kg/m 2 ), 11.5-16 kg (18.5-24.9 kg/m 2 ), 7-11.5 kg (15-25 kg/m 2 ), 5-9 kg (≥30 kg/m 2 ), assuming 0.5-2 kg weight gain in the first trimester. 39 Pre-eclampsia was defined as high blood pressure (>140/90 mmHg) with onset after 20 weeks of gestation together with proteinuria. Any pregnancy-induced hypertension included women with blood pressure >140/90 mmHg regardless of proteinuria. Participants with certain conditions were excluded for some outcomes (Table S2, e.g. excluding pre-pregnancy diabetes mellitus for GDM analyses). Covariable selection was based on literature review and availability in the Pregnancy Register (Table S3).

| Statistical analyses
Bayesian logistic regressions as implemented in R package rstanarm 40,41 were used to calculate odds ratios (ORs) and 95% highest density intervals (HDIs), a range that contains the 95% most probable OR values. 42 Statistical inference criteria were based on how the 95% HDI placed itself relative to a prespecified 'region of practical equivalence' (Methods in Appendix S1). Our conclusions for each outcome considered statistical inference in both models 1 and 2 (i.e. only outcomes that met statistical inference criteria across both models were assumed to show an association). For choice of priors, see Methods in Appendix S1. Missing data originated predominantly from the Pregnancy Register's lower coverage during 2013 (approximately 20% missing outcome information except labour duration, 48%) and were imputed using multiple imputations (chained random forests, five datasets).
For labour duration, the only continuous outcome, we used Bayesian linear regression with log-transformed outcome after assessment of normality of residuals and homoscedasticity. Exponentiated regression coefficients are presented and can be interpreted as relative duration.
Ski race performance (model 2) has a continuous exposure variable. For interpretation, we present ORs per standard deviation increase in relative finish time (instead of per percentage unit). ORs are not directly comparable between models 1 and 2 (ORs for a binary exposure correspond approximately to a 2 standard deviation change in a continuous exposure). For all model 2 analyses, we tested the fit of nonlinear (spline) effects (Methods in Appendix S1), and for all outcomes, the linear model was either superior or noninferior to the spline model.
We prespecified covariables for adjustment: parity, age, maternal country of birth, educational level, cohabitation status, smoking, alcohol consumption, delivery location, year and pre-pregnancy comorbidities (Table S3). Early pregnancy body mass index (BMI), a potential mediator, was introduced in an exploratory second step of adjustments. We also explored whether potential associations with ski race participation (model 1) were different in primi-or multiparous women.
Prespecified sensitivity analyses were (1) only skiers ≤5 years before delivery; (2) only skiers in the longest (90-km) races; (3) only race finishers; (4) complete case analysis; and (5) other Bayesian model specification: less informative prior for coefficients, 'normal (location = 0, scale = 5)'. Non-prespecified analyses were further added: (6) only women born in Scandinavia; (7) only primiparous women; (8) frequentist models (logistic regressions, except linear regression for labour duration) with Bonferroni correction for the number of main analyses (n = 58: 29 outcomes, two exposure variables); (9) selecting first available delivery among non-skiers with several deliveries per woman; and (10) only skiers ≤2 years before delivery. Finally, we performed exploratory analyses among skiers racing during pregnancy according to dates of ski race and delivery.

| R E SU LTS
The final cohort comprised 194 384 non-skiers and 15 377 skiers (14 937 with a finish time; flowchart in Figure S1). On average, 5 years elapsed between ski race and subsequent delivery (median 4 years, interquartile range 2-7 years). Compared with skiers, non-skiers were less often primiparous, born in Scandinavia, university-educated, and living with a partner; they more often smoked; and fewer had normal-range BMI ( Table 1). Average BMI and height was 23.6 kg/m 2 and 168 cm for skiers, and 24.9 kg/m 2 and 166 cm for non-skiers, respectively. Overall event rates are featured in Table S4.

| Ski race participation (model 1)
In the models comparing non-skiers with skiers ( Figure 1, Tables 2 and S5), non-skiers had higher odds of pregnancy complications (GDM, excessive GWG, pelvic girdle pain and psychiatric morbidity) and delivery interventions or complications (any caesarean section [CS], emergency CS, elective CS, induction of labour, epidural pain relief and severe perineal lacerations). Babies of non-skiers had higher odds of small-for-gestational-age (SGA) <3 rd percentile, SGA <10 th percentile, large-for-gestational-age (LGA) >90th percentile, 5-minute Apgar score <7, and the composite outcome severe neonatal complications. Non-skiers had lower odds of inadequate GWG and perinatal VTE. Models were unstable for perinatal VTE, as the outcome is very rare (0.3%), shown by a large discrepancy between Bayesian and frequentist models that harmonised closely for other outcomes (Table S5). Our statistical approach allowed us to gather evidence for a null hypothesis if the 95% HDI was within 0.975-1.025, ruling out a meaningful effect size (Methods in Appendix S1). For labour duration, these conditions were met, meaning that we found evidence against an association with ski race participation. There was evidence of effect modification by parity on the associations between ski race participation and excessive and inadequate GWG (Table S6).
Observed OR point estimates were 1.1-1.7 for increased and 0.8 for decreased risk associations. Results were similar when adjusting for the potential mediator early pregnancy BMI (for excessive GWG and inadequate GWG, and LGA, associations were attenuated; Table S5).

| Ski race performance (model 2)
Models that examined associations with performance, as measured by a slower relative finish time, largely followed the same pattern ( Figure 2, Tables 2 and S7). Lower performance was associated with higher odds of pregnancy complications (GDM, pre-eclampsia/eclampsia, any pregnancy-induced hypertension, excessive GWG and F I G U R E 1 Risk of predefined important perinatal outcomes for model 1, non-skiers compared with skiers (exposure baseline): odds ratios and 95% highest density intervals, from Bayesian logistic regressions unless otherwise specified. Bayesian 95% highest density interval is a range with the 95% most probable values of the odds ratio. For labour duration, figure shows proportion increase/decrease (calculated from Bayesian linear regression with a log-transformed outcome). All models are adjusted for parity, age, maternal country of birth, educational level, cohabitation status, smoking, alcohol consumption, location of delivery, calendar year of delivery and pre-pregnancy comorbidities. Exclusions of participants with certain conditions were made for some outcomes: gestational diabetes mellitus (excluding those with pre-pregnancy diabetes mellitus), pre-eclampsia or eclampsia; any pregnancy-induced hypertension (excluding those with pre-pregnancy hypertension), excessive gestational weight gain (GWG); inadequate GWG, 5-minute Apgar score < 7; severe neonatal complications (excluding those with current preterm deliveries), induction of labour (excluding those with elective caesarean section [CS]), instrumental vaginal delivery; labour duration; severe perineal lacerations; shoulder dystocia or brachial plexus injury (excluding those with any CS). Reference for all the outcomes is the inverse (absence of the outcome).

| Sensitivity analyses and exploratory analyses
Frequentist regressions corresponded closely to Bayesian main analyses except regarding perinatal VTE, a rare outcome. With Bonferroni correction for multiple tests, associations passed the threshold except for (model 1) perinatal VTE, severe perineal lacerations, SGA <3, SGA <10, LGA >90, 5-minute Apgar score <7, and severe neonatal complications; and for (model 2) pre-eclampsia/eclampsia, any pregnancy-induced hypertension, perinatal VTE and psychiatric morbidity. Other sensitivity analyses resulted in effect sizes similar to or larger than in the main analysis, T A B L E 2 Risk of predefined important perinatal outcomes from models for ski race participation and performance: ORs and 95% highest density intervals from Bayesian logistic regressions (unless otherwise specified).  (Tables S5 and S7). Very few competed during pregnancy (n = 454) to inform exploratory analyses; there were no associations with increased risks for any outcomes investigated, including a composite of any adverse fetal/neonatal outcomes (Table S8).

| Main findings
Non-skiers compared with skiers, and slower skiers compared with faster skiers, consistently had higher odds of GDM, excessive GWG, psychiatric morbidity, any CS, elective CS, and LGA >90th percentile; and lower odds of inadequate GWG, after adjustment for socio-demographic and lifestyle factors and comorbidities, in Bayesian main models. Among these, psychiatric morbidity and LGA >90 did not meet statistical inference criteria after Bonferroni correction in frequentist sensitivity analyses. There were no associations with fetal/neonatal complications.

| Strengths and limitations
This large study combined a sports event registry with a comprehensive population-based birth registry to identify non-skier controls and extract detailed outcome and covariable information (with exact linkage methods and virtually no loss to follow-up). Prospectively registered data, with an objective measure of exercise (finish time), were used; in contrast, previous literature has relied on self-reports. We apply a kind of evidence triangulation 43 with two analyses with different strengths/biases: skiers versus non-skiers (possibly lower type II error rate by large numbers, but with ski race self-selection), and ski race performance (expectedly less 'confounding by indication', run among skiers only). We prespecified our analysis plan in a detailed protocol and report full results on predefined outcomes, with sensitivity analyses adjusting for multiple comparisons. Outcomes constituted the full overlap between published expert consensus 13 and Pregnancy Registry variables. Important limitations exist. First, we use detailed registry data to adjust for important potential confounders, but the presence of remaining and unmeasured confounding cannot be excluded. Adoption of cross-country skiing is likely dependent on sociocultural and financial factors, and exercise habits may align with, for example, diet. Secondly, exercise lacks a standard measurement. 44 Ours are unconventional and we cannot differentiate between implied high volumes of exercise preceding the ski race and immutable factors contributing to better performance (e.g. genetic predisposition for fitness). As non-skiers frequently may perform other types of exercise, and fit persons with little skiing experience can have slow finish times, associations could be underestimated or missed. Thirdly, some diagnoses can be under-reported, e.g. psychiatric disorders, 45 which is why we combined several measures (Table S2) that include also pre-pregnancy psychiatric morbidity. Because registry information was gathered during visits to maternal healthcare, we believe that currently symptomatic disorders are better represented than lifetime disorders. Pelvic girdle pain, measured according to a doctor's diagnosis, should be interpreted as a selected group of severe cases. Postpartum conditions (such as postpartum VTE) may not be fully captured, as the Pregnancy Register is mainly composed of data from prenatal healthcare. However, we expect no discrepancy in sensitivity contingent on the exposure. Finally, multiplicity correction (Bonferroni) and stringent inference criteria may have led to type II errors.

| Interpretation (in light of other evidence)
Few observational studies have examined whether exercise volumes or fitness before a pregnancy has a bearing on perinatal outcomes, 35,[46][47][48][49][50][51][52][53] and these have provided mixed evidence, e.g. for GDM (lower risk 46 ; no association 50 ). RCTs exist only for the related but different question of whether exercise interventions during pregnancy decrease the risk of adverse perinatal outcomes. [9][10][11][12]14,[54][55][56] Our findings on GDM, excessive GWG, inadequate GWG and psychiatric morbidity have scarce or no precedence in pre-pregnancy exercise studies 35,46,47 ; contradict two studies with selfreported exercise 50,57 ; and align with moderate to strong RCT evidence on pregnancy exercise. [9][10][11]20 Swedish GDM rates are generally low (1-3%), mainly because of differences in screening methods and diagnostic criteria. 38 Still, this study found even lower risk in skiers. We found no studies on pre-pregnancy exercise and perinatal psychiatric morbidity. Exercise during pregnancy reduces the risk of antepartum depression, while effects on anxiety disorders and postpartum depression are less clear. 11 We also report small but consistent associations between pre-pregnancy exercise and lower risk of CS (consistent for elective CS; partly for emergency CS) and LGA >90th percentile. Pre-pregnancy data exist only from a very small cohort (97 US Marines with high fitness levels overall) with null findings on mode of delivery and birthweight. 52 Pregnancy exercise RCTs have not demonstrated effects on CS rates or LGA, but have shown effects on macrosomia (birthweight >4000 g). 12,14 Perinatal VTE, an extremely rare but severe outcome, was more common among all skiers and faster skiers, but the models were not robust. We found no previous work on exercise and perinatal VTE. General population studies show associations between exercise and lower VTE risk, but there are reports of higher risk after vigorous exercise. 58 Short-term physiological effects such as dehydration and vessel wall injury could theoretically increase the VTE risk. 59 Although these results need further investigation, recommendations of pregnancy exercise should include compensating water loss.
Two outcomes reflected prolonged delivery (labour duration and dystocia). For the former we found evidence against an association and, for the latter, insufficient evidence, aligned with null findings from pregnancy exercise RCTs. 14 The fetal outcomes besides LGA (e.g. preterm birth, SGA, malformations, mortality) showed no associations, in agreement with pregnancy exercise studies. 12,55,56,60 See Discussion in Appendix S1 regarding other outcomes.
The literature implies broad beneficial metabolic and cardiovascular effects of exercise 5 beyond maintenance of normal-range BMI. Our perinatal results were in accordance and were only partly attenuated by early pregnancy BMI. Associations are unlikely to be explained by a single endurance race, but rather by being physically fit. In previous surveys, 21% of Vasaloppet skiers versus 4% in the general population reported engaging in regular strenuous exercise and 58% versus 25% in strenuous exercise. 29 Whereas 56% of female Vasaloppet skiers exercised ≥4 hours/week, only 18% in the general population reported ≥1.5 hours/week. 61 Prepregnancy and pregnancy exercise habits are closely related, but whether the latter mediate any associations cannot be derived from this material. 17,18 Regarding the knowledge gap for elite athlete exercise before and during pregnancy and perinatal outcomes, 62 our mixed cohort (with few elite-level performers) provides some indirect evidence. We found no evidence for any non-linear associations (i.e. no evidence for 'U-shaped' associations with higher risks at the highest exercise levels). In exploratory analyses restricting the sample to skiers during pregnancy, findings were in accordance with main models, although precision is very limited because of the sparse numbers. A previous systematic review found no associations between preconception elite-level training and, for example, caesarean sections and birthweight, but the number of studies was small (for caesarean section, n = 3 studies, 324 persons). 62

| CONCLUSIONS
Ski race participation and performance before pregnancy, as proxies for higher volumes of exercise and better fitness, were associated with benefits for important perinatal metabolic and mental health outcomes and vaginal delivery, without adverse fetal/neonatal outcomes.
Pregnancy is perhaps a window of opportunity for interventions, but a challenging time for ambitious lifestyle changes. This paper focuses on habits before pregnancy, being relevant for obstetricians meeting patients between pregnancies, any clinicians meeting women of fertile age, and policy makers in public health. Our results add reasons to further promote existing physical activity recommendations in the general population, 63 adding new knowledge about perinatal disease burden that could potentially be alleviated with increased population-level exercise. Besides clinical interventions focused on risk groups, community-wide actions could target access to and affordability of exercise and everyday physical activity. 63,64 Questions that arise are whether, when and at which intensity individual exercise interventions or community-level policies could diminish the risk of future pregnancy complications. Any previous large exercise RCTs with women participants could potentially be followed up using national birth registries or other comprehensive sources of routinely collected data. Also, prospective trials cluster-randomising communities to exercise-facilitating actions could be encouraged, as could quasi-experimental designs to evaluate community-wide policies regarding population-level perinatal statistics.

AU T HOR C ON T R I BU T ION S
CA: conceptualisation; data curation; formal analysis; investigation; methodology; project administration; visualisation; writing -original draft. AKW, ISP, UH, KM, JW: methodology; writing -review & editing. RAW: formal analysis; investigation; methodology; writing -review & editing. AS: conceptualisation; funding acquisition; investigation; methodology; resources; supervision; writing -review & editing. All authors approved the final article.

AC K NO W L E D GE M E N T S
The authors thank Uppsala University for open access funding.

F U N DI NG I N FOR M AT ION
This work was supported by the Uppsala University Hospital and Uppsala Region ALF grants (Alkistis Skalkidou). The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the article; or decision to submit the paper for publication.

C ON F L IC T OF I N T E R E S T S TAT E M E N T
CA declares funding outside this work from the Knut and Alice Wallenberg Foundation (KAW 2019.0561), Uppsala University (E o R Börjesons stiftelse; Medicinska fakultetens i Uppsala stiftelse för psykiatrisk och neurologisk forskning), The Sweden-America Foundation, Foundation Blanceflor, Swedish Society of Medicine, and Märta och Nicke Nasvells fond. ISP declares to have acted outside this work as invited speaker at scientific meetings for Gedeon Richter and Novartis during the past 36 months. AS declares support for this work from Uppsala Region and declares to have received lecture honoraria outside this work from Svensk Förening för Obstetrik och Gynekologi, being a member of the advisory board of the patient organisation Mamma till Mamma, and a member of university-related boards at Uppsala University as well as the Nordic Marcé Board, Swedish Medical Society Research Delegation board, and Svensk Telepsykiatri board. AKW, UH, KM, JW and RAW declare no competing interests. Completed disclosure of interest forms are available to view online as supporting information.

E T H IC S A PPROVA L
The study was approved by the Uppsala Regional Ethics Committee (2017/347). Following Swedish regulations, the study did not require individual informed consent.

DATA AVA I L A BI L I T Y S TAT E M E N T
Restrictions apply to the availability of the data in this study according to Swedish legislation. Access to the data is dependent on permission from the Regional Ethics Committee of Uppsala and the holders of the Vasaloppet Register and Pregnancy Register.