Thyroid hormone replacement therapy patterns in pregnant women and perinatal outcomes in the offspring

It remains unknown to what degree thyroid hormone replacement therapy (THRT) during and initiation after pregnancy determines pregnancy outcomes. The present study primarily aimed to quantify the impact of THRT patterns (including trajectories) on gestational age, birth weight, and head circumference of infants. The secondary aim was to compare results of trajectory with traditional analysis.


| INTRODUCTION
Approximately 3% of women of reproductive age experience overt or subclinical hypothyroidism. 1 In addition, during pregnancy, women are more vulnerable to developing hypothyroidism because of the increased demand for thyroid hormone production. 2 Inadequate treatment of hypothyroidism during gestation has been associated with adverse pregnancy outcomes, like preterm delivery. 3 Therefore, thyroid hormone replacement therapy (THRT) is recommended. 4 The literature has reported conflicting results on the beneficial effect of THRT on pregnancy outcomes. 5,6 However, there is evidence that the effectiveness of THRT depends on timing (first trimester) and dosage to match severity of the condition. 3 A number of previous studies failed to include first trimester exposure information or lacked information on dosage and severity levels, eg, thyroid hormone blood levels. 6,7 Exposure groups reflecting variations in THRT use with respect to timing, duration, and dosage during gestation may be biologically more appropriate than simply grouping women into users and nonusers when assessing the impact of THRT on pregnancy outcomes. 8 The current study builds on our prior work showing that Group-Based Trajectory Models (GBTMs) could be used to identify women with distinct patterns of THRT use in pregnancy. 8 Our primary aim is to analyze the association between THRT patterns during pregnancy (using GBTM) and immediate pregnancy outcomes, such as infant birth weight, gestational age at birth, and head circumference. To address confounding by maternal underlying disease, we also compared women initiating THRT after delivery with the nonhypothyroid group. After analyzing these associations for all women with a prescription of THRT during pregnancy, a secondary aim is to compare the analysis of this joint group with the one that splits THRT users during pregnancy into disjoint trajectories. This helps illustrate the relevance of clustering techniques in studies of medication safety in pregnancy. An important advantage of this study over prior observational studies is the use of data on maternal thyroid hormone blood levels in pregnancy to account for severity levels of hypothyroidism.

| Data sources
The following data sources were linked using the unique 11-digit person identification number given to all legal residents in Norway.
The Norwegian Mother, Father and Child Cohort Study (MoBa) is a prospective, population-based cohort study of pregnancies in Norway that was initiated in 1999 by the Norwegian Institute of Public Health. 9 18. 11 The method of selecting participants for the Biobank subgroup is described by Caspersen et al. 12 The Medical Birth Registry of Norway (MBRN) is a nationwide health registry of information about all births in Norway. 13 The registry includes confirmed medical records related to maternal health before and during pregnancy. 13

Key points
• This study extends existing literature on the association between THRT and immediate perinatal outcomes by including thyroid hormone blood levels.
• Pregestational and first trimester THRT use is important for infant health.
• Underlying or undiagnosed hypothyroid condition during pregnancy increases the risk of LGA infants.
• Compared with analyses grouping women into users or nonusers of medication, GBTM enables a more biologically tailored investigation of the association between THRT and immediate pregnancy outcomes.
• Clustering approaches might be useful in future studies on drug safety in pregnancy, when timing, duration, and dose of exposure are important for fetal safety.
H03B, a combination of ATC H03AA and H03B, and ICD-10 code "e0-other") or a diagnosis of hyperthyroidism (ICD-code "e05") in the MBRN. Diagnosis information from the NPR was not considered a selection criterion for the study population, because the NPR was established in 2008.
From the original cohort of participants (n = 55 243), we excluded around 2.2% of women because of incomplete or inconsistent information on THRT use, missing information, and registration outliers on pregnancy outcomes. 17,18 We restricted the study sample to the first 32 gestational weeks, in order to make sure that all women have equal time available in relation to the exposure (see Supporting Information).

| Exposure definitions
Women with hypothyroidism were identified based on dispensed THRT prescriptions before and during pregnancy or first time within 1 year after delivery. Therefore, women were assigned to the medicated group (n = 1233), when they received at least one THRT prescription during the period starting 6 months prior to the pregnancy and ending at 32 gestational weeks. The THRT after delivery group (n = 1397) included women that received THRT prescriptions within 1 year after delivery and served as a proxy for a disease-comparator group. This group is important to study, as women might develop postpartum hypothyroidism because of postpartum thyroiditis. 19 Hence, these women might have thyroid antibodies present during pregnancy, which has previously been connected to adverse immediate pregnancy outcomes. 20,21 The nonhypothyroid group (n = 51 390) was defined as the reference group and included women that did not receive a THRT prescription before, during, or after pregnancy.
THRT was classified based on the ATC Classification System and included thyroid hormones (ATC code H03AA). 22 GBTMs are finite mixture models, which split a population into a finite, disjointed number of groups based on the latent mixture probability of group membership. 23 First, we split the exposure period, starting from 6 months prior to pregnancy and ending with gestational week 32, into months (in total 14 months). To classify women in the medicated hypothyroid group into trajectories, we then applied the GBTM approach to each woman's dispensing of prescription over the 14 months (ie, exposed/unexposed), similar to Frank et al 8 , who used GBTM to define long-term THRT adherence patterns before, during, and after pregnancy.

| Outcome definitions
We retrieved data on child outcomes from the MBRN, including birth weight (g), gestational age (days), and head circumference (cm), all modelled as continuous. In the Norwegian population, the average birth weight is 3,489 g, with a standard deviation (SD) of 591 g; the average gestational age is 275.1 days with SD of 13.3 days; the average head circumference is 35.30 cm with SD of 0.04 cm. 24,25 The large-for-gestational age (LGA) infant outcome was analyzed and dichotomized at the 90th percentile in the MoBa population. Other potential outcomes, such as small-for-gestational age, low birth weight, premature (gestational weeks <37) birth, and small or large head circumference, could not be analyzed because of the low number of cases (≤5) within the trajectory groups. Q3, and they were categorized as medicated or nonmedicated, depending on whether the woman reported psychotropic drug use (ATC codes N05 and N06). Thyroid hormone blood levels, TSH, FT4, FT3, and TPOAb levels, were retrieved from the Biobank subsample. 27 We considered the sufficient set of confounders to be maternal age, BMI, parity, marital status, comorbidities, fiber intake, educational level, income, supplement use, smoking and alcohol habits, gender of child, FT3, FT4, TSH severity, and the TPOAb category.

| Ethical approval
The establishment and data collection in MoBa were previously based on a license from the Norwegian Data Protection Agency, with approval from The Regional Committee for Medical Research Ethics.
Currently, it is based on regulations related to the Norwegian Health Registry Act.
The overall MoBa study was approved by the Norwegian Data Inspectorate (01/4325) and The Regional Committee for Medical Research Ethics (S-97045, S-95113). The current study was approved by The Regional Committee for Medical Research Ethics (2015/1241, REK Sør-Øst B). All participants provided written informed consent prior to participation.
Blood samples were obtained from both parents during pregnancy and from mothers and infants (umbilical cord) at birth.

| Statistical analysis
To take into account differences in characteristics across women in the various treatment groups, we performed propensity score analysis, with inverse probability of treatment weighting (IPTW), after multiple imputation. 28,29 The subsample of thyroid hormone blood levels from the Biobank was multiple imputed together with other missing covariate information. 30 By exploring the patterns of missing data, we assumed that data are missing at random (MAR), as also done in prior MoBa studies. 31 Though MAR is untestable, by including a wide variety of predictors in the imputation model, the assumption is likely to be plausible (see also Supporting Information). 32,33 The optimal number of THRT trajectories was selected by the highest (least negative) BIC value and estimated group proportions greater than 5.0%. 8 Boosted logistic regression models were applied to determine the conditional probability of six group comparisons, where the THRT trajectories from the medicated and the THRT after delivery group were compared with the nonhypothyroid group. 28 The propensity scores were calculated conditioned on the sufficient set of confounders.
We did not adjust for gestational age when analyzing birth weight, head circumference, or LGA infant, as gestational age can introduce collider bias. 34 Weights were truncated at the 99th percentile. For balance assessment, the Maximal Averaged Standardized Difference (MASD) was applied. The MASD is a balance diagnostic for the generalized propensity score after multiple imputations (see Supporting Information). In the final weighted regression model, we took repeated pregnancy participation in MoBa into account. A summary of the analytical procedure can be found in Algorithm S1.

| Sensitivity analyses
We compared the trajectory analysis to a more traditional approach.
With "traditional analysis," we specifically refer to the analysis where the medicated (previously split into disjoint trajectories) and THRT after delivery groups are compared with the nonhypothyroid group.
Furthermore, the trajectory analysis was done without inclusion of blood levels. Finally, we performed a complete case analysis.
A priori power calculations are presented in detail in the Supporting Information.
GBTMs were built with the "traj" Stata plugin (Stata version 15.1). 35 The remainders of the analyses were performed in R (version 3.4.4). For multiple imputation, we used the "mice" R package; 29 for the generalized propensity score, we used the "twang" R package and its "mnps" function for multiple treatments. 28 Regression analysis with IPTW was performed with the "survey" R package. 36
In the medicated group, we identified four disjointed trajectories Maternal characteristics are presented according to treatment groups in Table 1. Differences among groups were observed in socioeconomic characteristics, concomitant health, medication use, and lifestyle factors. The frequency of thyroid diagnoses also varied between groups (Table 1). Drug utilization for each THRT trajectory and the medicated hypothyroid group are shown in Table S1. Individual covariates had missing information ranging from 1.8% to 11.7%. In total, 29.0% had missing data in one or several variables. Table S2 presents the data on thyroid hormone blood levels before and after multiple imputation.

| Trajectory analysis
The results of the trajectory analysis are shown in Table 2

| Traditional and other sensitivity analyses
In the traditional analysis (Table S4), only the THRT after delivery group showed a significant difference in the risk of LGA infants (aOR = 1.19; 95% CI, 1.00-1.42) compared with the nonhypothyroid group.
Only small, insignificant deviations in the traditional analysis without blood levels were observed compared with the analysis including these data as confounders (Table S5). Dashed thin lines are approximated 95% pointwise confidence intervals on the estimated trajectories. Vertical lines mark the start of the 6-month period prior to pregnancy (at month "−6"), the start of the pregnancy period (at month "0"). A month represents four gestational weeks. For example, month "0" stands for the gestational weeks 1 to 4. The Y-axis presents the groupaverage adherence rate to THRT per month. For example, women in the Constant-Medium group took THRT, on average, 60% of days in each month before and during gestation. On average, women in the Increasing-Medium trajectory took no THRT 4 months before gestation, but in pregnancy month 5, they took THRT in, on average, 60% of days. Abbreviations: THRT, thyroid hormone replacement therapy [Colour figure can be viewed at wileyonlinelibrary.com] were significantly larger than in the nonhypothyroid group (Table S6).

| DISCUSSION
This study is the first to examine the impact of THRT trajectories in pregnancy on immediate perinatal outcomes and to include information on maternal thyroid hormone blood levels during gestation.
An increased risk of LGA infants was observed among women in the Increasing-Medium (69% magnitude) and THRT after delivery (19% magnitude) groups compared with the nonhypothyroid group.
Given that the fetus relies entirely on maternal thyroid hormone production in the first 20 weeks, 37 late initiation of treatment during early gestation in the Increasing-Medium group might explain the observed increased risk of LGA. 38 Pregestational THRT use might also be necessary for women with diagnosed hypothyroidism to maintain maternal thyroid hormone levels within the reference range during early gestation. 39 Hypothyroidism might be latent, for some time, before it is diagnosed and treated. 40 An underlying thyroid disorder during gestation among women in the THRT after delivery group might explain the LGA results. 40 Currently, there is no consensus about whether pregnant women should be screened for hypothyroidism in pregnancy. 40 Our results support current guidelines of selective screening of pregnant women for hypothyroidism early in gestation. Moreover, we recommend monitoring and tailoring of THRT to women with hypothyroidism in need of pharmacological treatment. There is however a need to further examine potential benefits of universal screening for the health of mother and child.
We found no significant risk of LGA infant for women that consistently used THRT (ie, women in the Constant-High and Constant-Medium groups). This is in contrast to results from a recent Finnish study, which showed that consistent users of THRT had a 26% increased risk magnitude of LGA infant (aOR = 1.26; 95% CI, 1.10-1.45) when compared with mothers without thyroid disease. 6 However, we excluded pregnancies with gestational age before 32 weeks, Hyperthyroidism (ICD-10 code "e05" from NPR only) LGA infant for consistent THRT users, as opposed to Turunen et al. 6 In the complete case analysis, infants of women in the Constant-High group had significantly higher birth weight and head circumference. This could be explained by the fact that women in this group were most often obese and had gestational diabetes. 41,42 Although we adjusted for BMI and maternal diabetes in the analysis, residual confounding and a role for unobserved maternal metabolic factors cannot be ruled out. Greater risk of selection bias in the complete case analysis might explain these significant effects, which disappeared after multiple imputation.

| Strengths
The clear strengths of this study are the large size of the study population, the combination of multifaceted data sources, inclusion of thyroid hormone blood levels, and advanced statistical analysis. To our knowledge, combining the generalized propensity score with exposure trajectories is a novel approach. This approach enabled us to minimize confounding bias; 43 in particular, we reduced the confounding by severity by including maternal thyroid hormone blood levels. A potential concern might be the use of prescription records rather than maternal self-reported THRT use. Prescription records do not necessarily represent actual medication use. 14

| Limitations
Selection bias is a well-known, acknowledged limitation of the MoBa cohort study. 9 Compared with the general Norwegian population, women in MoBa are known to be older, healthier, have higher educational levels, and are less likely to smoke during pregnancy. 10 All pregnancy outcomes were however within the normal range for a Norwegian infant.
Although we adjusted for measured confounders and thyroid hormone blood levels, we cannot rule out, for example, the influence of residual confounding by maternal disease severity in mid-late pregnancy, given that blood samples were taken in gestational week 18.
Given that Biobank is a nonrandom sample of MoBa, there is a possibility that information on blood levels is missing not at random. However, imputation of blood levels based on MAR assumption did not bias our results, as sensitivity analysis without inclusion of blood levels showed. This study warrants the need for future methodological development on using biological material from small, selected (non or) random subsamples in statistical analysis. 9 According to power calculations, the present study could only detect large effect sizes.

| CONCLUSIONS
We found an increased risk of LGA infants among women initiating THRT late in pregnancy or after delivery. However, there was no evidence that the various THRT patterns had a substantial, differential effect on the other outcomes. The results of this study support current guidelines on the importance of THRT use during pregnancy and selective screening of pregnant women for hypothyroidism.

ACKNOWLEDGEMENTS
We are grateful to all of the participating families in Norway that took part in this ongoing cohort study. We also thank Cathrine Thomsen,  LGA, large-for-gestational-age infant; OR, adjusted odds ratio; OR, odds ratio; β, weighted mean difference in pregnancy outcome; β, mean difference in pregnancy outcome [Colour figure can be viewed at wileyonlinelibrary.com] manuscript was revised during Anna Simone Frank's short-term