Familial segregation of venous thromboembolism


John A. Heit, Hematology Research, Stabile 660, Mayo Clinic, 200 First Street SW, Rochester, MN 55905, USA.
Tel.: +1 507 284 4634; fax: +1 507 266 9302; e-mail: heit.john@mayo.edu


Background: Venous thromboembolism (VTE) is postulated as a complex disease, but the heritability and mode of inheritance are uncertain. Objective: To determine if VTE (i) segregates in families; (ii) is attributable to inheritance, shared environment, or both; and (iii) the possible mode of inheritance. Patients and methods: In a family-based study of relatives from 751 probands (60% female) with objectively diagnosed VTE (without cancer), we performed complex segregation analyses corrected for mode of ascertainment, considering age-specific, non-gender- and gender-specific liability classes under Mendelian and non-Mendelian assumptions. We tested 12 models categorized into four model sets: (i) sporadic (assumes no genetic effect); (ii) Mendelian inheritance of a major gene (including dominant, additive, recessive or codominant classes); (iii) mixed model (Mendelian inheritance including the same four classes plus the effect of polygenes); and (iv) non-Mendelian. Results: Among the 16 650 relatives, 753 (48% female) were affected with VTE, of whom 62% were first-degree relatives. The sporadic model was rejected in both non-gender- and gender-specific liability class analyses. Among the remaining gender-specific models, the unrestricted (non-Mendelian) inheritance model was favored with an estimated heritability of 0.52. Among the Mendelian models, the dominant mixed model was preferred, with an estimated heritability and major disease allele frequency of 0.62 and 0.25, respectively, suggesting an effect of several minor genes. Conclusion: A multifactorial non-Mendelian inheritance model was favored as the cause for VTE, while a model postulating a purely environmental cause was rejected. VTE is probably a result of multigenic action as well as environmental exposures.


Venous thromboembolism (VTE) is a major national health problem, with over 200 000 incident cases per year in the USA [1]. Thirty percent of all pulmonary embolism patients die within 7 days, and 25% die suddenly [2]. Thus, prevention of VTE by identification and prophylaxis of patients at risk is necessary to improve survival. However, despite improved prophylaxis regimens [3] and more widespread utilization of prophylaxis, the incidence of VTE has been relatively constant at about 1 per 1000 since 1979 [1,4]. We believe the failure to reduce the incidence of VTE is largely due to our inability to recognize those persons at greatest risk. Currently recognized clinical risk factors include surgery, trauma, hospital inpatient status, malignant neoplasm, central vein catheterization or transvenous pacemaker, prior superficial vein thrombosis, varicose veins, neurological disease with extremity paresis, congestive heart failure, pregnancy or the postpartum period, oral contraceptives, and hormone replacement therapy [5,6]. While such characteristics can identify populations at risk, these characteristics have a low predictive value for the individual patient.

Inherited thrombophilia, such as reduction in plasma antithrombin, protein C, or protein S activity, is strongly associated with VTE and potentially could further stratify patients into high- and low-risk groups. However, altogether the prevalence of these recognized familial thrombophilias among VTE patients is only about 5%[7]. Additional specific mutations (e.g. factor V Leiden, prothrombin nucleotide G20210A) may account for another 15–20% of VTE occurring in the community [8–10]. Because there are few estimates of the heritability of thrombosis risk restricted to the VTE phenotype, the genetic influence on the risk of common late-onset VTE remains uncertain. As a first step toward the ultimate goal of identifying novel genes involved in VTE susceptibility, we performed a large family-based study addressing the heritability of VTE and the potential mode of inheritance.


Study population

Over the 6-year period 1998–2003, all patients with idiopathic, objectively diagnosed VTE who were referred to the Mayo Clinic Special Coagulation Laboratory/Clinic for a clinical suspicion of thrombophilia were approached for study inclusion. Consenting patients (probands; n = 751, 60% women) completed questionnaires providing a family pedigree and ethnic ancestry, and indicated all other family members that were affected with VTE. All patients were non-hispanic Caucasian-Americans in keeping with the racial composition of the region.

VTE incidence rates

Age-specific and age- and gender-specific VTE incidence rates for Olmsted County (MN, USA) updated through 1995 were used to estimate the affection probabilities [1]. To determine the incidence numerator, all incident VTE (deep vein thrombosis, pulmonary embolism, and chronic thromboembolic pulmonary hypertension) cases in Olmsted County were identified using the resources of the Rochester Epidemiology Project [11]. Person-years used for the denominator were calculated using US census data for the Olmsted County population for the years 1970, 1980, 1990, and 2000 with linear interpolation for the intercensal years. The non-gender-specific models used overall age-specific VTE population incidence rates derived from Olmsted County, while the gender-specific models used age- and sex-specific VTE population incidence rates derived from Olmsted County. Individuals of unknown age were assigned the same incidence rate as 60–69-year-olds of the same gender.

Statistical analysis

Our analyses included all families of probands referred over the entire study period (n = 751). We conducted complex segregation analysis using the unified version of the mixed model [12] as implemented in Pedigree Analysis Package (PAP) [13]. The data consisted of extended families of the 751 probands diagnosed with VTE. Since the families were ascertained through one individual (i.e. the proband) presenting with VTE, we corrected for the method of ascertainment [14] in order to obtain parameter estimates appropriate for the population.

The most general statistical model includes the independent and additive contribution of a single gene, small additive genetic effects of a large number of independent polygenes, and individual-specific environmental effects. This general model assumes Hardy–Weinberg equilibrium at all loci, no linkage disequilibrium between pairs of loci, no epistasis, and no genotype–environmental interactions. In the analysis we ran the model under Mendelian and non-Mendelian assumptions. The single major gene was assumed to consist of two classes of alleles, A (affected) and a (unaffected). The gene frequency of A and a in the sample population was assumed to be p and q = 1 − p, respectively. Other parameters such as the dominance (d) and the displacement (t) defined the distribution of VTE within the three possible genotypic classes, AA, Aa, and aa. The distribution of the trait (VTE) for the ith genotypic class (1 for AA, 2 for Aa, and 3 for aa) was assumed to be normal with mean μI and variance σ2, and was assumed to be the same for all three genotypic classes. The dominance was defined so that d = 0 corresponds to a recessive gene, d = 0.5 to an additive gene, d = 1 to a dominant gene, and 0 ≤ d ≤ 1 to a codominant gene. The displacement refers to the mean difference between the homozygous genotypes (AA and aa). Within each genotype, the proportion of the variance due to polygenes was denoted by h2 (heritability).

This model also included three transmission probabilities (τAA, τAa, τaa), which were defined as the probability that a parent with any of the three genotypes transmits the disease allele A to its offspring. If Mendelian transmission was assumed, the transmission probabilities were fixed to τAA = 1, τAa = 0.5, and τaa = 0. The penetrance function was calculated using the incidence rates. The traditional likelihood-ratio test (LRT) was used to test nested models (i.e. when one model is a submodel of the other). Asymptotically, LRT has a χ2 distribution with k degrees of freedom, where k is the difference between the number of parameters between the two nested models. When the models were not nested, Akaike's information criterion (AIC) was used to compare models [AIC = − 2 ln L + 2(number of parameters)], where ln L is the natural logarithm of the likelihood of the model [15]. The best fitting model is the one that provides the lowest AIC, although there is no specific statistical test for significance [16].

We used 12 models to analyze these data, and categorized them into four model sets: (i) sporadic, which assumes no genetic effect; (ii) Mendelian inheritance of a major gene, which is divided into four classes: dominant, additive, recessive and codominant; (iii) mixed model (Mendelian inheritance), which was divided into the same four classes as (iii), plus the effect of polygenes, which is represented by the parameter h2; and (iv) non-Mendelian (unrestricted τAa), which assumes no Mendelian segregation (the transmission probability τAa was not fixed to 0.5). The sporadic model assumes p = 0 and h2 = 0, which is equivalent to the assumption of equal transmission (τAA = τAa = τaa). The major gene and mixed models assume Mendelian transmission (τAA = 1, τAa = 0.5, and τaa = 0). For fitting the most parsimonious model of inheritance, log likelihood comparisons were made for the restrictive models (sporadic and non-Mendelian vs. Mendelian) with adjustments for the number of parameters estimated through the use of the AIC.


Among the 16 650 family members from the 751 families, 753 were affected with VTE, of whom 62% were first- and 38% were second- or higher-degree relatives, respectively (Table 1). Of the 753 affected relatives, 466 (62%) were women. Among the 287 affected males, 60% and 40% were first- and second- or higher-degree relatives, respectively. Among 466 affected females, 63% and 37% were first- and second- or higher-degree relatives, respectively. At the time of questionnaire completion, the mean ± SD proband age was 56 ± 16 (range 9–103) years, while among the 9439 family members with age data available, the mean ± SD age was 47 ± 26 (range 0–105) years.

Table 1.  Distribution of family members by relationship to proband, gender, and venous thromboembolism affection
Relative typeRelationship
to proband
Total*Affected (%)MaleFemale
Total (%)Affected (%)Total (%)Affected (%)
  • *

    With complete information on cancer status.

First-degreeParents 1468230 (15.7) 734 (50.0) 93 (12.7) 734 (50.0)137 (18.7)
Siblings 2146183 (8.5)1122 (52.3) 66 (5.9)1024 (47.7)117 (11.4)
Offspring 1743 53 (3.0) 854 (49.0) 14 (1.6) 889 (51.0) 39 (4.4)
Total  5357466 (8.7)2710 (50.6)173 (6.4)2647 (49.4)293 (11.1)
Second-degreeGrandparents 2311112 (4.9)1156 (50.0) 44 (3.8)1155 (50.0) 68 (5.9)
Aunts/uncles 4379100 (2.3)2246 (51.3) 40 (1.8)2133 (48.7) 60 (2.8)
Other  4603 75 (1.6)2507 (54.5) 30 (1.2)2096 (45.5) 45 (2.2)
Total 11293287 (2.5)5909 (52.3)114 (1.9)5384 (47.7)173 (3.2)
Overall total 16650753 (4.5)8619 (51.8)287 (3.3)8031 (48.2)466 (5.8)

VTE incidence rates from 1966 to 1995 are shown in Table 2. As previously noted [1], VTE incidence increased with age and was generally higher in men except for the childbearing years among women. Overall VTE incidence from 1966 to 1995 was 125.8 per 100 000 (age- and sex-adjusted to 1990 US white population).

Table 2.  Venous thromboembolism incidence rates for the non-gender and gender-specific models
Age, years
(liability class)
Incidence per person-year

Tables 3 and 4 present the segregation analysis results for the 751 families. We performed the analyses in two ways. First, we considered non-gender-specific incidence rates (Table 3), and second, we considered gender-specific incidence rates (Table 4). The sporadic model was rejected in both analyses. Because of a significant gender difference, we only discuss results performed under gender-specific incidence rate models (Table 4). Among all the remaining models tested, the general (unrestricted) inheritance model proved to fit best the observed data (the lowest AIC value), with an estimated heritability of 0.52 (with unrestricted τAa being zero for M–F and F–M transmission, and non-zero for F–F and M–M transmission). For all Mendelian major locus model analyses except the recessive model, the estimated disease allele frequency was < 0.05, suggesting an effect of a major gene. In all Mendelian mixed model analyses, the estimated value of the disease allele frequency was > 0.20, which may suggest an effect of several minor genes. The mixed dominant model proved to be the best model among the Mendelian models. The heritability of VTE under this model was estimated to be 0.62, with a major disease allele frequency of 0.25.

Table 3.  Segregation analysis for all venous thromboembolism families using a non-gender-specific model
ModelParameterNumber of
τAapth2−2ln LAIC
  1. Square brackets denote the parameter that was fixed; parentheses denote that the parameter went to a bound.

Major locus
Unrestricted τAa
Fixed heritability(0.000)0.18340.0451.19[0.000]7722.0267730.0264
Maximize heritability0.1580.5346(0.000)104.770.5697543.6507553.6505
Table 4.  Segregation analysis for all venous thromboembolism families using a gender-specific model
ModelParameterNumber of parameters
τAapdMdFtMtFh2−2ln LAIC
  1. Square brackets denote the parameter that was fixed; parentheses denote that the parameter went to a bound.

Sporadic[0.0000][0.00]8841.8908841.890 0
Major locus
Dominant[0.5]0.0081[1.000][1.000]1.171.78[0.00]8167.1038173.103 3
Additive[0.5]0.0144[0.500][0.500]2.113.21[0.00]8148.7678154.767 3
Recessive[0.5]0.1123[0.000][0.000]1.662.43[0.00]8164.0148170.014 3
Codominant[0.5]0.04560.1930.3312.553.47[0.00]8095.6548105.654 5
Dominant[0.5]0.2518[1.000][1.000]38.8123.040.6197791.2207799.220 4
Additive[0.5]0.4475[0.500][0.500](0.00)94.940.5637958.1927966.192 4
Recessive[0.5]0.4475[0.000][0.000](0.00)32.080.5637958.1927966.192 4
Codominant[0.5]0.6374(0.000)(0.000)32.1229.570.5967858.7197870.719 6
Polygenic[0.0000]0.4768158.4978160.497 1
Unrestricted τAa
Fixed heritability
F–F(0.000)0.1811(0.000)(0.000)0.861.42[0.000]7660.6987678.698 9
Maximize heritability          


This study demonstrates that VTE is not caused solely by environmental exposures. Genetic factors clearly account for a significant percentage of familial cases of VTE. Our study is the first family-based study strictly limited to the VTE phenotypes, deep vein thrombosis or pulmonary embolism. Our findings are consistent with those from the Genetic Analysis of Idiopathic Thrombophilia (GAIT) Study, a family-based study among Spanish families [17,18]. In both studies, the estimated heritabilities of thrombosis (arterial or venous) or VTE were virtually identical (0.61 for the GAIT study vs. 0.52–0.62 for our study). However, in the GAIT study, family members were counted as affected with thrombosis if they had VTE, superficial vein thrombosis, or arterial thrombosis (e.g. stroke or myocardial infarction). When the diagnoses considered were restricted to VTE (e.g. arterial thromboses were not counted), the heritability was not significantly changed.

Our study is also the first family-based study to assess the potential mode of VTE inheritance. Among all models tested, the general or unrestricted model fits the observed data best, suggesting a genetically complex mode of inheritance that involves no major disease susceptibility gene. Among the Mendelian models tested, a mixed dominant mode of inheritance was preferred, suggesting both a major disease susceptibility gene (with an allele frequency = 0.25) as well as multiple additional modifier genes (polygenes). Because the heritabilities for these two models did not reach 1.0, an additional effect of environmental exposures is suggested. These findings support the hypothesis that VTE is a complex (multifactorial) disease caused by multigenic action as well as environmental exposures [19,20].

Studies of a large French-Canadian kindred (n = 710) with type I protein C deficiency and a high incidence of VTE also suggest the presence of an additional thrombosis-susceptibility gene(s) [21]. This kindred contains a founder mutation in the protein C gene (3363 insertion C; protein CVermont IIb) that ultimately causes a frameshift and truncation of the mature protein [22,23]. Of particular note, thrombosis in protein C-deficient family members segregates in certain branches of the family and not in others, although all branches carry the same mutation [21,22,24]. This incomplete penetrance suggests that additional genes interact with protein C deficiency to potentiate thrombosis in certain branches of the family. A two-locus segregation analysis supported the presence of a single, interacting thrombosis-susceptibility gene (modifying locus) segregating in this family [24], although as yet this susceptibility gene has not been identified [25–27].

Further support for the multigenic inheritance of VTE comes from studies showing that many hemostasis-related plasma protein concentrations (traits) both correlate with thrombosis and show a high degree of heritability [17,18,28,29]. In the GAIT study, after accounting for the effects of age, gender, smoking, and oral contraceptives, essentially all of the hemostasis-related traits studied showed significant genetic components, ranging from 17% to 83% of the residual phenotypic variability [17,18]. For most traits, genes were the largest identifiable determinant of quantitative variation. Similarly, in a study of 1002 female twins from the St Thomas' UK Adult Twin Registry, genetic factors contributed to 38–82% of the variation in concentrations of plasma hemostasis-related proteins [28,30]. These results have recently been corroborated in the French-Canadian protein CVermont IIb kindred [29,31], and several quantitative trait loci have been identified [32–35].

It is important to address the potential limitations of our study. One possible source of bias was the ascertainment sampling scheme. However, we controlled for ascertainment bias in our analyses. While we made every effort to identify all family members affected with documented VTE, VTE is mainly a disease of older age and many of the probands were older at the time of ascertainment. Thus, the grandparents and parents of the probands were often either deceased or were clinically diagnosed with VTE because objective diagnostic testing was unavailable. These factors may have led to either under- or overreporting, respectively. Similarly, many of the offspring had not yet reached the age range of common, late-onset VTE. These unrecognized or unreported affected family members may have reduced the power of our study. The finding of the mixed dominant model as the best Mendelian model could be due to referral bias (lack of reliable information on the second- and higher-degree relatives) or the effect of genetic heterogeneity. Genetic heterogeneity may affect the ability to identify genetic and non-genetic factors associated with the risk of disease, since the phenotype alone may not be able to distinguish etiologically distinct subgroups. For example, age at onset may be one factor that distinguishes subgroups of VTE. However, in a subset analysis restricted to the families of probands referred over the initial 3-year study period, 1998–2000 (n = 588), we found essentially the same degree of heritability and mode of inheritance (data not shown).

In summary, we have demonstrated that VTE is not solely due to environmental exposures. VTE is highly heritable and probably a result of multigenic action as well as environmental exposures. These findings support a family-based linkage analysis approach to the discovery of VTE disease-susceptibility genes.


Funded, in part, by grants from the Doris Duke Charitable Foundation; the National Institutes of Health (HL66216), US Public Health Service; and by Mayo Foundation.