Identifying novel biomarkers for cardiovascular disease risk prediction


  • Y. Ge,

    1. Cardiology Division, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
    Search for more papers by this author
  • T. J. Wang

    Corresponding author
    1. Cardiology Division, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
    • Correspondence: Thomas J. Wang, MD, Cardiology Division, GRB-800, Massachusetts General Hospital, 55 Fruit Street, Boston, MA 02114, USA.

      (fax: +1 617 726 4105; e-mail:

    Search for more papers by this author


The primary prevention of cardiovascular disease relies on the ability to identify at-risk individuals long before the development of overt events. In the past decade, research into circulating, genetic and imaging biomarkers to augment traditional methods of risk prediction has only achieved modest success. Emerging technologies in the fields of genomics, metabolomics and proteomics are providing new platforms for biomarker discovery. Here, we review current concepts in the evaluation and discovery of cardiovascular biomarkers. Further research is needed to identify new biomarkers to successfully stratify risk of cardiovascular disease in low-risk populations, as well as to test whether management strategies informed by biomarker testing are better than standard of care.

Introduction: do we need novel biomarkers in primary cardiovascular disease risk prediction?

Despite recent trends of decreased incidence and improved outcomes in acute myocardial infarction [1], cardiovascular disease remains a major health concern in industrialized countries. For instance, in the USA, cardiovascular disease remains the leading cause of death by a large margin [2]. Thus, improved strategies for the primary prevention of cardiovascular disease are a public health priority. An important challenge to implementing such strategies in a cost-effective manner is the limited predictive value of current risk-assessment models. This ‘detection gap’ is illustrated by the observation that traditional risk factors (cigarette smoking, diabetes, hyperlipidaemia and hypertension) are identified in only a subset of individuals who develop coronary events. Up to 20% of patients with coronary disease have no traditional risk factors, and 40% have only one [3, 4]. Furthermore, on a population level, adults categorized as at low or intermediate risk of disease by traditional criteria are responsible for the majority of the risk burden [5]. Thus, identifying novel risk markers for cardiovascular disease has significant potential to improve the selection of individuals for preventative strategies. In this review, we will address the discovery and evaluation of novel biomarkers, focusing on both current evidence and newer platforms for biomarker discovery.

What are biomarkers and how do we evaluate them?

Any molecular, cellular, tissue or imaging measurement of a physiological, pathological or therapeutic response can be considered a biomarker [6]. To be considered useful, biomarkers must meet the following criteria: (i) accuracy, that is, the ability to identify individuals at risk; (ii) reliability, that is, stability of results when repeated; and (iii) therapeutic impact with early intervention [7]. In primary cardiovascular disease risk prediction, the existence of established clinical risk-assessment models such as the Framingham risk score means that a novel marker should also provide incremental information over and above existing algorithms [8].

The process of biomarker discovery and validation has been well described in the field of cancer research. In the first step, new target biomarkers are identified, or old ones are refined in a preclinical phase, leading to assay validation in individuals with and without disease. Second, retrospective repository studies are used to determine whether the biomarker in case subjects deviates from that in control subjects to establish the threshold for a positive screening result. Third, prospective screening studies are applied to large cohorts. Finally, the biomarker is validated as a disease control tool during randomized controlled trials [9]. So far, in cardiovascular disease risk prevention, all studied biomarkers have been limited to the first three steps, with the exception of C-reactive protein (CRP).

In each of the steps involved in biomarker discovery, important statistical concepts apply. Relative risk indicators, such as odds ratios or risk ratios, are the most frequently reported measures of association. These indicators measure the strength of association between a measurement and an outcome, but do not provide direct information as to whether a biomarker affects risk prediction. This is because the distribution of biomarker levels in cases and controls typically overlap substantially even when risk ratios are high, such that no cut-off value can achieve both high sensitivity and high specificity [10]. The ability to differentiate between cases and controls is referred to as discrimination. One measure of discrimination is the c-statistic, which is equivalent to the area under the receiver operating characteristic (ROC) curve. An ideal biomarker would have perfect sensitivity and specificity; however, in practice, the increase of one comes at the expense of the other. The ROC curve, a visual plot of sensitivity vs. 1-specificity (false-positive rate), offers a summary across all possible threshold values. The c-statistic therefore captures the trade-off between sensitivity and specificity of a prediction model into one statistic. It represents the probability that in a random case–control pair, the model will assign a higher predicted value to the case. Thus, the c-statistic ranges from 0.5 (no better than random guessing) to 1 (perfect discrimination) [11].

Despite its usefulness, there are important limitations to the c-statistic when used alone to evaluate risk prediction models. Because it is a rank-based measure, the c-statistic does not provide any information on the ability of a model to predict probabilities. For example, a model that would always assign 0.6 to cases and 0.5 to controls would have perfect discrimination, but the probabilities themselves would be meaningless. In a clinical context, the assigned probabilities are often important for decision-making, which relies on balancing predicted health benefits with the probability of treatment-related toxicities [11]. It is also worth noting that, in contrast to classical understanding, both sensitivity and specificity may vary with disease prevalence and clinical features [12, 13]. The c-statistic of a model may therefore depend upon the characteristics of the studied population. Finally, there is increasing recognition that even modest improvements in c-statistics require extremely strong associations between a novel biomarker and an outcome (e.g. odds ratios of 20 or more) [14, 15]. Because most novel markers fail to achieve this level of association, other statistical methods have been developed to assess their usefulness.

Reclassification is a newer statistical concept that explores whether the incorporation of a biomarker leads to a change in assigned risk. For cardiovascular disease, individuals may be classified as having low, intermediate or high risk, depending on their 10-year Framingham risk score [16]. A biomarker may reclassify individuals from one risk category to another. However, not all reclassification is useful. For example, reclassifying individuals at high risk of disease to an intermediate- or low-risk category may be helpful if they do not experience an event, but may be considered harmful if they do. This concept of ‘rightly’ and ‘wrongly’ reclassified individuals underlies a measure termed the net reclassification improvement (NRI) [15]. The NRI is the sum of the proportion of rightly minus wrongly reclassified individuals in the event and nonevent groups. Another measure, the integrated discrimination improvement (IDI), may be regarded as a continuous version of the NRI, with reclassification measured in terms of differences in predicted probabilities.

Despite its intuitive applications, there are still practical limits to the use of the NRI. For example, reclassifying an individual at low risk of disease to an intermediate-risk category may have the same effect upon the NRI as reclassifying an individual at intermediate-risk to a high-risk group. Yet, the clinical implications may be quite different. Similarly, incorrect down-classification of a few individuals at high risk may result in significantly more clinical harm than correct up-classification of many individuals from low-to-intermediate risk. Two modifications to the NRI have been proposed to address these limitations. First, the ‘clinical NRI’ examines reclassification only for individuals starting in the intermediate-risk group, with the assumption that these are individuals in whom clinical decision-making is most likely to be altered [17]. The other modification is a ‘weighted NRI’, which takes into consideration factors such as perceived costs or savings [18].

What is the current evidence to support the use of novel cardiovascular biomarkers?

Biomarker research in cardiology has largely focused on circulating and imaging biomarkers. The American College of Cardiology Foundation and the American Heart Association (ACCF/AHA) recently released guidelines on the use of existing markers [19]. A summary of the current ACCF/AHA recommendations is presented in Table 1. Despite the large number of studies examining a host of candidate biomarkers, only the assessment of family history of cardiovascular disease was granted a class I recommendation, the highest possible level. Key biomarkers, such as CRP and B-type natriuretic peptide (BNP), have been shown in general to only modestly improve discrimination and reclassification: approximately 0.02 increment in c-statistic and 3% to 5% NRI [20-24]. The literature pertaining to CRP, BNP and other commonly studied biomarkers has been reviewed in detail previously [25-27].

Table 1. Biomarkers for primary cardiovascular disease risk prediction
BiomarkerThe American college of cardiology foundation and the American heart association (ACCF/AHA) 2010 recommendationDiscrimination (c-statistic)NRIReference
  1. a

    Reclassification data are only available for natriuretic peptide within a multiple biomarker score.

Lipoprotein and apolipoproteinIIINo improvementNo improvement [60]
Natriuretic peptideIIINo improvement to +0.03a [24]
C-reactive protein


To help determine use of statins in men >50 years, women >60 years of age and LDL < 130 mg dL−1


Intermediate-risk adults

No improvement to +0.04No improvement to ~5% [20-23]
Lipoprotein-associated phospholipase A2


Intermediate-risk adults

+0.006 to +0.02Not statistically significant [61-63]
Haemoglobin A1C


In adults without known diabetes

~5% [64]
Urinary albumin


In adults with diabetes or hypertension


Intermediate-risk adults

+0.012% men, 13% women in a high-risk population [65, 66]
Resting ECG


In adults with diabetes or hypertension


Asymptomatic adults

+0.02 to +0.05~7% [67, 68]
Exercise ECG


Intermediate-risk adults

+0.03 (for all-cause mortality) [69]


Asymptomatic adult with hypertension


Asymptomatic adults

Stress echocardiogramIII 
Flow-mediated dilatationIIINo improvement [70]
Ankle–brachial index


Intermediate-risk adults

No improvement to +0.05No improvement [71, 72]
Carotid intima–media thickness


Intermediate-risk adults

+0.01 to +0.02~8–10% [73-75]
Coronary calcium score


Intermediate-risk adults


Low- to intermediate-risk adults

+0.04 to +0.05~14–25% [76-78]
Myocardial perfusion imaging


High-risk adults


Low- or intermediate-risk adults

Family historyIb+0.01 [79]

Since the publication of the 2010 ACCF/AHA guidelines, newer circulating biomarkers have been evaluated. For instance, the Dallas Heart Study has shown that high-sensitivity troponin (hsTnT) is detectable in up to a quarter of middle-aged men and women [28]. Increasing levels of circulating hsTnT was correlated with an increase in cardiovascular mortality. Addition of this marker to a traditional risk factor model improved the c-statistic by about 0.04. Another novel biomarker under investigation, growth-differentiation factor-15 (GDF-15), is expressed by cardiomyocytes under conditions of ischaemia or pressure-related stress [29]. Inclusion of GDF-15 in addition to a traditional risk factor model moderately improves prediction of all-cause mortality, with increases in the c-statistic of about 0.01 and an NRI of 6%.

One potential solution to address the limited value of individual biomarkers is to combine them into a ‘multimarker’ score. Intuitively, this approach makes sense. Combining biomarkers from different pathophysiological pathways improves the odds that they are not correlated and are thus able to add predictive value on top of each other. Practically, however, this approach has had only modest success in the few large studies that have examined the strategy.

One of the earliest large studies incorporating the use of multiple biomarkers was from the Framingham Offspring Study [30]. Ten biomarkers were selected based on biological plausibility and known association with cardiovascular disease risk. In backward elimination models, five were retained as predictors of death (BNP, CRP, urinary albumin/creatinine ratio, homocysteine and renin), and two were retained as predictors of major cardiovascular events (BNP and urinary albumin/creatinine ratio). A multimarker score in the top quintile was associated with a fourfold increase in all-cause mortality and a twofold increase in cardiovascular events. The improvement in c-statistic, however, was small: 0.02 for mortality and 0.01 for cardiovascular events (Fig. 1).

Figure 1.

ROC curve of major cardiovascular events at 5 years for both a traditional risk factor model and one with the addition of biomarkers. Adapted with permission from Wang et al. [30].

Similar conclusions were reached from the results of another large study, which was based on the Swedish Malmö Diet and Cancer cohort [20]. In predicting first coronary events, only two of six biomarkers were retained in backward elimination models: BNP and mid-region pro-adrenomedullin, a peptide with important vasodilatory functions. A top quartile multimarker score was associated with a twofold higher risk of coronary events, but with only minimal change in the c-statistic (~0.01). The NRI was nonsignificant for coronary events in the analysed sample. When restricting the analysis to intermediate-risk groups only, the clinical NRI was 15%. However, this was largely driven by down-classification (19%) rather than up-classification (4%) of individuals at intermediate risk.

In the Uppsala Longitudinal Study of Adult Men, troponin I, NT-proBNP, CRP and cystatin C were added to a traditional risk factor model in a population of elderly men (mean age 71 years) [31]. The addition of these biomarkers improved the c-statistic substantially for both cardiovascular and all-cause mortality (0.10 and 0.07). However, 42% of the cohort had known prior cardiovascular disease. When restricting the analysis to participants without a history of cardiovascular disease, results were more modest, with a 0.06 improvement in c-statistic for cardiovascular mortality and a nonsignificant improvement for all-cause death. Nonetheless, the NRI in that study was high (26%). The likely explanation for the discrepancy between the Uppsala results and those from prior studies was the homogeneity of the study sample (all men, same age). Because age is the strongest predictor in the traditional risk factor model, eliminating variation in age decreases the discriminative ability of the baseline model, as evidenced by a c-statistic value of 0.66. Thus, there is a larger ‘window’ to improve the c-statistics and other performance measures.

Recently, two other studies incorporated an even larger number of candidate biomarkers. Blankenberg et al. [32] reported the results of addition of 30 biomarkers to a traditional risk model in the MORGAM study. These biomarkers reflected the numerous physiological and pathological pathways leading to atherosclerosis, including for example pathways involved in inflammation, lipid metabolism, vascular and renal function, oxidative stress and myocardial damage. None of the biomarkers alone improved discrimination significantly, although an optimal combination of NT-proBNP, CRP and troponin I improved the c-statistic by 0.03. This was accompanied by an NRI of 11%, approximately half of which was due to correct up-classification of events and half to down-classification of nonevents. In a case–control study derived from the Women's Health Initiative cohort, five biomarkers (out of 18) were identified that were associated with inflammation, oxidative stress and haemostasis. The authors noted moderate improvements in the c-statistic (0.02) and NRI (6.5%) [33].

Are serial biomarker measurements useful?

An ideal biomarker, in addition to identifying the at-risk population, would be amenable to serial measurement. Modification of risk factors would yield a change in the measured marker, providing a surrogate for effective therapy. Key examples include low-density lipoprotein (LDL) cholesterol and blood pressure. LDL cholesterol plays a causal role in the development of atherosclerosis [34], and clinical trials have shown that lowering of LDL results in decreased risk of cardiovascular events in a dose-dependent manner [35]. Thus, LDL is an integral part of cardiovascular disease risk assessment as well as a target for pharmacological therapy [36].

There is little evidence to support serial measurements of circulating biomarkers other than LDL. CRP, one of the most studied markers of systemic inflammation, has been shown to be strongly associated with cardiovascular disease [37]. Statin therapy reduces both cardiovascular disease risk and CRP levels, but it remains uncertain whether targeting CRP levels per se improves outcomes. In the Justification for the Use of Statins in Primary Prevention: An Intervention Trial Evaluating Rosuvastatin (JUPITER) trial, rosuvastatin was shown to decrease overall cardiovascular events by 44% and CRP level by 37%, in a population with a baseline LDL < 130 mg dL−1 and CRP concentration >2 mg L−1 [38]. However, this was also accompanied by a 50% reduction in LDL level. A recent study from the Heart Protection Study Collaborative Group included individuals at high risk of vascular events [39]. Statins were again shown to reduce cardiovascular events and levels of LDL and CRP. When stratified by CRP level, however, the risk reduction was consistent across the groups, showing that elevated baseline CRP levels did not predict greater risk reduction. Experimental and genetic studies attempting to address whether CRP directly contributes to the development of atherosclerosis have yielded mixed results. For instance, the results of Mendelian randomization studies do not support a link between genetic variants determining CRP levels and cardiovascular disease risk, which suggests that CRP is a marker rather than a cause of atherosclerosis [40].

With regard to imaging biomarkers, measurement of carotid intima–media thickness (CIMT) has long been of interest for its value in tracking disease progression. CIMT has been used widely as an intermediate end-point in clinical trials. A meta-analysis of trials of statin therapy showed that treatment was associated with decreased progression of CIMT by 0.012 mm year−1 and a 50% reduced risk of overall cardiovascular events [41]. However, inter-operator measurement variability may limit the utility of this tool in clinical settings. A formal training and certification process for both ultrasonographers and readers has yet to be established outside of clinical trials [42].

Although coronary artery calcium (CAC) is a strong biomarker of cardiovascular disease risk, the value of serial CAC measurements in primary prevention is unclear. In randomized trials, statin therapy, compared to placebo, has been shown to have little effect upon the rate of CAC progression [43, 44]. Thus, although CAC may improve risk stratification of individuals at baseline intermediate risk and serve as a potential motivational tool for risk factor reduction [45], it may be of limited value when measured serially.

Current and future research directions

Traditional biomarker discovery has focused on identifying signals in known pathways leading to coronary disease, such as those involved in lipid metabolism or inflammation. Although such approaches can provide valuable biological insight, they may have relatively limited predictive utility. As shown in Fig. 2, the addition of ‘correlated’ biomarkers (e.g. as would be typical for biomarkers derived from common pathways) results in relatively little change in the c-statistic [26]. On the other hand, the addition of ‘uncorrelated’ biomarkers promotes greater improvements in discrimination.

Figure 2.

Hypothetical c-statistic as a function of a model containing traditional risk factors and a variable number of new biomarkers, each with hazard ratio of 1.35 for cardiovascular events per SD increment in biomarker. The curves simulate three scenarios of inter-marker correlation (r). Adapted with permission from Wang [26].

The availability of new molecular tools for investigating multiple biological pathways in a relatively unbiased manner introduces the possibility of uncovering biomarkers from new pathways (i.e. uncorrelated to existing measures) [46]. For instance, mass spectrometry–based methods allow the global assessment of hundreds of proteins (proteomics) or small molecules (metabolomics) in blood samples. Metabolomics has proven to be a more tractable initial platform for biomarker discovery than proteomics, because the number of circulating small molecules (probably in the thousands) is several orders of magnitude lower than the number of circulating proteins (probably in the millions) [46].

Shah et al. [47] recently identified and validated branched-chain amino acid and urea cycle metabolites as predictors of cardiovascular disease, with elevations in each associated with a 30–40% increased risk. Models that incorporated principal components for the two measures showed a modest improvement in discrimination, with an increase in c-statistic of 0.02–0.03. Wang et al. [48] identified three phospholipid metabolites associated with cardiovascular disease in a cohort of patients who underwent elective cardiac catheterization. In a murine model, supplementation of these metabolites was shown to upregulate macrophage scavenger receptors linked to atherosclerosis. In a matched case–control Framingham Offspring Study, five branched-chain and aromatic amino acids that predicted diabetes 12 years prior to the onset of disease were identified [49]. A combination score from the three amino acids with the strongest association (isoleucine, tyrosine and phenylalanine) was related to a more than fivefold increased risk of diabetes (top vs. bottom quartile).

Mapping of the human genome and subsequent genome-wide association studies have permitted the identification of single-nucleotide polymorphisms (SNPs) associated with cardiovascular disease and risk factor phenotypes. To date, more than 30 genetic loci have been identified, with per allele increased risks ranging from 5% to 30% [50, 51]. However, the mechanisms linking most of these SNPs with cardiovascular disease risk are unclear. For example, the role played by the 9p21 locus, first identified in 2007 and possessing the most robust association with coronary disease (odds ratio 1.29), remains under investigation [52]. This region of interest, although containing no genes of its own, has been shown to play a regulatory role in downstream expression of two genes involved in vascular smooth muscle proliferation [53] and response to inflammatory signals [54].

Application of genetic loci as novel biomarkers for risk stratification has yielded mixed results, with some [55, 56] but not all [57, 58] studies showing modest improvement over traditional risk factors. In the Malmö Diet and Cancer Study, a genetic score composed of nine SNPs associated with LDL or high-density lipoprotein cholesterol was shown to improve NRI by 6%, compared with a traditional risk model, without affecting the c-statistic [56]. Ripatti et al. [58] examined 13 loci associated with coronary heart disease, which included 9p21. Although participants in the top quintile of the genetic score had a 66% greater risk of coronary heart disease events, there was no overall improvement in c-statistic or reclassification.

Other techniques, such as gene expression profiling, have been investigated for their role in cardiovascular disease risk prediction. Rosenberg and colleagues recently derived and tested a 23-gene expression profile from circulating whole blood cells in a high-risk population referred for coronary angiography [59]. Compared with a clinical risk model, the expression profile improved the predictive ability for obstructive coronary disease (defined as >50% stenosis in at least one major artery). The c-statistic was improved by 0.01, and the profile reclassified 22% of individuals, with an NRI of 16%. Further validation of these findings is necessary. In addition, whether this model can improve risk stratification in an unselected low- or intermediate-risk population has not yet been determined.


Numerous promising biomarkers for cardiovascular disease risk prediction have been evaluated in the past decade, but, so far, evidence to support their use in routine clinical practice is limited. Further research is needed to identify new biomarkers that can successfully stratify cardiovascular disease risk in low-risk populations, as well as to determine whether management strategies informed by biomarker testing are better than standard of care. The application of powerful novel discovery platforms, such as genomics, proteomics and metabolomics, is still in the developmental stages, but there is potential for rapid and exponential growth. Translating these discoveries into clinical practice will be critical for reducing the population burden of cardiovascular disease.

Conflict of interest statement

Dr Wang has been part of the scientific advisory board of Diasorin Inc., has received technical support or research funding from Siemens, Brahms, Critical Diagnostics, Singulex, Diasorin, Pfizer, Roche and LabCorp and has received honoraria from Roche, Diasorin and Quest Diagnostics. Dr Wang is named as a co-inventor on patent applications relating to the use of metabolomic or neurohormonal biomarkers in risk prediction.