Multiplexed measurement of candidate blood protein biomarkers of heart failure

Abstract Aims There is a critical need for better biomarkers so that heart failure can be diagnosed at an earlier stage and with greater accuracy. The purpose of this study was to design a robust mass spectrometry (MS)‐based assay for the simultaneous measurement of a panel of 35 candidate protein biomarkers of heart failure, in blood. The overall aim was to evaluate the potential clinical utility of this biomarker panel for prediction of heart failure in a cohort of 500 patients. Methods and results Multiple reaction monitoring (MRM) MS assays were designed with Skyline and Spectrum Mill PeptideSelector software and developed using nanoflow reverse phase C18 chromatographic Chip Cube‐based separation, coupled to a 6460 triple quadrupole mass spectrometer. Optimized MRM assays were applied, in a sample‐blinded manner, to serum samples from a cohort of 500 patients with heart failure and non‐heart failure (non‐HF) controls who had cardiovascular risk factors. Both heart failure with reduced ejection fraction (HFrEF) patients and heart failure with preserved ejection fraction (HFpEF) patients were included in the study. Peptides for the Apolipoprotein AI (APOA1) protein were the most significantly differentially expressed between non‐HF and heart failure patients (P = 0.013 and P = 0.046). Four proteins were significantly differentially expressed between non‐HF and the specific subtypes of HF (HFrEF and HFpEF); Leucine‐rich‐alpha‐2‐glycoprotein (LRG1, P < 0.001), zinc‐alpha‐2‐glycoprotein (P = 0.005), serum paraoxanse/arylesterase (P = 0.013), and APOA1 (P = 0.038). A statistical model found that combined measurements of the candidate biomarkers in addition to BNP were capable of correctly predicting heart failure with 83.17% accuracy and an area under the curve (AUC) of 0.90. This was a notable improvement on predictive capacity of BNP measurements alone, which achieved 77.1% accuracy and an AUC of 0.86 (P = 0.005). The protein peptides for LRG1, which contributed most significantly to model performance, were significantly associated with future new onset HF in the non‐HF cohort [Peptide 1: odds ratio (OR) 2.345 95% confidence interval (CI) (1.456–3.775) P = 0.000; peptide 2: OR 2.264 95% CI (1.422–3.605), P = 0.001]. Conclusions This study has highlighted a number of promising candidate biomarkers for (i) diagnosis of heart failure and subtypes of heart failure and (ii) prediction of future new onset heart failure in patients with cardiovascular risk factors. Furthermore, this study demonstrates that multiplexed measurement of a combined biomarker signature that includes BNP is a more accurate predictor of heart failure than BNP alone.


Introduction
Heart failure (HF) is a major global health issue, with recent estimates suggesting that there are more than 26 million HF patients worldwide. The burden of this disease on the healthcare system is significant. In the United Kingdom, 1-2% of the National Health Service's budget is spent on HF, with 60-70% of this estimated to be on the cost of hospitalizations. 1 HF is a complex pathology and often presents with non-specific symptoms. Hence, delays in accurate diagnosis and appropriate treatment interventions contribute further to HF-associated healthcare costs. 2 Echocardiography remains key to accurate diagnosis of HF; however, the resources required for this is an important factor in the healthcare burden of HF. 1 Blood-based biomarkers offer a lower cost, minimally invasive alternative for HF diagnosis and prognosis, with much quicker turnaround times for test results. The current gold-standard biomarkers for HF are the natriuretic peptides-N-terminal proBNP and BNP. However, there are a number of limitations with these biomarkers in the context of HF management; the biological variation of BNP or N-terminal proBNP is~30% in chronic HF, and levels are further influenced by patient weight, comorbidities, and medications. 2 The natriuretic peptides are also not useful in classifying types of HF, which is key to managing the disease effectively-especially in terms of risk stratification. Ultimately, follow-up echocardiography is still relied upon to confirm diagnosis of HF, even if elevated levels of BNP are detected. The underlying mechanisms that contribute to HF remain poorly understood, and it is thought that biomarkers that reflect important pathophysiologic pathways involved in cardiovascular dysfunction would likely be of greater clinical value for diagnosis and prognosis of HF.
Recent studies indicate that shifting the focus from conventional risk factors and/or single disease biomarkers toward biomarker 'signatures', made up of multiple disease-relevant proteins, would be of considerable benefit in the management of HF. 3 Aside from cancer, cardiovascular research is the field in which novel disease biomarkers have been most extensively investigated, with more than 150 potential biomarkers for cardiovascular disease documented in the literature. 4 However, no new protein biomarker tests for HF have yet been approved and included in guidelines for use in clinical practice. This is largely due to difficulties in being able to demonstrate that use of novel biomarkers will eventually lead to improved outcomes for HF patients. Some of the difficulty lies in evaluating lengthy lists of candidate biomarker proteins in a statistically relevant number of patient samples, with sufficient sensitivity and specificity toward HF. 5 This bottleneck has previously been attributed to the fact that the technologies being used are unable to provide the combination of sufficient throughput and robust accuracy for analysing multiple biomarker candidates in large patient cohorts. In contrast to the traditional antibody-based assays, mass spectrometry technology allows for more highthroughput, sensitive, and selective measurement of target proteins by detection of their unique peptide fragments. Multiple reaction monitoring (MRM) is a mass spectrometry technique that makes use of multiple mass analysis steps to select a series of predefined ions for detection. 6 MRM is advantageous over alternative biomarker validation techniques such as enzyme-linked immunosorbent assay, SOMA scan and proximity extension assay, in that it is not array-based and restrictive to measurement of protein targets to which suitable antibodies or somamers are commercially available. MRM assays are customizable, high-throughput, and suitable for routine use in a clinical setting. 7 The key aim of this study was to develop a set of robust MRM assays for measurement of a panel of pathophysiologically relevant candidate protein biomarkers for HF. With the ultimate aim of being able to develop a clinically useful biomarker test, the assays were optimized for application to crude patient serum samples. Application of these assays to a statistically powered patient cohort has provided evidence that this biomarker panel could have clinical utility for prediction of HF within a mixed population of both disease and control patients.

Methods
Full detailed methods are available in the supporting information.
clinical variables between groups. A one-way analysis of covariance (ANCOVA) was conducted to compare biomarker expression between groups, while controlling for confounding clinical factors. Survival analysis was performed using the Cox-regression method. Random Forest models were used to discriminate between the patient groups (Random Forest package in R V.4.3.2 and pROC package in R V. 3.4.4).

Results
Multiple reaction monitoring assay development -selection of proteotypic peptides A panel of 35 candidate biomarker proteins relevant to cardiovascular conditions was assembled. This panel included 19 known and 16 novel biomarkers. 'Known' biomarkers were identified from the literature and added to the panel based on their known involvement in the pathogenesis of cardiovascular diseases ( Table 1). 'Novel' biomarkers were identified from a previous study by Watson et al. 8 Both Skyline and Peptide Selector were used to select proteotypic peptides for development of MRM assays for the candidate protein biomarkers. Data from in house experiments and relevant publications on MRM-based investigations also guided the peptide selection process 9-14 (Supporting Information, Table  S1). Where possible, at least two peptides were selected per protein. It was found that the peptides identified for natriuretic peptide receptor A (ANP) and BNP were not unique to these proteins so they were removed from the candidate list.

Multiple reaction monitoring developmentassay optimization
The in-silico MRM assays were analysed in pooled patient serum samples. Crude serum was depleted of the 14 most abundant serum proteins to enhance detectability and optimize measurement parameters for the lower abundant proteins. On the basis of the results from these experiments, MRM assays were optimized for 20 proteins (30 peptides and 150 transitions). These assays were evaluated in crude serum samples in order to confirm that target proteins could be reliably measured without need for serum depletion. This resulted in the development of a single working MRM method for measurement of 22 peptides (15 proteins) ( Figure S1). Synthetic crude peptides were used to assist in the development of assays for 11 of the remaining proteins, which were determined-based on previous results-to have the most potential for successful measurement in serum and were also considered to be the most biologically relevant. All peptides selected for inclusion in the final MRM method had dot products greater than 0.9. For many of the proteins, only one peptide was included in the final method. To ensure accurate peptide detection, especially for the lower abundant proteins, all five transitions for each peptide were retained in the final method. The resulting final MRM method consisted of 35 peptides for 25 proteins, with a total of 175 transitions. The workflow for peptide selection and assay optimisation is outlined in Figure 1.

Clinical cohort for assay evaluation
The clinical cohort included 150 HF patients; 75 HF patients with reduced ejection fraction (HFrEF) and 75 HF patients with preserved ejection fraction (HFpEF). A population of 350 non-heart failure (non-HF) patients were selected from the STOP-HF (St. Vincent's Screening to Prevent HF) study. These patients represent a high-risk population for future development of HF and served as a control group within this study. 15 Point of care BNP measurements were taken for each patient. Assays, with sensitivity for BNP at <5 ng/mL, were run singly and the intra-assay variability was <10%.
All samples were analysed singly with the developed MRM method over 21 batches of between 22 and 24 samples. A stock peptide mix was used to confirm system suitability and ensure reproducibility between each batch of samples analysed. The %CV values for the stock peptide mix was around 15% indicating that instrument performance was consistent throughout the analysis. However, it was noted from inspection of the quality of peptide peaks in Skyline, that some batches had a disproportionate amount of missing values or very low values for all peptides. These sample batches were removed from subsequent analyses to remove any potential bias of batch effects. Biomarker analysis was ultimately conducted using data from 406 patient samples (Table S2). Even though samples were removed from the analyses, the ratio of HF to non-HF patients remained the same; 121 HF and 289 non-HF. Patient characteristics for the 406 patients are summarized in Table 1. The non-HF patients were generally younger (66.75 ± 96) than HFpEF (74.19 ± 6.9) and HFrEF (70.12 ± 11.4) patients (P ≤ 0.001). The majority of patients affected by either HFpEF or HFrEF were male in this cohort (63.2% and 72.6%, respectively). This is unsurprising as, although more women die from HF, it is predominantly diagnosed in men. As indicated in this table, the non-HF group demonstrate clinical features, which would indicate that they are at risk of future development of HF (prevalence of hypertension = 66.6% and prevalence of dyslipidaemia = 67.6%); however, previous incidences of ischaemic heart disease (IHD, 17.4%) and cardiac arrhythmia/atrial fibrillation (AF, 10.1%) are significantly lower than in the HFpEF and HFrEF groups (IHD = 41.1%, AF = 85.5% vs. IHD = 59.7%, AF = 56.5%, respectively, P ≤ 0.001). Within the HF group, HFpEF patients demonstrate 2250 C. Tonry et al. Values are mean ± SD, n (%). Independent-samples Kruskal-Wallis analysis was performed for numerical variables for the three patient groups. Chi-square contingency analysis was performed for categorical variables. * P values that indicate significant differences (P < 0.05) between non-HF and HFPEF group. Ŧ P values that indicate significant differences (P < 0.05) between non-HF and HFREF group. ¥ P values that indicate significant differences (P < 0.05) between HFPEF and HFREF group.

Evaluation of individual candidate biomarkers
Area values for the most intense peptide transition were used to determine peptide expression levels in sera. Any missing values were replaced using a multiple imputation method to remove any potential bias. As expected, BNP was significantly differentially expressed between non-HF and HF patients; however, there was not a significant difference in BNP expression between HFrEF and HFpEF patients once data was adjusted for AF (P = 0.320, Figure S2). This highlights the lack of specificity of BNP for differentiation between types of HF , which is required for appropriate clinical management of HF. Individually, five of the candidate biomarker proteins were significantly differentially expressed between HF and non-HF patients: leucine-rich-alpha-2-glycoprotein (LRG1, P < 0.001), zinc-alpha-2-glycoprotein (ZA2G, P = 0.001), serum paraoxanse/arylesterase (PON1, P = 0.006), Apolipoprotein A-I (APOA1, P = 0.009 and P = 0.038), and pentraxin 3 (PTX) ( Table 2, P = 0.049). In addition to gender and age, adjustments for AF were made as a large proportion (82%) of patients within this cohort had the condition. All proteins remained significant when adjusted for gender, age and AF individually, aside from PTX, which was no longer significant after adjustment for these confounders ( Table 2). Only peptides for APOA1 protein remained significant when controlled for all three confounders collectively (P = 0.013 and P = 0.046). Four proteins were significantly differentially expressed between non-HF and the subtypes of HF (HFrEF and HFpEF): LRG1 (P < 0.001), ZA2G (P = 0.005), PON1 (P = 0.013), and APOA1 (P = 0.038). LRG1 was significantly differentially expressed between non-HF and HFpEF, and this held true when adjusted for gender and age (P = 0.001 and P = 0.040, respectively). However, significance was lost when adjusted for AF or all three confounders collectively ( Table 3). ZA2G was significantly differentially expressed between non-HF and HFrEF and also between non-HF and HFpEF (P = 0.005 and P = 0.043, respectively). Differences between non-HF and HFrEF remained significant when adjusted for age (P = 0.025) and AF (P = 0.031) but not when adjusted for gender or all three confounders collectively ( Table 3). PON1 was significantly differentially expressed between non-HF and HFrEF (P = 0.005). Significance remained when adjusted for age, gender, and AF individually and all three confounders collectively (P = 0.05). APOA1 was significantly differentially expressed between non-HF and HFrEF (P = 0.038) and remained significant when adjusted for AF (P = 0.048) but not when adjusted for any other confounders ( Table 3).

Predictive capacity of combined biomarkers
A random forest model was developed to determine if collective measurement of all 25 proteins improves accuracy of BNP for prediction of HF. This type of modelling is designed to handle artefactual noise, and thus, all peptides can be included in the model without diminishing the validity of statistical interpretations from the model, that is, variable detectability of some of the lower abundant peptides did not affect model performance. 16 Indeed, removing peptides that had only a modest contribution to the model, resulted in poorer model performance (data not shown). The model developed here was shown to predict HF with an accuracy of 83.17% and area under the curve (AUC) of 0.90 ( Figure 2).
In contrast, a model, which only included patient BNP data, had 77.1% accuracy and an AUC of 0.86. Contribution of each individual protein to the model performance is outlined in Table 4, where proteins are listed in order of importance to the model's predictive capacity. The performance of the model was not enhanced any further with addition of risk factors such as patient age, sex, and body mass index, and these risk factors alone achieved accuracy of just 64.4% and an AUC of 0.69 ( Figure 2). A one-sided hypothesis test was carried out based on 2000 stratified bootstrap samples to test if the AUC obtained using only BNP was significantly less than the AUC achieved with addition of peptide measurements. It was found that the difference was significant (P = 0.0045).
The net reclassification was also calculated [net reclassification 0.1111 95% confidence interval (CI) (0.0209-0.2013) P = 0.01578); however, results varied based on which cut-off threshold values were chosen. Hence, the bootstrapping approach provided more stable results for comparison of model performance. Age is a confounding factor in the prediction of HF. Logistic regression analysis was also performed on a subset of age-matched HF and non-HF patients (n = 95 vs. n = 208, respectively) to ensure that the overall model performance was not influenced by effects of age. It was found that overall predictive performance of all peptides combined remained similar (0.83) to the non-matched cohort, as determined by observations of AUC (data not shown).

Prediction of future heart failure
Over a period of 10 years following original sample collection, 17 out of the 287 non-HF patients developed HF. One of the protein biomarkers, LRG1, was found to be significantly associated with future HF, even when adjusted for age and gender (  (Figure 3).

Discussion
In this study, an MRM-mass spectrometry approach was employed for multiplexed measurement of 25 candidate protein biomarkers for HF. The 'novel candidates' were identified in a previous study by Watson et al, in which 2D-DIGE proteomics analysis of coronary sinus blood revealed differences in protein expression between asymptomatic hypertensive patients, stratified according to BNP levels. In that study, only the protein LRG1 was further validated via enzyme-linked immunosorbent assay, whereas here, we were able to collectively measure all identified proteins via MRM as part of a biomarker 'signature'. 8 For implementation into clinical use, biomarker signatures must bring significant added value to what is already used routinely in the clinic. Hence, a random forest algorithm was used to test the predictive utility of the biomarker signature in combination with BNP. Random forest has desirable properties in that it does not over-fit the data and can informally allow the assessment of complex high-order interactions. 17 This analysis revealed that, by combining measurement of a panel of disease relevant proteins with BNP measurements in a predictive model for HF, the accuracy increases from 77.1% (AUC 0.86) for BNP alone to 83.17% (AUC 0.90) for BNP plus biomarkers. Importantly, the positive predictive value also increases from 58.4% to 65.4%. The majority of peptides were significantly correlated with each other, although the strength of this association was very weak; only peptides associated with the same protein had strong correlations with each other (r ≥ 0.8, Table S3). LRG, ZA2G, and PON1 were the only proteins to show a significant correlation with BNP, but again these correlations were weak (r < 0.4) (Table S3). Individually, a number of the protein candidates show significant differences in expression between HF patients and non-HF controls and were found to be associated with particular HF aetiologies, that is, HFrEF or HFpEF. This is of interest as, due to poorly understood differences in pathophysiology between HFpEF and HFrEF, many patients with HFpEF are not diagnosed correctly using conventional biomarkers for HF. 18 Indeed, from the information provided in Table 1, it is evident that the majority of measurements taken from the blood and urine do not discriminate between HFpEF and HFrEF. LRG1 was significantly elevated in HF and particularly associated with HFpEF. Indeed, LRG1 was the only biomarker candidate to show a significant association with HFpEF as opposed to HFrEF, even when adjusted for gender and AF ( Table 3). This is noteworthy considering the challenges in diagnosing HFpEF. LRG1 expression was also significantly correlated with BNP expression (Table S3). Watson et al have previously reported overexpression of LRG1 in asymptomatic patients with elevated BNP, who are at risk for HF. Indeed, there is growing evidence to support the functional relevance of LRG1 as a biomarker for early onset myocardial infarction. 19 LRG1 was also the only biomarker found to be  predictive of future HF, with non-HF patients in the 'high LRG1' group found to be twice as likely to develop HF in the future. Overall, these data add weight to the suggestion that LRG1 could be a valuable biomarker for ventricular dysfunction and HF and may have particular utility in diagnosis of HFpEF. 8 ZA2G was significantly elevated in HF patients, with a more significant association with HFrEF. Although generally researched in the context of cancer, 20 it has previously been shown that serum levels of ZA2G are increased in HF patients. 21 PON1 was also significantly differentially expressed between HF and non-HF patients and significantly associated with HFrEF. Studies in two Japanese patient cohorts have revealed that the presence of certain alleles of PON1 increase the risk of carotid artery atherosclerotic disease 22 ; however, the clinical significance of PON1 activity in cardiovascular conditions remains controversial. Some studies have associated low baseline PON1 activity with increased severity of coronary artery disease, while other studies report an association between high baseline PON1 activity and coronary artery disease severity. 23 In a study by Hammadah et al, however, no correlation between HF events or hospitalizations and PON1 activity was observed. 23 In this study, PON1 was down-regulated in HF, although the more important observation regarding this protein was that, like ZA2G, it contributed strongly to the predictive performance of the combined biomarker model ( Table 4). Apolipoprotein I is significantly differentially expressed between both HF and non-HF samples, even when adjusted for gender and AF, and had a more significant association with HFrEF. Apolipoproteins have been more strongly linked with HF and cardiovascular disease. Low APOA1 expression has been linked with more severe disease 24 and conversely, higher levels of APOA1 are associated with reduced risk of major cardiovascular events. 25 In our data, reductions in APOA1 were also observed in both HFpEF and HFrEF patients, when compared with non-HF patients. The protein pentraxin 3 was also significantly elevated in serum from patients with HF; however, these changes were no longer significant when adjusted for AF. Individually, peptides for the proteins described above  This finding again highlights the benefit of multiplexed measurement of biomarker combinations, rather than relying solely on the statistical significance of individual proteins for identification of clinically useful predictive biomarkers. Although some of the candidate proteins did show potential to differentiate between HFpEF or HFrEF and non-HF controls, the combined measurement of all candidate biomarker proteins did not have sufficient sensitivity or specificity to differentiate between either of the two disease subtypes (HFpEF and HFrEF). In this instance, it is likely that this is due to the low numbers of both HFpEF and HFrEF patients (n = 59 and n = 62, respectively) in relation to non-HF controls (n = 289). To further elucidate the potential clinical utility of the biomarker panel, or selected candidate biomarkers within the panel, for specific diagnosis of HFpEF the biomarkers will have to be assessed in a more appropriately powered cohort of HFpEF patients and matched controls.
This study has some limitations. This Irish cohort is not representative of a diverse racial and ethnic population. Furthermore, all patients recruited, including non-HF, represent an at-risk population for future development of HF. Isoforms of BNP could not be included in the MRM method as there were limited options of proteotypic peptides for this protein and the only one that was deemed suitable was below the limits of detection. Indeed, 10 other biologically relevant candidate protein biomarkers were found to be below the limits of detection during assay development. These proteins, mainly the 'known' biomarkers, are routinely measured via immune assay-based techniques. However, it is likely that due to their structure (short sequence length) and low abundance in serum, that these proteins will always prove difficult to measure via MRM unless alternative sample preparation and MRM methods are employed. Mass spectrometry technology is continually evolving, and it will be possible to further refine MRM assays for the low abundant proteins and peptides as part of on-going work for development of the biomarker assay. Inclusion of these 'known' biomarkers in the MRM assay may further improve on the performance of the biomarker model described here. In addition, the cohort was not age matched, and age is a confounding risk  factor in prediction of HF. However, when the samples were retrospectively matched based on age, the predictive performance of the combined biomarker model was not significantly impacted. Only a small number of non-HF patients went on to develop HF, and so only exploratory survival analysis could be performed as part of this study. Definitive conclusions on the potential clinical utility of this combined biomarker panel for prediction of HF cannot be made until the panel is validated in an external cohort. This will include defining clinical thresholds for the complete biomarker panel, which will facilitate more accurate analysis of biomarker utility for prediction of future HF. These investigations will form part of a larger collaborative study, adhering to standard required for Clinical Laboratory Improvement Amendmentsaccredited assays. In order to be clinically useful for diagnosis and management of HF, candidate protein biomarkers should be easily measured in blood in a high-throughput, cost-effective, and reproducible manner in large sample numbers. Hence, there are a number of clinical tests now on offer, which avail of mass spectrometry to measure multiple proteins and/or protein isoforms in patient blood samples in short turn-around times. 26 The field of proteomics and mass spectrometry is developing rapidly, and advances have been made that will help bridge the gap between biomarker discovery, and development of a clinical test. 27,28 The costs of running an assay, such as what we have described here, has been estimated to be between €35 and €40 (£30-£35) per sample and turnaround time from sample receipt to provision of data would be 3-4 hours. In the United Kingdom, point of care BNP tests were previously estimated to cost an average of £25 per patient. 29,30 In Ireland, data from previous studiespatients and significantly associated withwithin the STOP-HF cohort estimated the average cost of point-of-care BNP to be €20 per patient. 31 Although more expensive than point-of-care BNP, it should be noted that the MRM assay described here is being developed to progress an evolving era in precision medicine, which BNP and other single markers cannot support. Therefore, the potential clinical value of such tests will justify the marginal increase in costs. This demonstrates the perceived clinical value in developing multimarker signatures such as what we have reported here.
Thus, future work will be focused on further validating this biomarker panel in additional independent sample cohorts from different countries, which will require the establishment of clinical research collaborations in order to implement.

Conflict of interest
None declared.

Supporting information
Additional supporting information may be found online in the Supporting Information section at the end of the article.  Figure S1. Measurement of crude and synthetic peptides in serum. Figure S2. Significantly Protein Expression Changes between non-HF and HF patients. Table S2. Multiple Reaction Monitoring (MRM) measurement data for all peptides in n = 406 patient samples. HF = heart failure; DHF -diastolic heart failure (HFpEF); SHF = systolic heart failure (HFrEF). Table S3. Spearman correlation coefficients between all measured peptides. Correlation coefficient greater than 0.8 indicate a strong colinear relationship (orange highlight). 'Pep 1' = peptide 1, 'Pep 2' = peptide 2.