Optimised plasma sample preparation and LC‐MS analysis to support large‐scale proteomic analysis of clinical trial specimens: Application to the Fenofibrate Intervention and Event Lowering in Diabetes (FIELD) trial

Abstract Purpose Robust, affordable plasma proteomic biomarker workflows are needed for large‐scale clinical studies. We evaluated aspects of sample preparation to allow liquid chromatography‐mass spectrometry (LC‐MS) analysis of more than 1500 samples from the Fenofibrate Intervention and Event Lowering in Diabetes (FIELD) trial of adults with type 2 diabetes. Methods Using LC‐MS with data‐independent acquisition we evaluated four variables: plasma protein depletion, EDTA or citrated anti‐coagulant blood collection tubes, plasma lipid depletion strategies and plasma freeze–thaw cycles. Optimised methods were applied in a pilot study of FIELD participants. Results LC‐MS of undepleted plasma conducted over a 45 min gradient yielded 172 proteins after excluding immunoglobulin isoforms. Cibachrome‐blue‐based depletion yielded additional proteins but with cost and time expenses, while immunodepleting albumin and IgG provided few additional identifications. Only minor variations were associated with blood collection tube type, delipidation methods and freeze–thaw cycles. From 65 batches involving over 1500 injections, the median intra‐batch quantitative differences in the top 100 proteins of the plasma external standard were less than 2%. Fenofibrate altered seven plasma proteins. Conclusions and clinical relevance A robust plasma handling and LC‐MS proteomics workflow for abundant plasma proteins has been developed for large‐scale biomarker studies that balance proteomic depth with time and resource costs.


INTRODUCTION
Proteomics is increasingly used in clinical research to identify novel biomarkers for the prediction of health outcomes, including responses to interventions, suggest mechanisms of disease prediction and prevention, and facilitate the development of novel therapeutics [1,2].
Such clinical studies are often extensive, and from heterogeneous subjects with potential impacts on proteomics analyses due to sample collection and handling [3][4][5]. Here, we set out to develop a robust, cost-effective liquid chromatography-mass spectrometry (LC-MS) proteomic workflow to analyse large numbers of clinical plasma samples. Consideration was given to minimising the number of sample processing steps to control pre-analytical variance while achieving detailed proteomic coverage of abundant plasma proteins. Given the large scale of many clinical studies, including the one which we will undertake, there is a need to establish a robust method for data quality for both sample preparation and subsequent mass spectrometric analysis, which would generally be acceptable for reporting undertaken clinical trials data.
Mass spectrometry-based proteomic analysis is often performed in discrete, relatively small studies where cohorts of samples are prepared in a single instance and then measured in the mass spectrometer in a single batch. This ensures that variation introduced through sample handling techniques and instrument performance are kept at a minimum [6]. However, in the context of analysing thousands of plasma samples from clinical observational studies and trials, this is not feasible, requiring us to develop quality control (QC) methods and standardised protocols to ensure that each batch of samples could be obtained within a pre-defined quantitative threshold. Some reports have provided blueprints for proteomic biomarker discovery workflows compatible with large-scale studies. For example, Bruderer et al. [6] established a capillary LC-MS/MS method with 300 µm I.D.
columns to quantitate more than 1500 plasma samples, while Geyer et al. [7] quantified 319 plasma proteomes in quadruplicate. These

Materials
A citrated plasma sample was used as an external standard for batch QC assessment (Sigma-Aldrich, North Rocks, Sydney, Australia).

Subjects
The study was conducted according to the guidelines of the Dec-

Processing of un-delipidated plasma
Plasma tryptic digests were prepared by transferring 10 µL of plasma to a 1.5 mL Eppendorf tube and adding 800 µL of 1% w/v SDC in

Statement of Clinical Relevance
Plasma biomarkers hold great promise to improve diagnosis and influence aspects of clinical practice, but their discov-

Offline delipidation
Three different methods of offline delipidation were tested. was removed for BCA protein assay. The sample was processed with trypsin and peptides desalted as described above.

Acetone precipitation
Acetone precipitation of proteins was performed by taking 10 µL of plasma and adding 1 mL of ice-cold acetone. Samples were then left at −20 • C for 1 h before centrifuged at 18,000 × g for 5 min to pellet the precipitated protein. 200 µL of 1% SDC 100 mM TEAB was added to solubilise the protein pellet before 10 µL was removed for BCA protein assay. The sample was processed with trypsin and peptides desalted as described above.

STAGE Tip processing
As described previously, a combined STAGE Tip digestion/desalting/delipidation workflow was conducted [9]. Following tryptic digestion, 250 µL of 99% ethyl acetate, 1% TFA was added to each sample and vortexed for 10 s. Samples were then processed in a multiplex batch utilising a 3D printed apparatus described in Harney et al. [10].

Freeze-thaw cycles
EDTA plasma samples (1.2 mL in triplicates) from three volunteers were aliquoted (200 µL) into six separate tubes and frozen at −80 • C.
Five tubes per subject were then removed from the freezer and allowed to thaw on a roller mixer at room temperature for 30 min, then refrozen at −80 • C. Samples remained at −80 • C for 1 day before the next cycle. This procedure was continued stepwise until the final tubes were frozen and thawed six times. Each sample was prepared using the STAGE Tip, method and LC-MS data collected in DDA mode as described below.

Blood collection tubes
Matched K2 EDTA and sodium citrate plasma from five volunteers were processed as per the STAGE Tip methodology described above.
LC-MS data was collected in data-independent acquisition (DIA) mode as described below.
For experiments investigating impacts of abundant protein depletion, delipidation and freeze-thaw cycles, a Top15 data-dependent acquisition (DDA) mode was used on Q Exactive HF mass spectrometer, employing using the same chromatography, columns and run-length as DIA experiments. The following instrument settings were applied for DDA experiments: MS1-AGC 3e6, resolution 60 K, scan range 300-1650 m/z. MS2-AGC 1e5, resolution 15 K, loop count 15.
For samples prepared using the STAGE Tip workflow, DIA mode using variable m/z windows over the 300-1600 m/z range was used as previously described [9].

2.10
Proteomic data analysis LC-MS data acquired using DIA were analysed using Spectronaut Pulsar X v12.0.2 (Biognosys, Switzerland) following our methods using an enhanced spectral library derived from publicly accessible human plasma datasets as we previously described [9]. Raw files were searched in Spectronaut against the spectral library [9] with the follow-

Abundant protein depletion
It is well-established that the high concentration of some proteins in plasma, such as immunoglobulins and albumin, limits the depth of proteome analysis that can be achieved in a single run LC-MS experiment [12,13]. Affinity-based depletion methods have been used to remove these highly abundant proteins to address this high dynamic range issue [14]. However, for large-scale studies as proposed here, This is a relevant finding as Apo-AI, a key component of high-density lipoprotein (HDL), is a biomarker of the lipid drug fenofibrate activity. We, therefore, considered that analysis using neat plasma is the most facile approach that affords time and cost advantages and would minimise quantitative variability due to sample handling. This came at the expense of 40 fewer proteins that could be detected using Dep G.

Plasma delipidation methods
Given that plasma contains numerous lipid species that may accumulate on stationary phase [15] and having selected neat plasma as the starting matrix, we tested various delipidation procedures with the objective to maximise LC column and MS system stability needed to analyse hundreds of clinical samples. We tested four delipidation methods: acetone, MTBE, methanol chloroform (MEOHCL) and ethyl acetate integrated into the STAGE Tip purification protocol [16].
We used DDA to access the recovery of proteins from triplicate preparations of each method ( Figure 2

Comparison of blood collection tubes
Most plasma proteomic studies utilise EDTA plasma, and there is little information regarding comparisons with sodium citrate plasma. This is

Effects of multiple freeze-thaw cycles on stored plasma
We conducted sample freeze-thaw events to assess the effects of six freeze-thaw cycles on the quantitation of proteins. Overall, there was no significant variation in protein abundance between the cycles (ANOVA on 270 proteins was non-significant) (see Supplementary Data 4). To further illustrate this, Figure 3 shows the relative abundance of apolipoproteins, displaying minimal variance across six freeze-thaw events.

A pilot study of FIELD samples
We applied the above determined optimal conditions, using neat,  Table S5). A differential analysis comparing pre-and post-treatment groups identified seven significantly differentially regulated proteins (p < 0.05) (see Table 2). The major HDL protein constituents, Apo-AI and Apo-AII, showed a non-significant increase post-6-weeks of fenofibrate in this small cohort. There was a significant decrease in the abundance of kallistatin (log 2 FC −0.13, p = 0.04), a serine proteinase inhibitor that some studies suggest is a biomarker for microvascular complications in diabetes patients [18].

Quantitative reproducibility in large scale analyses
The above methods were applied to a cohort of FIELD trial participants (1560 samples data to be reported elsewhere). Here, we report a summary of the intra-batch quantitative reproducibility associated with LC-MS analysis of the external QC plasma standard. The external plasma standard was acquired pre-and post-65 batches of 24 samples and the difference in areas of the top 100 most abundant proteins used to assess batch acceptability. The median intra-batch difference was 1.6%, with the medians for 95% of the data within the range −9%-11%.

Removal of highly abundant plasma proteins
It has been previously established that there is an advantage to The limitation to this approach is proteome depth, which can only be addressed using more extensive sample fractionation methods and come with added time and cost impacts.

Evaluation of delipidation methods
Human plasma contains as much as 471 mg/dL weight of lipids primarily sterol lipids, glycerophospholipids, glycerolipids, sphingolipids amongst other classes and species [20]. Moreover, large, hydrophobic lipids may be retained on reversed-phase LC columns and contribute to poor reproducibility [15] and column blocking [10] so we rationalised we should deplete these to meet our objective for a robust and stable LC-MS system to quantitate thousands of clinical specimens. Three different offline lipid removal methods were trialled, including one-step protein precipitations using acetone, and, more complex workflows, which separate lipids and proteins by inducing phases in the aqueous and organic layers. A drawback of the two-phase methods is the need to carefully pipette around the protein pellet, which sits at the interface of both phases, to remove the aqueous and organic phases. The use of offline methods of delipidation also creates a need to perform offline desalting. The manual handling required to perform these two separate steps of the workflow adds to the potential for introducing quantitative bias. We also trialled an online method of delipidation and desalting in a single multiplexed workflow. We found that the online delipidation method was the most convenient for handling multiple samples. With the aid of a combination style repeating pipette (e.g., Eppendorf Combipette M4), the ability to prime, wash and elute peptides was rapid and reproducible. One downside of the online method is that the solvents used for washing and elution are highly volatile and hazardous, meaning that venting had to be employed to remove vapours following centrifugation. The performance of our workflow without a delipidation step was not accessed.

Blood anti-coagulants
Most plasma proteomic studies deploying LC-MS utilise samples collected into EDTA anti-coagulant tubes [21,22]. There is a paucity of proteomic studies examining plasma proteins collected from other types of anti-coagulant tubes. This is relevant to the FIELD study where citrated plasma samples are available for specimens collected over 12 years ago [8]. We performed a pairwise comparison of matched citrate and EDTA preserved plasma from a group of donors and observed minimal quantitative differences except for haemoglobin, fibrinogen and fibronectin levels. The suspected antagonistic nature of fenofibrate on cardiovascular complications has been linked to fibrosis related cardiovascular remodelling [23]; therefore, the elevated detection of these clotting related proteins when using citrate was deemed useful. Contrastingly the use of EDTA has been previously reported to increase the osmotic fragility of erythrocytes leading to increased haemolysis [24]. When EDTA was used, we found that elevated levels of haemoglobin were encountered, consistent with minor red blood cell haemolysis, which was absent in the citrated samples. This data confirmed the utility of sodium citrate plasma for LC-MS proteomic analyses.

Effects of multiple freeze-thaw cycles on stored plasma
One of the considerations of this study was the effect of multiple freeze-thaw cycles. Parallel mRNA analyses were also planned for the specific cohort of chosen samples, so it was important to determine which analysis would need to take priority. RNA is typically less stable and prone to degradation when frozen and thawed multiple times; therefore, the effects of various freeze-thaw cycles were evaluated to see the impact on the detectible proteome. A subset of (non-FIELD study) donor plasma was collected, and aliquoted, before being frozen and thawed six times. After observing no significant global variation, we chose a subset of apolipoproteins and complement proteins for this analysis. Our results showed that there was little effect on the abundance or detection of these markers of fenofibrate action. We attribute this to the denaturing conditions used for protein digestion, meaning that if proteins were degraded by freeze-thaw, this might not have been detectable in the subsequent LC-MS peptide analysis. We note that it has previously been reported that the concentration of plasma proteins remains stable when specimens are stored at −80 • C [25].

FIELD sample pilot study
To evaluate the effectiveness of the optimised workflow, a pilot study was conducted using stored FIELD study citrate plasma samples from patients treated with fenofibrate for 6 weeks, with fasted venous blood being taken pre-and post-treatment. Minor increases in the major HDL lipoproteins Apo-AI and Apo-AII were observed after 6-weeks of fenofibrate, although this trend did not reach statistical significance in this small cohort. Despite the small cohort size, several proteome changes were observed, including a change in the amount of circulating kallistatin reported previously as being associated with cardiovascular complications in people with diabetes [18]. While the small sample size of this pilot cohort means that these preliminary conclusions need to be further validated, it clearly demonstrated a fit-for-purpose analytical workflow for studying FIELD samples.

Evaluation of quality control measures over time
We utilised book-ended batching with an external plasma standard to monitor batch quality. We used the median area difference for the top 100 proteins, aiming to achieve <15% intra-batch difference. On the occasions when this metric was exceeded, it prompted us to examine causes which were primarily due to declining MS detection sensitivity, or occasionally because of human errors in sample preparation pipetting. The sample preparation workflow itself is robust as we never experienced a failure using that procedure. For the LC-MS using our setup with capillary flow it was typical to achieve 150 injections of 2 µg raw plasma digest before the system sensitivity deteriorated and failed batch QC. This was rectified by replacing the self-packed 150 µm LC column, changing the transfer capillary and cleaning the MS frontend ion optics to restore sensitivity, a process which would consume 1-3 days. This was a necessary process to enable quantitation of less abundant proteins, which would otherwise result in missing data.
In summary, we have reported a facile and robust human plasma sample preparation and LC-MS workflow suitable for large-scale proteomic analysis using neat plasma. The use of neat plasma has the advantage of avoiding complications due to additional sample processing steps, as well as time and reagent cost savings. We envisage this workflow which balances proteomic depth against cost and time can be applied to most clinical proteomic studies using human plasma where quantitative analysis of the major plasma proteins is the objective.

Associated data
Mass spectrometry data is available via ProteomeXchange with the identifier PDX029732.