Extracellular vesicles as a promising source of lipid biomarkers for breast cancer detection in blood plasma

Abstract Extracellular vesicles (EVs), including exosomes and microvesicles, mediate intercellular communication in cancer, from development to metastasis. EV‐based liquid biopsy is a promising strategy for cancer diagnosis as EVs can be found in cancer patients’ body fluids. In this study, the lipid composition of breast cancer‐derived EVs was studied as well as the potential of blood plasma EVs for the identification of lipid biomarkers for breast cancer detection. Initially, an untargeted lipidomic analysis was carried out for a panel of cancerous and non‐cancerous mammary epithelial cells and their secreted EVs. We found that breast cancer‐derived EVs are enriched in sphingolipids and glycerophospholipids compared to their parental cells. The initial in vitro study showed that EVs and their parental cells can be correctly classified (100% accuracy) between cancerous and non‐cancerous, as well as into their respective breast cancer subtypes, based on their lipid composition. Subsequently, an untargeted lipidomic analysis was carried out for blood plasma EVs from women diagnosed with breast cancer (primary or progressive metastatic breast cancer) as well as healthy women. Correspondingly, when blood plasma EVs were analysed, breast cancer patients and healthy women were correctly classified with an overall accuracy of 93.1%, based on the EVs’ lipid composition. Similarly, the analysis of patients with primary breast cancer and healthy women showed an overall accuracy of 95% for their correct classification. Furthermore, primary and metastatic breast cancers were correctly classified with an overall accuracy of 89.5%. This reveals that the blood plasma EVs’ lipids may be a promising source of biomarkers for detection of breast cancer. Additionally, this study demonstrates the usefulness of untargeted lipidomics in the study of EV lipid composition and EV‐associated biomarker discovery studies. This is a proof‐of‐concept study and a starting point for further analysis on the identification of EV‐based biomarkers for breast cancer.


INTRODUCTION
Breast cancer continues to be the most commonly diagnosed cancer worldwide, and its future burden is predicted to continue to increase in the decades to come (Arnold et al., 2022).Over 2.3 million new breast cancer cases occurred in 2020, with an incidence rate over 80 per 100,000 females in parts of Europe, Northern America and Australia/New Zealand (Arnold et al., 2022).By 2040, the number of new cases is predicted to increase by over 40% (Arnold et al., 2022).Breast cancer can be classified into different molecular subtypes based on the expression of the estrogen receptor (ER), progesterone receptor (PR) and the human epidermal growth factor receptor 2 (HER2, also known as HER2/neu or ErbB2) or the lack of the expression of these receptors (triple-negative breast cancer, TNBC) (Langlands et al., 2013).Self-examination, clinical breast exam and imaging are the first steps for a breast cancer diagnosis, but a tissue biopsy using invasive methods is required if malignancy is suspected (Newell & Mahoney, 2014).A liquid biopsy could bring significant benefits in the screening, early diagnosis, prognosis and monitoring of the progression of breast cancer, which subsequently contribute to the patients' early access to treatment (Alimirzaie et al., 2019;Tay & Tan, 2021;Venetis et al., 2023).
Extracellular vesicles (EVs), including exosomes and microvesicles, are found in body fluids and can be a source of biomarkers for cancer diagnosis (Logozzi et al., 2009, Skotland et al., 2017).Cancer cells as well as immune cells produce EVs, which play an important role in cancer development, progression and metastasis (Bell and Taylor, 2016;Boelens et al., 2014;Tickner et al., 2014).Proteins and microRNA found in EVs have been widely analysed (Kalra et al., 2012, Keerthikumar et al., 2016).In contrast, their lipid content remains underexplored.The lipid composition of cancer-derived EVs has mainly been studied by analysing melanoma, colorectal, breast and prostate cancer cell lines (Brzozowski et al., 2018;Hosseini-Beheshti et al., 2012;Llorente et al., 2013;Lobasso et al., 2021;Lydic et al., 2015;Nishida-Aoki et al., 2020;Skotland et al., 2017).In breast cancer, a lipidomic study of two metastatic TNBC cell lines found enrichment of unsaturated diacylglycerols in EVs produced by high-metastatic cells compared to EVs produced by low-metastatic cells (Nishida-Aoki et al., 2020).Lipidomic analysis of urinary EVs from prostate cancer patients has suggested that a combination of two phosphatidylserines (PS) and one lactosylceramide (LacCer) can distinguish between healthy individuals and prostate cancer patients; PS(18:1/18:1), PS(18:0_18:2) and LacCer(d18:1/16:0) (Skotland et al., 2017).This demonstrates the importance of lipidomic analysis of EVs found in cancer patients' body fluids and its potential for the identification of lipid biomarkers for cancer diagnosis.
Lipidomic analysis using mass spectrometry (MS)-based approaches such as liquid chromatography-MS (LC-MS) and imaging MS allow a comprehensive study of lipid alterations in cancer, including breast cancer.High levels of sphingolipids have been detected in breast cancer tissue when compared to normal breast tissue, using LC-MS-based approaches (Nagahashi et al., 2016).Similarly, differences in the lipid profile between breast cancer subtypes have been found by analysing breast cancer cell lines (Eiriksson et al., 2020).In addition, MS-based technologies using matrix-assisted laser desorption/ionization (MALDI) (Kang et al., 2011), desorption electrospray ionisation (DESI) (Guenther et al., 2015) and rapid evaporative ionization mass spectrometry (REIMS) (St John et al., 2017) have demonstrated to be robust technologies for breast cancer detection, thanks to the high sensitivity and relative straightforwardness of studying the lipid content of cancerous tissues by MS.
EV-based liquid biopsy could overcome the limitations associated with tissue biopsy; this includes access to tumour tissue and invasiveness, as well as tumour heterogeneity (Armakolas et al., 2023;Liang et al., 2021;Martins et al., 2021).Lipidomic analysis of EVs circulating in cancer patients' body fluids combines the robustness and sensitivity of MS-based technologies for cancer detection with the clinical advantage of using non-or minimally invasive methods for sample collection.This approach has great potential not only for cancer diagnosis, but also for cancer prognosis, monitoring and stratified medicine.In this proof-ofconcept study, we studied the lipid composition of EVs produced by breast cancer cell lines from a range of molecular subtypes and the potential of EVs found in breast cancer patients' blood plasma for the identification of lipid biomarkers for breast cancer detection.We carried out an untargeted lipidomic analysis of a comprehensive panel of cancerous and non-cancerous mammary epithelial cells and their secreted EVs, as well as EVs found in blood plasma from breast cancer patients and healthy women.To date, this approach has not been explored.

. Isolation of EVs from conditioned medium
EV isolation was performed as described previously (Horgan et al., 2020;Penders et al., 2021).Cells were expanded in 10x T225 cell culture flasks and once cells reached ∼80%-90% confluency, they were maintained in their respective supplemented medium in the absence of serum for 72 h.For HuMEC Ready Medium (serum-free), the bovine pituitary extract was omitted due to presence of abundant particulates.Conditioned media was harvested, filtered with a 0.45 μm membrane bottle-top filter, and stored at −80 • C until further processing.The conditioned media were concentrated using 100 kDa MWCO Amicon Ultra-15 centrifugal filter devices (Sigma-Aldrich) and centrifugation at 5000 x g at 4 • C to a final volume of 500 μL.Subsequently, EVs were isolated by size exclusion chromatography using a Sepharose CL-2B (Sigma-Aldrich) column (1 cm × 30 cm) packed to approximately 28 cm and equilibrated using particle-free Dulbecco's Phosphate-Buffered Saline (DPBS, Thermo Fisher Scientific).Concentrated conditioned media was loaded onto the column and DPBS was used as the mobile phase.Fractions of 1 mL were eluted by gravity and the peak fractions containing the EVs (fractions 8-12) were pooled and stored at −80 • C until further analyses.

. Whole blood collection
Before surgery was undertaken, blood samples were collected from ten women with primary breast cancer (age range 36-68 years and median age 52 years).All primary breast cancer samples, except one, were obtained from treatment-naive patients.Blood samples were obtained from an additional nine women with progressive metastatic breast cancer (age range 36-72 years and median age 53 years), who were under cancer treatment but not responding to the treatment.In addition, samples were obtained from ten healthy women who neither had pre-existing medical conditions and infections at the time the samples were collected or in the previous weeks (age range 36-62 years and median age 50 years).Whole blood samples were collected under fasting conditions by venepuncture using EDTA (K2E) vacutainer tubes.Blood plasma was obtained by 10 min centrifugation at 1000 x g at 4 • C, followed by the collection of the supernatant and its centrifugation in a new tube for 10 min at 2000 x g at 4 • C. The blood plasma obtained was stored at −80 • C until further analyses.

. Ethics approval
Human samples used in this research project were obtained from the Imperial College Healthcare Tissue Bank (ICHTB

. Isolation of EVs from human blood plasma
The blood plasma samples were centrifuged at 5000 x g for 15 min at 4 • C in order to remove cell debris, if still present.The supernatant was transferred into a new tube and centrifuged at 16,000 x g for 30 min at 4 • C to remove larger microvesicles.The supernatant was then transferred into a new tube, and it was kept at 4 • C to be used the same day.Iodixanol density gradients were prepared using OptiPrep (60%, w/v, Sigma-Aldrich) according to the manufacture's recommendations.Initially, a working solution of 50% OptiPrep was obtained by mixing five volumes of 60% OptiPrep with one volume of a homogenisation solution, consisting of 0.25 M sucrose, 6 mM EDTA and 6 mM Tris-HCl at pH 7.4.The OptiPrep working solution was diluted to 30% and 6%, in a buffer solution consisting of 0.25 M sucrose, 1 mM EDTA and 1 mM Tris-HCl at pH 7.4.One mL of 30% and 1 mL of 6% OptiPrep were layered in ultra-clear centrifuge tubes (Beckman Coulter, CA, USA) from bottom to top.This was followed by 3 mL of the blood plasma which was clear of debris and larger microvesicles.The samples were centrifuged in a Beckman Coulter's SW 55 Ti swinging-bucket rotor for 2 h at 120,000 x g avg at 4 • C, setting up the acceleration to maximum and deceleration to zero.The centrifugation tubes were carefully removed from the rotor and ten 500 μL fractions were collected from top to bottom.The fractions nine and ten, containing the EVs (but also lipoproteins with the same density, mainly High-Density Lipoproteins (HDL)) were added on top of a chromatography column (Bio-Rad, CA, USA) containing 2 mL of Capto Core 700 (Sigma-Aldrich), which combines size separation and binding chromatography to deplete co-isolation of non-EV components.
Once the mix of the two fractions containing EVs entered the stationary phase completely, DPBS was used as the mobile phase and three 1 mL fractions were collected by gravity for EVs' analysis, pooled and stored at −80 • C until further analyses.

. Density gradient measurement
The absorbance of 6%, 10%, 20% and 30% iodixanol solutions (OptiPrep) were measured at 340 nm by spectrophotometry (as suggested by manufacturer) and a standard curve was created.The absorbance of each fraction collected was also measured and their density calculated based on the iodixanol standard curve.

. Nanoparticle tracking analysis
EVs' concentration was measured using a NanoSight NS300 (Malvern Panalytical Ltd, England, United Kingdom) equipped with a 532 nm laser and a sCMOS camera.EVs were diluted in particle-free DPBS to a concentration of 1-10 × 10 8 particles/mL.Five 60 s videos were analysed of the diluted EVs using NTA V3.0 software.

. Protein quantification
Protein concentration analysis was carried out using the Pierce BCA protein assay kit (Thermo Fisher Scientific) according to manufacturers' instructions.Samples were incubated at 37 • C for 30 min and absorbance was measured at 562 nm.

. Immunoblotting analyses
A pooled sample of EVs isolated from blood plasma and a pooled sample of blood plasma were analysed using western blotting.Proteins were extracted from the pooled samples by adding RIPA lysis buffer containing protease and phosphatase inhibitors and sonication (mode pulse) for 20 s in ice-water.Samples were mixed for 1 h at 4 • C and then centrifuged at 20,000 x g for 10 min at 4 • C. Supernatants (10 μL) were used for protein concentration analysis using the BCA protein assay kit (Thermo Fisher Scientific).Protein extracts (15 μg) were mixed with Laemmli sample buffer (Bio-Rad) without reducing agent, loaded and separated on 4%-20% Mini-PROTEAN TGX precast protein gels (Bio-Rad).Proteins were transferred to polyvinylidene fluoride (PVDF) membranes (Bio-Rad) and blocked in 5% (w/v) non-fat dry milk (Bio-Rad) in TBS-T (Tris-Buffered Saline with 0.1% (v/v) Tween-20 (Sigma-Aldrich)) for 1 h at room temperature, followed by three 10 min washes in TBS-T.Three primary antibodies were evaluated: mouse anti-CD9 (Cat#10626D, RRID:AB_2532982), anti-CD81 (Cat#10630D, RRID:AB_2532984) and anti-apopoliprotein A1 (ApoA1, Cat#MIA1405, RRID:AB_11152905), all from Thermo Fisher Scientific.The membranes were incubated overnight at 4 • C in one of the primary antibodies diluted 1:1000 in 5% (w/v) bovine serum albumin (Sigma-Aldrich).Subsequently, membranes were washed three times in TBS-T and incubated for 1 h at room temperature with secondary HRP-linked antibody against mouse IgG (1:2000 in blocking buffer, Cell Signaling Technology (MA, USA) Cat#7076, RRID:AB_330924).After additional washing in TBS-T three times, the membranes were scanned using the LAS-3000 Imaging system (Fujifilm, Japan).

. Transmission electron microscopy (TEM)
EVs isolated from blood plasma were analysed using TEM by simple drop casting.In one typical preparation, 1 μL of sample was drop casted directly onto a copper grid (Carbon support film square grid, 400 mesh, 5-6 nm, copper, Electron Microscopy Sciences) and allowed to dry at room temperature overnight.Dried samples were then imaged with a JEOL 2100Plus (LEOL, Japan) transmission electron microscope at 200 kV.

. Lipid extraction
Lipids were extracted from EVs and cells based on the Bligh and Dyer method (Bligh & Dyer, 1959).To summarise, 3.75 mL of chloroform:methanol (1:2, v/v) were added to 1 mL of the sample homogenized in LC-MS grade water and vortexed.Samples were processed on ice and incubated for 30 min for the extraction of the lipids into the organic phase.After the incubation time, 1.25 mL of chloroform was added, and the samples were vortexed.Next, 1.25 mL of LC-MS grade water was added and the mixture vortexed again.The samples were centrifuged at 1000 x g for 10 min at 4 • C to separate the chloroform layer (bottom layer) containing the lipids from the aqueous or methanolic layer (top layer) containing non-lipids.In parallel, 'blank samples' were prepared by replacing the sample by 1 mL of LC-MS grade water to allow a washing step to be included and obtain a much cleaner sample of lipids.The washing step consisted of interchanging the organic and aqueous layers between the 'blank samples' and the samples, and then centrifuging at 1000 x g for 10 min at 4 • C.After the final centrifugation, the organic layer was collected and dried using nitrogen.The lipids extracted from mammary epithelial cells and their EVs were reconstituted in acetonitrile/isopropanol (1:1, v/v), the samples were normalised based on the protein concentration estimated from the protein disc obtained during the lipid extraction.Lipids extracted from blood plasma EVs were reconstituted in LC-MS grade water/isopropanol (1:5, v/v) based on the number of EVs per sample.

. UPLC-MS and UPLC-MS/MS analyses
Lipids extracted from cells and their released EVs, as well as from EVs isolated from blood plasma, were analysed by Ultra Performance Liquid Chromatography (UPLC)-MS.An ACQUITY UPLC System coupled to a Waters XEVO G2 Q-TOF mass spectrometer (Waters Corporation, MA, USA) was used.The liquid chromatographic separation of the compounds was carried out using an ACQUITY UPLC BEH C8 Column (1.7 μm, 2.1 × 100 mm, column temperature 55 • C).Mobile phase A consisted of LC-MS grade water:acetonitrile:isopropanol (2:1:1, v/v/v), 5 mM ammonium acetate, 0.05% acetate acid and 20 μM phosphoric acid.Mobile B consisted of acetonitrile:isopropanol (1:1, v/v), 5 mM ammonium acetate and 0.05% acetic acid.The binary gradient profile for the LC-MS analyses of the mammary cells and their EVs was 99.9% mobile phase A (0.0-2.0 min), 70% A (2.0-11.5 min), 10% A (11.5-12.0min), 0.1% A (12.0-12.50min) and 99.9%A (12.50-14.6 min).LC-MS analyses of lipids extracted from EVs isolated from blood plasma were carried using the facilities and methodologies defined by the National Phenome Centre from Imperial College London for the untargeted lipidomic analysis of human blood plasma.The binary gradient profile was 99% mobile phase A (0.0-2.0 min), 70% A (2.0-11.5 min), 10% A (11.5-12.0min), 0.1% A (12.0-12.50min), 35% A (12.50-12.55min), 70% A (12.55-12.65 min), 99% A (12.65-12.75min) and 99% A (12.75-13.25 min).For both, the MS and MS/MS data were acquired for 0.1 s in the centroid mode in both positive electrospray ionization mode (ESI+) and negative electrospray ionization mode (ESI-).Mass spectrometer source parameters were as follow: capillary voltage 2 kV (ESI+) or 1.5 kV (ESI-), sampling cone voltage 25 V, source temperature 120 • C, desolvation temperature 600 • C, cone gas flow 150 (L/h) and desolvation flow 1000 (L/h).Quality Control (QC) samples containing a pool of all the samples evaluated and lipid internal standards were included in the set of samples analysed.This was to assess the quality of the data throughout the whole running time.Data dependent acquisition (DDA) was achieved for QC samples to perform MS/MS experiments to identify lipid species.A collision energy ramp of 22-37 V was applied for the fragmentation of the parent ions for both electrospray ionization modes.In addition, individual MS/MS experiments were also carried out for the relevant lipid species identified.

. Data analysis
MS data pre-processing was undertaken using XCMS open-source software (Smith et al., 2006).The XCMS outputs included m/z, retention time and intensity for each peak detected.Features with coefficients of variation (CV%) in QC samples higher than 30% (non-reproducible measurements) were removed from the outputs.Sample intensities were normalised by probabilistic quotient normalisation for univariate analysis.For principal components analysis (PCA), the intensities were also log 10 transformed.Univariate and multivariate analyses were carried out using Python programming language (IPython, RRID:SCR_001658) and the package scikit-learn (Pedregosa et al., 2011).To identify the lipid species significantly enriched in EVs when compared to their parental cells, log 2 fold changes were calculated for each detected feature based on the mean of the normalised relative abundances for each group studied (i.e., cells and EVs).A threshold of log 2 fold > 2 or < −2 was defined for the analysis of the lipid enrichment between EVs and cells, respectively.Statistically significant differences between the means of EVs and cells were calculated by one-way ANOVA.P-values were adjusted by Benjamini-Hochberg correction (q-values) and a threshold of statistical significance was set at lower than 0.05 (q-value < 0.05).Spearman rank-order correlation coefficient was calculated for paired cells and their secreted EVs to identify lipids correlated between these two groups (i.e., cells and EVs).Logistic regression classification with recursive feature elimination (LR-RFE) analysis was performed for the selection of a subset of the most relevant features for the analysis between groups (i.e., cancer and no-cancer, or between the three types of breast cancer subtypes studied).To evaluate the performance of the LR-RFE model, leave-one-group-out cross-validated LR classification was performed for the study of EVs and their parental cells (biological replicates were defined as one group) and leave-one-individual-out cross-validated LR classification for the study of EVs found in blood plasma.The Area Under the Receiver Operating Characteristic (AUROC) curve was determined to measure the accuracy of the LR classification models generated with a reduced number of features.The LC-MS datasets obtained by ESI+ and ESI− modes were fused by concatenation, with all variables being auto-scaled (subtraction of mean and division by standard deviation) for LR-RFE classification; box plots also show auto-scaled data.Lipid annotations were performed using in-house databases and matching them with online lipid databases (LIPID MAPS, HMDB and METLIN) and by the manual verification of MS/MS spectrum.Mann-Whitney U tests were performed to evaluate whether significant differences were observed in the number of EV's isolated from the blood plasma samples of the three groups studied: (i) healthy volunteers, (ii) primary breast cancer and (iii) progressive metastatic breast cancer.

 RESULTS
Untargeted lipidomic analysis using reversed-phase LC-MS was carried out for cancerous and non-cancerous human mammary epithelial cells and their secreted EVs, and subsequently for EVs found in blood plasma from women with breast cancer as well as healthy women.The methodology implemented in this study allowed the identification of a wide range of lipids.This included sphingolipids among them sphingomyelins (SM), ceramides (Cer), and hexosylceramides (HexCer); glycerophospholipids included PS, phosphatidylcholines (PC), ether-PC, phosphatidylinositols (PI), phosphatidylethanolamines (PE), ether-PE, phosphatidic acids (PA) phosphatidylglycerols (PG), lysophosphatidic acids (LPA), lysophosphatidylethanolamines (LPE), and lysophosphatidylcholines (LPC); glycerolipids included triglycerides (TG); and fatty acids (FA).

. Sphingolipids and glycerophospholipids are the most abundant lipids in breast cancer-derived EVs
Initially, an in vitro study was carried out to understand the lipid composition of EVs produced by breast cancer cells.For this, an untargeted lipidomic analysis of a comprehensive panel of cultured cancerous human mammary epithelial cell lines and their secreted EVs was carried out.The cell panel included nine breast cancer cell lines (MCF-7, T47D, HCC1954, JIMT-1, MDA-MB-436, MDA-MB-231, MDA-MB-468, HCC1937 and Hs578T) representative of the ER+/PR+, HER2+ and TNBC molecular subtypes.The EVs produced in vitro by the mammary epithelial cells analysed in this study had a vesicular morphology and were positive for the EV marker proteins CD9, CD63 and CD81, as previously reported for this cell panel (Penders et al., 2021).The PCA carried out to understand the distribution of the LC-MS data in both ESI+ and ESI− modes showed a separation between EVs and cells based on their lipid compositions (Figure 1a).PCA showing the distribution of the LC-MS for each biological replicate in both ESI+ and ESI− modes is shown in Figure S1.A volcano plot was generated to display the lipids found to be significantly (q-value < 0.05) enriched in breast cancer-derived EVs (log 2 fold change > 2), when compared to their parental cells (log 2 fold change < −2), as depicted in Figure 1b.Sphingolipids (Cer, SM and HexCer) and glycerophospholipids (LPC, LPE, PC, ether-PE, and PI) were significantly enriched in EVs, when compared to their parental cells (Figure 1b and Table S1).In contrast, cells were mainly enriched in TG, FA and PG when compared to EVs (Figure 1b and Table S2).EVs could be produced by multivesicular bodies, plasma membrane as well as the Golgi apparatus, and consequently do not represent the entirety of cellular components.
To identify lipids which were highly correlated between EVs and their parental cells, the cells and their respective secreted EVs were paired and Spearman rank-order correlation coefficients calculated (Table S3).Ether-PC, ether-PE and HexCer showed a very strong (Spearman r s > 0.9) correlation between EVs and their parental cells (Table S3).In addition, PE, PC and PA species were also identified within those lipids that had a very strong (Spearman r s > 0.8) correlation between cells and EVs.This suggests that cells and their secreted EVs' lipid composition are to some extent related and that the lipid arrangement in EVs could be connected to their cellular origin, including their cancerous origin.

. EVs and their parental cells can be correctly classified into cancerous and non-cancerous
To understand whether mammary epithelial cells and their secreted EVs could be classified into cancerous and non-cancerous, we analysed the lipid composition of cancerous (MCF-7, T47D, HCC1954, JIMT-1, MDA-MB-436, MDA-MB-231, MDA-MB-468, HCC1937 and Hs578T) and non-cancerous human mammary epithelial cells (MCF-10A and HuMEC) as well as their secreted EVs.The LC-MS data from the ESI+ and ESI− modes were fused and auto-scaled to ensure equal contribution of features in both datasets.An LR-RFE analysis for the fused and auto-scaled LC-MS datasets allowed us to identify PE and PC species which allowed EVs and cells to be correctly classified into cancerous or non-cancerous (Figure 2a obtained for PE(18:2_22:3) and PC(16:1_22:6) when analysed separately, indicating these are the main drivers of the classification of EVs and cells into cancerous and non-cancerous.Additionally, a leave-one-group-out cross-validated LR classification analysis showed that the EVs secreted by the non-cancerous and cancerous mammary epithelial cells can be correctly classified with an overall accuracy of 87.9% (Figure 2c), based on the sphingolipids found enriched in breast cancer-derived EVs when compared to their parental cells (Figure 1b, Table S1 and Figure S2).

. EVs and their parental cells can be correctly classified into their respective breast cancer subtype
For the fused LC-MS datasets, a LR-RFE analysis was performed to study whether EVs and their parental cancerous cells could be correctly classified into the three breast cancer subtypes studied; ER+/PR+, HER2+ and TNBC, based on their lipid composition.The LR-RFE model allowed the identification of phospholipid species contributing to the classification of the cancerous cells and EVs into their respective breast cancer subtypes (Figure 3a).These relevant lipid species included PE( 14 3b).In addition, a leave-one-group-out cross-validated LR classification analysis showed that breast cancer-derived EVs can be classified into their respective breast cancer subtype with an overall accuracy of 70.4% (Figure 3c), based on the sphingolipids found enriched in breast cancer-derived EVs (Figure 1b, Table S1 and Figure S3).S1.The overall accuracy of the model is 87.9%.Biological replicates were defined as one group for leave-one-group-out cross-validated LR classification analysis.As per the scale on the right-hand side of the confusion matrix the colour is driven by the number of observations (n) rather than the percentages.

. There is potential in the lipid composition of blood plasma EVs for breast cancer detection
Subsequently, we studied EVs isolated from blood plasma from women diagnosed with breast cancer (primary and progressive metastatic breast cancer samples), and healthy women.The blood plasma from breast cancer patients was obtained from patients with invasive ductal carcinoma (IDC), which is the most common histological type of breast cancer, as well as invasive lobular carcinoma (ILC), and ductal carcinoma in situ (DCIS).These samples included hormone receptor-positive breast cancers (ER+/PR+), HER2+ breast cancers as well as TNBC.Table 1 summarises the pathological characteristics.The EVs isolated from blood plasma were found positive for the EV protein markers CD9 and CD81, and depleted in ApoA1 which is a major protein component of circulating HDL (Figure 4a).These EVs had a density of approximately 1.18-1.23 g/mL and showed a vesicular morphology (Figure 4b).They had an average size of 169.4 nm (mode = 134.9nm) (Table S4).Significant differences were observed in the number of EVs isolated from blood plasma samples from the healthy volunteers (mean = 2.1E+09 particles/mL) when compared to the primary breast cancer patients (mean = 8.5E+9 particles/mL, p-value < 0.0005) and when compared to the patients with progressive metastatic breast cancer (mean = 3.8E+9 particles/mL, p-value < 0.005) (Figure 4c and Table S4).These results could be attributed to the fact that cancer cells as well as other types of cells triggered by the disease produce EVs, which could contribute to a higher number of EVs circulating in blood from cancer patients.Similarly, significant differences (p-value < 0.05) were observed in the number of EVs isolated from blood plasma samples from primary breast cancer patients when compared to the patients with progressive metastatic breast cancer (Figure 4c and Table S4).The metastatic samples are from breast cancer patients who were under treatment but not responding to the cancer treatment and it is not clear the effect of the treatment in the EVs production from cancer cells and other cells triggered by the disease.Conclusions should not be made based on the number of EVs produced but on their biomolecular composition, EV's cellular origin as well as patients' clinical information.S1.Biological replicates were defined as one group for cross-validation analysis.As per the scale on the right-hand side of the confusion matrix the colour is driven by the number of observations (n) rather than the percentages.
Untargeted lipidomic analysis was carried out for the EVs isolated from blood plasma from breast cancer patients (N = 19) and healthy volunteers (N = 10).PCA for the LC-MS data in both ESI+ and ESI− modes showed a separation between breast cancer samples and samples from healthy women, for most of the samples evaluated (Figure 5a).The LC-MS data from the ESI+ and ESI− modes were fused and auto-scaled, and then an LR-RFE analysis was performed.This analysis allowed the identification of phospholipid species which distinguish breast cancer samples from the samples from healthy volunteers (Figure 5b).Leave-one-individual-out cross-validated LR classification showed that EVs from breast cancer patients and healthy volunteers can be correctly classified with an overall accuracy of 93.1% based on those lipid species (Figure 5c).In addition, the combination of those phospholipids showed an AUROC equal to 0.94 (Figure 5d).
Importantly, the same phospholipids can also distinguish primary breast cancer patients from healthy volunteers (Figure 6a).Leave-one-individual-out cross-validation analysis showed that samples from primary breast cancer patients (N = 10) and healthy volunteers (N = 10) can be correctly classified with an overall accuracy of 95% (Figure 6b).Similarly, a ROC analysis showed an AUROC equal to 0.97 for the combination of the five phospholipids (Figure 6c).The primary breast cancer samples included samples from early-stage breast cancer patients; negative lymph nodes and DCIS (Table 1).

. Classification of primary and metastatic breast cancers based on blood plasma EVs' lipid composition
A LR-RFE analysis showed that blood plasma EV's phospholipid and ceramide species (LPC(16:1) [M-H]−, LPC(O-18:2) [M+OAc]−, PS(18:1/18:1) [M-H]−, PS(18:0_18:2) [M-H]− and HexCer(36:1) [M+K]+) allow us to distinguish between primary and metastatic breast cancers (Figure 7a).The leave-one-individual-out cross-validation analysis for the LR-RFE model showed TA B L E  Pathological characteristics of breast cancer samples evaluated.Ten samples from patients with primary cancer were evaluated, as well as nine samples from patients with progressive metastatic breast cancer.The information in the table for progressive metastatic breast cancer samples corresponds to the information reported for the initial breast cancer diagnosis (primary cancer).that primary and metastatic cancers can be classified with an overall accuracy of 89.5% (Figure 7b).In addition, a ROC analysis showed an AUROC equal to 0.97 for the combination of those phospholipid and ceramide species (Figure 7c).

 DISCUSSION
EV's lipid composition is an underexplored field, including the potential in the lipid composition of EVs circulating in blood for breast cancer diagnosis.We demonstrate that untargeted lipidomics is a useful approach for the study of EV's lipid composition and reveal that there is potential in the lipid composition of EVs found in blood plasma for the identification of lipid biomarkers for breast cancer detection.Initially, we studied the lipid composition of EVs, and their cancerous parental cell lines and we found that sphingolipids (Cer, SM and HexCer) and glycerophospholipids (LPC, LPE, PC, ether-PE, and PI) are enriched in EVs produced by breast cancer cells.Similar results have been evidenced in the study of EVs secreted by prostate cancer cells (Brzozowski et al., 2018).High abundances of sphingolipids have also been detected in other cancer-derived EVs' studies (Brzozowski et al., 2018;Hosseini-Beheshti et al., 2012;Skotland et al., 2017), including EVs produced by TNBC cells (Nishida-Aoki et al., 2020).Ether phospholipids were also identified in EVs, which together with SM could play an important role in cellular signalling as well as in the stability of the EV's lipid bilayer as their role as endogenous antioxidants has been suggested (Ogretmen, 2018;ean & Lodhi, 2018).Ether phospholipids have also been identified in EVs released by prostate cancer cell lines (Llorente et al., 2013) and urinary EVs from prostate cancer patients (Skotland et al., 2017).Similarly, lysophospholipids were identified in breast cancer-derived EVs, they can act as signalling molecules, but they could also participate in the EV membrane curvature (Fuller & Rand, 2001).Lysophospholipids have also been identified in prostate cancer-derived EVs (Brzozowski et al., 2018;Llorente et al., 2013) and EVs found in blood plasma (Jakubec et al., 2020).
In contrast to EVs, breast cancer cells were enriched in TG and FA.A high content of TG (Eiriksson et al., 2020) and FA (Guenther et al., 2015) was previously reported in cancer cells.Active cell proliferation is fundamental in cancer development and progression and requires the continuous supply of FA (Koundouros & Poulogiannis, 2020).Arachidonic acid (AA) was one of the FA found to be enriched in cells, when compared to EVs.AA is important in cancer development and activation of the PI3K/AKT/mTOR signalling pathway, whose over-activation contributes to cell proliferation, growth, and cell migration in tumour cells, including breast cancer (Fruman et al., 2017;Wymann & Schneiter, 2008).Cells and their secreted EVs can differ in their lipid composition, but there are also lipid species correlated between them.Ether phospholipids species showed a very strong correlation between EVs and their parental cells.A high level of ether lipids has been observed in cancer cells and their participation in cell differentiation and signalling pathways has been suggested (Benjamin et al., 2013;Dean & Lodhi, 2018).In addition, PC and PE species also showed a strong correlation between cells and their EVs.PC and PE are fundamental lipids in biological membranes, which include EVs membranes, but they are also a source of signalling molecules.PE is important in cell curvature and fluidity, facilitating cell budding, fission and fusion (van Meer et al., 2008) and the same function could be played in EVs.
Our in vitro study showed that phospholipid species allow correct classification (100% accuracy) of mammary epithelial cells and their secreted EVs based on their cancerous or non-cancerous origin.In addition, we also found that the EVs produced by the non-cancerous and cancerous cells can be correctly classified with an overall accuracy of 87.9%, based on the sphingolipid species found enriched in EVs when compared to their parental cancer cells.This reflects the fact that metabolic changes are often found in breast cancer tissue (Kang et al., 2011;Nagahashi et al., 2016).We also found that cancerous mammary epithelial cells and their secreted EVs can be correctly classified into their respective ER+/PR+, HER2+ or TNBC subtypes based on their lipid composition.This is in agreement with other studies that have suggested a differentiation of breast cancer subtypes based on the lipid profile of breast cancer cells (Eiriksson et al., 2020).Although the number of cell lines studied per breast cancer subtype is low, and the test can be underpowered because of this, the results of this in vitro study suggest that cancer-derived EVs carry metabolic phenotype information from the cancer subtype from which they were originated.
We translated our in vitro approach and findings into the lipidomic analysis of EVs found in blood plasma from breast cancer patients and healthy women.We found that blood plasma EVs' lipid species allowed us to correctly classify patients with breast cancer and healthy women with an overall accuracy of 93.1%.This included detection of ER/PR+, HER2+ and TNBC from patients with primary or progressive metastatic cancer.Importantly, we found that when analysing only the samples from the primary breast cancer patients and healthy women, the same combination of lipids allowed us to correctly classify the samples with an overall accuracy of 95%.Only one sample out of the ten primary breast cancer samples analysed was misclassified.The patient's clinical history indicates that adjuvant treatment was given to the patient before the breast surgery, which could suggest that the treatment could have had an effect on the EV composition and/or production.It is important to highlight the fact that our lipidomic analysis of plasma EV-based liquid biopsies allowed correct classification of samples from patients with primary cancer at an early stage (0-2), this includes pre-invasive breast cancer (DCIS, stage 0) and cancers that had not spread to the lymph nodes.Additionally, we found that primary and progressive metastatic breast cancers can be correctly classified with an overall accuracy of 89.5% based on blood plasma EVs' phospholipid and ceramide species.These findings demonstrate that EVs circulating in breast cancer patients' blood are a promising source of lipid biomarkers for breast cancer detection, including primary and metastatic cancers, with potential application in detection of breast cancer at an early stage as well as monitoring its progression.Among the lipids identified in blood plasma EVs relevant in breast cancer detection we found PC and PE which have also been detected in breast cancer tissues as highly abundant lipids (Ide et al., 2013;Punnonen et al., 1989).PE, PC and PI could act as precursors of second messengers including PA, DG and AA which participate in the activation of signalling pathways including PI3K/AKT/mTOR (Foster, 2009;Fruman et al., 2017;Wymann & Schneiter, 2008).PS was found in higher levels in metastatic breast cancers compared to primary breast cancers.The importance of PS as a marker of tumour cells and metastases has been suggested (Riedl et al., 2011).They have also been identified in prostate cancer-derived EVs and urinary EVs from prostate cancer patients (Llorente et al., 2013;Skotland et al., 2017).LPA(21:0) was identified as one of the relevant lipids in breast cancer detection.LPA has also been identified in EVs found in human blood plasma in other studies (Jakubec et al., 2020).Similarly, odd-carbon lipid species, have also been detected in both human blood plasma (Huynh et al., 2019) and EVs found in blood plasma (Jakubec et al., 2020).Odd-chain fatty acids could be produced endogenously or could be associated with the patients' diet and/or microbiota-derived vesicles.The key role of the human microbiome in carcinogenesis is a growing area of scientific focus (Xavier et al., 2020), and it has been suggested it may play a role in the risk of estrogen-dependent cancers (Plottel & Microbiome, 2011).Interestingly, LPA has been associated with breast cancer metastasis, breast cancer treatment failure and postmenopausal breast cancers (Moolenaar et al., 2004).LPA receptors are abundant in breast cancer and have been associated with activation and upregulation of PI3K/AKT, p38-MAPK and ERK/MAPK signalling pathways, leading to cancer development and progression (Panupinthu et al., 2010).
EVs circulating in blood from cancer patients provide a snapshot of the complex cellular mechanisms triggered by the disease, this includes not only the release of EVs from cancer cells but also non-malignant cells like immune cells.For instance, neutrophilderived EVs could be detected in blood plasma.It has been suggested that a high neutrophil to lymphocyte ratio is associated with a poor breast cancer prognosis (Gago-Dominguez et al., 2020).In addition, tumour-associated neutrophils can act as tumour promotors depending on the microenvironment and this cellular crosstalk can be facilitated by both cancer and neutrophilderived EVs (Rubenich et al., 2021).This reinforces the importance of the analysis of EVs found in breast cancer patients' body fluids, as they are a representative picture of all the molecular and cellular processes involved in cancer.This explains differences that can be found between in vitro and clinical studies.It also demonstrates the importance of appropriate selection of methods of isolation of EVs from blood plasma that deplete the co-isolation of blood plasma components non-associated with the EV protein corona.This also applies for direct analysis of blood plasma, as EV-associated cancer biomarkers could be masked by other major components of blood plasma.
The identification of biomarker candidates for breast cancer detection was beyond the scope of this proof-of-concept study.It serves as a starting point for future clinical studies which will include analysis of larger cohorts of breast cancer patients as well as those with benign breast conditions.Such studies should also include the analysis of tumour tissues as well as blood plasma EVs from patients before and after breast surgery, to establish whether there is a direct relationship between the presence of a tumour in the body and EV-associated cancer biomarkers circulating in the patients' blood.This analysis of blood plasma EV populations combined with studies of multiple types of cancers will be required for identification of specific EV-based biomarkers for breast cancer detection.
In conclusion, this proof-of-concept study reveals the potential in the lipid composition of EVs found in blood plasma for breast cancer detection.Furthermore, this study demonstrates that untargeted lipidomic analysis is a useful approach for the study of the lipid composition of EVs and for EV-based biomarker discovery studies.
,b).The relevant lipid species included PE(18:2_22:3) [M-H]−, PC(16:1_22:6) [M+H]+, PE(O-18:2_22:6) [M-H]− and PC(18:0_18:1) [M+OAc]−.The leave-one-groupout cross-validated LR classification, carried out to evaluate the performance of the LR-RFE model, showed that EVs and cells can be classified into cancerous or non-cancerous with 100% accuracy, based on those lipid species.In addition, ROC analysis based on the combination of these four relevant lipid species showed an outstanding AUROC value equal to 1, indicating the ability of these PE and PC species to distinguish EVs and cells between cancerous and non-cancerous.Equally, an AUROC equal to 1 can be F I G U R E  Lipidomic analysis of EVs and their parental cells.(a) PCA showing the LC-MS data distribution for both cells (N = 11 cell lines and 3 biological replicates; n = 33) and their secreted EVs (n = 33).This includes both cancerous and non-cancerous cells and their EVs.The LC-MS data was acquired in both ESI+ and ESI− modes.(b) Volcano plot showing the lipids significantly (q-value < 0.05) enriched in breast cancer-derived EVs (log 2 fold change > 2, red stars) when compared to their parental cells (log 2 fold change < −2, purple stars).n = total number of observations.

F
I G U R E  Classification of EVs and cells into cancerous and non-cancerous.(a) Box plots for the PE and PC species identified by LR-RFE analysis that distinguish EVs and cells between cancerous (N = 9 breast cancer cell lines and their respective EVs, 3 biological replicates; n = 54) and non-cancerous (N = 2 non-cancerous cell lines and their respective EVs, 3 biological replicates; n = 12).Box plots are based on the fused and auto-scaled ESI− and ESI+ datasets.(b) Confusion matrix of the leave-one-group-out cross-validated LR classification model of EVs and cells into cancerous (n = 54) and non-cancerous (n = 12) with an overall accuracy of 100%.(c) Confusion matrix of the leave-one-group-out cross-validated LR classification model of EVs into cancerous (EVs obtained from N = 9 breast cancer cell lines, 3 biological replicates; n = 27) and non-cancerous (EVs obtained from N = 2 non-cancerous cell lines, 3 biological replicates; n = 6) based on the sphingolipids found enriched in breast cancer-derived EVs indicated in Figure 1b and Table

F
I G U R E  Classification of breast cancer cells and their secreted EVs into their respective breast cancer subtypes.(a) Box plots for the five lipid species identified by the LR-RFE analysis that classifies breast cancer cells and their EVs into their respective breast cancer subtypes.Nine breast cancer cell lines were studied from different molecular subtypes: ER+/PR+ (N = 2 cell lines and their respective EVs, 3 biological replicates; n = 12), HER2+ (N = 2 cell lines and their respective EVs, 3 biological replicates; n = 12), and TNBC (N = 5 cell lines and their respective EVs, 3 biological replicates; n = 30).Box plots are based on the fused and auto-scaled ESI− and ESI+ datasets.(b) Confusion matrix of the leave-one-group-out cross-validated LR classification model of EVs and cells into breast cancer subtypes (100% accuracy).(c) Confusion matrix of the leave-one-group-out cross-validated LR classification of EVs into their respective breast cancer subtype (overall accuracy of 70.4%) based on the sphingolipids enriched in breast cancer-derived EVs indicated in Figure 1b and Table Those phospholipids included PE(16:0_16:1) [M+OAc]−, PI(22:0_18:3) [M-H]−, PC(16:0_18:1) [M+K]+, LPA(21:0) [M-H]− and LPC(18:0) [M+H-H 2 O]+.

F
I G U R E  Characterisation of EVs isolated from blood plasma by combination of density gradient ultracentrifugation and size exclusion/bind-elute chromatography.(a) Western blot analysis of EV protein markers (CD9 and CD81) and ApoA1 in a pooled sample of all blood plasma (PL) samples, and a pooled sample of EVs (EV) isolated from the blood plasma samples studied.(b) EVs morphology analysed by TEM.(c) Box plots for the distribution of the EV concentration per mL obtained from the blood plasma samples from healthy volunteers (HV, N = 10), patients with primary breast cancer (Primary, N = 10), and patients with progressive metastatic breast cancer (Metastatic, N = 9).P-values were obtained by Mann-Whitney U tests for comparisons between the groups studied (HV vs. metastatic, HV vs. primary, and primary vs. metastatic).

F
I G U R E  Lipidomic analysis of blood plasma EVs from breast cancer patients and healthy volunteers.(a) PCA showing the LC-MS data distribution for the breast cancer patients' samples (BC, N = 19), and healthy volunteers (HV, N = 10), in both ESI+ and ESI− modes.BC samples include primary (N = 10) and progressive metastatic (N = 9) breast cancer samples.(b) Box plots for the five lipid species identified by the statistical model obtained by LR-RFE analysis which distinguish BC samples from samples from HV. Box plots are based on the fused and auto-scaled ESI− and ESI+ datasets.(c) Confusion matrix of the leave-one-individual-out cross-validated LR classification of BC and HV samples showing an overall accuracy of 93.1%.The colour scheme of the confusion matrix is driven by the number of observations rather than the percentages.(d) AUROC for the combination of the five relevant lipid species was equal to 0.94.

F
I G U R E  Lipidomic analysis of blood plasma EVs from primary breast cancer patients and healthy volunteers.(a) Box plots for the five lipid species identified by the statistical model obtained by LR-RFE analysis that distinguish breast cancer patients from healthy volunteers (HV, N = 10), but also primary breast cancer patients (BC-primary, N = 10) from HV. Box plots are based on the fused and auto-scaled ESI− and ESI+ datasets.(b) Confusion matrix of the leave-one-individual-out cross-validated LR classification of primary breast cancer and HV samples showing an overall accuracy of 95%.The colour scheme of the confusion matrix is driven by the number of observations rather than the percentages.(c) AUROC curve for the combination of the five relevant lipid species was equal to 0.97.

F
I G U R E  Lipidomic analysis of blood plasma EVs from primary and metastatic breast cancer patients.(a) Box plots for the five lipid species identified by the statistical model obtained by LR-RFE analysis that distinguishes primary cancers (N = 10) from metastatic cancers (N = 9).Box plots are based on the fused and auto-scaled ESI− and ESI+ datasets.(b) Confusion matrix of the leave-one-individual-out cross-validated LR classification of primary breast cancers and metastatic cancers showing an overall accuracy of 89.5%.The colour scheme of the confusion matrix is driven by the number of observations rather than the percentages.(c) AUROC curve for the combination of the five relevant lipid species was equal to 0.97.

Histological grade Tumour size (mm) Histopathology ER/PR/HER Lymph node
Abbreviation: n.a., not available.a Patient received adjuvant treatment before breast surgery.