Automatically digital extracellular vesicles analyzer for size‐dependent subpopulation analysis in surface plasmon resonance microscopy

The heterogeneity in extracellular vesicles (EVs) introduces an extra level of complexity in small EV‐based liquid biopsy for cancer diagnosis. Surface plasmon resonance microscopy (SPRM) offers sensitive and multifunctional analysis for single tiny EVs. However, the EV SPRM image analysis has been extremely laborious and time‐consuming due to the low contrast and bad signal‐to‐noise ratio. Herein, we present a digital EV analyzer (DEA), a software package, for the automatic analysis of EV data in the SPRM platform. This package enabled fully automated single EV identification, digital counting, sizing EV, and quantification of dwell time. Integrated with a classification algorithm and intelligent searching algorithm for subpopulation analysis, DEA was able to accurately group EVs with five different origins. We hope this software tool would largely promote the application of SPRM technology into cancer diagnosis with EVs.

heterogeneity both in size and composition hinders the accurate analysis of EVs' origin. 8,9 One of the main reasons is that the protein markers differ in size-dependent EV subpopulations. Currently, the size and protein analysis are separately done by techniques such as nanoparticle tracking analysis, 10 transmission electron microscopy, 11 enzyme-linked immune-sorbent assay, and other fluorescent methods. Methods based on microfluidic devices could isolate the size-based subgroups of EVs, which is complicated and not widely used by common labs. 12 Other approaches based on surface plasmon resonance microscopy (SPRM) offer specifically capturing and sensitive imaging of single EVs without subsets separation. 13 However, the tiny size and low contrast of a single EV make it difficult to analyze SPRM images. 15 Automated nanoparticle analysis of SPRM images was explored with Au nanoparticles with an acceptable signalto-noise ratio (SNR) due to the localized surface plasmon resonance effect. 15 Whereas, the biggest problem in EV image analysis lies in low contrast and low SNR, causing failures in EV identification. Here, we present a software package for fully automatic digital EV analysis of SPRM imaging ( Figure 1).

Overview of automatic digital EV analyzer in SPRM
As shown in Figure 1, EVs are mix-sized in biological samples, and EVs are usually isolated from cell culture medium or plasma by ultracentrifugation. In SPRM assays, the mixsized EV solutions were directly analyzed. In one snap, EVs with different sizes were specifically captured onto sensor chips, and their intensities were recorded. The offline software named digital EV analyzer (DEA) was developed to process SPRM images. Finally, the size distributions, binding times, and dynamic tracking of a single EV were obtained. Combined with the search algorithm and classification algorithm, the EVs from different cell lines were grouped according to the similarity in size-dependent surface protein signatures.

2.2
Workflow of the automatic EV image processing DEA includes the following steps, that is, 1) data import, 2) image processing, 3) EV identification, 4) single EV auto-tracking, 5) EV analysis of size, binding time, and dynamic tracking, and 6) EV classification ( Figure 2). We have provided a graphical user interface platform ( Figure S1), allowing interactive parameter settings for image preprocessing. A set of experimental data (named project, consisting of a stack of images) was imported; and preprocessed to remove the static background by a differential algorithm. And spatial filtering and contrast enhancement were applied to improve the image quality. The EV identification was based on the Yolov4 model using the Pytorch framework on a Linux system. About 3500 manually labeled EVs SPRM images with different sizes and contrast were fed into the model for training. After training, the model could automatically recognize and label all single EVs from single SPRM image input. The nearby frame stack information was applied to determining the nature of single EVs (binding or unbinding) as previously described. 15 The intensity of EVs was in direct proportion to the increase of intensity in the current image to the former one (background). The difference between the two locations was used as the size indicator for single EVs ( Figure S2). After calibration by size curves from silica nanoparticles, we could know the diameter distributions of EVs. For the demand for cancer diagnosis by EVs, the size-dependent subpopulation analysis was integrated into F I G U R E 2 Workflow of the automatic extracellular vesicle (EV) image processing. EV analyzer of surface plasmon resonance microscopy (SPRM) including 1) data import, 2) image processing, 3) EV identification, 4) single EV auto-tracking, 5) EV analysis of size, binding time, and dynamic tracking, and 6) EV classification.
DEA. With the function of EV classification, the EVs with different origins were groups due to the unique size-based signatures. And the accuracy of the classification and receiver operating characteristic (ROC) curves by the different searching algorithms were also analyzed.
Single nanoparticle identification and intensity analysis for silica nanoparticles and polystyrene nanoparticles (PSNPs) of low contrast. To obtain the desired EV recognition, the spatial filtering algorithm was integrated to improve the EV SNR ( Figure 3A and Methods). Before spatial filtering, the SNR of an EV is 2.29. And after spatial filtering, the SNR of the same EV is increased to 12.97, which is a 5.6-time increase ( Figure 3B). As a model, the PSNP with a refractive index of 1.59∼1.60 were tested by DEA ( Figure 3C). From the images of PSNPs, we can directly see that the contrast and SNR decrease as the diameters of PSNPs decreases. For 50 nm PSNPs, after imaging processing, the particles were nearly not able to be distinguished from the background. But the 50 nm PSNPs were still able to be identified by DEA. These results indicated that our DEA software was adequate for nanoparticle identification with low contrast. The images of silica nanoparticles were like EV nanoparticles due to the similar refractive index. And the intensity of silica nanoparticles, with diameters of 30, 50, 70, 100, and 160 nm, were analyzed by DEA software to establish the standard curve of EV size ( Figure 3D). The distribution of the maximum intensity of silica nanoparticles was calculated and recorded in the process of EVs binding to the sensor chips. Due to the non-uniform size of silica nanoparticles, the recorded size of silica is a gaussian distribution. The peak value of the intensity distribution was used for the construction of the standard curve of EVs ( Figure 3E).

Automatic identification and intensity analysis of single EV
Next, the EV samples derived from the LNCaP cell line were isolated and imaged by SPRM. The sensor chips were modified with CD63 aptamer to specifically capture EVs. [16][17][18][19] Similarly, the raw images were processed by removing static background, spatial filtering, and contrast enhancement ( Figure 4A). After filtering, the SNR of EV was increased from 2.16 to 11.26, which is a 5.21-time improvement ( Figure 4B). EVs were automatically identified by DEA software. And according to the single EV tracking algorithm, the repeated binding events at the same location were also recorded and the dwell time of EV binding onto sensor chips was analyzed ( Figure 4C). Compared with the non-specific bindings, the specific binding between EVs and sensor chips usually has a longer time (>10s, in this work). Therefore, we were able to detect the surface protein markers in EVs. With the standard curve of silica nanoparticles, from the variation of intensity in one location after EVs binding, we can calculate the size of the EVs.

2.4
EVs classification with linear discriminant analysis, support vector machine, and decision tree algorithm  (Table S1), including CD63, EpCAM, HER2, PSMA, and PTK7, was applied to profile the signature of these EV samples. 20 First, we need to find a suitable classification algorithm for grouping EVs. Three commonly used algorithms including linear discriminant analysis (LDA), 20 support vector machine (SVM), 21 and decision tree algorithm were tested in this work. Empirically, the EVs were divided into three subgroups, EV-S (30-70 nm), EV-M (70-120 nm), and EV-L (120-160 nm). Combined with the ensembled EV surface protein markers and the three subsets information, the three algorithms were used to classify the five kinds of EV samples.
As shown in Figure 5, LDA showed a better performance with an overall accuracy of 73% ( Figure 5A,D). SVM is a little worse than LDA with an average accuracy of 70% ( Figure 5B,E). Decision tree algorithm classification is the worst among the three algorithms with an average accuracy of 63% ( Figure 5C,F). And the ROC curve analysis showed the same result as the average accuracy. The area under the curve (AUC) of ROC using LDA is 0.997, 0.86, 0.91, 0.75, and 0.997, respectively, for EVs from A549, HepG2, MCF-7, LNCaP, and L-02 ( Figure 5G). The AUC using SVM is 0.93, 0.85, 0.88, 0.76, and 0.96, respectively, for EVs from A549, HepG2, MCF-7, LNCaP, and L-02 ( Figure 5H). And the AUC using the decision tree algorithm is 0.87, 0.69, 0.77, 0.76, and 0.99, respectively, for EVs from A549, HepG2, MCF-7, LNCaP, and L-02 ( Figure 5I). From the analysis above, LDA was more suitable for EV grouping in this work.

Optimizing the subgrouping of EVs
The size of EVs was from 30 to 150 nm. Only dividing them into three subpopulations was intuitive. In previous works, simply setting thresholds of EV subgroups was limited by isolation technologies. 12 To date, few approaches were able to finely separate the EVs subgroups. However, in SPRM imaging assays, the sizing accuracy of 10 nm was achieved. Setting more groups with smaller size ranges was realizable. Therefore, different searching algorithms were applied to find the optimal subgrouping of EVs, including the hill climbing algorithm, simulated annealing algorithm, and genetic algorithm. After optimizing the subgrouping of EVs, LDA was used to classify the five EV samples. As shown in Figure 6, the average accuracy was increased by 10% compared with the results of dividing into three subgroups, which implies the importance of EV subgroup analysis. The average accuracy of the three search algorithms is similar. The average accuracy was 88%, 87%, and 86%, respectively, for the hill climbing algorithm ( Figure 6A,D), simulated annealing algorithm ( Figure 6B,E), and genetic algorithm ( Figure 6C,F). And the AUC of these three methods was also higher. The AUC using the hill climbing algorithm is 0.99, 0.96, 0.97, 0.95, and 1, respectively, for EVs from A549, HepG2, MCF-7, LNCaP, and L-02 ( Figure 6G). And the AUC using the simulated annealing algorithm is 0.99, 0.96, 0.97, 0.95, and 0.999, respectively, for EVs from A549, HepG2, MCF-7, LNCaP, and L-02 ( Figure 6H). And the AUC using the genetic algorithm is 0.99, 0.94, 0.98, 0.92, and 1, respectively, for EVs from A549, HepG2, MCF-7, LNCaP, and L-02 ( Figure 6I).

CONCLUSION
We have presented a software package for size-dependent subpopulations' analysis of surface plasmon resonance microscopic images of single EVs. The software could automatically identify EVs, track positions, record dwell times, and classify EV groups from various origins. Results show that five EV samples isolated from five kinds of cell culture medium were accurately classified with subgroup information. We thus anticipate that these technological advancements would have the potential for clinical cancer diagnosis.

SPRM system and image recording
The SPRM setup has been described previously elsewhere. Briefly, the system is based on an inverted microscope (IX-83; Olympus) equipped with a high numerical aperture oil immersion objective (NA = 1.49). Collimated light from a super luminescent diode illuminated the Au/Glass sensor chip at a highly inclined angle. Reflective light was collected by a CMOS camera (Prime TM; Photometrics).

Cell culture and EVs isolation
The human lung cancer cell line (A549, ATCC), liver cancer cell line (HepG2, ATCC), breast cancer cell line (MCF-7, ATCC), and normal liver cell line (L-02, ATCC) were cultured using high-glucose Dulbecco's modified Eagle's medium (DMEM) (Gibico) with 10% extracellularvesicle-free fetal bovine serum (EV-free FBS) (Gibico) and 1% penicillin-streptomycin (Gibco). The human prostate cancer cell line (LNCaP, ATCC) was cultured in Roswell Park Memorial Institute 1640 (PRMI-1640) (Hyclone) with 10% EV-free FBS and 1% penicillin-streptomycin (Gibco). The cells were incubated in the cultural dishes (Corning) with 30% confluent. The culture medium was collected after 48 h culture when the cells were 70%-80% confluent. EVs were isolated by differential centrifugation as described previously. Cell culture media (200 mL) was first centrifugated at 300 g for 5 min, followed by centrifugation at 2000 g for 45 min to remove cells. The treated medium was centrifugated at 10,000 g for 60 min. Then the supernatant was filtrated by 0.22 μm membrane filtration (Millipore). Finally, the filtrate was ultracentrifuged at 100,000 g for 120 min. The small EVs were washed with 50 mL 1× phosphate buffer saline (PBS, Hyclone), followed by ultracentrifugation at 100,000 g for 120 min. The purified small EVs were resuspended in 50 μl 1× PBS.

Hill Climbing algorithm
Hill-climbing algorithm is a local greedy optimal solution search algorithm if the current solution and local solution have a changed pattern. We set the step size, start from the initial node, find the next node randomly in the step size range, and calculate the node with higher classification accuracy among the two-grain size division methods as the new initial node. Then we randomly reach the next node again according to the step distance and cycle the comparison until we find the optimal solution.

Simulated Annealing algorithm
The annealing algorithm is also a similar greedy algorithm, which obtains a better solution by comparing the nodes close to each other, but the algorithm accepts solutions that are worse than the current solution with a given probability. After searching for the particle size partitioning method that is locally the most accurate for classification, there is a probability to continue moving to the next node to avoid the interference of the local optimal solution.

Genetic algorithm
The genetic algorithm is a search algorithm that simulates the natural evolutionary process to search for the optimal solution and treats the particle size partitioning method as a naturally evolving organism. The position of the partition of the permutation after conversion to binary is used as the individual code, and ten particle size division methods are randomly selected for a genetic operation to find the optimal solution.

EV analysis software
The software is mainly developed based on Python's PyQt 5.15.2 framework and integrates EV image preprocessing, EV recognition, EV intensity analysis, size-dependent subpopulation analysis, and EV classification. EV images are preprocessed by differential and spatial filtering methods to obtain EV particle images with high signal-to-noise ratios. EV recognition is performed by the Yolov4 particle detection algorithm built using Pytorch. The computer workstation contains four Nvidia GeForce RTX 2080Ti GPUs and a CPU of Xeon Silver 4116. The model was trained with 3500 manually labeled SPRM images containing EV particles, which can automatically identify and label EV particles in the images and record particle location information. EV intensity was obtained by differencing the intensity values before and after the appearance of particles, and the maximum brightness of each particle was calculated by exhaustive counting of particle appearance frames. The EV source analysis mainly uses the classification algorithm to classify EV samples by the multidimensional information matrix of mixed EV samples, to obtain the EV source and calculate the classification accuracy, ROC, and other parameters.

Spatial filtering
The spatial filtering algorithm was used to improve the image signal-to-noise ratio and thus improve the particle recognition rate. SPR is represented as two circles in the frequency domain, the signal is on the circles and mainly concentrated at the intersection of the two circles, and the background noise is inside and outside the rings. We construct a circular filter, multiply the circular filter with the SPRM signal in the frequency domain after intercepting the intersection of the two circles, and then perform the inverse Fourier transform to obtain the SPR image with a high signal-to-noise ratio.

A U T H O R C O N T R I B U T I O N S
Hui Yu and Yuting Yang conceived the idea; Xie Feng developed the software package; Xie Feng and Chunhui Zhai analyzed data; Chunhui Zhai and Jiaying Xu performed the experiments; Chunhui Zhai, Xie Feng, and Hui Yu wrote the paper.

A C K N O W L E D G M E N T S
This work was supported by the National Natural Science Foundation of China (grant no. 22027807).

C O N F L I C T O F I N T E R E S T S TAT E M E N T
The authors declare no conflict of interest.